linux-block.git
11 months agobcachefs: fix error checking in bch2_fs_alloc()
Dan Carpenter [Thu, 14 Sep 2023 09:47:44 +0000 (12:47 +0300)]
bcachefs: fix error checking in bch2_fs_alloc()

There is a typo here where it uses ";" instead of "?:".  The result is
that bch2_fs_fs_io_direct_init() is called unconditionally and the errors
from it are not checked.

Fixes: 0060c68159fc ("bcachefs: Split up fs-io.[ch]")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Brian Foster <bfoster@redhat.com>
11 months agobcachefs: chardev: fix an integer overflow (32 bit only)
Dan Carpenter [Thu, 14 Sep 2023 14:59:10 +0000 (17:59 +0300)]
bcachefs: chardev: fix an integer overflow (32 bit only)

On 32 bit systems, "sizeof(*arg) + replica_entries_bytes" can have an
integer overflow leading to memory corruption.  Use size_add() to
prevent this.

Fixes: b44dd3797034 ("bcachefs: Redo filesystem usage ioctls")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: chardev: return -EFAULT if copy_to_user() fails
Dan Carpenter [Thu, 14 Sep 2023 14:58:07 +0000 (17:58 +0300)]
bcachefs: chardev: return -EFAULT if copy_to_user() fails

The copy_to_user() function returns the number of bytes remaining but
we want to return -EFAULT to the user.

Fixes: e0750d947352 ("bcachefs: Initial commit")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Change bucket_lock() to use bit_spin_lock()
Kent Overstreet [Thu, 14 Sep 2023 00:33:06 +0000 (20:33 -0400)]
bcachefs: Change bucket_lock() to use bit_spin_lock()

bucket_lock() previously open coded a spinlock, because we need to cram
a spinlock into a single byte.

But it turns out not all archs support xchg() on a single byte; since we
need struct bucket to be small, this means we have to play fun games
with casts and ifdefs for endianness.

This fixes building on 32 bit arm, and likely other architectures.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Kill other unreachable() uses
Kent Overstreet [Thu, 14 Sep 2023 00:39:31 +0000 (20:39 -0400)]
bcachefs: Kill other unreachable() uses

Per previous commit, bare unreachable() considered harmful, convert to
BUG()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Remove undefined behavior in bch2_dev_buckets_reserved()
Josh Poimboeuf [Wed, 13 Sep 2023 21:08:29 +0000 (23:08 +0200)]
bcachefs: Remove undefined behavior in bch2_dev_buckets_reserved()

In general it's a good idea to avoid using bare unreachable() because it
introduces undefined behavior in compiled code.  In this case it even
confuses GCC into emitting an empty unused
bch2_dev_buckets_reserved.part.0() function.

Use BUG() instead, which is nice and defined.  While in theory it should
never trigger, if something were to go awry and the BCH_WATERMARK_NR
case were to actually hit, the failure mode is much more robust.

Fixes the following warnings:

  vmlinux.o: warning: objtool: bch2_bucket_alloc_trans() falls through to next function bch2_reset_alloc_cursors()
  vmlinux.o: warning: objtool: bch2_dev_buckets_reserved.part.0() is missing an ELF size annotation

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Remove a redundant and harmless bch2_free_super() call
Christophe JAILLET [Wed, 13 Sep 2023 16:44:09 +0000 (18:44 +0200)]
bcachefs: Remove a redundant and harmless bch2_free_super() call

Remove a redundant call to bch2_free_super().

This is harmless because bch2_free_super() has a memset() at its end. So
a second call would only lead to from kfree(NULL).

Remove the redundant call and only rely on the error handling path.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix use-after-free in bch2_dev_add()
Christophe JAILLET [Wed, 13 Sep 2023 16:44:08 +0000 (18:44 +0200)]
bcachefs: Fix use-after-free in bch2_dev_add()

If __bch2_dev_attach_bdev() fails, bch2_dev_free() is called twice.
Once here and another time in the error handling path.

This leads to several use-after-free.

Remove the redundant call and only rely on the error handling path.

Fixes: 6a44735653d4 ("bcachefs: Improved superblock-related error messages")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: add module description to fix modpost warning
Brian Foster [Wed, 13 Sep 2023 14:14:30 +0000 (10:14 -0400)]
bcachefs: add module description to fix modpost warning

modpost produces the following warning:

WARNING: modpost: missing MODULE_DESCRIPTION() in fs/bcachefs/bcachefs.o

Add a module description for bcachefs.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Heap allocate btree_trans
Kent Overstreet [Tue, 12 Sep 2023 21:16:02 +0000 (17:16 -0400)]
bcachefs: Heap allocate btree_trans

We're using more stack than we'd like in a number of functions, and
btree_trans is the biggest object that we stack allocate.

But we have to do a heap allocatation to initialize it anyways, so
there's no real downside to heap allocating the entire thing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix W=12 build errors
Kent Overstreet [Tue, 12 Sep 2023 22:41:22 +0000 (18:41 -0400)]
bcachefs: Fix W=12 build errors

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Remove unneeded semicolon
Yang Li [Wed, 13 Sep 2023 00:57:56 +0000 (08:57 +0800)]
bcachefs: Remove unneeded semicolon

./fs/bcachefs/btree_gc.c:1249:2-3: Unneeded semicolon
./fs/bcachefs/btree_gc.c:1521:2-3: Unneeded semicolon
./fs/bcachefs/btree_gc.c:1575:2-3: Unneeded semicolon
./fs/bcachefs/counters.c:46:2-3: Unneeded semicolon

Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Add a missing prefetch include
Kent Overstreet [Tue, 12 Sep 2023 22:41:09 +0000 (18:41 -0400)]
bcachefs: Add a missing prefetch include

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix -Wcompare-distinct-pointer-types in bch2_copygc_get_buckets()
Nathan Chancellor [Tue, 12 Sep 2023 19:15:44 +0000 (12:15 -0700)]
bcachefs: Fix -Wcompare-distinct-pointer-types in bch2_copygc_get_buckets()

When building bcachefs for 32-bit ARM, there is a warning when using
max() to compare an expression involving 'size_t' with an 'unsigned
long' literal:

  fs/bcachefs/movinggc.c:159:21: error: comparison of distinct pointer types ('typeof (16UL) *' (aka 'unsigned long *') and 'typeof (buckets_in_flight->nr / 4) *' (aka 'unsigned int *')) [-Werror,-Wcompare-distinct-pointer-types]
    159 |         size_t nr_to_get = max(16UL, buckets_in_flight->nr / 4);
        |                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  include/linux/minmax.h:76:19: note: expanded from macro 'max'
     76 | #define max(x, y)       __careful_cmp(x, y, >)
        |                         ^~~~~~~~~~~~~~~~~~~~~~
  include/linux/minmax.h:38:24: note: expanded from macro '__careful_cmp'
     38 |         __builtin_choose_expr(__safe_cmp(x, y), \
        |                               ^~~~~~~~~~~~~~~~
  include/linux/minmax.h:28:4: note: expanded from macro '__safe_cmp'
     28 |                 (__typecheck(x, y) && __no_side_effects(x, y))
        |                  ^~~~~~~~~~~~~~~~~
  include/linux/minmax.h:22:28: note: expanded from macro '__typecheck'
     22 |         (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
        |                    ~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~
  1 error generated.

On 64-bit architectures, size_t is 'unsigned long', so there is no
warning when comparing these two expressions. Use max_t(size_t, ...) for
this situation, eliminating the warning.

Fixes: dd49018737d4 ("bcachefs: Rhashtable based buckets_in_flight for copygc")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix -Wcompare-distinct-pointer-types in do_encrypt()
Nathan Chancellor [Tue, 12 Sep 2023 19:15:43 +0000 (12:15 -0700)]
bcachefs: Fix -Wcompare-distinct-pointer-types in do_encrypt()

When building bcachefs for 32-bit ARM, there is a warning when using
min() to compare a variable of type 'size_t' with an expression of type
'unsigned long':

  fs/bcachefs/checksum.c:142:22: error: comparison of distinct pointer types ('typeof (len) *' (aka 'unsigned int *') and 'typeof (((1UL) << 12) - offset) *' (aka 'unsigned long *')) [-Werror,-Wcompare-distinct-pointer-types]
    142 |                         unsigned pg_len = min(len, PAGE_SIZE - offset);
        |                                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
  include/linux/minmax.h:69:19: note: expanded from macro 'min'
     69 | #define min(x, y)       __careful_cmp(x, y, <)
        |                         ^~~~~~~~~~~~~~~~~~~~~~
  include/linux/minmax.h:38:24: note: expanded from macro '__careful_cmp'
     38 |         __builtin_choose_expr(__safe_cmp(x, y), \
        |                               ^~~~~~~~~~~~~~~~
  include/linux/minmax.h:28:4: note: expanded from macro '__safe_cmp'
     28 |                 (__typecheck(x, y) && __no_side_effects(x, y))
        |                  ^~~~~~~~~~~~~~~~~
  include/linux/minmax.h:22:28: note: expanded from macro '__typecheck'
     22 |         (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
        |                    ~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~
  1 error generated.

On 64-bit architectures, size_t is 'unsigned long', so there is no
warning when comparing these two expressions. Use min_t(size_t, ...) for
this situation, eliminating the warning.

Fixes: 1fb50457684f ("bcachefs: Fix memory corruption in encryption path")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix -Wincompatible-function-pointer-types-strict from key_invalid callbacks
Nathan Chancellor [Tue, 12 Sep 2023 19:15:42 +0000 (12:15 -0700)]
bcachefs: Fix -Wincompatible-function-pointer-types-strict from key_invalid callbacks

When building bcachefs with -Wincompatible-function-pointer-types-strict,
a clang warning designed to catch issues with mismatched function
pointer types, which will be fatal at runtime due to kernel Control Flow
Integrity (kCFI), there are several instances along the lines of:

  fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(const struct bch_fs *, struct bkey_s_c, enum bkey_invalid_flags, struct printbuf *)' with an expression of type 'int (const struct bch_fs *, struct bkey_s_c, unsigned int, struct printbuf *)' [-Werror,-Wincompatible-function-pointer-types-strict]
    118 |         BCH_BKEY_TYPES()
        |         ^~~~~~~~~~~~~~~~
  fs/bcachefs/bcachefs_format.h:342:2: note: expanded from macro 'BCH_BKEY_TYPES'
    342 |         x(deleted,              0)                      \
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
  fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x'
    117 | #define x(name, nr) [KEY_TYPE_##name]   = bch2_bkey_ops_##name,
        |                                           ^~~~~~~~~~~~~~~~~~~~
  <scratch space>:206:1: note: expanded from here
    206 | bch2_bkey_ops_deleted
        | ^~~~~~~~~~~~~~~~~~~~~
  fs/bcachefs/bkey_methods.c:34:17: note: expanded from macro 'bch2_bkey_ops_deleted'
     34 |         .key_invalid = deleted_key_invalid,             \
        |                        ^~~~~~~~~~~~~~~~~~~

The flags parameter should be of type 'enum bkey_invalid_flags', not
'unsigned int'. Adjust the type everywhere so that there is no more
warning.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix -Wformat in bch2_bucket_gens_invalid()
Nathan Chancellor [Tue, 12 Sep 2023 19:15:41 +0000 (12:15 -0700)]
bcachefs: Fix -Wformat in bch2_bucket_gens_invalid()

When building bcachefs for 32-bit ARM, there is a compiler warning in
bch2_bucket_gens_invalid() due to use of an incorrect format specifier:

  fs/bcachefs/alloc_background.c:530:10: error: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Werror,-Wformat]
    529 |                 prt_printf(err, "bad val size (%lu != %zu)",
        |                                                ~~~
        |                                                %zu
    530 |                        bkey_val_bytes(k.k), sizeof(struct bch_bucket_gens));
        |                        ^~~~~~~~~~~~~~~~~~~
  fs/bcachefs/util.h:223:54: note: expanded from macro 'prt_printf'
    223 | #define prt_printf(_out, ...)           bch2_prt_printf(_out, __VA_ARGS__)
        |                                                               ^~~~~~~~~~~

On 64-bit architectures, size_t is 'unsigned long', so there is no
warning when using %lu but on 32-bit architectures, size_t is 'unsigned
int'. Use '%zu', the format specifier for 'size_t', to eliminate the
warning.

Fixes: 4be0d766a7e9 ("bcachefs: bucket_gens btree")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix -Wformat in bch2_alloc_v4_invalid()
Nathan Chancellor [Tue, 12 Sep 2023 19:15:40 +0000 (12:15 -0700)]
bcachefs: Fix -Wformat in bch2_alloc_v4_invalid()

When building bcachefs for 32-bit ARM, there is a compiler warning in
bch2_alloc_v4_invalid() due to use of an incorrect format specifier:

  fs/bcachefs/alloc_background.c:246:30: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat]
    245 |                 prt_printf(err, "bad val size (%u > %lu)",
        |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        |                                                     %u
    246 |                        alloc_v4_u64s(a.v), bkey_val_u64s(k.k));
        |                        ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
  fs/bcachefs/bkey.h:58:27: note: expanded from macro 'bkey_val_u64s'
     58 | #define bkey_val_u64s(_k)       ((_k)->u64s - BKEY_U64s)
        |                                 ^
  fs/bcachefs/util.h:223:54: note: expanded from macro 'prt_printf'
    223 | #define prt_printf(_out, ...)           bch2_prt_printf(_out, __VA_ARGS__)
        |                                                               ^~~~~~~~~~~

This expression is of type 'size_t'. On 64-bit architectures, size_t is
'unsigned long', so there is no warning when using %lu but on 32-bit
architectures, size_t is 'unsigned int'. Use '%zu', the format specifier
for 'size_t' to eliminate the warning.

Fixes: 11be8e8db283 ("bcachefs: New on disk format: Backpointers")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix -Wformat in bch2_btree_key_cache_to_text()
Nathan Chancellor [Tue, 12 Sep 2023 19:15:39 +0000 (12:15 -0700)]
bcachefs: Fix -Wformat in bch2_btree_key_cache_to_text()

When building bcachefs for 32-bit ARM, there is a compiler warning in
bch2_btree_key_cache_to_text() due to use of an incorrect format
specifier:

  fs/bcachefs/btree_key_cache.c:1060:36: error: format specifies type 'size_t' (aka 'unsigned int') but the argument has type 'long' [-Werror,-Wformat]
   1060 |         prt_printf(out, "nr_freed:\t%zu",       atomic_long_read(&c->nr_freed));
        |                                     ~~~         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        |                                     %ld
  fs/bcachefs/util.h:223:54: note: expanded from macro 'prt_printf'
    223 | #define prt_printf(_out, ...)           bch2_prt_printf(_out, __VA_ARGS__)
        |                                                               ^~~~~~~~~~~
  1 error generated.

On 64-bit architectures, size_t is 'unsigned long', so there is no
warning when using %zu but on 32-bit architectures, size_t is
'unsigned int'. Use '%lu' to match the other format specifiers used in
this function for printing values returned from atomic_long_read().

Fixes: 6d799930ce0f ("bcachefs: btree key cache pcpu freedlist")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix -Wformat in bch2_set_bucket_needs_journal_commit()
Nathan Chancellor [Tue, 12 Sep 2023 19:15:38 +0000 (12:15 -0700)]
bcachefs: Fix -Wformat in bch2_set_bucket_needs_journal_commit()

When building bcachefs for 32-bit ARM, there is a compiler warning in
bch2_set_bucket_needs_journal_commit() due to a debug print using the
wrong specifier:

  fs/bcachefs/buckets_waiting_for_journal.c:137:30: error: format specifies type 'size_t' (aka 'unsigned int') but the argument has type 'unsigned long' [-Werror,-Wformat]
    136 |         pr_debug("took %zu rehashes, table at %zu/%zu elements",
        |                                                   ~~~
        |                                                   %lu
    137 |                  nr_rehashes, nr_elements, 1UL << b->t->bits);
        |                                            ^~~~~~~~~~~~~~~~~
  include/linux/printk.h:579:26: note: expanded from macro 'pr_debug'
    579 |         dynamic_pr_debug(fmt, ##__VA_ARGS__)
        |                          ~~~    ^~~~~~~~~~~
  include/linux/dynamic_debug.h:270:22: note: expanded from macro 'dynamic_pr_debug'
    270 |                            pr_fmt(fmt), ##__VA_ARGS__)
        |                                   ~~~     ^~~~~~~~~~~
  include/linux/dynamic_debug.h:250:59: note: expanded from macro '_dynamic_func_call'
    250 |         _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, ##__VA_ARGS__)
        |                                                                  ^~~~~~~~~~~
  include/linux/dynamic_debug.h:248:65: note: expanded from macro '_dynamic_func_call_cls'
    248 |         __dynamic_func_call_cls(__UNIQUE_ID(ddebug), cls, fmt, func, ##__VA_ARGS__)
        |                                                                        ^~~~~~~~~~~
  include/linux/dynamic_debug.h:224:15: note: expanded from macro '__dynamic_func_call_cls'
    224 |                 func(&id, ##__VA_ARGS__);                       \
        |                             ^~~~~~~~~~~
  1 error generated.

On 64-bit architectures, size_t is 'unsigned long', so there is no
warning when using %zu but on 32-bit architectures, size_t is
'unsigned int'. Use the correct specifier to resolve the warning.

Fixes: 7a82e75ddaef ("bcachefs: New data structure for buckets waiting on journal commit")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix a handful of spelling mistakes in various messages
Colin Ian King [Tue, 12 Sep 2023 08:25:27 +0000 (09:25 +0100)]
bcachefs: Fix a handful of spelling mistakes in various messages

There are several spelling mistakes in error messages. Fix these.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: remove redundant pointer q
Colin Ian King [Tue, 12 Sep 2023 12:37:44 +0000 (13:37 +0100)]
bcachefs: remove redundant pointer q

The pointer q is being assigned a value but it is never read. The
assignment and pointer are redundant and can be removed.
Cleans up clang scan build warning:

fs/bcachefs/quota.c:813:2: warning: Value stored to 'q' is never
read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: remove duplicated assignment to variable offset_into_extent
Colin Ian King [Tue, 12 Sep 2023 12:37:43 +0000 (13:37 +0100)]
bcachefs: remove duplicated assignment to variable offset_into_extent

Variable offset_into_extent is being assigned to zero and a few
statements later it is being re-assigned again to the save value.
The second assignment is redundant and can be removed. Cleans up
clang-scan build warning:

fs/bcachefs/io.c:2722:3: warning: Value stored to 'offset_into_extent'
is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: remove redundant initializations of variables start_offset and end_offset
Colin Ian King [Tue, 12 Sep 2023 12:37:42 +0000 (13:37 +0100)]
bcachefs: remove redundant initializations of variables start_offset and end_offset

The variables start_offset and end_offset are being initialized with
values that are never read, they being re-assigned later on. The
initializations are redundant and can be removed.

Cleans up clang-scan build warnings:
fs/bcachefs/fs-io.c:243:11: warning: Value stored to 'start_offset' during
its initialization is never read [deadcode.DeadStores]
fs/bcachefs/fs-io.c:244:11: warning: Value stored to 'end_offset' during
its initialization is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: remove redundant initialization of pointer dst
Colin Ian King [Tue, 12 Sep 2023 12:37:41 +0000 (13:37 +0100)]
bcachefs: remove redundant initialization of pointer dst

The pointer dst is being initialized with a value that is never read,
it is being re-assigned later on when it is used in a while-loop
The initialization is redundant and can be removed.

Cleans up clang-scan build warning:
fs/bcachefs/disk_groups.c:186:30: warning: Value stored to 'dst' during
its initialization is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: remove redundant initialization of pointer d
Colin Ian King [Tue, 12 Sep 2023 12:37:40 +0000 (13:37 +0100)]
bcachefs: remove redundant initialization of pointer d

The pointer d is being initialized with a value that is never read,
it is being re-assigned later on when it is used in a for-loop.
The initialization is redundant and can be removed.

Cleans up clang-scan build warning:
fs/bcachefs/buckets.c:1303:25: warning: Value stored to 'd' during its
initialization is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: trace_read_nopromote()
Kent Overstreet [Tue, 12 Sep 2023 00:44:33 +0000 (20:44 -0400)]
bcachefs: trace_read_nopromote()

Add a tracepoint to print the reason a read wasn't promoted.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Log finsert/fcollapse operations
Kent Overstreet [Sun, 10 Sep 2023 23:11:47 +0000 (19:11 -0400)]
bcachefs: Log finsert/fcollapse operations

Now that we have the logged operations btree, we can make
finsert/fcollapse atomic w.r.t. unclean shutdown as well.

This adds bch_logged_op_finsert to represent the state of an finsert or
fcollapse, which is a bit more complicated than truncate since we need
to track our position in the "shift extents" operation.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Log truncate operations
Kent Overstreet [Sun, 10 Sep 2023 20:42:30 +0000 (16:42 -0400)]
bcachefs: Log truncate operations

Previously, we guaranteed atomicity of truncate after unclean shutdown
with the BCH_INODE_I_SIZE_DIRTY flag - which required a full scan of the
inodes btree.

Recently the deleted inodes btree was added so that we no longer have to
scan for deleted inodes, but truncate was unfinished and that change
left it broken.

This patch uses the new logged operations btree to fix truncate
atomicity; we now log an operation that can be replayed at the start of
a truncate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: BTREE_ID_logged_ops
Kent Overstreet [Sun, 27 Aug 2023 22:27:41 +0000 (18:27 -0400)]
bcachefs: BTREE_ID_logged_ops

Add a new btree for long running logged operations - i.e. for logging
operations that we can't do within a single btree transaction, so that
they can be resumed if we crash.

Keys in the logged operations btree will represent operations in
progress, with the state of the operation stored in the value.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: New io_misc.c helpers
Kent Overstreet [Mon, 4 Sep 2023 09:38:30 +0000 (05:38 -0400)]
bcachefs: New io_misc.c helpers

This pulls the non vfs specific parts of truncate and finsert/fcollapse
out of fs-io.c, and moves them to io_misc.c.

This is prep work for logging these operations, to make them atomic in
the event of a crash.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Break up io.c
Kent Overstreet [Sun, 10 Sep 2023 22:05:17 +0000 (18:05 -0400)]
bcachefs: Break up io.c

More reorganization, this splits up io.c into
 - io_read.c
 - io_misc.c - fallocate, fpunch, truncate
 - io_write.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: bch2_trans_update_get_key_cache()
Kent Overstreet [Mon, 11 Sep 2023 23:50:42 +0000 (19:50 -0400)]
bcachefs: bch2_trans_update_get_key_cache()

Factor out a slowpath into a separate function.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: __bch2_btree_insert() -> bch2_btree_insert_trans()
Kent Overstreet [Mon, 11 Sep 2023 23:48:07 +0000 (19:48 -0400)]
bcachefs: __bch2_btree_insert() -> bch2_btree_insert_trans()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Kill incorrect assertion
Kent Overstreet [Mon, 11 Sep 2023 18:34:56 +0000 (14:34 -0400)]
bcachefs: Kill incorrect assertion

In the bch2_fs_alloc() error path we call bch2_fs_free() without setting
BCH_FS_STOPPING - this is fine.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Convert more code to bch_err_msg()
Kent Overstreet [Mon, 11 Sep 2023 05:37:34 +0000 (01:37 -0400)]
bcachefs: Convert more code to bch_err_msg()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Kill missing inode warnings in bch2_quota_read()
Kent Overstreet [Mon, 11 Sep 2023 02:05:50 +0000 (22:05 -0400)]
bcachefs: Kill missing inode warnings in bch2_quota_read()

bch2_quota_read(), when scanning for inodes, may attempt to look up
inodes that have been deleted in the main subvolume - this is not an
error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix bch_sb_handle type
Kent Overstreet [Sun, 10 Sep 2023 06:13:33 +0000 (02:13 -0400)]
bcachefs: Fix bch_sb_handle type

blk_mode_t was recently introduced; we should be using it now, instead
of fmode_t.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix bch2_propagate_key_to_snapshot_leaves()
Kent Overstreet [Sun, 10 Sep 2023 20:24:02 +0000 (16:24 -0400)]
bcachefs: Fix bch2_propagate_key_to_snapshot_leaves()

When we handle a transaction restart in a nested context, we need to
return -BCH_ERR_transaction_restart_nested because we invalidated the
outer context's iterators and locks.

bch2_propagate_key_to_snapshot_leaves() wasn't doing this, this patch
fixes it to use trans_was_restarted().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix silent enum conversion error
Kent Overstreet [Sun, 10 Sep 2023 01:14:54 +0000 (21:14 -0400)]
bcachefs: Fix silent enum conversion error

This changes mark_btree_node_locked() to take an enum
btree_node_locked_type, not a six_lock_type, since BTREE_NODE_UNLOCKED
is -1 which may cause problems converting back and forth to
six_lock_type if short enums are in use.

With this change, we never store BTREE_NODE_UNLOCKED in a six_lock_type
enum.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Array bounds fixes
Kent Overstreet [Sun, 10 Sep 2023 00:10:11 +0000 (20:10 -0400)]
bcachefs: Array bounds fixes

It's no longer legal to use a zero size array as a flexible array
member - this causes UBSAN to complain.

This patch switches our zero size arrays to normal flexible array
members when possible, and inserts casts in other places (e.g. where we
use the zero size array as a marker partway through an array).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: bch2_acl_to_text()
Kent Overstreet [Fri, 8 Sep 2023 22:14:08 +0000 (18:14 -0400)]
bcachefs: bch2_acl_to_text()

We can now print out acls from bch2_xattr_to_text(), when the xattr
contains an acl.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: restart journal reclaim thread on ro->rw transitions
Brian Foster [Wed, 30 Aug 2023 10:45:59 +0000 (06:45 -0400)]
bcachefs: restart journal reclaim thread on ro->rw transitions

Commit c2d5ff36065a4 ("bcachefs: Start journal reclaim thread
earlier") tweaked reclaim thread management to start a bit earlier
in the mount sequence by moving the start call from
__bch2_fs_read_write() to bch2_fs_journal_start(). This has the side
effect of never starting the reclaim thread on a ro->rw transition,
which can be observed by monitoring reclaim behavior via the
journal_reclaim tracepoints. I.e. once an fs has remounted ro->rw,
we only ever rely on direct reclaim from that point forward.

Since bch2_journal_reclaim_start() properly handles the case where
the reclaim thread has already been created, restore the start call
in the read-write helper. This allows the reclaim thread to start
early when appropriate and also exit/restart on remounts or freeze
cycles. In the latter case it may be possible to simply allow the
task to freeze rather than destroy it, but for now just fix the
immediate bug.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix snapshot_skiplist_good()
Kent Overstreet [Mon, 28 Aug 2023 19:17:31 +0000 (15:17 -0400)]
bcachefs: Fix snapshot_skiplist_good()

We weren't correctly checking snapshot skiplist nodes - we were checking
if they were in the same tree, not if they were an actual ancestor.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Kill stripe check in bch2_alloc_v4_invalid()
Kent Overstreet [Thu, 24 Aug 2023 21:07:50 +0000 (17:07 -0400)]
bcachefs: Kill stripe check in bch2_alloc_v4_invalid()

Since we set bucket data type to BCH_DATA_stripe based on the data
pointer, not just the stripe pointer, it doesn't make sense to check for
no stripe in the .key_invalid method - this is a situation that
shouldn't happen, but our other fsck/repair code handles it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Improve bch2_moving_ctxt_to_text()
Kent Overstreet [Thu, 24 Aug 2023 01:20:42 +0000 (21:20 -0400)]
bcachefs: Improve bch2_moving_ctxt_to_text()

Print more information out about moving contexts - fold in the output of
the redundant bch2_data_jobs_to_text(), and also include information
relevant to whether move_data() should be blocked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Put bkey invalid check in commit path in a more useful place
Kent Overstreet [Wed, 23 Aug 2023 00:29:35 +0000 (20:29 -0400)]
bcachefs: Put bkey invalid check in commit path in a more useful place

When doing updates early in recovery, before we can go RW, we still want
to check that keys are valid at commit time - this moves key invalid
checking to before the "btree updates to journal" path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Always check alloc data type
Kent Overstreet [Tue, 22 Aug 2023 22:48:09 +0000 (18:48 -0400)]
bcachefs: Always check alloc data type

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix a double free on invalid bkey
Kent Overstreet [Tue, 22 Aug 2023 22:47:16 +0000 (18:47 -0400)]
bcachefs: Fix a double free on invalid bkey

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: bch2_propagate_key_to_snapshot_leaves()
Kent Overstreet [Sat, 19 Aug 2023 01:14:33 +0000 (21:14 -0400)]
bcachefs: bch2_propagate_key_to_snapshot_leaves()

If fsck finds a key that needs work done, the primary example being an
unlinked inode that needs to be deleted, and the key is in an internal
snapshot node, we have a bit of a conundrum.

The conundrum is that internal snapshot nodes are shared, and we in
general do updates in internal snapshot nodes because there may be
overwrites in some snapshots and not others, and this may affect other
keys referenced by this key (i.e. extents).

For example, we might be seeing an unlinked inode in an internal
snapshot node, but then in one child snapshot the inode might have been
reattached and might not be unlinked. Deleting the inode in the internal
snapshot node would be wrong, because then we'll delete all the extents
that the child snapshot references.

But if an unlinked inode does not have any overwrites in child
snapshots, we're fine: the inode is overwrritten in all child snapshots,
so we can do the deletion at the point of comonality in the snapshot
tree, i.e. the node where we found it.

This patch adds a new helper, bch2_propagate_key_to_snapshot_leaves(),
to handle the case where we need a to update a key that does have
overwrites in child snapshots: we copy the key to leaf snapshot nodes,
and then rewind fsck and process the needed updates there.

With this, fsck can now always correctly handle unlinked inodes found in
internal snapshot nodes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Cleanup redundant snapshot nodes
Kent Overstreet [Fri, 18 Aug 2023 02:10:02 +0000 (22:10 -0400)]
bcachefs: Cleanup redundant snapshot nodes

After deleteing snapshots, we may be left with a snapshot tree where
some nodes only have one child, and we have a linear chain.

Interior snapshot nodes are never used directly (i.e. they never have
subvolumes that point to them), they are only referered to by child
snapshot nodes - hence, they are redundant.

The existing code talks about redundant snapshot nodes as forming and
equivalence class; i.e. nodes for which snapshot_t->equiv is equal. In a
given equivalence class, we only ever need a single key at a given
position - i.e. multiple versions with different snapshot fields are
redundant.

The existing snapshot cleanup code deletes these redundant keys, but not
redundant nodes. It turns out this is buggy, because we assume that
after snapshot deletion finishes we should only have a single key per
equivalence class, but the btree update path doesn't preserve this -
overwriting keys in old snapshots doesn't check for the equivalence
class being equal, and thus we can end up with duplicate keys in the
same equivalence class and fsck complaining about snapshot deletion not
having run correctly.

The equivalence class notion has been leaking out of the core snapshots
code and into too much other code, i.e. fsck, so this patch takes a
different approach: snapshot deletion now moves keys to the node in an
equivalence class being kept (the leafiest node) and then deletes the
redundant nodes in the equivalance class.

Some work has to be done to correctly delete interior snapshot nodes;
snapshot node depth and skiplist fields for descendent nodes have to be
fixed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix btree write buffer with snapshots btrees
Kent Overstreet [Mon, 21 Aug 2023 23:57:34 +0000 (19:57 -0400)]
bcachefs: Fix btree write buffer with snapshots btrees

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix is_ancestor bitmap
Kent Overstreet [Thu, 13 Jul 2023 06:43:29 +0000 (02:43 -0400)]
bcachefs: Fix is_ancestor bitmap

The is_ancestor bitmap is at optimization for bch2_snapshot_is_ancestor;
once we get sufficiently close to the ancestor ID we're searching for we
test a bitmap.

But initialization of the is_ancestor bitmap was broken; we do it by
using bch2_snapshot_parent(), but we call that on nodes that haven't
been initialized yet with bch2_mark_snapshot().

Fix this by adding a separate loop in bch2_snapshots_read() for
initializing the is_ancestor bitmap, and also add some new debug asserts
for checking this sort of breakage in the future.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: move check_pos_snapshot_overwritten() to snapshot.c
Kent Overstreet [Sat, 19 Aug 2023 01:13:44 +0000 (21:13 -0400)]
bcachefs: move check_pos_snapshot_overwritten() to snapshot.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix bch2_mount error path
Kent Overstreet [Fri, 18 Aug 2023 21:44:21 +0000 (17:44 -0400)]
bcachefs: Fix bch2_mount error path

In the bch2_mount() error path, we were calling
deactivate_locked_super(), which calls ->kill_sb(), which in our case
was calling bch2_fs_free() without __bch2_fs_stop().

This changes bch2_mount() to just call bch2_fs_stop() directly.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Delete a faulty assertion
Kent Overstreet [Fri, 18 Aug 2023 04:05:35 +0000 (00:05 -0400)]
bcachefs: Delete a faulty assertion

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Improve btree_path_relock_fail tracepoint
Kent Overstreet [Fri, 18 Aug 2023 02:04:20 +0000 (22:04 -0400)]
bcachefs: Improve btree_path_relock_fail tracepoint

In https://github.com/koverstreet/bcachefs/issues/450, we're seeing
unexplained btree_path_relock_fail events - according to the information
currently in the tracepoint, it appears the relock should be succeeding.

This adds lock counts to the tracepoint to help track it down.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix divide by zero in rebalance_work()
Kent Overstreet [Thu, 17 Aug 2023 20:35:58 +0000 (16:35 -0400)]
bcachefs: Fix divide by zero in rebalance_work()

This fixes https://github.com/koverstreet/bcachefs-tools/issues/159

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Split out snapshot.c
Kent Overstreet [Wed, 16 Aug 2023 20:54:33 +0000 (16:54 -0400)]
bcachefs: Split out snapshot.c

subvolume.c has gotten a bit large, this splits out a separate file just
for managing snapshot trees - BTREE_ID_snapshots.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: stack_trace_save_tsk() depends on CONFIG_STACKTRACE
Kent Overstreet [Wed, 16 Aug 2023 19:05:18 +0000 (15:05 -0400)]
bcachefs: stack_trace_save_tsk() depends on CONFIG_STACKTRACE

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix swallowing of data in buffered write path
Kent Overstreet [Tue, 15 Aug 2023 02:29:41 +0000 (22:29 -0400)]
bcachefs: Fix swallowing of data in buffered write path

In __bch2_buffered_write, if we fail to write to an entire !uptodate
folio, we have to back out the write, bail out and retry.

But we were missing an iov_iter_revert() call, so the data written to
the folio was lost and the rest of the write shifted to the wrong
offset.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: fix up wonky error handling in bch2_seek_pagecache_hole()
Brian Foster [Mon, 14 Aug 2023 14:49:42 +0000 (10:49 -0400)]
bcachefs: fix up wonky error handling in bch2_seek_pagecache_hole()

The folio_hole_offset() helper returns a mix of bool and int types.
The latter is to support a possible -EAGAIN error code when using
nonblocking locks. This is not only confusing, but the only caller
also essentially ignores errors outside of stopping the range
iteration. This means an -EAGAIN error can't return directly from
folio_hole_offset() and may be lost via bch2_clamp_data_hole().

Fix up the error handling and make it more readable.
__filemap_get_folio() returns -ENOENT instead of NULL when no folio
exists, so reuse the same error code in folio_hole_offset(). Fix up
bch2_seek_pagecache_hole() to return the current offset on -ENOENT,
but otherwise return unexpected error code up to the caller.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix bkey format calculation
Kent Overstreet [Sun, 13 Aug 2023 23:34:02 +0000 (19:34 -0400)]
bcachefs: Fix bkey format calculation

For extents, we increase the number of bits of the size field to allow
extents to get bigger due to merging - but this code didn't check for
overflow.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix bch2_extent_fallocate()
Kent Overstreet [Sun, 13 Aug 2023 22:04:32 +0000 (18:04 -0400)]
bcachefs: Fix bch2_extent_fallocate()

 - There was no need for a retry loop in bch2_extent_fallocate(); if we
   have to retry we may be overwriting something different and we need
   to return an error and let the caller retry.
 - The bch2_alloc_sectors_start() error path was wrong, and wasn't
   running our cleanup at the end of the function

This also fixes a very rare open bucket leak due to the missing cleanup.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Zero btree_paths on allocation
Kent Overstreet [Sun, 13 Aug 2023 22:15:53 +0000 (18:15 -0400)]
bcachefs: Zero btree_paths on allocation

This fixes a bug in the cycle detector, bch2_check_for_deadlock() - we
have to make sure the node pointers in the btree paths array are set to
something not-garbage before another thread may see them.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix 'pointer to invalid device' check
Kent Overstreet [Sun, 13 Aug 2023 17:04:08 +0000 (13:04 -0400)]
bcachefs: Fix 'pointer to invalid device' check

This fixes the device removal tests, which have been failing at random
due to the fact that when we're running the .key_invalid checks in the
write path the key may actually no longer exist - we might be racing
with the keys being deleted.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Lower BCH_NAME_MAX to 512
Joshua Ashton [Sun, 13 Aug 2023 15:53:45 +0000 (16:53 +0100)]
bcachefs: Lower BCH_NAME_MAX to 512

To ensure we aren't shooting ourselves in the foot after merge for
potentially doing future revisions for dirent or for storing multiple
names for casefolding, limit this to 512 for now.

Previously this define was linked to the max size a d_name in
bch_dirent could be.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Optimize bch2_dirent_name_bytes
Joshua Ashton [Sat, 12 Aug 2023 21:26:30 +0000 (22:26 +0100)]
bcachefs: Optimize bch2_dirent_name_bytes

Avoids doing a full strnlen for getting the length of the name of a
dirent entry.

Given the fact that the name of dirents is stored at the end of the
bkey's value, and we know the length of that in u64s, we can find the
last u64 and figure out how many NUL bytes are at the end of the string.

On little endian systems this ends up being the leading zeros of the
last u64, whereas on big endian systems this ends up being the trailing
zeros of the last u64.
We can take that value in bits and divide it by 8 to get the number of
NUL bytes at the end.

There is no endian-fixup or other compatibility here as this is string
data interpreted as a u64.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Introduce bch2_dirent_get_name
Joshua Ashton [Sat, 12 Aug 2023 21:26:29 +0000 (22:26 +0100)]
bcachefs: Introduce bch2_dirent_get_name

A nice cleanup that avoids a bunch of open-coding name/string usage
around dirent usage.

Will be used by casefolding impl in future commits.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: six locks: Guard against wakee exiting in __six_lock_wakeup()
Kent Overstreet [Sat, 12 Aug 2023 21:10:42 +0000 (17:10 -0400)]
bcachefs: six locks: Guard against wakee exiting in __six_lock_wakeup()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Don't open code closure_nr_remaining()
Kent Overstreet [Sat, 12 Aug 2023 20:51:45 +0000 (16:51 -0400)]
bcachefs: Don't open code closure_nr_remaining()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix lifetime in bch2_write_done(), add assertion
Kent Overstreet [Sat, 12 Aug 2023 20:52:33 +0000 (16:52 -0400)]
bcachefs: Fix lifetime in bch2_write_done(), add assertion

We're hunting for an open_bucket leak, add an assertion to help track it
down: also, we can't use the bch_fs after dropping our write ref to it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Add a comment for should_drop_open_bucket()
Kent Overstreet [Sat, 12 Aug 2023 20:46:54 +0000 (16:46 -0400)]
bcachefs: Add a comment for should_drop_open_bucket()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: six locks: Fix missing barrier on wait->lock_acquired
Kent Overstreet [Sat, 12 Aug 2023 19:05:06 +0000 (15:05 -0400)]
bcachefs: six locks: Fix missing barrier on wait->lock_acquired

Six locks do lock handoff via the wakeup path: the thread doing the
wakeup also takes the lock on behalf of the waiter, which means the
waiter only has to look at its waitlist entry, and doesn't have to touch
the lock cacheline while another thread is using it.

Linus noticed that this needs a real barrier, which this patch fixes.

Also add a comment for the should_sleep_fn() error path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: linux-bcachefs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
11 months agobcachefs: Check for directories in deleted inodes btree
Kent Overstreet [Sat, 12 Aug 2023 16:34:47 +0000 (12:34 -0400)]
bcachefs: Check for directories in deleted inodes btree

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Add btree_trans* to inode_set_fn
Joshua Ashton [Sat, 12 Aug 2023 14:47:45 +0000 (15:47 +0100)]
bcachefs: Add btree_trans* to inode_set_fn

This will be used when we need to re-hash a directory tree when setting
flags.

It is not possible to have concurrent btree_trans on a thread.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Improve bch2_write_points_to_text()
Kent Overstreet [Sat, 12 Aug 2023 16:13:19 +0000 (12:13 -0400)]
bcachefs: Improve bch2_write_points_to_text()

Now we also print the open_buckets owned by each write_point - this is
to help with debugging a shutdown hang.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix check_version_upgrade()
Kent Overstreet [Sat, 12 Aug 2023 02:22:31 +0000 (22:22 -0400)]
bcachefs: Fix check_version_upgrade()

We were failing to upgrade to the latest compatible version - whoops.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix 'journal not marked as containing replicas'
Kent Overstreet [Fri, 11 Aug 2023 23:30:38 +0000 (19:30 -0400)]
bcachefs: Fix 'journal not marked as containing replicas'

This fixes the replicas_write_errors test: the patch
  bcachefs: mark journal replicas before journal write submission

partially fixed replicas marking for the journal, but it broke the case
where one replica failed - this patch re-adds marking after the journal
write completes, when we know how many replicas succeeded.

Additionally, we do not consider it a fsck error when the very last
journal entry is not correctly marked, since there is an inherent race
there.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: btree_journal_iter.c
Kent Overstreet [Sat, 5 Aug 2023 20:08:44 +0000 (16:08 -0400)]
bcachefs: btree_journal_iter.c

Split out a new file from recovery.c for managing the list of keys we
read from the journal: before journal replay finishes the btree iterator
code needs to be able to iterate over and return keys from the journal
as well, so there's a fair bit of code here.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: sb-clean.c
Kent Overstreet [Sat, 5 Aug 2023 19:54:38 +0000 (15:54 -0400)]
bcachefs: sb-clean.c

Pull code for bch_sb_field_clean out into its own file.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Move bch_sb_field_crypt code to checksum.c
Kent Overstreet [Sat, 5 Aug 2023 19:43:00 +0000 (15:43 -0400)]
bcachefs: Move bch_sb_field_crypt code to checksum.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: sb-members.c
Kent Overstreet [Sat, 5 Aug 2023 19:40:21 +0000 (15:40 -0400)]
bcachefs: sb-members.c

Split out a new file for bch_sb_field_members - we'll likely want to
move more code here in the future.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Split up btree_update_leaf.c
Kent Overstreet [Sat, 5 Aug 2023 16:55:08 +0000 (12:55 -0400)]
bcachefs: Split up btree_update_leaf.c

We now have
  btree_trans_commit.c
  btree_update.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Split up fs-io.[ch]
Kent Overstreet [Thu, 3 Aug 2023 22:18:21 +0000 (18:18 -0400)]
bcachefs: Split up fs-io.[ch]

fs-io.c is too big - time for some reorganization
 - fs-dio.c: direct io
 - fs-pagecache.c: pagecache data structures (bch_folio), utility code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix assorted checkpatch nits
Kent Overstreet [Mon, 7 Aug 2023 16:04:05 +0000 (12:04 -0400)]
bcachefs: Fix assorted checkpatch nits

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix for sb buffer being misaligned
Kent Overstreet [Tue, 8 Aug 2023 00:44:56 +0000 (20:44 -0400)]
bcachefs: Fix for sb buffer being misaligned

On old kernels, kmalloc() may return an allocation that's not naturally
aligned - this resulted in a bug where we allocated a bio with not
enough biovecs. Fix this by using buf_pages().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Convert journal validation to bkey_invalid_flags
Kent Overstreet [Sun, 6 Aug 2023 16:43:31 +0000 (12:43 -0400)]
bcachefs: Convert journal validation to bkey_invalid_flags

This fixes a bug where we were already passing bkey_invalid_flags
around, but treating the parameter as just read/write - so the compat
code wasn't being run correctly.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Improve journal_entry_err_msg()
Kent Overstreet [Sun, 6 Aug 2023 14:57:25 +0000 (10:57 -0400)]
bcachefs: Improve journal_entry_err_msg()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: BCH_COMPAT_bformat_overflow_done no longer required
Kent Overstreet [Sun, 6 Aug 2023 14:04:37 +0000 (10:04 -0400)]
bcachefs: BCH_COMPAT_bformat_overflow_done no longer required

Awhile back, we changed bkey_format generation to ensure that the packed
representation could never represent fields larger than the unpacked
representation.

This was to ensure that bkey_packed_successor() always gave a sensible
result, but in the current code bkey_packed_successor() is only used in
a debug assertion - not for anything important.

This kills the requirement that we've gotten rid of those weird bkey
formats, and instead changes the assertion to check if we're dealing
with an old weird bkey format.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: kill EBUG_ON() redefinition in bkey.c
Kent Overstreet [Sun, 6 Aug 2023 14:02:41 +0000 (10:02 -0400)]
bcachefs: kill EBUG_ON() redefinition in bkey.c

our debug mode assertions in bkey.c haven't been getting run, whoops

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Add logging to bch2_inode_peek() & related
Kent Overstreet [Sun, 6 Aug 2023 14:04:05 +0000 (10:04 -0400)]
bcachefs: Add logging to bch2_inode_peek() & related

Add error messages when we fail to lookup an inode, and also add a few
missing bch2_err_class() calls.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix lock thrashing in __bchfs_fallocate()
Kent Overstreet [Thu, 3 Aug 2023 07:39:49 +0000 (03:39 -0400)]
bcachefs: Fix lock thrashing in __bchfs_fallocate()

We've observed significant lock thrashing on fstests generic/083 in
fallocate, due to dropping and retaking btree locks when checking the
pagecache for data.

This adds a nonblocking mode to bch2_clamp_data_hole(), where we only
use folio_trylock(), and can thus be used safely while btree locks are
held - thus we only have to drop btree locks as a fallback, on actual
lock contention.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix for bch2_copygc() spuriously returning -EEXIST
Kent Overstreet [Fri, 4 Aug 2023 14:51:02 +0000 (10:51 -0400)]
bcachefs: Fix for bch2_copygc() spuriously returning -EEXIST

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Convert btree_err_type to normal error codes
Kent Overstreet [Thu, 3 Aug 2023 23:36:28 +0000 (19:36 -0400)]
bcachefs: Convert btree_err_type to normal error codes

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix btree_err() macro
Kent Overstreet [Fri, 4 Aug 2023 00:32:46 +0000 (20:32 -0400)]
bcachefs: Fix btree_err() macro

Error code wasn't being propagated correctly, change it to match
fsck_err()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Ensure topology repair runs
Kent Overstreet [Fri, 4 Aug 2023 00:57:06 +0000 (20:57 -0400)]
bcachefs: Ensure topology repair runs

This fixes should_restart_for_topology_repair() - previously it was
returning false if the btree io path had already seleceted topology
repair to run, even if it hadn't run yet.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Log a message when running an explicit recovery pass
Kent Overstreet [Fri, 4 Aug 2023 00:37:32 +0000 (20:37 -0400)]
bcachefs: Log a message when running an explicit recovery pass

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Print out required recovery passes on version upgrade
Kent Overstreet [Thu, 3 Aug 2023 21:33:20 +0000 (17:33 -0400)]
bcachefs: Print out required recovery passes on version upgrade

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
11 months agobcachefs: Fix shift by 64 in set_inc_field()
Kent Overstreet [Thu, 3 Aug 2023 20:38:36 +0000 (16:38 -0400)]
bcachefs: Fix shift by 64 in set_inc_field()

UBSAN was complaining about a shift by 64 in set_inc_field().

This only happened when the value being shifted was 0, so in theory
should be harmless - a shift by 64 (or register width) should logically
give a result of 0, but CPUs will in practice leave the input unchanged
when the number of bits to shift by wraps - and since our input here is
0, the output is still what we want.

But, it's still undefined behaviour and we need our UBSAN output to be
clean, so it needs to be fixed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>