Kent Overstreet [Fri, 9 Feb 2024 02:10:32 +0000 (21:10 -0500)]
bcachefs: fsck_err() may now take a btree_trans
fsck_err() now optionally takes a btree_trans; if the current thread has
one, it is required that it be passed.
The next patch will use this to unlock when waiting for user input.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 May 2024 23:37:29 +0000 (19:37 -0400)]
bcachefs: btree_types bitmask cleanups
Make things more consistent and ensure that we're using u64 bitfields -
key types and btree ids are already around 32 bits.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Apr 2024 03:58:01 +0000 (23:58 -0400)]
bcachefs: Delete old assertion for online fsck
the order in which btree_gc walks keys have changed, so we no longer
have the sort of issues with online fsck this assertion was warning
about.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 May 2024 22:54:39 +0000 (18:54 -0400)]
bcachefs: Initialize gc buckets in alloc trigger
Needed for online fsck; we need the trigger to initialize newly
allocated buckets and generation number changes while gc is running.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 May 2024 22:53:48 +0000 (18:53 -0400)]
bcachefs: Walk leaf to root in btree_gc
Next change will move gc_alloc_start initialization into the alloc
trigger, so we have to mark those first.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 May 2024 21:54:46 +0000 (17:54 -0400)]
bcachefs: Don't block journal when finishing check_allocations()
Blocking the journal was needed to finish checking old style accounting,
but that code is gone and it's not needed in the alloc rewrite,
mark_lock is sufficient for synchronization.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 May 2024 17:55:49 +0000 (13:55 -0400)]
bcachefs: bch2_fs_get_tree() cleanup
- improve error paths
- call bch2_fs_start() separately, after applying late-parsed options
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 29 May 2024 17:38:06 +0000 (13:38 -0400)]
bcachefs: Kill bch2_mount()
Fold into bch2_fs_get_tree()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 27 Dec 2023 16:33:21 +0000 (11:33 -0500)]
bcachefs: Eytzinger accumulation for accounting keys
The btree write buffer takes as input keys from the journal, sorts them,
deduplicates them, and flushes them back to the btree in sorted order.
The disk space accounting rewrite is moving accounting to normal btree
keys, with update (in this case deltas) accumulated in the write buffer
and then flushed to the btree; but this is going to increase the number
of keys handled by the write buffer by perhaps as much as a factor of
3x-5x.
The overhead from copying around and sorting this many keys would cause
a significant performance regression, but: there is huge locality in
updates to accounting keys that we can take advantage of.
Instead of appending accounting keys to the list of keys to be sorted,
this patch adds an eytzinger search tree of recently seen accounting
keys. We look up the accounting key in the eytzinger search tree and
apply the delta directly, adding it if it doesn't exist, and
periodically prune the eytzinger tree of unused entries.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 19 Mar 2024 04:04:52 +0000 (00:04 -0400)]
bcachefs: bch_acct_rebalance_work
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 29 Feb 2024 03:37:21 +0000 (22:37 -0500)]
bcachefs: bch_acct_btree
Add counters for how much disk space we're using per btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 12 Feb 2024 07:17:02 +0000 (02:17 -0500)]
bcachefs: bch_acct_snapshot
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 23 Feb 2024 22:23:41 +0000 (17:23 -0500)]
bcachefs: bch2_fs_usage_base_to_text()
Helper to show raw accounting in sysfs, mainly for debugging.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 25 Feb 2024 00:58:07 +0000 (19:58 -0500)]
bcachefs: bch2_fs_accounting_to_text()
Helper to show raw accounting in sysfs, mainly for debugging.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 25 Feb 2024 02:09:51 +0000 (21:09 -0500)]
bcachefs: Convert bch2_compression_stats_to_text() to new accounting
We no longer have to walk the whole btree to calculate compression
stats.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Jan 2024 02:42:36 +0000 (21:42 -0500)]
bcachefs: bch_acct_compression
This adds per-compression-type accounting of compressed and uncompressed
size as well as number of extents - meaning we can now see compression
ratio (without walking the whole filesystem).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 18 Feb 2024 05:13:22 +0000 (00:13 -0500)]
bcachefs: bch2_verify_accounting_clean()
Verify that the in-memory accounting verifies the on-disk accounting
after a clean shutdown.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 12 Feb 2024 20:21:10 +0000 (15:21 -0500)]
bcachefs: Convert bch2_replicas_gc2() to new accounting
bch2_replicas_gc2() is used for garbage collection superblock replicas
entries that are empty - this converts it to the new accounting scheme.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 12 Feb 2024 03:48:05 +0000 (22:48 -0500)]
bcachefs: Convert gc to new accounting
Rewrite fsck/gc for the new accounting scheme.
This adds a second set of in-memory accounting counters for gc to use;
like with other parts of gc we run all trigger in TRIGGER_GC mode, then
compare what we calculated to existing in-memory accounting at the end.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 2 Jan 2024 05:22:57 +0000 (00:22 -0500)]
bcachefs: Kill replicas_journal_res
More dead code deletion
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 2 Jan 2024 05:15:16 +0000 (00:15 -0500)]
bcachefs: Kill fs_usage_online
More dead code deletion.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 25 Feb 2024 01:04:48 +0000 (20:04 -0500)]
bcachefs: Kill bch2_fs_usage_to_text()
Dead code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 28 Dec 2023 03:09:25 +0000 (22:09 -0500)]
bcachefs: Delete journal-buf-sharded old style accounting
More deletion of dead code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 1 Jan 2024 03:30:15 +0000 (22:30 -0500)]
bcachefs: Kill writing old accounting to journal
More ripping out of the old disk space accounting.
Note that the new disk space accounting is incompatible with the old,
and writing out old style disk space accounting with the new code is
infeasible.
This means upgrading and downgrading past this version requires
regenerating accounting.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 2 Jan 2024 04:36:23 +0000 (23:36 -0500)]
bcachefs: kill bch2_fs_usage_read()
With bch2_ioctl_fs_usage(), this is now dead code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 7 Jan 2024 01:29:25 +0000 (20:29 -0500)]
bcachefs: Convert bch2_ioctl_fs_usage() to new accounting
This converts bch2_ioctl_fs_usage() to read from the new disk
accounting, via bch2_fs_replicas_usage_read().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 6 Jan 2024 02:23:07 +0000 (21:23 -0500)]
bcachefs: Kill bch2_fs_usage_initialize()
Deleting code for the old disk accounting scheme.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 2 Jan 2024 00:42:37 +0000 (19:42 -0500)]
bcachefs: dev_usage updated by new accounting
Reading disk accounting now requires an eytzinger lookup (see:
bch2_accounting_mem_read()), but the per-device counters are used
frequently enough that we'd like to still be able to read them with just
a percpu sum, as in the old code.
This patch special cases the device counters; when we update in-memory
accounting we also update the old style percpu counters if it's a deice
counter update.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 4 Jun 2024 22:31:13 +0000 (18:31 -0400)]
bcachefs: Coalesce accounting keys before journal replay
This fixes a performance regression in journal replay; without
colaescing accounting keys we have multiple keys at the same position,
which means journal_keys_peek_upto() has to skip past many overwritten
keys - turning journal replay into an O(n^2) algorithm.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 9 Nov 2023 19:22:46 +0000 (14:22 -0500)]
bcachefs: Disk space accounting rewrite
Main part of the disk accounting rewrite.
This is a wholesale rewrite of the existing disk space accounting, which
relies on percepu counters that are sharded by journal buffer, and
rolled up and added to each journal write.
With the new scheme, every set of counters is a distinct key in the
accounting btree; this fixes scaling limitations of the old scheme,
where counters took up space in each journal entry and required multiple
percpu counters.
Now, in memory accounting requires a single set of percpu counters - not
multiple for each in flight journal buffer - and in the future we'll
probably also have counters that don't use in memory percpu counters,
they're not strictly required.
An accounting update is now a normal btree update, using the btree write
buffer path. At transaction commit time, we apply accounting updates to
the in memory counters, which are percpu counters indexed in an
eytzinger tree by the accounting key.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 17 Nov 2023 05:23:07 +0000 (00:23 -0500)]
bcachefs: btree write buffer knows how to accumulate bch_accounting keys
Teach the btree write buffer how to accumulate accounting keys - instead
of having the newer key overwrite the older key as we do with other
updates, we need to add them together.
Also, add a flag so that write buffer flush knows when journal replay is
finished flushing accounting, and teach it to hold accounting keys until
that flag is set.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 28 Dec 2023 01:59:01 +0000 (20:59 -0500)]
bcachefs: Accumulate accounting keys in journal replay
Until accounting keys hit the btree, they are deltas, not new versions
of the existing key; this means we have to teach journal replay to
accumulate them.
Additionally, the journal doesn't track precisely which entries have
been flushed to the btree; it only tracks a range of entries that may
possibly still need to be flushed.
That means we need to compare accounting keys against the version in the
btree and only flush updates that are newer.
There's another wrinkle with the write buffer: if the write buffer
starts flushing accounting keys before journal replay has finished
flushing accounting keys, journal replay will see the version number
from the new updates and updates from the journal will be lost.
To avoid this, journal replay has to flush accounting keys first, and
we'll be adding a flag so that write buffer flush knows to hold
accounting keys until then.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 27 Dec 2023 23:31:46 +0000 (18:31 -0500)]
bcachefs: KEY_TYPE_accounting
New key type for the disk space accounting rewrite.
- Holds a variable sized array of u64s (may be more than one for
accounting e.g. compressed and uncompressed size, or buckets and
sectors for a given data type)
- Updates are deltas, not new versions of the key: this means updates
to accounting can happen via the btree write buffer, which we'll be
teaching to accumulate deltas.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Bertschinger [Tue, 28 May 2024 04:36:11 +0000 (22:36 -0600)]
bcachefs: use new mount API
This updates bcachefs to use the new mount API:
- Update the file_system_type to use the new init_fs_context()
function.
- Define the new fs_context_operations functions.
- No longer register bch2_mount() and bch2_remount(); these are now
called via the new fs_context functions.
- Define a new helper type, bch2_opts_parse that includes a struct
bch_opts and additionally a printbuf used to save options that can't
be parsed until after the FS is opened. This enables us to parse as
many options as possible prior to opening the filesystem while saving
those options that need the open FS for later parsing.
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Bertschinger [Tue, 28 May 2024 04:36:10 +0000 (22:36 -0600)]
bcachefs: Add error code to defer option parsing
This introduces a new error code, option_needs_open_fs, which is used to
indicate that an attempt was made to parse a mount option prior to
opening a filesystem, when that mount option requires an open filesystem
in order to be validated.
Returning this error results in bch2_parse_one_mount_opt() saving that
option for later parsing, after the filesystem is opened.
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Bertschinger [Tue, 28 May 2024 04:36:09 +0000 (22:36 -0600)]
bcachefs: add printbuf arg to bch2_parse_mount_opts()
Mount options that take the name of a device that may be part of a
filesystem, for example "metadata_target", cannot be validated until
after the filesystem has been opened. However, an attempt to parse those
options may be made prior to the filesystem being opened.
This change adds a printbuf parameter to bch2_parse_mount_opts() which
will be used to save those mount options, when they are supplied prior
to the FS being opened, so that they can be parsed later.
This functionality is not currently needed, but will be used after
bcachefs starts using the new mount API to parse mount options. This is
because using the new mount API, we will process mount options prior to
opening the FS, but the new API doesn't provide a convenient way to
"replay" mount option parsing. So we save these options ourselves to
accomplish this.
This change also splits out the code to parse a single option into
bch2_parse_one_mount_opt(), which will be useful when using the new
mount API which deals with a single mount option at a time.
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 26 Apr 2024 00:45:00 +0000 (20:45 -0400)]
bcachefs: metadata version bucket_stripe_sectors
New on disk format version for bch_alloc->stripe_sectors and
BCH_DATA_unstriped - accounting for unstriped data in stripe buckets.
Upgrade/downgrade requires regenerating alloc info - but only if erasure
coding is in use.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 23 Nov 2023 21:34:03 +0000 (16:34 -0500)]
bcachefs: BCH_DATA_unstriped
Add a new pseudo data type, to track buckets that are members of a
stripe, but have unstriped data in them.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 23 Nov 2023 22:21:23 +0000 (17:21 -0500)]
bcachefs: bch_alloc->stripe_sectors
Add a separate counter to bch_alloc_v4 for amount of striped data; this
lets us separately track striped and unstriped data in a bucket, which
lets us see when erasure coding has failed to update extents with stripe
pointers, and also find buckets to continue updating if we crash mid way
through creating a new stripe.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 26 May 2024 22:11:37 +0000 (18:11 -0400)]
bcachefs: check_key_has_inode()
Consolidate duplicated checks for extents/dirents/xattrs - these keys
should all have a corresponding inode of the correct type.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Bertschinger [Sat, 25 May 2024 19:36:19 +0000 (13:36 -0600)]
bcachefs: allow passing full device path for target options
The output of mount options such as "metadata_target" in `/proc/mounts`
uses the full path to the device.
mount(8) from util-linux uses the output from `/proc/mounts` to pass
existing mount options when performing a remount, so bcachefs should
accept as input the same form that it prints as output.
Without this change:
$ mount -t bcachefs -o metadata_target=vdb /dev/vdb /mnt
$ strace mount -o remount /mnt
...
fsconfig(4, FSCONFIG_SET_STRING, "metadata_target", "/dev/vdb", 0) = -1 EINVAL (Invalid argument)
...
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 27 May 2024 02:20:34 +0000 (22:20 -0400)]
bcachefs: bch2_printbuf_strip_trailing_newline()
Add a new helper to fix inode_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Bertschinger [Sun, 26 May 2024 19:08:20 +0000 (13:08 -0600)]
bcachefs: don't expose "read_only" as a mount option
When "read_only" is exposed as a mount option, it is redundant with the
standard option "ro" and gives users multiple ways to specify that a
bcachefs filesystem should be mounted read-only. This presents the risk
of having inconsistent options specified.
This can be seen when remounting a read-only filesystem in read-write
mode, using mount(8) from util-linux. Because mount(8) parses the
existing mount options from `/proc/mounts` and applies them when
remounting, it can end up applying both "read_only" and "rw":
$ mount img -o ro /mnt
$ strace mount -o remount,rw /mnt
...
fsconfig(4, FSCONFIG_SET_FLAG, "read_only", NULL, 0) = 0
fsconfig(4, FSCONFIG_SET_FLAG, "rw", NULL, 0) = 0
...
Making "read_only" no longer a mount option means this edge case cannot
occur.
Fixes:
62719cf33c3a ("bcachefs: Fix nochanges/read_only interaction")
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Bertschinger [Sun, 26 May 2024 19:08:19 +0000 (13:08 -0600)]
bcachefs: make offline fsck set read_only fs flag
A subsequent change will remove "read_only" as a mount option in favor
of the standard option "ro", meaning the userspace fsck command cannot
pass it to the fsck ioctl. Instead, in offline fsck, set "read_only"
kernel-side without trying to parse it as a mount option.
For compatibility with versions of the "bcachefs fsck" command that try
to pass the "read_only" mount opt, remove it from the mount options
string prior to parsing when it is present.
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 24 May 2024 22:04:22 +0000 (18:04 -0400)]
bcachefs: btree_ptr_sectors_written() now takes bkey_s_c
this is for the userspace metadata dump tool
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 10 May 2024 13:27:09 +0000 (09:27 -0400)]
bcachefs: Check for bsets past bch_btree_ptr_v2.sectors_written
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Uros Bizjak [Thu, 23 May 2024 09:19:26 +0000 (11:19 +0200)]
bcachefs: Use try_cmpxchg() family of functions instead of cmpxchg()
Use try_cmpxchg() family of functions instead of
cmpxchg (*ptr, old, new) == old. x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg
(and related move instruction in front of cmpxchg).
Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when
cmpxchg fails. There is no need to re-read the value in the loop.
No functional change intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 7 Jun 2024 20:35:42 +0000 (16:35 -0400)]
bcachefs: add might_sleep() annotations for fsck_err()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 17 Jun 2024 17:58:17 +0000 (13:58 -0400)]
bcachefs: fix missing include
fs-common.h needs dirent.h for enum bch_rename_mode
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Youling Tang [Thu, 25 Apr 2024 09:16:59 +0000 (17:16 +0800)]
bcachefs: Use filemap_read() to simplify the execution flow
Using filemap_read() can reduce unnecessary code execution
for non IOCB_DIRECT paths.
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Youling Tang [Thu, 18 Apr 2024 00:50:55 +0000 (08:50 +0800)]
bcachefs: Align the display format of `btrees/inodes/keys`
Before patch:
```
#cat btrees/inodes/keys
u64s 17 type inode_v3 0:4096:U32_MAX len 0 ver 0: mode=40755
flags= (
16300000)
bi_size=0
```
After patch:
```
#cat btrees/inodes/keys
u64s 17 type inode_v3 0:4096:U32_MAX len 0 ver 0:
mode=40755
flags=(
16300000)
bi_size=0
```
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Youling Tang [Thu, 18 Apr 2024 08:31:03 +0000 (16:31 +0800)]
bcachefs: Fix missing spaces in journal_entry_dev_usage_to_text
Fixed missing spaces displayed in journal_entry_dev_usage_to_text
while adjusting the display format to improve readability.
before:
```
# bcachefs list_journal -a -t alloc:1:0 /dev/sdb
...
dev_usage: dev=0free: buckets=233180 sectors=0 fragmented=0sb: buckets=13 sectors=6152 fragmented=504journal: buckets=1847 sectors=945664 fragmented=0btree: buckets=20 sectors=10240 fragmented=0user: buckets=1419 sectors=726513 fragmented=15cached: buckets=0 sectors=0 fragmented=0parity: buckets=0 sectors=0 fragmented=0stripe: buckets=0 sectors=0 fragmented=0need_gc_gens: buckets=0 sectors=0 fragmented=0need_discard: buckets=1 sectors=0 fragmented=0
```
after:
```
# bcachefs list_journal -a -t alloc:1:0 /dev/sdb
...
dev_usage: dev=0
free: buckets=233180 sectors=0 fragmented=0
sb: buckets=13 sectors=6152 fragmented=504
journal: buckets=1847 sectors=945664 fragmented=0
btree: buckets=20 sectors=10240 fragmented=0
user: buckets=1419 sectors=726513 fragmented=15
cached: buckets=0 sectors=0 fragmented=0
parity: buckets=0 sectors=0 fragmented=0
stripe: buckets=0 sectors=0 fragmented=0
need_gc_gens: buckets=0 sectors=0 fragmented=0
need_discard: buckets=1 sectors=0 fragmented=0
```
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 4 Jul 2024 21:10:29 +0000 (17:10 -0400)]
bcachefs: fix ei_update_lock lock ordering
ei_update_lock is largely vestigal and will probably be removed, but
we're not ready for that just yet.
this fixes some lockdep splats with the new lockdep support for btree
node locks; they're harmless, since we were taking ei_update_lock before
actually locking any btree nodes, but "any btree nodes locked" are now
tracked at the btree_trans level.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jul 2024 20:30:41 +0000 (16:30 -0400)]
bcachefs: bch2_btree_reserve_cache_to_text()
Add a pretty printer so the btree reserve cache can be seen in sysfs; as
it pins open_buckets we need it for tracking down open_buckets issues.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jul 2024 20:11:45 +0000 (16:11 -0400)]
bcachefs: sysfs trigger_freelist_wakeup
another debugging knob
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 29 Jun 2024 01:40:00 +0000 (21:40 -0400)]
bcachefs: sysfs internal/trigger_journal_writes
another debugging knob - trigger the journal to do ready journal writes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jul 2024 20:00:46 +0000 (16:00 -0400)]
bcachefs: add capacity, reserved to fs_alloc_debug_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 29 Jun 2024 18:37:46 +0000 (14:37 -0400)]
bcachefs: uninline fallocate functions
better stack traces
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 10 Jul 2024 22:03:11 +0000 (18:03 -0400)]
bcachefs: btree ids are 64 bit bitmasks
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jul 2024 18:08:38 +0000 (14:08 -0400)]
bcachefs: Print allocator stuck on timeout in fallocate path
same as in io_write.c, if we're waiting on the allocator for an
excessive amount of time, print what's going on
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sun, 14 Jul 2024 22:43:32 +0000 (15:43 -0700)]
Linux 6.10
Linus Torvalds [Sun, 14 Jul 2024 22:29:35 +0000 (15:29 -0700)]
Merge tag 'kbuild-fixes-v6.10-4' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Make scripts/ld-version.sh robust against the latest LLD
- Fix warnings in rpm-pkg with device tree support
- Fix warnings in fortify tests with KASAN
* tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
fortify: fix warnings in fortify tests with KASAN
kbuild: rpm-pkg: avoid the warnings with dtb's listed twice
kbuild: Make ld-version.sh more robust against version string changes
Masahiro Yamada [Sun, 14 Jul 2024 17:04:32 +0000 (02:04 +0900)]
fortify: fix warnings in fortify tests with KASAN
When a software KASAN mode is enabled, the fortify tests emit warnings
on some architectures.
For example, for ARCH=arm, the combination of CONFIG_FORTIFY_SOURCE=y
and CONFIG_KASAN=y produces the following warnings:
TEST lib/test_fortify/read_overflow-memchr.log
warning: unsafe memchr() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memchr.c
TEST lib/test_fortify/read_overflow-memchr_inv.log
warning: unsafe memchr_inv() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memchr_inv.c
TEST lib/test_fortify/read_overflow-memcmp.log
warning: unsafe memcmp() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memcmp.c
TEST lib/test_fortify/read_overflow-memscan.log
warning: unsafe memscan() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memscan.c
TEST lib/test_fortify/read_overflow2-memcmp.log
warning: unsafe memcmp() usage lacked '__read_overflow2' warning in lib/test_fortify/read_overflow2-memcmp.c
[ more and more similar warnings... ]
Commit
9c2d1328f88a ("kbuild: provide reasonable defaults for tool
coverage") removed KASAN flags from non-kernel objects by default.
It was an intended behavior because lib/test_fortify/*.c are unit
tests that are not linked to the kernel.
As it turns out, some architectures require -fsanitize=kernel-(hw)address
to define __SANITIZE_ADDRESS__ for the fortify tests.
Without __SANITIZE_ADDRESS__ defined, arch/arm/include/asm/string.h
defines __NO_FORTIFY, thus excluding <linux/fortify-string.h>.
This issue does not occur on x86 thanks to commit
4ec4190be4cf
("kasan, x86: don't rename memintrinsics in uninstrumented files"),
but there are still some architectures that define __NO_FORTIFY
in such a situation.
Set KASAN_SANITIZE=y explicitly to the fortify tests.
Fixes:
9c2d1328f88a ("kbuild: provide reasonable defaults for tool coverage")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Closes: https://lore.kernel.org/all/
0e8dee26-41cc-41ae-9493-
10cd1a8e3268@app.fastmail.com/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Jose Ignacio Tornos Martinez [Thu, 11 Jul 2024 16:49:19 +0000 (18:49 +0200)]
kbuild: rpm-pkg: avoid the warnings with dtb's listed twice
After
8d1001f7bdd0 (kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n),
the following warning "warning: File listed twice: *.dtb" is appearing for
every dtb file that is included.
The reason is that the commented commit already adds the folder
/lib/modules/%{KERNELRELEASE} in kernel.list file so the folder
/lib/modules/%{KERNELRELEASE}/dtb is no longer necessary, just remove it.
Fixes:
8d1001f7bdd0 ("kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n")
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Nathan Chancellor [Mon, 8 Jul 2024 05:06:47 +0000 (22:06 -0700)]
kbuild: Make ld-version.sh more robust against version string changes
After [1] in upstream LLVM, ld.lld's version output became slightly
different when the cmake configuration option LLVM_APPEND_VC_REV is
disabled.
Before:
Debian LLD 19.0.0 (compatible with GNU linkers)
After:
Debian LLD 19.0.0, compatible with GNU linkers
This results in ld-version.sh failing with
scripts/ld-version.sh: 18: arithmetic expression: expecting EOF: "10000 * 19 + 100 * 0 + 0,"
because the trailing comma is included in the patch level part of the
expression. While [1] has been partially reverted in [2] to avoid this
breakage (as it impacts the configuration stage and it is present in all
LTS branches), it would be good to make ld-version.sh more robust
against such miniscule changes like this one.
Use POSIX shell parameter expansion [3] to remove the largest suffix
after just numbers and periods, replacing of the current removal of
everything after a hyphen. ld-version.sh continues to work for a number
of distributions (Arch Linux, Debian, and Fedora) and the kernel.org
toolchains and no longer errors on a version of ld.lld with [1].
Fixes:
02aff8592204 ("kbuild: check the minimum linker version in Kconfig")
Link: https://github.com/llvm/llvm-project/commit/0f9fbbb63cfcd2069441aa2ebef622c9716f8dbb
Link: https://github.com/llvm/llvm-project/commit/649cdfc4b6781a350dfc87d9b2a4b5a4c3395909
Link: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html
Suggested-by: Fangrui Song <maskray@google.com>
Reviewed-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sun, 14 Jul 2024 17:18:25 +0000 (10:18 -0700)]
Merge tag 'sched_urgent_for_v6.10' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Fix a performance regression when measuring the CPU time of a thread
(clock_gettime(CLOCK_THREAD_CPUTIME_ID,...)) due to the addition of
PSI IRQ time accounting in the hotpath
- Fix a task_struct leak due to missing to decrement the refcount when
the task is enqueued before the timer which is supposed to do that,
expires
- Revert an attempt to expedite detaching of movable tasks, as finding
those could become very costly. Turns out the original issue wasn't
even hit by anyone
* tag 'sched_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath
sched/deadline: Fix task_struct reference leak
Revert "sched/fair: Make sure to try to detach at least one movable task"
Linus Torvalds [Sun, 14 Jul 2024 17:11:20 +0000 (10:11 -0700)]
Merge tag 'x86_urgent_for_v6.10' of git://git./linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
- Make sure TF is cleared before calling other functions (BHI
mitigation in this case) in the SYSENTER compat handler, as
otherwise it will warn about being in single-step mode
* tag 'x86_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bhi: Avoid warning in #DB handler due to BHI mitigation
Linus Torvalds [Sat, 13 Jul 2024 23:34:22 +0000 (16:34 -0700)]
Merge tag 'i2c-for-6.10-rc8' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Fixes for the I2C testunit, the Renesas R-Car driver and some
MAINTAINERS corrections"
* tag 'i2c-for-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: testunit: avoid re-issued work after read message
i2c: rcar: ensure Gen3+ reset does not disturb local targets
i2c: mark HostNotify target address as used
i2c: testunit: correct Kconfig description
MAINTAINERS: VIRTIO I2C loses a maintainer, gains a reviewer
MAINTAINERS: delete entries for Thor Thayer
i2c: rcar: clear NO_RXDMA flag after resetting
i2c: rcar: bring hardware to known state when probing
Linus Torvalds [Sat, 13 Jul 2024 20:00:25 +0000 (13:00 -0700)]
Merge tag '6.10-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fix from Steve French:
"Small fix, also for stable"
* tag '6.10-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix setting SecurityFlags to true
Steve French [Tue, 9 Jul 2024 23:07:35 +0000 (18:07 -0500)]
cifs: fix setting SecurityFlags to true
If you try to set /proc/fs/cifs/SecurityFlags to 1 it
will set them to CIFSSEC_MUST_NTLMV2 which no longer is
relevant (the less secure ones like lanman have been removed
from cifs.ko) and is also missing some flags (like for
signing and encryption) and can even cause mount to fail,
so change this to set it to Kerberos in this case.
Also change the description of the SecurityFlags to remove mention
of flags which are no longer supported.
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Wolfram Sang [Sat, 13 Jul 2024 08:50:55 +0000 (10:50 +0200)]
Merge tag 'i2c-host-fixes-6.10-rc8' of git://git./linux/kernel/git/andi.shyti/linux into i2c/for-current
This tag includes three fixes for the Renesas R-Car driver:
1. Ensures the device is in a known state after probing.
2. Allows clearing the NO_RXDMA flag after a reset.
3. Forces a reset before any transfer on Gen3+ platforms to
prevent disruption of the configuration during parallel
transfers.
Linus Torvalds [Sat, 13 Jul 2024 01:33:33 +0000 (18:33 -0700)]
Merge tag 'net-6.10-rc8-2' of git://git./linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
"A quick follow up to yesterday's pull. We got a regressions report for
the bnxt patch as soon as it got to your tree. The ethtool fix is also
good to have, although it's an older regression.
Current release - regressions:
- eth: bnxt_en: fix crash in bnxt_get_max_rss_ctx_ring() on older HW
when user tries to decrease the ring count
Previous releases - regressions:
- ethtool: fix RSS setting, accept "no change" setting if the driver
doesn't support the new features
- eth: i40e: remove needless retries of NVM update, don't wait 20min
when we know the firmware update won't succeed"
* tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()
octeontx2-af: fix issue with IPv4 match for RSS
octeontx2-af: fix issue with IPv6 ext match for RSS
octeontx2-af: fix detection of IP layer
octeontx2-af: fix a issue with cpt_lf_alloc mailbox
octeontx2-af: replace cpt slot with lf id on reg write
i40e: fix: remove needless retries of NVM update
net: ethtool: Fix RSS setting
Michael Chan [Fri, 12 Jul 2024 17:53:18 +0000 (10:53 -0700)]
bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()
On older chips not supporting multiple RSS contexts, reducing
ethtool channels will crash:
BUG: kernel NULL pointer dereference, address:
00000000000000b8
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 7032 Comm: ethtool Tainted: G S 6.10.0-rc4 #1
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
RIP: 0010:bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
Code: c3 d3 eb 4c 8b 83 38 01 00 00 48 8d bb 38 01 00 00 4c 39 c7 74 42 41 8d 54 24 ff 31 c0 0f b7 d2 4c 8d 4c 12 02 66 85 ed 74 1d <49> 8b 90 b8 00 00 00 49 8d 34 11 0f b7 0a 66 39 c8 0f 42 c1 48 83
RSP: 0018:
ffffaaa501d23ba8 EFLAGS:
00010202
RAX:
0000000000000000 RBX:
ffff8efdf600c940 RCX:
0000000000000000
RDX:
000000000000007f RSI:
ffffffffacf429c4 RDI:
ffff8efdf600ca78
RBP:
0000000000000080 R08:
0000000000000000 R09:
0000000000000100
R10:
0000000000000001 R11:
ffffaaa501d238c0 R12:
0000000000000080
R13:
0000000000000000 R14:
ffff8efdf600c000 R15:
0000000000000006
FS:
00007f977a7d2740(0000) GS:
ffff8f041f840000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000000b8 CR3:
00000002320aa004 CR4:
00000000003706f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
? __die_body+0x15/0x60
? page_fault_oops+0x157/0x440
? do_user_addr_fault+0x60/0x770
? _raw_spin_lock_irqsave+0x12/0x40
? exc_page_fault+0x61/0x120
? asm_exc_page_fault+0x22/0x30
? bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
? bnxt_get_max_rss_ctx_ring+0x25/0x90 [bnxt_en]
bnxt_set_channels+0x9d/0x340 [bnxt_en]
ethtool_set_channels+0x14b/0x210
__dev_ethtool+0xdf8/0x2890
? preempt_count_add+0x6a/0xa0
? percpu_counter_add_batch+0x23/0x90
? filemap_map_pages+0x417/0x4a0
? avc_has_extended_perms+0x185/0x420
? __pfx_udp_ioctl+0x10/0x10
? sk_ioctl+0x55/0xf0
? kmalloc_trace_noprof+0xe0/0x210
? dev_ethtool+0x54/0x170
dev_ethtool+0xa2/0x170
dev_ioctl+0xbe/0x530
sock_do_ioctl+0xa3/0xf0
sock_ioctl+0x20d/0x2e0
bp->rss_ctx_list is not initialized if the chip or firmware does not
support multiple RSS contexts. Fix it by adding a check in
bnxt_get_max_rss_ctx_ring() before proceeding to reference
bp->rss_ctx_list.
Fixes:
0d1b7d6c9274 ("bnxt: fix crashes when reducing ring count with active RSS contexts")
Reported-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/netdev/ZpFEJeNpwxW1aW9k@gmail.com/
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240712175318.166811-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Fri, 12 Jul 2024 19:08:42 +0000 (12:08 -0700)]
Merge tag 'for-6.10-rc7-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Fix a regression in extent map shrinker behaviour.
In the past weeks we got reports from users that there are huge
latency spikes or freezes. This was bisected to newly added shrinker
of extent maps (it was added to fix a build up of the structures in
memory).
I'm assuming that the freezes would happen to many users after release
so I'd like to get it merged now so it's in 6.10. Although the diff
size is not small the changes are relatively straightforward, the
reporters verified the fixes and we did testing on our side.
The fixes:
- adjust behaviour under memory pressure and check lock or scheduling
conditions, bail out if needed
- synchronize tracking of the scanning progress so inode ranges are
not skipped or work duplicated
- do a delayed iput when scanning a root so evicting an inode does
not slow things down in case of lots of dirty data, also fix
lockdep warning, a deadlock could happen when writing the dirty
data would need to start a transaction"
* tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: avoid races when tracking progress for extent map shrinking
btrfs: stop extent map shrinker if reschedule is needed
btrfs: use delayed iput during extent map shrinking
Linus Torvalds [Fri, 12 Jul 2024 17:39:29 +0000 (10:39 -0700)]
Merge tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A fix for a possible use-after-free following "rbd unmap" or "umount"
marked for stable and two kernel-doc fixups"
* tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-client:
libceph: fix crush_choose_firstn() kernel-doc warnings
libceph: suppress crush_choose_indep() kernel-doc warnings
libceph: fix race between delayed_work() and ceph_monc_stop()
Linus Torvalds [Fri, 12 Jul 2024 17:29:49 +0000 (10:29 -0700)]
Merge tag 'pmdomain-v6.10-rc2' of git://git./linux/kernel/git/ulfh/linux-pm
Pull pmdomain fix from Ulf Hansson:
- qcom: Skip retention level for rpmhpd's
* tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: qcom: rpmhpd: Skip retention level for Power Domains
Linus Torvalds [Fri, 12 Jul 2024 17:26:48 +0000 (10:26 -0700)]
Merge tag 'mmc-v6.10-rc4-2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- davinci_mmc: Prevent transmitted data size from exceeding sgm's
length
- sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
* tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length
mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
Linus Torvalds [Fri, 12 Jul 2024 16:00:25 +0000 (09:00 -0700)]
Merge tag 'arm-fixes-6.10-3' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Most of these changes are Qualcomm SoC specific and came in just after
I sent out the last set of fixes. This includes two regression fixes
for SoC drivers, a defconfig change to ensure the Lenovo X13s is
usable and 11 changes to DT files to fix regressions and minor
platform specific issues.
Tony and Chunyan step back from their respective maintainership roles
on the omap and unisoc platforms, and Christophe in turn takes over
maintaining some of the Freescale SoC drivers that he has been taking
care of in practice already.
Lastly, there are two trivial fixes for the davinci and sunxi
platforms"
* tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY
MAINTAINERS: Add more maintainers for omaps
ARM: davinci: Convert comma to semicolon
MAINTAINERS: Move myself from SPRD Maintainer to Reviewer
Revert "dt-bindings: cache: qcom,llcc: correct QDU1000 reg entries"
arm64: dts: qcom: qdu1000: Fix LLCC reg property
arm64: dts: qcom: sm6115: add iommu for sdhc_1
arm64: dts: qcom: x1e80100-crd: fix DAI used for headset recording
arm64: dts: qcom: x1e80100-crd: fix WCD audio codec TX port mapping
soc: qcom: pmic_glink: disable UCSI on sc8280xp
arm64: defconfig: enable Elan i2c-hid driver
arm64: dts: qcom: sc8280xp-crd: use external pull up for touch reset
arm64: dts: qcom: sc8280xp-x13s: fix touchscreen power on
arm64: dts: qcom: x1e80100: Fix PCIe 6a reg offsets and add MHI
arm64: dts: qcom: sa8775p: Correct IRQ number of EL2 non-secure physical timer
arm64: dts: allwinner: Fix PMIC interrupt number
arm64: dts: qcom: sc8280xp: Set status = "reserved" on PSHOLD
arm64: dts: qcom: x1e80100-*: Allocate some CMA buffers
arm64: dts: qcom: sc8180x: Fix LLCC reg property again
Linus Torvalds [Fri, 12 Jul 2024 15:45:27 +0000 (08:45 -0700)]
Merge tag 'char-misc-6.10-final' of git://git./linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
"Here are some small remaining driver fixes for 6.10-final that have
all been in linux-next for a while and resolve reported issues.
Included in here are:
- mei driver fixes (and a spelling fix at the end just to be clean)
- iio driver fixes for reported problems
- fastrpc bugfixes
- nvmem small fixes"
* tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mei: vsc: Fix spelling error
mei: vsc: Enhance SPI transfer of IVSC ROM
mei: vsc: Utilize the appropriate byte order swap function
mei: vsc: Prevent timeout error with added delay post-firmware download
mei: vsc: Enhance IVSC chipset stability during warm reboot
nvmem: core: limit cell sysfs permissions to main attribute ones
nvmem: core: only change name to fram for current attribute
nvmem: meson-efuse: Fix return value of nvmem callbacks
nvmem: rmem: Fix return value of rmem_read()
misc: microchip: pci1xxxx: Fix return value of nvmem callbacks
hpet: Support 32-bit userspace
misc: fastrpc: Restrict untrusted app to attach to privileged PD
misc: fastrpc: Fix ownership reassignment of remote heap
misc: fastrpc: Fix memory leak in audio daemon attach operation
misc: fastrpc: Avoid updating PD type for capability request
misc: fastrpc: Copy the complete capability structure to user
misc: fastrpc: Fix DSP capabilities request
iio: light: apds9306: Fix error handing
iio: trigger: Fix condition for own trigger
Linus Torvalds [Fri, 12 Jul 2024 15:39:44 +0000 (08:39 -0700)]
Merge tag 'tty-6.10-final' of git://git./linux/kernel/git/gregkh/tty
Pull tty / serial fixes from Greg KH:
"Here are some small serial driver fixes for 6.10-final. Included in
here are:
- qcom-geni fixes for a much much much discussed issue and everyone
now seems to be agreed that this is the proper way forward to
resolve the reported lockups
- imx serial driver bugfixes
- 8250_omap errata fix
- ma35d1 serial driver bugfix
All of these have been in linux-next for over a week with no reported
issues"
* tag 'tty-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: qcom-geni: do not kill the machine on fifo underrun
serial: qcom-geni: fix hard lockup on buffer flush
serial: qcom-geni: fix soft lockup on sw flow control and suspend
serial: imx: ensure RTS signal is not left active after shutdown
tty: serial: ma35d1: Add a NULL check for of_node
serial: 8250_omap: Fix Errata i2310 with RX FIFO level check
serial: imx: only set receiver level if it is zero
Linus Torvalds [Fri, 12 Jul 2024 15:35:56 +0000 (08:35 -0700)]
Merge tag 'usb-6.10-final' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB driver fixes and new device ids for
6.10-final. Included in here are:
- new usb-serial device ids for reported devices
- syzbot-triggered duplicate endpoint bugfix
- gadget bugfix for configfs memory overwrite
- xhci resume bugfix
- new device quirk added
- usb core error path bugfix
All of these have been in linux-next (most for a while) with no
reported issues"
* tag 'usb-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: mos7840: fix crash on resume
USB: serial: option: add Rolling RW350-GL variants
USB: serial: option: add support for Foxconn T99W651
USB: serial: option: add Netprisma LCUK54 series modules
usb: gadget: configfs: Prevent OOB read/write in usb_string_copy()
usb: dwc3: pci: add support for the Intel Panther Lake
usb: core: add missing of_node_put() in usb_of_has_devices_or_graph
USB: Add USB_QUIRK_NO_SET_INTF quirk for START BP-850k
USB: core: Fix duplicate endpoint bug by clearing reserved bits in the descriptor
xhci: always resume roothubs if xHC was reset during resume
USB: serial: option: add Telit generic core-dump composition
USB: serial: option: add Fibocom FM350-GL
USB: serial: option: add Telit FN912 rmnet compositions
Linus Torvalds [Fri, 12 Jul 2024 15:32:40 +0000 (08:32 -0700)]
Merge tag 'sound-6.10' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"The majority of changes here are small device-specific fixes for ASoC
SOF / Intel and usual HD-audio quirks.
The only significant high LOC is found in the Cirrus firmware driver,
but all those are for hardening against malicious firmware blobs, and
they look fine for taking as a last minute fix, too"
* tag 'sound-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Enable Mute LED on HP 250 G7
firmware: cs_dsp: Use strnlen() on name fields in V1 wmfw files
ALSA: hda/realtek: Limit mic boost on VAIO PRO PX
ALSA: hda: cs35l41: Fix swapped l/r audio channels for Lenovo ThinBook 13x Gen4
ASoC: SOF: Intel: hda-pcm: Limit the maximum number of periods by MAX_BDL_ENTRIES
ASoC: rt711-sdw: add missing readable registers
ASoC: SOF: Intel: hda: fix null deref on system suspend entry
ALSA: hda/realtek: add quirk for Clevo V5[46]0TU
firmware: cs_dsp: Prevent buffer overrun when processing V2 alg headers
firmware: cs_dsp: Validate payload length before processing block
firmware: cs_dsp: Return error if block header overflows file
firmware: cs_dsp: Fix overflow checking of wmfw header
Linus Torvalds [Fri, 12 Jul 2024 15:22:43 +0000 (08:22 -0700)]
Merge tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs fixes from Kent Overstreet:
- revert the SLAB_ACCOUNT patch, something crazy is going on in memcg
and someone forgot to test
- minor fixes: missing rcu_read_lock(), scheduling while atomic (in an
emergency shutdown path)
- two lockdep fixes; these could have gone earlier, but were left to
bake awhile
* tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs:
bcachefs: bch2_gc_btree() should not use btree_root_lock
bcachefs: Set PF_MEMALLOC_NOFS when trans->locked
bcachefs; Use trans_unlock_long() when waiting on allocator
Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT"
bcachefs: fix scheduling while atomic in break_cycle()
bcachefs: Fix RCU splat
David S. Miller [Fri, 12 Jul 2024 12:42:02 +0000 (13:42 +0100)]
Merge branch 'octeontx2-cpt-rss-cfg-fixes' into main
Srujana Challa says:
====================
Fixes for CPT and RSS configuration
This series of patches fixes various issues related to CPT
configuration and RSS configuration.
v1->v2:
- Excluded the patch "octeontx2-af: reduce cpt flt interrupt vectors for
cn10kb" to submit it to net-next.
- Addressed the review comments.
Kiran Kumar K (1):
octeontx2-af: Fix issue with IPv6 ext match for RSS
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Satheesh Paul [Wed, 10 Jul 2024 07:51:27 +0000 (13:21 +0530)]
octeontx2-af: fix issue with IPv4 match for RSS
While performing RSS based on IPv4, packets with
IPv4 options are not being considered. Adding changes
to match both plain IPv4 and IPv4 with option header.
Fixes:
41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kiran Kumar K [Wed, 10 Jul 2024 07:51:26 +0000 (13:21 +0530)]
octeontx2-af: fix issue with IPv6 ext match for RSS
While performing RSS based on IPv6, extension ltype
is not being considered. This will be problem for
fragmented packets or packets with extension header.
Adding changes to match IPv6 ext header along with IPv6
ltype.
Fixes:
41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Mazur [Wed, 10 Jul 2024 07:51:25 +0000 (13:21 +0530)]
octeontx2-af: fix detection of IP layer
Checksum and length checks are not enabled for IPv4 header with
options and IPv6 with extension headers.
To fix this a change in enum npc_kpu_lc_ltype is required which will
allow adjustment of LTYPE_MASK to detect all types of IP headers.
Fixes:
21e6699e5cd6 ("octeontx2-af: Add NPC KPU profile")
Signed-off-by: Michal Mazur <mmazur2@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Srujana Challa [Wed, 10 Jul 2024 07:51:24 +0000 (13:21 +0530)]
octeontx2-af: fix a issue with cpt_lf_alloc mailbox
This patch fixes CPT_LF_ALLOC mailbox error due to
incompatible mailbox message format. Specifically, it
corrects the `blkaddr` field type from `int` to `u8`.
Fixes:
de2854c87c64 ("octeontx2-af: Mailbox changes for 98xx CPT block")
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Dabilpuram [Wed, 10 Jul 2024 07:51:23 +0000 (13:21 +0530)]
octeontx2-af: replace cpt slot with lf id on reg write
Replace slot id with global CPT lf id on reg read/write as
CPTPF/VF driver would send slot number instead of global
lf id in the reg offset. And also update the mailbox response
with the global lf's register offset.
Fixes:
ae454086e3c2 ("octeontx2-af: add mailbox interface for CPT")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Leroy [Fri, 12 Jul 2024 11:14:26 +0000 (13:14 +0200)]
MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY
FREESCALE SOC DRIVERS has been orphaned since
commit
eaac25d026a1 ("MAINTAINERS: Drop Li Yang as their email address
stopped working")
QUICC ENGINE LIBRARY has Qiang Zhao as maintainer but he hasn't
responded for years and when Li Yang was still maintaining FREESCALE
SOC DRIVERS he was also handling QUICC ENGINE LIBRARY directly.
As a maintainer of LINUX FOR POWERPC EMBEDDED PPC8XX AND PPC83XX, I
also need FREESCALE SOC DRIVERS to be actively maintained, so add
myself as maintainer of FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY.
See below link for more context.
Link: https://lore.kernel.org/linuxppc-dev/20240219153016.ntltc76bphwrv6hn@skbuf/T/#mf6d4a5eef79e8eae7ae0456a2794c01e630a6756
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tony Lindgren [Tue, 9 Jul 2024 13:59:29 +0000 (16:59 +0300)]
MAINTAINERS: Add more maintainers for omaps
There are many generations of omaps to maintain, and I will be only active
as a hobbyist with time permitting. Let's add more maintainers to ensure
continued Linux support.
TI is interested in maintaining the active SoCs such as am3, am4 and
dra7. And the hobbyists are interested in maintaining some of the older
devices, mainly based on omap3 and 4 SoCs.
Kevin and Roger have agreed to maintain the active TI parts. Both Kevin
and Roger have been working on the omap variants for a long time, and
have a good understanding of the hardware.
Aaro and Andreas have agreed to maintain the community devices. Both Aaro
and Andreas have long experience on working with the earlier TI SoCs.
While at it, let's also change me to be a reviewer for the omap1, and
drop the link to my old omap web page.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Wolfram Sang [Thu, 11 Jul 2024 12:08:19 +0000 (14:08 +0200)]
i2c: testunit: avoid re-issued work after read message
The to-be-fixed commit rightfully prevented that the registers will be
cleared. However, the index must be cleared. Otherwise a read message
will re-issue the last work. Fix it and add a comment describing the
situation.
Fixes:
c422b6a63024 ("i2c: testunit: don't erase registers after STOP")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Aleksandr Loktionov [Wed, 10 Jul 2024 22:44:54 +0000 (15:44 -0700)]
i40e: fix: remove needless retries of NVM update
Remove wrong EIO to EGAIN conversion and pass all errors as is.
After commit
230f3d53a547 ("i40e: remove i40e_status"), which should only
replace F/W specific error codes with Linux kernel generic, all EIO errors
suddenly started to be converted into EAGAIN which leads nvmupdate to retry
until it timeouts and sometimes fails after more than 20 minutes in the
middle of NVM update, so NVM becomes corrupted.
The bug affects users only at the time when they try to update NVM, and
only F/W versions that generate errors while nvmupdate. For example, X710DA2
with 0x8000ECB7 F/W is affected, but there are probably more...
Command for reproduction is just NVM update:
./nvmupdate64
In the log instead of:
i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOMEM)
appears:
i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM
i40e: eeprom check failed (-5), Tx/Rx traffic disabled
The problematic code did silently convert EIO into EAGAIN which forced
nvmupdate to ignore EAGAIN error and retry the same operation until timeout.
That's why NVM update takes 20+ minutes to finish with the fail in the end.
Fixes:
230f3d53a547 ("i40e: remove i40e_status")
Co-developed-by: Kelvin Kang <kelvin.kang@intel.com>
Signed-off-by: Kelvin Kang <kelvin.kang@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Saeed Mahameed [Wed, 10 Jul 2024 22:55:38 +0000 (15:55 -0700)]
net: ethtool: Fix RSS setting
When user submits a rxfh set command without touching XFRM_SYM_XOR,
rxfh.input_xfrm is set to RXH_XFRM_NO_CHANGE, which is equal to 0xff.
Testing if (rxfh.input_xfrm & RXH_XFRM_SYM_XOR &&
!ops->cap_rss_sym_xor_supported)
return -EOPNOTSUPP;
Will always be true on devices that don't set cap_rss_sym_xor_supported,
since rxfh.input_xfrm & RXH_XFRM_SYM_XOR is always true, if input_xfrm
was not set, i.e RXH_XFRM_NO_CHANGE=0xff, which will result in failure
of any command that doesn't require any change of XFRM, e.g RSS context
or hash function changes.
To avoid this breakage, test if rxfh.input_xfrm != RXH_XFRM_NO_CHANGE
before testing other conditions. Note that the problem will only trigger
with XFRM-aware userspace, old ethtool CLI would continue to work.
Fixes:
0dd415d15505 ("net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20240710225538.43368-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kent Overstreet [Fri, 5 Jul 2024 01:02:16 +0000 (21:02 -0400)]
bcachefs: bch2_gc_btree() should not use btree_root_lock
btree_root_lock is for the root keys in btree_root, not the pointers to
the nodes themselves; this fixes a lock ordering issue between
btree_root_lock and btree node locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 4 Jul 2024 00:35:36 +0000 (20:35 -0400)]
bcachefs: Set PF_MEMALLOC_NOFS when trans->locked
proper lock ordering is: fs_reclaim -> btree node locks
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jul 2024 20:14:11 +0000 (16:14 -0400)]
bcachefs; Use trans_unlock_long() when waiting on allocator
not using unlock_long() blocks key cache reclaim, and the allocator may
take awhile
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 12 Jul 2024 00:01:31 +0000 (20:01 -0400)]
Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT"
This reverts commit
86d81ec5f5f05846c7c6e48ffb964b24cba2e669.
This wasn't tested with memcg enabled, it immediately hits a null ptr
deref in list_lru_add().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Wolfram Sang [Thu, 11 Jul 2024 08:30:44 +0000 (10:30 +0200)]
i2c: rcar: ensure Gen3+ reset does not disturb local targets
R-Car Gen3+ needs a reset before every controller transfer. That erases
configuration of a potentially in parallel running local target
instance. To avoid this disruption, avoid controller transfers if a
local target is running. Also, disable SMBusHostNotify because it
requires being a controller and local target at the same time.
Fixes:
3b770017b03a ("i2c: rcar: handle RXDMA HW behaviour on Gen3")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Linus Torvalds [Thu, 11 Jul 2024 22:11:14 +0000 (15:11 -0700)]
Merge tag 'for-6.10/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mikulas Patocka:
- Fix broken discard for device mapper VDO target
* tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm vdo: replace max_discard_sectors with max_hw_discard_sectors