Wojciech Gładysz [Wed, 3 Jul 2024 07:01:12 +0000 (09:01 +0200)]
ext4: sanity check for NULL pointer after ext4_force_shutdown
Test case: 2 threads write short inline data to a file.
In ext4_page_mkwrite the resulting inline data is converted.
Handling ext4_grp_locked_error with description "block bitmap
and bg descriptor inconsistent: X vs Y free clusters" calls
ext4_force_shutdown. The conversion clears
EXT4_STATE_MAY_INLINE_DATA but fails for
ext4_destroy_inline_data_nolock and ext4_mark_iloc_dirty due
to ext4_forced_shutdown. The restoration of inline data fails
for the same reason not setting EXT4_STATE_MAY_INLINE_DATA.
Without the flag set a regular process path in ext4_da_write_end
follows trying to dereference page folio private pointer that has
not been set. The fix calls early return with -EIO error shall the
pointer to private be NULL.
Sample crash report:
Unable to handle kernel paging request at virtual address
dfff800000000004
KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[
dfff800000000004] address between user and kernel address ranges
Internal error: Oops:
0000000096000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 20274 Comm: syz-executor185 Not tainted
6.9.0-rc7-syzkaller-gfda5695d692c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
pstate:
80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __block_commit_write+0x64/0x2b0 fs/buffer.c:2167
lr : __block_commit_write+0x3c/0x2b0 fs/buffer.c:2160
sp :
ffff8000a1957600
x29:
ffff8000a1957610 x28:
dfff800000000000 x27:
ffff0000e30e34b0
x26:
0000000000000000 x25:
dfff800000000000 x24:
dfff800000000000
x23:
fffffdffc397c9e0 x22:
0000000000000020 x21:
0000000000000020
x20:
0000000000000040 x19:
fffffdffc397c9c0 x18:
1fffe000367bd196
x17:
ffff80008eead000 x16:
ffff80008ae89e3c x15:
00000000200000c0
x14:
1fffe0001cbe4e04 x13:
0000000000000000 x12:
0000000000000000
x11:
0000000000000001 x10:
0000000000ff0100 x9 :
0000000000000000
x8 :
0000000000000004 x7 :
0000000000000000 x6 :
0000000000000000
x5 :
fffffdffc397c9c0 x4 :
0000000000000020 x3 :
0000000000000020
x2 :
0000000000000040 x1 :
0000000000000020 x0 :
fffffdffc397c9c0
Call trace:
__block_commit_write+0x64/0x2b0 fs/buffer.c:2167
block_write_end+0xb4/0x104 fs/buffer.c:2253
ext4_da_do_write_end fs/ext4/inode.c:2955 [inline]
ext4_da_write_end+0x2c4/0xa40 fs/ext4/inode.c:3028
generic_perform_write+0x394/0x588 mm/filemap.c:3985
ext4_buffered_write_iter+0x2c0/0x4ec fs/ext4/file.c:299
ext4_file_write_iter+0x188/0x1780
call_write_iter include/linux/fs.h:2110 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0x968/0xc3c fs/read_write.c:590
ksys_write+0x15c/0x26c fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__arm64_sys_write+0x7c/0x90 fs/read_write.c:652
__invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:48
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:133
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:152
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Code:
97f85911 f94002da 91008356 d343fec8 (
38796908)
---[ end trace
0000000000000000 ]---
----------------
Code disassembly (best guess):
0:
97f85911 bl 0xffffffffffe16444
4:
f94002da ldr x26, [x22]
8:
91008356 add x22, x26, #0x20
c:
d343fec8 lsr x8, x22, #3
* 10:
38796908 ldrb w8, [x8, x25] <-- trapping instruction
Reported-by: syzbot+18df508cf00a0598d9a6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
18df508cf00a0598d9a6
Link: https://lore.kernel.org/all/000000000000f19a1406109eb5c5@google.com/T/
Signed-off-by: Wojciech Gładysz <wojciech.gladysz@infogain.com>
Link: https://patch.msgid.link/20240703070112.10235-1-wojciech.gladysz@infogain.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 1 Jul 2024 13:28:00 +0000 (15:28 +0200)]
jbd2: increase maximum transaction size
Originally, we were quite conservative in limiting maximum transaction
size to a quarter of the journal because we were not accounting
transaction descriptor and revoke blocks. These days we do properly
account them and reserve space for them from the total transaction
credits. Thus there's no need to be so conservative and we can increase
the maximum transaction size to one third of the journal (even half
should work fine in principle but the performance will likely suffer in
that case). This also fixes failures to grow filesystems with tiny
journals.
Link: CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6Vyxtw@mail.gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240701132800.7158-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 24 Jun 2024 17:01:20 +0000 (19:01 +0200)]
jbd2: drop pointless shrinker batch initialization
In jbd2_journal_init_common() we set batch size of a shrinker shrinking
checkpointed buffers to journal->j_max_transaction_buffers. But that is
guaranteed to be 0 at that point so we effectively stay with the default
shrinker batch size of 128. It has been like this since introduction of
jbd2 shrinkers so just drop the pointless initialization.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 24 Jun 2024 17:01:19 +0000 (19:01 +0200)]
jbd2: avoid infinite transaction commit loop
Commit
9f356e5a4f12 ("jbd2: Account descriptor blocks into
t_outstanding_credits") started to account descriptor blocks into
transactions outstanding credits. However it didn't appropriately
decrease the maximum amount of credits available to userspace. Thus if
the filesystem requests a transaction smaller than
j_max_transaction_buffers but large enough that when descriptor blocks
are added the size exceeds j_max_transaction_buffers, we confuse
add_transaction_credits() into thinking previous handles have grown the
transaction too much and enter infinite journal commit loop in
start_this_handle() -> add_transaction_credits() trying to create
transaction with enough credits available.
Fix the problem by properly accounting for transaction space reserved
for descriptor blocks when verifying requested transaction handle size.
CC: stable@vger.kernel.org
Fixes:
9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits")
Reported-by: Alexander Coffin <alex.coffin@maticrobots.com>
Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6Vyxtw@mail.gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 24 Jun 2024 17:01:18 +0000 (19:01 +0200)]
jbd2: precompute number of transaction descriptor blocks
Instead of computing the number of descriptor blocks a transaction can
have each time we need it (which is currently when starting each
transaction but will become more frequent later) precompute the number
once during journal initialization together with maximum transaction
size. We perform the precomputation whenever journal feature set is
updated similarly as for computation of
journal->j_revoke_records_per_block.
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 24 Jun 2024 17:01:17 +0000 (19:01 +0200)]
jbd2: make jbd2_journal_get_max_txn_bufs() internal
There's no reason to have jbd2_journal_get_max_txn_bufs() public
function. Currently all users are internal and can use
journal->j_max_transaction_buffers instead. This saves some unnecessary
recomputations of the limit as a bonus which becomes important as this
function gets more complex in the following patch.
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ye Bin [Thu, 20 Jun 2024 07:24:05 +0000 (15:24 +0800)]
jbd2: avoid mount failed when commit block is partial submitted
We encountered a problem that the file system could not be mounted in
the power-off scenario. The analysis of the file system mirror shows that
only part of the data is written to the last commit block.
The valid data of the commit block is concentrated in the first sector.
However, the data of the entire block is involved in the checksum calculation.
For different hardware, the minimum atomic unit may be different.
If the checksum of a committed block is incorrect, clear the data except the
'commit_header' and then calculate the checksum. If the checkusm is correct,
it is considered that the block is partially committed, Then continue to replay
journal.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240620072405.3533701-1-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Thu, 13 Jun 2024 15:02:34 +0000 (17:02 +0200)]
ext4: avoid writing unitialized memory to disk in EA inodes
If the extended attribute size is not a multiple of block size, the last
block in the EA inode will have uninitialized tail which will get
written to disk. We will never expose the data to userspace but still
this is not a good practice so just zero out the tail of the block as it
isn't going to cause a noticeable performance overhead.
Fixes:
e50e5129f384 ("ext4: xattr-in-inode support")
Reported-by: syzbot+9c1fe13fcb51574b249b@syzkaller.appspotmail.com
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240613150234.25176-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Luis Henriques (SUSE) [Tue, 18 Jun 2024 14:43:12 +0000 (15:43 +0100)]
ext4: don't track ranges in fast_commit if inode has inlined data
When fast-commit needs to track ranges, it has to handle inodes that have
inlined data in a different way because ext4_fc_write_inode_data(), in the
actual commit path, will attempt to map the required blocks for the range.
However, inodes that have inlined data will have it's data stored in
inode->i_block and, eventually, in the extended attribute space.
Unfortunately, because fast commit doesn't currently support extended
attributes, the solution is to mark this commit as ineligible.
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039883
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Tested-by: Ben Hutchings <benh@debian.org>
Fixes:
9725958bb75c ("ext4: fast commit may miss tracking unwritten range during ftruncate")
Link: https://patch.msgid.link/20240618144312.17786-1-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Luis Henriques (SUSE) [Wed, 29 May 2024 09:20:30 +0000 (10:20 +0100)]
ext4: fix possible tid_t sequence overflows
In the fast commit code there are a few places where tid_t variables are
being compared without taking into account the fact that these sequence
numbers may wrap. Fix this issue by using the helper functions tid_gt()
and tid_geq().
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://patch.msgid.link/20240529092030.9557-3-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Luis Henriques (SUSE) [Mon, 27 May 2024 16:14:47 +0000 (17:14 +0100)]
ext4: use ext4_update_inode_fsync_trans() helper in inode creation
Call helper function ext4_update_inode_fsync_trans() instead of open
coding it in __ext4_new_inode(). This helper checks both that the handle
is valid *and* that it hasn't been aborted due to some fatal error in the
journalling layer, using is_handle_aborted().
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Link: https://patch.msgid.link/20240527161447.21434-1-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jeff Johnson [Mon, 27 May 2024 18:02:29 +0000 (11:02 -0700)]
ext4: add missing MODULE_DESCRIPTION()
Fix the 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/ext4/ext4-inode-test.o
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240527-md-fs-ext4-v1-1-07aad5936bb1@quicinc.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jeff Johnson [Sun, 26 May 2024 18:53:49 +0000 (11:53 -0700)]
jbd2: add missing MODULE_DESCRIPTION()
Fix the 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/jbd2/jbd2.o
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://patch.msgid.link/20240526-md-fs-jbd2-v1-1-7bba6665327d@quicinc.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kees Cook [Thu, 23 May 2024 22:54:12 +0000 (15:54 -0700)]
ext4: use memtostr_pad() for s_volume_name
As with the other strings in struct ext4_super_block, s_volume_name is
not NUL terminated. The other strings were marked in commit
072ebb3bffe6
("ext4: add nonstring annotations to ext4.h"). Using strscpy() isn't
the right replacement for strncpy(); it should use memtostr_pad()
instead.
Reported-by: syzbot+50835f73143cc2905b9e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
00000000000019f4c00619192c05@google.com/
Fixes:
744a56389f73 ("ext4: replace deprecated strncpy with alternatives")
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://patch.msgid.link/20240523225408.work.904-kees@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Mon, 20 May 2024 13:18:31 +0000 (21:18 +0800)]
jbd2: speed up jbd2_transaction_committed()
jbd2_transaction_committed() is used to check whether a transaction with
the given tid has already committed, it holds j_state_lock in read mode
and check the tid of current running transaction and committing
transaction, but holding the j_state_lock is expensive.
We have already stored the sequence number of the most recently
committed transaction in journal t->j_commit_sequence, we could do this
check by comparing it with the given tid instead. If the given tid isn't
smaller than j_commit_sequence, we can ensure that the given transaction
has been committed. That way we could drop the expensive lock and
achieve about 10% ~ 20% performance gains in concurrent DIOs on may
virtual machine with 100G ramdisk.
fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
-bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
-group_reporting
Before:
overwrite IOPS=88.2k, BW=344MiB/s
read IOPS=95.7k, BW=374MiB/s
rand overwrite IOPS=98.7k, BW=386MiB/s
randread IOPS=102k, BW=397MiB/s
After:
overwrite IOPS=105k, BW=410MiB/s
read IOPS=112k, BW=436MiB/s
rand overwrite IOPS=104k, BW=404MiB/s
randread IOPS=111k, BW=432MiB/s
CC: Dave Chinner <david@fromorbit.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Link: https://lore.kernel.org/linux-ext4/ZjILCPNZRHeazSqV@dread.disaster.area/
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/20240520131831.2910790-1-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:40:05 +0000 (20:40 +0800)]
ext4: make ext4_da_map_blocks() buffer_head unaware
After calling the ext4_da_map_blocks(), a delalloc extent state could
be identified through the EXT4_MAP_DELAYED flag in map. So factor out
buffer_head related handles in ext4_da_map_blocks(), make this function
buffer_head unaware and becomes a common helper, and also update the
stale function commtents, preparing for the iomap da write path in the
future.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-11-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:40:04 +0000 (20:40 +0800)]
ext4: make ext4_insert_delayed_block() insert multi-blocks
Rename ext4_insert_delayed_block() to ext4_insert_delayed_blocks(),
pass length parameter to make it insert multiple delalloc blocks at a
time. For non-bigalloc case, just reserve len blocks and insert delalloc
extent. For bigalloc case, we can ensure that the clusters in the middle
of a extent must be unallocated, we only need to check whether the start
and end clusters are delayed/allocated. We should subtract the space for
the start and/or end block(s) if they are allocated.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-10-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:40:03 +0000 (20:40 +0800)]
ext4: factor out a helper to check the cluster allocation state
Factor out a common helper ext4_clu_alloc_state(), check whether the
cluster containing a delalloc block to be added has been allocated or
has delalloc reservation, no logic changes.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-9-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:40:02 +0000 (20:40 +0800)]
ext4: make ext4_da_reserve_space() reserve multi-clusters
Add 'nr_resv' parameter to ext4_da_reserve_space(), which indicates the
number of clusters wants to reserve, make it reserve multiple clusters
at a time.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-8-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:40:01 +0000 (20:40 +0800)]
ext4: make ext4_es_insert_delayed_block() insert multi-blocks
Rename ext4_es_insert_delayed_block() to ext4_es_insert_delayed_extent()
and pass length parameter to make it insert multiple delalloc blocks at
a time. For the case of bigalloc, split the allocated parameter to
lclu_allocated and end_allocated. lclu_allocated indicates the
allocation state of the cluster which is containing the lblk,
end_allocated indicates the allocation state of the extent end, clusters
in the middle of delay allocated extent must be unallocated.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-7-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:40:00 +0000 (20:40 +0800)]
ext4: drop iblock parameter
The start block of the delalloc extent to be inserted is equal to
map->m_lblk, just drop the duplicate iblock input parameter.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/20240517124005.347221-6-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:39:59 +0000 (20:39 +0800)]
ext4: trim delalloc extent
In ext4_da_map_blocks(), we could find four kind of extents in the
extent status tree: hole, unwritten, written and delayed extent. Now we
only trim the map len if we found an unwritten extent or a written
extent. This is okay now since map->m_len is always set to one and we
always insert one delayed block at a time. But this will become isn't
okay for other two cases if ext4_insert_delayed_block() and
ext4_da_map_blocks() support inserting multiple map->len blocks later.
1. If we found a hole in the extent status tree which es->es_len is
shorter than the length we want to write, we should trim the
map->m_len to prevent adding extra delay more blocks than we
expected. For example, assume we write data [A, C) to a file that
contains a hole extent [A, B) and a written extent [B, D) in the
cache.
A B C D
before da write: ...hhhhhh|wwwwww....
Then we will get extent [A, B), we should trim map->m_len to B-A
before inserting new delalloc blocks, if not, the range [B, C) will
be duplicated.
2. If we found a delayed extent in the extent status tree which
es->es_len is shorter than the length we want to write, we should
trim the map->m_len to es->es_len and return directly since the front
part of this map has been delayed, we can't insert the delalloc
extent that contains the latter part in this round, we should return
the delayed length and the caller should increase the position and
call ext4_da_map_blocks() again. For example, assume we write data
[A, C) to a file that contains a delayed extent [A, B) in the cache.
A B C
before da write: ...dddddd|hhh....
Then we will get delayed extent [A, B), we should also trim map->m_len
to B-A and return, if not, we will incorrectly assume that the write
is complete and won't insert [B, C).
So we need to always trim the map->m_len if the found es->es_len in the
extent status tree is shorter than the map->m_len, prearing for
inserting a extent with multiple delalloc blocks. This patch only does a
pre-fix, the handle is crude and ext4_da_map_blocks() deserve a cleanup,
we will do that later.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-5-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:39:58 +0000 (20:39 +0800)]
ext4: warn if delalloc counters are not zero on inactive
The per-inode i_reserved_data_blocks count the reserved delalloc blocks
in a regular file, it should be zero when destroying the file. The
per-fs s_dirtyclusters_counter count all reserved delalloc blocks in a
filesystem, it also should be zero when umounting the filesystem. Now we
have only an error message if the i_reserved_data_blocks is not zero,
which is unable to be simply captured, so add WARN_ON_ONCE to make it
more visable.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-4-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:39:57 +0000 (20:39 +0800)]
ext4: check the extent status again before inserting delalloc block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (
00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Fri, 17 May 2024 12:39:56 +0000 (20:39 +0800)]
ext4: factor out a common helper to query extent map
Factor out a new common helper ext4_map_query_blocks() from the
ext4_da_map_blocks(), it query and return the extent map status on the
inode's extent path, no logic changes.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/20240517124005.347221-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Luis Henriques (SUSE) [Wed, 15 May 2024 08:28:57 +0000 (09:28 +0100)]
ext4: fix infinite loop when replaying fast_commit
When doing fast_commit replay an infinite loop may occur due to an
uninitialized extent_status struct. ext4_ext_determine_insert_hole() does
not detect the replay and calls ext4_es_find_extent_range(), which will
return immediately without initializing the 'es' variable.
Because 'es' contains garbage, an integer overflow may happen causing an
infinite loop in this function, easily reproducible using fstest generic/039.
This commit fixes this issue by unconditionally initializing the structure
in function ext4_es_find_extent_range().
Thanks to Zhang Yi, for figuring out the real problem!
Fixes:
8016e29f4362 ("ext4: fast commit recovery path")
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240515082857.32730-1-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:38 +0000 (19:24 +0800)]
jbd2: remove unnecessary "should_sleep" in kjournald2
We only need to sleep if no running transaction is expired. Simply remove
unnecessary "should_sleep".
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-10-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:37 +0000 (19:24 +0800)]
jbd2: remove dead check of JBD2_UNMOUNT in kjournald2
We always set JBD2_UNMOUNT with j_state_lock held in journal_kill_thread.
In kjournald2, we check JBD2_UNMOUNT flag two times under the same
j_state_lock. Then the second check is unnecessary.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-9-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:36 +0000 (19:24 +0800)]
jbd2: remove dead equality check of j_commit_[sequence/request] in kjournald2
The j_commit_[sequence/request] are updated with j_state_lock held during
runtime. In kjournald2, two equality checks of j_commit_[sequence/request]
are under the same j_state_lock, then the second check is unnecessary.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-8-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:35 +0000 (19:24 +0800)]
jbd2: use bh_in instead of jh2bh(jh_in) to simplify code
We save jh2bh(jh_in) to bh_in, so use bh_in directly instead of
jh2bh(jh_in) to simplify the code.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-7-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:34 +0000 (19:24 +0800)]
jbd2: remove unneeded kmap to do escape in jbd2_journal_write_metadata_buffer
The data to do escape could be accessed directly from b_frozen_data,
just remove unneeded kmap.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-6-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:33 +0000 (19:24 +0800)]
jbd2: jump to new copy_done tag when b_frozen_data is created concurrently
If b_frozen_data is created concurrently, we can update new_folio and
new_offset with b_frozen_data and then move forward
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-5-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:32 +0000 (19:24 +0800)]
jbd2: remove unnedded "need_copy_out" in jbd2_journal_write_metadata_buffer
As we only need to copy out when we should do escape, need_copy_out
could be simply replaced by "do_escape".
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-4-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:31 +0000 (19:24 +0800)]
jbd2: remove unused return info from jbd2_journal_write_metadata_buffer
The done_copy_out info from jbd2_journal_write_metadata_buffer is not
used. Simply remove it.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-3-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Kemeng Shi [Tue, 14 May 2024 11:24:30 +0000 (19:24 +0800)]
jbd2: avoid memleak in jbd2_journal_write_metadata_buffer
The new_bh is from alloc_buffer_head, we should call free_buffer_head to
free it in error case.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-2-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Xiaxi Shen [Wed, 1 May 2024 03:30:17 +0000 (20:30 -0700)]
ext4: fix uninitialized variable in ext4_inlinedir_to_tree
Syzbot has found an uninit-value bug in ext4_inlinedir_to_tree
This error happens because ext4_inlinedir_to_tree does not
handle the case when ext4fs_dirhash returns an error
This can be avoided by checking the return value of ext4fs_dirhash
and propagating the error,
similar to how it's done with ext4_htree_store_dirent
Signed-off-by: Xiaxi Shen <shenxiaxi26@gmail.com>
Reported-and-tested-by: syzbot+eaba5abe296837a640c0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
eaba5abe296837a640c0
Link: https://patch.msgid.link/20240501033017.220000-1-shenxiaxi26@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Thorsten Blum [Tue, 2 Apr 2024 10:51:58 +0000 (12:51 +0200)]
jbd2: use str_plural() to fix Coccinelle warning
Fixes the following Coccinelle/coccicheck warning reported by
string_choices.cocci:
opportunity for str_plural(dropped)
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Link: https://patch.msgid.link/20240402105157.254389-2-thorsten.blum@toblux.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Li zeming [Tue, 2 Apr 2024 02:23:00 +0000 (10:23 +0800)]
ext4: block_validity: Remove unnecessary ‘NULL’ values from new_node
new_node is assigned first, so it does not need to initialize the
assignment.
Signed-off-by: Li zeming <zeming@nfschina.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://patch.msgid.link/20240402022300.25858-1-zeming@nfschina.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Linus Torvalds [Sun, 23 Jun 2024 21:08:54 +0000 (17:08 -0400)]
Linux 6.10-rc5
Linus Torvalds [Sun, 23 Jun 2024 15:06:01 +0000 (11:06 -0400)]
Merge tag 'i2c-for-6.10-rc5' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"The core gains placeholders for recently added functions when
CONFIG_I2C is not defined as well documentation fixes to start using
inclusive terminology.
The drivers get paths in DT bindings fixed as well as proper interrupt
handling for the ocores driver"
* tag 'i2c-for-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
docs: i2c: summary: be clearer with 'controller/target' and 'adapter/client' pairs
docs: i2c: summary: document 'local' and 'remote' targets
docs: i2c: summary: document use of inclusive language
docs: i2c: summary: update speed mode description
docs: i2c: summary: update I2C specification link
docs: i2c: summary: start sentences consistently.
i2c: Add nop fwnode operations
i2c: ocores: set IACK bit after core is enabled
dt-bindings: i2c: google,cros-ec-i2c-tunnel: correct path to i2c-controller schema
dt-bindings: i2c: atmel,at91sam: correct path to i2c-controller schema
Linus Torvalds [Sun, 23 Jun 2024 15:01:57 +0000 (11:01 -0400)]
Merge tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"Five smb3 client fixes
- three nets/fiolios cifs fixes
- fix typo in module parameters description
- fix incorrect swap warning"
* tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Move the 'pid' from the subreq to the req
cifs: Only pick a channel once per read request
cifs: Defer read completion
cifs: fix typo in module parameter enable_gcm_256
cifs: drop the incorrect assertion in cifs_swap_rw()
Linus Torvalds [Sun, 23 Jun 2024 14:32:24 +0000 (10:32 -0400)]
Merge tag 'fixes-2024-06-23' of git://git./linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
"Fix fragility in checks for unset node ID.
Use numa_valid_node() function to verify that nid is a valid node
ID instead of inconsistent comparisons with either NUMA_NO_NODE or
MAX_NUMNODES"
* tag 'fixes-2024-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock: use numa_valid_node() helper to check for invalid node ID
Linus Torvalds [Sun, 23 Jun 2024 14:20:02 +0000 (07:20 -0700)]
Merge tag 'mips-fixes_6.10_2' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fix lseek in o32 compat mode
- fix for microMIPS MT ASE helpers
* tag 'mips-fixes_6.10_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: fix compat_sys_lseek syscall
MIPS: mipsmtregs: Fix target register for MFTC0
Linus Torvalds [Sun, 23 Jun 2024 14:16:13 +0000 (07:16 -0700)]
Merge tag 'x86_urgent_for_v6.10_rc5' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- An ARM-relevant fix to not free default RMIDs of a resource control
group
- A randconfig build fix for the VMware virtual GPU driver
* tag 'x86_urgent_for_v6.10_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Don't try to free nonexistent RMIDs
drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependency
Linus Torvalds [Sun, 23 Jun 2024 14:13:23 +0000 (07:13 -0700)]
Merge tag 'powerpc-6.10-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Prevent use-after-free in 64-bit KVM VFIO
- Add generated Power8 crypto asm to .gitignore
Thanks to Al Viro and Nathan Lynch.
* tag 'powerpc-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
KVM: PPC: Book3S HV: Prevent UAF in kvm_spapr_tce_attach_iommu_group()
powerpc/crypto: Add generated P8 asm to .gitignore
Wolfram Sang [Sun, 23 Jun 2024 00:13:27 +0000 (02:13 +0200)]
Merge tag 'i2c-host-fixes-6.10-rc5' of git://git./linux/kernel/git/andi.shyti/linux into i2c/for-current
This pull request fixes the paths of the dt-schema to their
complete locations for the ChromeOS EC tunnel driver and the
Atmel at91sam drivers.
Additionally, the OpenCores driver receives a fix for an issue
that dates back to version 2.6.18. Specifically, the interrupts
need to be acknowledged (clearing all pending interrupts) after
enabling the core.
Linus Torvalds [Sat, 22 Jun 2024 22:29:49 +0000 (15:29 -0700)]
Merge tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linux
Pull rust fix from Miguel Ojeda:
- Avoid unused import warning in 'rusttest'.
* tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linux:
rust: avoid unused import warning in `rusttest`
Linus Torvalds [Sat, 22 Jun 2024 21:02:16 +0000 (14:02 -0700)]
Merge tag 'regulator-fix-v6.10-rc4' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A few driver specific fixes for incorrect device descriptions, plus a
fix for a missing symbol export which causes build failures for some
newly added drivers in other trees"
* tag 'regulator-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: axp20x: AXP717: fix LDO supply rails and off-by-ones
regulator:
bd71815: fix ramp values
regulator: core: Fix modpost error "regulator_get_regmap" undefined
regulator: tps6594-regulator: Fix the number of irqs for TPS65224 and TPS6594
Linus Torvalds [Sat, 22 Jun 2024 20:58:47 +0000 (13:58 -0700)]
Merge tag 'spi-fix-v6.10-rc4' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A number of fixes that have built up for SPI, a bunch of driver
specific ones including an unfortunate revert of an optimisation for
the i.MX driver which was causing issues with some configurations,
plus a couple of core fixes for the rarely used octal mode and for a
bad interaction between multi-CS support and target mode"
* tag 'spi-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-imx: imx51: revert burst length calculation back to bits_per_word
spi: Fix SPI slave probe failure
spi: Fix OCTAL mode support
spi: stm32: qspi: Clamp stm32_qspi_get_mode() output to CCR_BUSWIDTH_4
spi: stm32: qspi: Fix dual flash mode sanity test in stm32_qspi_setup()
spi: cs42l43: Drop cs35l56 SPI speed down to 11MHz
spi: cs42l43: Correct SPI root clock speed
Linus Torvalds [Sat, 22 Jun 2024 20:55:56 +0000 (13:55 -0700)]
Merge tag 'nfsd-6.10-2' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix crashes triggered by administrative operations on the server
* tag 'nfsd-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: grab nfsd_mutex in nfsd_nl_rpc_status_get_dumpit()
nfsd: fix oops when reading pool_stats before server is started
Linus Torvalds [Sat, 22 Jun 2024 16:07:56 +0000 (09:07 -0700)]
Merge tag 'xfs-6.10-fixes-4' of git://git./fs/xfs/xfs-linux
Pull xfs fix from Chandan Babu:
- Fix assertion failure due to a race between unlink and cluster buffer
instantiation.
* tag 'xfs-6.10-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix unlink vs cluster buffer instantiation race
Linus Torvalds [Sat, 22 Jun 2024 16:02:39 +0000 (09:02 -0700)]
Merge tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs.
The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits
for the LRU position (we would prefer 64), so wraparound is possible
for the cached data LRUs on a filesystem that has done sufficient
(petabytes) reads; this is now handled.
One notable user reported bugfix, where we were forgetting to
correctly set the bucket data type, which should have been
BCH_DATA_need_gc_gens instead of BCH_DATA_free; this was causing us to
go emergency read-only on a filesystem that had seen heavy enough use
to see bucket gen wraparoud.
We're now starting to fix simple (safe) errors without requiring user
intervention - i.e. a small incremental step towards full self
healing.
This is currently limited to just certain allocation information
counters, and the error is still logged in the superblock; see that
patch for more information. ("bcachefs: Fix safe errors by default")"
* tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs: (22 commits)
bcachefs: Move the ei_flags setting to after initialization
bcachefs: Fix a UAF after write_super()
bcachefs: Use bch2_print_string_as_lines for long err
bcachefs: Fix I_NEW warning in race path in bch2_inode_insert()
bcachefs: Replace bare EEXIST with private error codes
bcachefs: Fix missing alloc_data_type_set()
closures: Change BUG_ON() to WARN_ON()
bcachefs: fix alignment of VMA for memory mapped files on THP
bcachefs: Fix safe errors by default
bcachefs: Fix bch2_trans_put()
bcachefs: set_worker_desc() for delete_dead_snapshots
bcachefs: Fix bch2_sb_downgrade_update()
bcachefs: Handle cached data LRU wraparound
bcachefs: Guard against overflowing LRU_TIME_BITS
bcachefs: delete_dead_snapshots() doesn't need to go RW
bcachefs: Fix early init error path in journal code
bcachefs: Check for invalid btree IDs
bcachefs: Fix btree ID bitmasks
bcachefs: Fix shift overflow in read_one_super()
bcachefs: Fix a locking bug in the do_discard_fast() path
...
Linus Torvalds [Sat, 22 Jun 2024 15:16:17 +0000 (08:16 -0700)]
Merge tag 'ata-6.10-rc5' of git://git./linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
- We currently enable DIPM (device initiated power management) in the
device (using a SET FEATURES call to the device), regardless if the
HBA supports any LPM states or not. It seems counter intuitive, and
potentially dangerous to enable a device side feature, when the HBA
does not have the corresponding support. Thus, make sure that we do
not enable DIPM if the HBA does not support any LPM states.
* tag 'ata-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: ahci: Do not enable LPM if no LPM states are supported by the HBA
Linus Torvalds [Sat, 22 Jun 2024 15:03:47 +0000 (08:03 -0700)]
Merge tag 'pwm/for-6.10-rc5-fixes-take2' of git://git./linux/kernel/git/ukleinek/linux
Pull pwm fixes from Uwe Kleine-König:
"Three fixes for the pwm-stm32 driver.
The first patch prevents an integer wrap-around for small periods. In
the second patch the calculation of the prescaler is fixed which
resulted in values for the ARR register that don't fit into the
corresponding register bit field. The last commit improves an error
message that was wrongly copied from another error path"
* tag 'pwm/for-6.10-rc5-fixes-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: stm32: Fix error message to not describe the previous error path
pwm: stm32: Fix calculation of prescaler
pwm: stm32: Refuse too small period requests
Linus Torvalds [Sat, 22 Jun 2024 14:58:21 +0000 (07:58 -0700)]
Merge tag 'arm-fixes-6.10' of git://git./linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"There are seven oneline patches that each address a distinct problem
on the NXP i.MX platform, mostly the popular i.MX8M variant.
The only other two fixes are for error handling on the psci firmware
driver and SD card support on the milkv duo riscv board"
* tag 'arm-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
firmware: psci: Fix return value from psci_system_suspend()
riscv: dts: sophgo: disable write-protection for milkv duo
arm64: dts: imx8qm-mek: fix gpio number for reg_usdhc2_vmmc
arm64: dts: freescale: imx8mm-verdin: enable hysteresis on slow input pin
arm64: dts: imx93-11x11-evk: Remove the 'no-sdio' property
arm64: dts: freescale: imx8mp-venice-gw73xx-2x: fix BT shutdown GPIO
arm: dts: imx53-qsb-hdmi: Disable panel instead of deleting node
arm64: dts: imx8mp: Fix TC9595 input clock on DH i.MX8M Plus DHCOM SoM
arm64: dts: freescale: imx8mm-verdin: Fix GPU speed
Linus Torvalds [Sat, 22 Jun 2024 14:55:06 +0000 (07:55 -0700)]
Merge tag 'loongarch-fixes-6.10-2' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Some hw breakpoint fixes, an objtool build warnging fix, and a trivial
cleanup"
* tag 'loongarch-fixes-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Remove an unneeded semicolon
LoongArch: Fix multiple hardware watchpoint issues
LoongArch: Trigger user-space watchpoints correctly
LoongArch: Fix watchpoint setting error
LoongArch: Only allow OBJTOOL & ORC unwinder if toolchain supports -mthin-add-sub
Linus Torvalds [Sat, 22 Jun 2024 14:41:57 +0000 (07:41 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix dangling references to a redistributor region if the vgic was
prematurely destroyed.
- Properly mark FFA buffers as released, ensuring that both parties
can make forward progress.
x86:
- Allow getting/setting MSRs for SEV-ES guests, if they're using the
pre-6.9 KVM_SEV_ES_INIT API.
- Always sync pending posted interrupts to the IRR prior to IOAPIC
route updates, so that EOIs are intercepted properly if the old
routing table requested that.
Generic:
- Avoid __fls(0)
- Fix reference leak on hwpoisoned page
- Fix a race in kvm_vcpu_on_spin() by ensuring loads and stores are
atomic.
- Fix bug in __kvm_handle_hva_range() where KVM calls a function
pointer that was intended to be a marker only (nothing bad happens
but kind of a mine and also technically undefined behavior)
- Do not bother accounting allocations that are small and freed
before getting back to userspace.
Selftests:
- Fix compilation for RISC-V.
- Fix a "shift too big" goof in the KVM_SEV_INIT2 selftest.
- Compute the max mappable gfn for KVM selftests on x86 using
GuestMaxPhyAddr from KVM's supported CPUID (if it's available)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SEV-ES: Fix svm_get_msr()/svm_set_msr() for KVM_SEV_ES_INIT guests
KVM: Discard zero mask with function kvm_dirty_ring_reset
virt: guest_memfd: fix reference leak on hwpoisoned page
kvm: do not account temporary allocations to kmem
MAINTAINERS: Drop Wanpeng Li as a Reviewer for KVM Paravirt support
KVM: x86: Always sync PIR to IRR prior to scanning I/O APIC routes
KVM: Stop processing *all* memslots when "null" mmu_notifier handler is found
KVM: arm64: FFA: Release hyp rx buffer
KVM: selftests: Fix RISC-V compilation
KVM: arm64: Disassociate vcpus from redistributor region on teardown
KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin()
KVM: selftests: x86: Prioritize getting max_gfn from GuestPhysBits
KVM: selftests: Fix shift of 32 bit unsigned int more than 32 bits
Uwe Kleine-König [Fri, 21 Jun 2024 14:37:14 +0000 (16:37 +0200)]
pwm: stm32: Fix error message to not describe the previous error path
"Failed to lock the clock" is an appropriate error message for
clk_rate_exclusive_get() failing, but not for the clock running too
fast for the driver's calculations.
Adapt the error message accordingly.
Fixes:
d44d635635a7 ("pwm: stm32: Fix for settings using period > UINT32_MAX")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/285182163211203fc823a65b180761f46e828dcb.1718979150.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Fri, 21 Jun 2024 14:37:13 +0000 (16:37 +0200)]
pwm: stm32: Fix calculation of prescaler
A small prescaler is beneficial, as this improves the resolution of the
duty_cycle configuration. However if the prescaler is too small, the
maximal possible period becomes considerably smaller than the requested
value.
One situation where this goes wrong is the following: With a parent
clock rate of
208877930 Hz and max_arr = 0xffff = 65535, a request for
period = 941243 ns currently results in PSC = 1. The value for ARR is
then calculated to
ARR = 941243 *
208877930 / (
1000000000 * 2) - 1 = 98301
This value is bigger than 65535 however and so doesn't fit into the
respective register field. In this particular case the PWM was
configured for a period of 313733.
4806027616 ns (with ARR = 98301 &
0xffff). Even if ARR was configured to its maximal value, only period =
627495.
6861167669 ns would be achievable.
Fix the calculation accordingly and adapt the comment to match the new
algorithm.
With the calculation fixed the above case results in PSC = 2 and so an
actual period of 941229.
1667195285 ns.
Fixes:
8002fbeef1e4 ("pwm: stm32: Calculate prescaler with a division instead of a loop")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/b4d96b79917617434a540df45f20cb5de4142f88.1718979150.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Wolfram Sang [Fri, 21 Jun 2024 07:30:13 +0000 (09:30 +0200)]
docs: i2c: summary: be clearer with 'controller/target' and 'adapter/client' pairs
This not only includes rewording, but also where to put which emphasis
on terms in this document.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Wolfram Sang [Fri, 21 Jun 2024 07:30:12 +0000 (09:30 +0200)]
docs: i2c: summary: document 'local' and 'remote' targets
Because Linux can be a target as well, add terminology to differentiate
between Linux being the target and Linux accessing targets.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Wolfram Sang [Fri, 21 Jun 2024 07:30:11 +0000 (09:30 +0200)]
docs: i2c: summary: document use of inclusive language
We now have the updated I2C specs and our own Code of Conduct, so we
have all we need to switch over to the inclusive terminology. Define
them here.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Wolfram Sang [Fri, 21 Jun 2024 07:30:10 +0000 (09:30 +0200)]
docs: i2c: summary: update speed mode description
Fastest I2C mode is 5 MHz. Update the docs and reword the paragraph
slightly.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Wolfram Sang [Fri, 21 Jun 2024 07:30:09 +0000 (09:30 +0200)]
docs: i2c: summary: update I2C specification link
Luckily, the specs are directly downloadable again, so update the link.
Also update its title to the original name "I²C".
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Wolfram Sang [Fri, 21 Jun 2024 07:30:08 +0000 (09:30 +0200)]
docs: i2c: summary: start sentences consistently.
Change the first paragraphs to contain only one space after the end of
the previous sentence like in the rest of the document.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Linus Torvalds [Fri, 21 Jun 2024 21:28:28 +0000 (14:28 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two fixes: one in the ufs driver fixing an obvious memory leak and the
other (with a core flag based update) trying to prevent USB crashes by
stopping the core from issuing a request for the I/O Hints mode page"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: usb: uas: Do not query the IO Advice Hints Grouping mode page for USB/UAS devices
scsi: core: Introduce the BLIST_SKIP_IO_HINTS flag
scsi: ufs: core: Free memory allocated for model before reinit
Linus Torvalds [Fri, 21 Jun 2024 21:11:50 +0000 (14:11 -0700)]
Merge tag 'drm-fixes-2024-06-22' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Still pretty quiet, two weeks worth of amdgpu fixes, with one i915 and
one xe. I didn't get the drm-misc-fixes tree PR this week, but there
was only one fix queued and I think it can wait another week, so seems
pretty normal.
xe:
- Fix for invalid register access
i915:
- Fix conditions for joiner usage, it's not possible with eDP MSO
amdgpu:
- Fix display idle optimization race
- Fix GPUVM TLB flush locking scope
- IPS fix
- GFX 9.4.3 harvesting fix
- Runtime pm fix for shared buffers
- DCN 3.5.x fixes
- USB4 fix
- RISC-V clang fix
- Silence UBSAN warnings
- MES11 fix
- PSP 14.0.x fix"
* tag 'drm-fixes-2024-06-22' of https://gitlab.freedesktop.org/drm/kernel:
drm/xe/vf: Don't touch GuC irq registers if using memory irqs
drm/amdgpu: init TA fw for psp v14
drm/amdgpu: cleanup MES11 command submission
drm/amdgpu: fix UBSAN warning in kv_dpm.c
drm/radeon: fix UBSAN warning in kv_dpm.c
drm/amd/display: Disable CONFIG_DRM_AMD_DC_FP for RISC-V with clang
drm/amd/display: Attempt to avoid empty TUs when endpoint is DPIA
drm/amd/display: change dram_clock_latency to 34us for dcn35
drm/amd/display: Change dram_clock_latency to 34us for dcn351
drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2
drm/amdgpu: Indicate CU havest info to CP
drm/amd/display: prevent register access while in IPS
drm/amdgpu: fix locking scope when flushing tlb
drm/amd/display: Remove redundant idle optimization check
drm/i915/mso: using joiner is not possible with eDP MSO
Linus Torvalds [Fri, 21 Jun 2024 21:06:14 +0000 (14:06 -0700)]
Merge tag 'ovl-fixes-6.10-rc5' of git://git./linux/kernel/git/overlayfs/vfs
Pull overlayfs fixes from Miklos Szeredi:
"Fix two bugs, one originating in this cycle and one from 6.6"
* tag 'ovl-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: fix encoding fid for lower only root
ovl: fix copy-up in tmpfile
Linus Torvalds [Fri, 21 Jun 2024 21:01:03 +0000 (14:01 -0700)]
Merge tag 'io_uring-6.10-
20240621' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Just a single cleanup for the fixed buffer iov_iter import.
More cosmetic than anything else, but let's get it cleaned up as it's
confusing"
* tag 'io_uring-6.10-
20240621' of git://git.kernel.dk/linux:
io_uring/rsrc: fix incorrect assignment of iter->nr_segs in io_import_fixed
Linus Torvalds [Fri, 21 Jun 2024 20:55:38 +0000 (13:55 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Small bug fixes:
- Prevent a crash in bnxt if the en and rdma drivers disagree on the
MSI vectors
- Have rxe memcpy inline data from the correct address
- Fix rxe's validation of UD packets
- Several mlx5 mr cache issues: bad lock balancing on error, missing
propagation of the ATS property to the HW, wrong bucketing of freed
mrs in some cases
- Incorrect goto error unwind in mlx5 driver probe
- Missed userspace input validation in mlx5 SRQ create
- Incorrect uABI in MANA rejecting valid optional MR creation flags"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mana_ib: Ignore optional access flags for MRs
RDMA/mlx5: Add check for srq max_sge attribute
RDMA/mlx5: Fix unwind flow as part of mlx5_ib_stage_init_init
RDMA/mlx5: Ensure created mkeys always have a populated rb_key
RDMA/mlx5: Follow rb_key.ats when creating new mkeys
RDMA/mlx5: Remove extra unlock on error path
RDMA/rxe: Fix responder length checking for UD request packets
RDMA/rxe: Fix data copy for IB_SEND_INLINE
RDMA/bnxt_re: Fix the max msix vectors macro
Linus Torvalds [Fri, 21 Jun 2024 18:26:43 +0000 (11:26 -0700)]
Merge tag 'sound-6.10-rc5-2' of git://git./linux/kernel/git/tiwai/sound
Pull more sound fixes from Takashi Iwai:
"A follow-up fix for a random build issue, as well as another trivial
HD-audio quirk"
* tag 'sound-6.10-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda: Use imply for suggesting CONFIG_SERIAL_MULTI_INSTANTIATE
ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14AHP9
Linus Torvalds [Fri, 21 Jun 2024 18:20:37 +0000 (11:20 -0700)]
Merge tag 'acpi-6.10-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These address a possible NULL pointer dereference in the ACPICA code
and quirk camera enumeration on multiple platforms where incorrect
data are present in the platform firmware.
Specifics:
- Undo an ACPICA code change that attempted to keep operation regions
within a page boundary, but allowed accesses to unmapped memory to
occur (Raju Rangoju)
- Ignore MIPI camera graph port nodes created with the help of the
information from the ACPI tables on all Dell Tiger, Alder and
Raptor Lake models as that information is reported to be invalid on
the platforms in question (Hans de Goede)
- Use new Intel CPU model matching macros in the MIPI DisCo for
Imaging part of ACPI device enumeration (Hans de Goede)"
* tag 'acpi-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: mipi-disco-img: Switch to new Intel CPU model defines
ACPI: scan: Ignore camera graph port nodes on all Dell Tiger, Alder and Raptor Lake models
ACPICA: Revert "ACPICA: avoid Info: mapping multiple BARs. Your kernel is fine."
Linus Torvalds [Fri, 21 Jun 2024 18:16:56 +0000 (11:16 -0700)]
Merge tag 'thermal-6.10-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"These fix the Mediatek lvts_thermal driver, the Intel int340x driver,
and the thermal core (two issues related to system suspend).
Specifics:
- Remove the filtered mode for mt8188 from lvts_thermal as it is not
supported on this platform and fail the lvts_thermal initialization
when the golden temperature is zero as that means the efuse data is
not correctly set (Julien Panis)
- Update the processor_thermal part of the Intel int340x driver to
support shared interrupts as the processor thermal device interrupt
may in fact be shared with PCI devices (Srinivas Pandruvada)
- Synchronize the suspend-prepare and post-suspend actions of the
thermal PM notifier to avoid a destructive race condition and
change the priority of that notifier to the minimum to avoid
interference between the work items spawned by it and the other
PM notifiers during system resume (Rafael Wysocki)"
* tag 'thermal-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: processor_thermal: Support shared interrupts
thermal: core: Change PM notifier priority to the minimum
thermal: core: Synchronize suspend-prepare and post-suspend actions
thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse data
thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188
Linus Torvalds [Fri, 21 Jun 2024 18:07:15 +0000 (11:07 -0700)]
Merge tag 'dmaengine-fix-6.10' of git://git./linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
- kmemleak, error path handling and missing kmem_cache_destroy() fixes
for ioatdma driver
- use after free fix for idxd driver
- data synchronisation fix for xdma isr handling
- fsl driver channel constraints and linking two fsl module fixes
* tag 'dmaengine-fix-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: ioatdma: Fix missing kmem_cache_destroy()
dt-bindings: dma: fsl-edma: fix dma-channels constraints
dmaengine: fsl-edma: avoid linking both modules
dmaengine: ioatdma: Fix kmemleak in ioat_pci_probe()
dmaengine: ioatdma: Fix error path in ioat3_dma_probe()
dmaengine: ioatdma: Fix leaking on version mismatch
dmaengine: ti: k3-udma-glue: Fix of_k3_udma_glue_parse_chn_by_id()
dmaengine: idxd: Fix possible Use-After-Free in irq_process_work_list
dmaengine: xilinx: xdma: Fix data synchronisation in xdma_channel_isr()
Linus Torvalds [Fri, 21 Jun 2024 18:03:35 +0000 (11:03 -0700)]
Merge tag 'phy-fixes-6.10' of git://git./linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:
- Qualcomm QMP driver fixes for missing register offsets and correct N4
offsets for registers
* tag 'phy-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: qcom: qmp-combo: Switch from V6 to V6 N4 register offsets
phy: qcom-qmp: pcs: Add missing v6 N4 register offsets
phy: qcom-qmp: qserdes-txrx: Add missing registers offsets
Linus Torvalds [Fri, 21 Jun 2024 17:58:57 +0000 (10:58 -0700)]
Merge tag 'soundwire-6.10-fixes' of git://git./linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- Single fix for calling fwnode_handle_put() on the
returned fwnode pointer
* tag 'soundwire-6.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: fix usages of device_get_named_child_node()
Paolo Bonzini [Fri, 21 Jun 2024 16:48:44 +0000 (12:48 -0400)]
Merge tag 'kvm-riscv-fixes-6.10-2' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.10, take #2
- Fix compilation for KVM selftests
Uwe Kleine-König [Fri, 21 Jun 2024 14:37:12 +0000 (16:37 +0200)]
pwm: stm32: Refuse too small period requests
If period_ns is small, prd might well become 0. Catch that case because
otherwise with
regmap_write(priv->regmap, TIM_ARR, prd - 1);
a few lines down quite a big period is configured.
Fixes:
7edf7369205b ("pwm: Add driver for STM32 plaftorm")
Cc: stable@vger.kernel.org
Reviewed-by: Trevor Gamblin <tgamblin@baylibre.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/b86f62f099983646f97eeb6bfc0117bb2d0c340d.1718979150.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Youling Tang [Tue, 4 Jun 2024 08:46:10 +0000 (16:46 +0800)]
bcachefs: Move the ei_flags setting to after initialization
`inode->ei_flags` setting and cleaning should be done after initialization,
otherwise the operation is invalid.
Fixes:
9ca4853b98af ("bcachefs: Fix quota support for snapshots")
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 20 Jun 2024 23:42:39 +0000 (19:42 -0400)]
bcachefs: Fix a UAF after write_super()
write_super() may reallocate the superblock buffer - but
bch_sb_field_ext was referencing it; don't use it after the write_super
call.
Reported-by: syzbot+8992fc10a192067b8d8a@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 20 Jun 2024 17:10:34 +0000 (13:10 -0400)]
bcachefs: Use bch2_print_string_as_lines for long err
printk strings get truncated to 1024 bytes; if we have a long error
message (journal debug info) we need to use a helper.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 20 Jun 2024 17:20:49 +0000 (13:20 -0400)]
bcachefs: Fix I_NEW warning in race path in bch2_inode_insert()
discard_new_inode() is the correct interface for tearing down an indoe
that was fully created but not made visible to other threads, but it
expects I_NEW to be set, which we don't use.
Reported-by: https://github.com/koverstreet/bcachefs/issues/690
Fixes: bcachefs: Fix race path in bch2_inode_insert()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 27 May 2024 02:52:22 +0000 (22:52 -0400)]
bcachefs: Replace bare EEXIST with private error codes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 20 Jun 2024 14:04:35 +0000 (10:04 -0400)]
bcachefs: Fix missing alloc_data_type_set()
Incorrect bucket state transition in the discard path; when incrementing
a bucket's generation number that had already been discarded, we were
forgetting to check if it should be need_gc_gens, not free.
This was caught by the .invalid checks in the transaction commit path,
causing us to go emergency read only.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 20 Jun 2024 13:45:09 +0000 (09:45 -0400)]
closures: Change BUG_ON() to WARN_ON()
If a BUG_ON() can be hit in the wild, it shouldn't be a BUG_ON()
For reference, this has popped up once in the CI, and we'll need more
info to debug it:
03240 ------------[ cut here ]------------
03240 kernel BUG at lib/closure.c:21!
03240 kernel BUG at lib/closure.c:21!
03240 Internal error: Oops - BUG:
00000000f2000800 [#1] SMP
03240 Modules linked in:
03240 CPU: 15 PID: 40534 Comm: kworker/u80:1 Not tainted
6.10.0-rc4-ktest-ga56da69799bd #25570
03240 Hardware name: linux,dummy-virt (DT)
03240 Workqueue: btree_update btree_interior_update_work
03240 pstate:
00001005 (nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
03240 pc : closure_put+0x224/0x2a0
03240 lr : closure_put+0x24/0x2a0
03240 sp :
ffff0000d12071c0
03240 x29:
ffff0000d12071c0 x28:
dfff800000000000 x27:
ffff0000d1207360
03240 x26:
0000000000000040 x25:
0000000000000040 x24:
0000000000000040
03240 x23:
ffff0000c1f20180 x22:
0000000000000000 x21:
ffff0000c1f20168
03240 x20:
0000000040000000 x19:
ffff0000c1f20140 x18:
0000000000000001
03240 x17:
0000000000003aa0 x16:
0000000000003ad0 x15:
1fffe0001c326974
03240 x14:
0000000000000a1e x13:
0000000000000000 x12:
1fffe000183e402d
03240 x11:
ffff6000183e402d x10:
dfff800000000000 x9 :
ffff6000183e402e
03240 x8 :
0000000000000001 x7 :
00009fffe7c1bfd3 x6 :
ffff0000c1f2016b
03240 x5 :
ffff0000c1f20168 x4 :
ffff6000183e402e x3 :
ffff800081391954
03240 x2 :
0000000000000001 x1 :
0000000000000000 x0 :
00000000a8000000
03240 Call trace:
03240 closure_put+0x224/0x2a0
03240 bch2_check_for_deadlock+0x910/0x1028
03240 bch2_six_check_for_deadlock+0x1c/0x30
03240 six_lock_slowpath.isra.0+0x29c/0xed0
03240 six_lock_ip_waiter+0xa8/0xf8
03240 __bch2_btree_node_lock_write+0x14c/0x298
03240 bch2_trans_lock_write+0x6d4/0xb10
03240 __bch2_trans_commit+0x135c/0x5520
03240 btree_interior_update_work+0x1248/0x1c10
03240 process_scheduled_works+0x53c/0xd90
03240 worker_thread+0x370/0x8c8
03240 kthread+0x258/0x2e8
03240 ret_from_fork+0x10/0x20
03240 Code:
aa1303e0 d63f0020 a94363f7 17ffff8c (
d4210000)
03240 ---[ end trace
0000000000000000 ]---
03240 Kernel panic - not syncing: Oops - BUG: Fatal exception
03240 SMP: stopping secondary CPUs
03241 SMP: failed to stop secondary CPUs 13,15
03241 Kernel Offset: disabled
03241 CPU features: 0x00,
00000003,
80000008,
4240500b
03241 Memory Limit: none
03241 ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]---
03246 ========= FAILED TIMEOUT copygc_torture_no_checksum in 7200s
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Konstantin Taranov [Wed, 5 Jun 2024 08:16:08 +0000 (01:16 -0700)]
RDMA/mana_ib: Ignore optional access flags for MRs
Ignore optional ib_access_flags when an MR is created.
Fixes:
0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://lore.kernel.org/r/1717575368-14879-1-git-send-email-kotaranov@linux.microsoft.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Patrisious Haddad [Tue, 28 May 2024 12:52:56 +0000 (15:52 +0300)]
RDMA/mlx5: Add check for srq max_sge attribute
max_sge attribute is passed by the user, and is inserted and used
unchecked, so verify that the value doesn't exceed maximum allowed value
before using it.
Fixes:
e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/277ccc29e8d57bfd53ddeb2ac633f2760cf8cdd0.1716900410.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Yishai Hadas [Tue, 28 May 2024 12:52:55 +0000 (15:52 +0300)]
RDMA/mlx5: Fix unwind flow as part of mlx5_ib_stage_init_init
Fix unwind flow as part of mlx5_ib_stage_init_init to use the correct
goto upon an error.
Fixes:
758ce14aee82 ("RDMA/mlx5: Implement MACsec gid addition and deletion")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/aa40615116eda14ec9eca21d52017d632ea89188.1716900410.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Jason Gunthorpe [Tue, 28 May 2024 12:52:54 +0000 (15:52 +0300)]
RDMA/mlx5: Ensure created mkeys always have a populated rb_key
cachable and mmkey.rb_key together are used by mlx5_revoke_mr() to put the
MR/mkey back into the cache. In all cases they should be set correctly.
alloc_cacheable_mr() was setting cachable but not filling rb_key,
resulting in cache_ent_find_and_store() bucketing them all into a 0 length
entry.
implicit_get_child_mr()/mlx5_ib_alloc_implicit_mr() failed to set cachable
or rb_key at all, so the cache was not working at all for implicit ODP.
Cc: stable@vger.kernel.org
Fixes:
8c1185fef68c ("RDMA/mlx5: Change check for cacheable mkeys")
Fixes:
dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7778c02dfa0999a30d6746c79a23dd7140a9c729.1716900410.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Jason Gunthorpe [Tue, 28 May 2024 12:52:53 +0000 (15:52 +0300)]
RDMA/mlx5: Follow rb_key.ats when creating new mkeys
When a cache ent already exists but doesn't have any mkeys in it the cache
will automatically create a new one based on the specification in the
ent->rb_key.
ent->ats was missed when creating the new key and so ma_translation_mode
was not being set even though the ent requires it.
Cc: stable@vger.kernel.org
Fixes:
73d09b2fe833 ("RDMA/mlx5: Introduce mlx5r_cache_rb_key")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://lore.kernel.org/r/7c5613458ecb89fbe5606b7aa4c8d990bdea5b9a.1716900410.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Jason Gunthorpe [Tue, 28 May 2024 12:52:52 +0000 (15:52 +0300)]
RDMA/mlx5: Remove extra unlock on error path
The below commit lifted the locking out of this function but left this
error path unlock behind resulting in unbalanced locking. Remove the
missed unlock too.
Cc: stable@vger.kernel.org
Fixes:
627122280c87 ("RDMA/mlx5: Add work to remove temporary entries from the cache")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://lore.kernel.org/r/78090c210c750f47219b95248f9f782f34548bb1.1716900410.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Paolo Bonzini [Fri, 21 Jun 2024 12:03:55 +0000 (08:03 -0400)]
Merge tag 'kvm-x86-fixes-6.10-rcN' of https://github.com/kvm-x86/linux into HEAD
KVM fixes for 6.10
- Fix a "shift too big" goof in the KVM_SEV_INIT2 selftest.
- Compute the max mappable gfn for KVM selftests on x86 using GuestMaxPhyAddr
from KVM's supported CPUID (if it's available).
- Fix a race in kvm_vcpu_on_spin() by ensuring loads and stores are atomic.
- Fix technically benign bug in __kvm_handle_hva_range() where KVM consumes
the return from a void-returning function as if it were a boolean.
Sakari Ailus [Fri, 14 Jun 2024 08:14:18 +0000 (11:14 +0300)]
i2c: Add nop fwnode operations
Add nop variants of i2c_find_device_by_fwnode(),
i2c_find_adapter_by_fwnode() and i2c_get_adapter_by_fwnode() for use
without CONFIG_I2C.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Michael Roth [Tue, 4 Jun 2024 23:35:10 +0000 (18:35 -0500)]
KVM: SEV-ES: Fix svm_get_msr()/svm_set_msr() for KVM_SEV_ES_INIT guests
With commit
27bd5fdc24c0 ("KVM: SEV-ES: Prevent MSR access post VMSA
encryption"), older VMMs like QEMU 9.0 and older will fail when booting
SEV-ES guests with something like the following error:
qemu-system-x86_64: error: failed to get MSR 0x174
qemu-system-x86_64: ../qemu.git/target/i386/kvm/kvm.c:3950: kvm_get_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
This is because older VMMs that might still call
svm_get_msr()/svm_set_msr() for SEV-ES guests after guest boot even if
those interfaces were essentially just noops because of the vCPU state
being encrypted and stored separately in the VMSA. Now those VMMs will
get an -EINVAL and generally crash.
Newer VMMs that are aware of KVM_SEV_INIT2 however are already aware of
the stricter limitations of what vCPU state can be sync'd during
guest run-time, so newer QEMU for instance will work both for legacy
KVM_SEV_ES_INIT interface as well as KVM_SEV_INIT2.
So when using KVM_SEV_INIT2 it's okay to assume userspace can deal with
-EINVAL, whereas for legacy KVM_SEV_ES_INIT the kernel might be dealing
with either an older VMM and so it needs to assume that returning
-EINVAL might break the VMM.
Address this by only returning -EINVAL if the guest was started with
KVM_SEV_INIT2. Otherwise, just silently return.
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Nikunj A Dadhania <nikunj@amd.com>
Reported-by: Srikanth Aithal <sraithal@amd.com>
Closes: https://lore.kernel.org/lkml/37usuu4yu4ok7be2hqexhmcyopluuiqj3k266z4gajc2rcj4yo@eujb23qc3zcm/
Fixes:
27bd5fdc24c0 ("KVM: SEV-ES: Prevent MSR access post VMSA encryption")
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <
20240604233510.764949-1-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rafael J. Wysocki [Fri, 21 Jun 2024 10:55:12 +0000 (12:55 +0200)]
Merge branch 'acpi-scan'
Merge ACPI device enumeration fixes for 6.10-rc5:
- Ignore MIPI camera graph port nodes created with the help of the
information from the ACPI tables on all Dell Tiger, Alder and Raptor
Lake models as that information is reported to be invalid on the
systems in question (Hans de Goede).
- Use new Intel CPU model matching macros in the MIPI DisCo for Imaging
part of ACPI device enumeration (Hans de Goede).
* acpi-scan:
ACPI: mipi-disco-img: Switch to new Intel CPU model defines
ACPI: scan: Ignore camera graph port nodes on all Dell Tiger, Alder and Raptor Lake models
Takashi Iwai [Fri, 21 Jun 2024 07:39:09 +0000 (09:39 +0200)]
ALSA: hda: Use imply for suggesting CONFIG_SERIAL_MULTI_INSTANTIATE
The recent fix introduced a reverse selection of
CONFIG_SERIAL_MULTI_INSTANTIATE, but its condition isn't always met.
Use a weak reverse selection to suggest the config for avoiding such
inconsistencies, instead.
Fixes:
9b1effff19cd ("ALSA: hda: cs35l56: Select SERIAL_MULTI_INSTANTIATE")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202406210732.ozgk8IMK-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/
202406211244.oLhoF3My-lkp@intel.com/
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20240621073915.19576-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Arnd Bergmann [Thu, 20 Jun 2024 16:23:04 +0000 (18:23 +0200)]
mips: fix compat_sys_lseek syscall
This is almost compatible, but passing a negative offset should result
in a EINVAL error, but on mips o32 compat mode would seek to a large
32-bit byte offset.
Use compat_sys_lseek() to correctly sign-extend the argument.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Jiaxun Yang [Sun, 16 Jun 2024 13:25:02 +0000 (14:25 +0100)]
MIPS: mipsmtregs: Fix target register for MFTC0
Target register of mftc0 should be __res instead of $1, this is
a leftover from old .insn code.
Fixes:
dd6d29a61489 ("MIPS: Implement microMIPS MT ASE helpers")
Cc: stable@vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Pablo Caño [Thu, 20 Jun 2024 15:25:33 +0000 (17:25 +0200)]
ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14AHP9
Lenovo Yoga Pro 7 14AHP9 (PCI SSID 17aa:3891) seems requiring a similar workaround like Yoga 9 model and Yoga 7 Pro 14APH8 for the bass speaker.
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/all/20231207182035.30248-1-tiwai@suse.de/
Signed-off-by: Pablo Caño <pablocpascual@gmail.com>
Link: https://patch.msgid.link/20240620152533.76712-1-pablocpascual@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Yang Li [Fri, 21 Jun 2024 02:18:40 +0000 (10:18 +0800)]
LoongArch: KVM: Remove an unneeded semicolon
Remove an unneeded semicolon to avoid build warnings:
./arch/loongarch/kvm/exit.c:764:2-3: Unneeded semicolon
Cc: stable@vger.kernel.org
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9343
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>