Masahiro Yamada [Mon, 20 Jan 2025 08:10:31 +0000 (17:10 +0900)]
kconfig: fix memory leak in sym_warn_unmet_dep()
The string allocated in sym_warn_unmet_dep() is never freed, leading
to a memory leak when an unmet dependency is detected.
Fixes:
f8f69dc0b4e0 ("kconfig: make unmet dependency warnings readable")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Masahiro Yamada [Mon, 20 Jan 2025 07:59:14 +0000 (16:59 +0900)]
kconfig: fix file name in warnings when loading KCONFIG_DEFCONFIG_LIST
Most 'make *config' commands use .config as the base configuration file.
When .config does not exist, Kconfig tries to load a file listed in
KCONFIG_DEFCONFIG_LIST instead.
However, since commit
b75b0a819af9 ("kconfig: change defconfig_list
option to environment variable"), warning messages have displayed an
incorrect file name in such cases.
Below is a demonstration using Debian Trixie. While loading
/boot/config-6.12.9-amd64, the warning messages incorrectly show .config
as the file name.
With this commit, the correct file name is displayed in warnings.
[Before]
$ rm -f .config
$ make config
#
# using defaults found in /boot/config-6.12.9-amd64
#
.config:6804:warning: symbol value 'm' invalid for FB_BACKLIGHT
.config:9895:warning: symbol value 'm' invalid for ANDROID_BINDER_IPC
[After]
$ rm -f .config
$ make config
#
# using defaults found in /boot/config-6.12.9-amd64
#
/boot/config-6.12.9-amd64:6804:warning: symbol value 'm' invalid for FB_BACKLIGHT
/boot/config-6.12.9-amd64:9895:warning: symbol value 'm' invalid for ANDROID_BINDER_IPC
Fixes:
b75b0a819af9 ("kconfig: change defconfig_list option to environment variable")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:55 +0000 (00:00 +0900)]
genksyms: fix syntax error for attribute before init-declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
For example, genksyms fails to parse the following valid code:
int x, __attribute__((__section__(".init.data")))y;
Here, only 'y' is annotated by the attribute, although I am not aware
of actual uses of this pattern in the kernel tree.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
$ echo 'int x, __attribute__((__section__(".init.data")))y;' | scripts/genksyms/genksyms -w
<stdin>:1: syntax error
This commit allows attributes to be placed between a comma and
init_declarator.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:54 +0000 (00:00 +0900)]
genksyms: fix syntax error for builtin (u)int*x*_t types
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, genksyms fails to parse the following code in
arch/arm64/lib/xor-neon.c:
static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
{
[ snip ]
}
The syntax error occurs because genksyms does not recognize the
uint64x2_t keyword.
This commit adds support for builtin types described in Arm Neon
Intrinsics Reference.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:53 +0000 (00:00 +0900)]
genksyms: fix syntax error for attribute after 'union'
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ fs/lockd/svc.i
$ cat fs/lockd/svc.i | scripts/genksyms/genksyms -w
[ snip ]
./include/net/addrconf.h:35: syntax error
The syntax error occurs in the following code in include/net/addrconf.h:
union __packed {
[ snip ]
};
The issue arises from __packed, which is defined as
__attribute__((__packed__)), immediately after the 'union' keyword.
This commit allows the 'union' keyword to be followed by attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:52 +0000 (00:00 +0900)]
genksyms: fix syntax error for attribute after 'struct'
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ arch/x86/kernel/cpu/mshyperv.i
$ cat arch/x86/kernel/cpu/mshyperv.i | scripts/genksyms/genksyms -w
[ snip ]
./arch/x86/include/asm/svm.h:122: syntax error
The syntax error occurs in the following code in arch/x86/include/asm/svm.h:
struct __attribute__ ((__packed__)) vmcb_control_area {
[ snip ]
};
The issue arises from __attribute__ immediately after the 'struct'
keyword.
This commit allows the 'struct' keyword to be followed by attributes.
The lexer must be adjusted because dont_want_brace_phase should not be
decremented while processing attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:51 +0000 (00:00 +0900)]
genksyms: fix syntax error for attribute after abstact_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ kernel/module/main.i
$ cat kernel/module/main.i | scripts/genksyms/genksyms -w
[ snip ]
kernel/module/main.c:97: syntax error
The syntax error occurs in the following code in kernel/module/main.c:
static void __mod_update_bounds(enum mod_mem_type type __maybe_unused, void *base,
unsigned int size, struct mod_tree_root *tree)
{
[ snip ]
}
The issue arises from __maybe_unused, which is defined as
__attribute__((__unused__)).
This commit allows direct_abstract_declarator to be followed with
attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:50 +0000 (00:00 +0900)]
genksyms: fix syntax error for attribute before nested_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ drivers/acpi/prmt.i
$ cat drivers/acpi/prmt.i | scripts/genksyms/genksyms -w
[ snip ]
drivers/acpi/prmt.c:56: syntax error
The syntax error occurs in the following code in drivers/acpi/prmt.c:
struct prm_handler_info {
[ snip ]
efi_status_t (__efiapi *handler_addr)(u64, void *);
[ snip ]
};
The issue arises from __efiapi, which is defined as either
__attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows nested_declarator to be prefixed with attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:49 +0000 (00:00 +0900)]
genksyms: fix syntax error for attribute before abstract_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ init/main.i
$ cat init/main.i | scripts/genksyms/genksyms -w
[ snip ]
./include/linux/efi.h:1225: syntax error
The syntax error occurs in the following code in include/linux/efi.h:
efi_status_t
efi_call_acpi_prm_handler(efi_status_t (__efiapi *handler_addr)(u64, void *),
u64 param_buffer_addr, void *context);
The issue arises from __efiapi, which is defined as either
__attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows abstract_declarator to be prefixed with attributes.
To avoid conflicts, I tweaked the rule for decl_specifier_seq. Due to
this change, a standalone attribute cannot become decl_specifier_seq.
Otherwise, I do not know how to resolve the conflicts.
The following code, which was previously accepted by genksyms, will now
result in a syntax error:
void my_func(__attribute__((unused))x);
I do not think it is a big deal because GCC also fails to parse it.
$ echo 'void my_func(__attribute__((unused))x);' | gcc -c -x c -
<stdin>:1:37: error: unknown type name 'x'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:48 +0000 (00:00 +0900)]
genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier
The __attribute__ keyword can appear in more contexts than 'const' or
'volatile'.
To avoid grammatical conflicts with future changes, ATTRIBUTE_PHRASE
should not be reduced into type_qualifier.
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:47 +0000 (00:00 +0900)]
genksyms: record attributes consistently for init-declarator
I believe the missing action here is a bug.
For rules with no explicit action, the following default is used:
{ $$ = $1; }
However, in this case, $1 is the value of attribute_opt itself. As a
result, the value of attribute_opt is always NULL.
The following test code demonstrates inconsistent behavior.
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
The attribute is recorded only when followed by an initializer.
This commit adds the correct action to propagate the value of the
ATTRIBUTE_PHRASE token.
With this change, the attribute in the example above is consistently
recorded for both 'x' and 'y'.
[Before]
$ cat <<EOF | scripts/genksyms/genksyms -d
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
EOF
Defn for type0 x == <int x >
Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Hash table occupancy 2/4096 = 0.
000488281
[After]
$ cat <<EOF | scripts/genksyms/genksyms -d
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
EOF
Defn for type0 x == <int x __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Hash table occupancy 2/4096 = 0.
000488281
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:46 +0000 (00:00 +0900)]
genksyms: restrict direct-declarator to take one parameter-type-list
Similar to the previous commit, this change makes the parser logic a
little more accurate.
Currently, genksyms accepts the following invalid code:
struct foo {
int (*callback)(int)(int)(int);
};
A direct-declarator should not recursively absorb multiple
( parameter-type-list ) constructs.
In the example above, (*callback) should be followed by at most one
(int).
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:45 +0000 (00:00 +0900)]
genksyms: restrict direct-abstract-declarator to take one parameter-type-list
While there is no more grammatical ambiguity in genksyms, the parser
logic is still inaccurate.
For example, genksyms accepts the following invalid C code:
void my_func(int ()(int));
This should result in a syntax error because () cannot be reduced to
<direct-abstract-declarator>.
( <abstract-declarator> ) can be reduced, but <abstract-declarator>
must not be empty in the following grammar from K&R [1]:
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
Furthermore, genksyms accepts the following weird code:
void my_func(int (*callback)(int)(int)(int));
The parser allows <direct-abstract-declarator> to recursively absorb
multiple ( {<parameter-type-list>}? ), but this behavior is incorrect.
In the example above, (*callback) should be followed by at most one
(int).
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:44 +0000 (00:00 +0900)]
genksyms: remove Makefile hack
This workaround was introduced for suppressing the reduce/reduce conflict
warnings because the %expect-rr directive, which is applicable only to GLR
parsers, cannot be used for genksyms.
Since there are no longer any conflicts, this Makefile hack is now
unnecessary.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:43 +0000 (00:00 +0900)]
genksyms: fix last 3 shift/reduce conflicts
The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 3 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The ambiguity arises when decl_specifier_seq is followed by '(' because
the following two interpretations are possible:
- decl_specifier_seq direct_abstract_declarator '(' parameter_declaration_clause ')'
- decl_specifier_seq '(' abstract_declarator ')'
This issue occurs because the current parser allows an empty string to
be reduced to direct_abstract_declarator, which is incorrect.
K&R [1] explains the correct grammar:
<parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
| {<declaration-specifier>}+ <abstract-declarator>
| {<declaration-specifier>}+
<abstract-declarator> ::= <pointer>
| <pointer> <direct-abstract-declarator>
| <direct-abstract-declarator>
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
This commit resolves all remaining conflicts.
We need to consider the difference between the following two examples:
[Example 1] ( <abstract-declarator> ) can become <direct-abstract-declarator>
void my_func(int (foo));
... is equivalent to:
void my_func(int foo);
[Example 2] ( <parameter-type-list> ) can become <direct-abstract-declarator>
typedef int foo;
void my_func(int (foo));
... is equivalent to:
void my_func(int (*callback)(int));
Please note that the function declaration is identical in both examples,
but the preceding typedef creates the distinction. I introduced a new
term, open_paren, to enable the type lookup immediately after the '('
token. Without this, we cannot distinguish between [Example 1] and
[Example 2].
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:42 +0000 (00:00 +0900)]
genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The comment in the parser describes the current problem:
/* This wasn't really a typedef name but an identifier that
shadows one. */
Consider the following simple C code:
typedef int foo;
void my_func(foo foo) {}
In the function parameter list (foo foo), the first 'foo' is a type
specifier (typedef'ed as 'int'), while the second 'foo' is an identifier.
However, the lexer cannot distinguish between the two. Since 'foo' is
already typedef'ed, the lexer returns TYPE for both instances, instead
of returning IDENT for the second one.
To support shadowed identifiers, TYPE can be reduced to either a
simple_type_specifier or a direct_abstract_declarator, which creates
a grammatical ambiguity.
Without analyzing the grammar context, it is very difficult to resolve
this correctly.
This commit introduces a flag, dont_want_type_specifier, which allows
the parser to inform the lexer whether an identifier is expected. When
dont_want_type_specifier is true, the type lookup is suppressed, and
the lexer returns IDENT regardless of any preceding typedef.
After this commit, only 3 shift/reduce conflicts will remain.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:41 +0000 (00:00 +0900)]
genksyms: reduce type_qualifier directly to decl_specifier
A type_qualifier (const, volatile, etc.) is not a type_specifier.
According to K&R [1], a type-qualifier should be directly reduced to
a declaration-specifier.
<declaration-specifier> ::= <storage-class-specifier>
| <type-specifier>
| <type-qualifier>
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:40 +0000 (00:00 +0900)]
genksyms: rename cvar_qualifier to type_qualifier
I believe "cvar" stands for "Const, Volatile, Attribute, or Restrict".
This is called "type-qualifier" in K&R. [1]
Adopt this more generic naming.
No functional changes are intended.
[1] https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Masahiro Yamada [Mon, 13 Jan 2025 15:00:39 +0000 (00:00 +0900)]
genksyms: rename m_abstract_declarator to abstract_declarator
This is called "abstract-declarator" in K&R. [1]
I am not sure what "m_" stands for, but the name is clear enough
without it.
No functional changes are intended.
[1] https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
Torsten Hilbrich [Mon, 13 Jan 2025 06:01:29 +0000 (07:01 +0100)]
kbuild: Fix signing issue for external modules
When running the sign script the kernel is within the source directory
of external modules. This caused issues when the kernel uses relative
paths, like:
make[5]: Entering directory '/build/client/devel/kernel/work/linux-2.6'
make[6]: Entering directory '/build/client/devel/addmodules/vtx/work/vtx'
INSTALL /build/client/devel/addmodules/vtx/_/lib/modules/6.13.0-devel+/extra/vtx.ko
SIGN /build/client/devel/addmodules/vtx/_/lib/modules/6.13.0-devel+/extra/vtx.ko
/bin/sh: 1: scripts/sign-file: not found
DEPMOD /build/client/devel/addmodules/vtx/_/lib/modules/6.13.0-devel+
Working around it by using absolute pathes here.
Fixes:
13b25489b6f8 ("kbuild: change working directory to external module directory with M=")
Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Sun, 22 Dec 2024 00:15:00 +0000 (09:15 +0900)]
ARC: migrate to the generic rule for built-in DTB
Commit
654102df2ac2 ("kbuild: add generic support for built-in boot
DTBs") introduced generic support for built-in DTBs.
Select GENERIC_BUILTIN_DTB to use the generic rule.
To keep consistency across architectures, this commit also renames
CONFIG_ARC_BUILTIN_DTB_NAME to CONFIG_BUILTIN_DTB_NAME.
Now, "nsim_700" is the default value for CONFIG_BUILTIN_DTB_NAME, rather
than a fallback in case it is empty.
Acked-by: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 17:37:05 +0000 (17:37 +0000)]
rust: Use gendwarfksyms + extended modversions for CONFIG_MODVERSIONS
Previously, two things stopped Rust from using MODVERSIONS:
1. Rust symbols are occasionally too long to be represented in the
original versions table
2. Rust types cannot be properly hashed by the existing genksyms
approach because:
* Looking up type definitions in Rust is more complex than C
* Type layout is potentially dependent on the compiler in Rust,
not just the source type declaration.
CONFIG_EXTENDED_MODVERSIONS addresses the first point, and
CONFIG_GENDWARFKSYMS the second. If Rust wants to use MODVERSIONS, allow
it to do so by selecting both features.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Co-developed-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Matthew Maurer [Fri, 3 Jan 2025 17:37:04 +0000 (17:37 +0000)]
Documentation/kbuild: Document storage of symbol information
Document where exported and imported symbols are kept, format options,
and limitations.
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Matthew Maurer [Fri, 3 Jan 2025 17:37:03 +0000 (17:37 +0000)]
modpost: Allow extended modversions without basic MODVERSIONS
If you know that your kernel modules will only ever be loaded by a newer
kernel, you can disable BASIC_MODVERSIONS to save space. This also
allows easy creation of test modules to see how tooling will respond to
modules that only have the new format.
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Matthew Maurer [Fri, 3 Jan 2025 17:37:02 +0000 (17:37 +0000)]
modpost: Produce extended MODVERSIONS information
Generate both the existing modversions format and the new extended one
when running modpost. Presence of this metadata in the final .ko is
guarded by CONFIG_EXTENDED_MODVERSIONS.
We no longer generate an error on long symbols in modpost if
CONFIG_EXTENDED_MODVERSIONS is set, as they can now be appropriately
encoded in the extended section. These symbols will be skipped in the
previous encoding. An error will still be generated if
CONFIG_EXTENDED_MODVERSIONS is not set.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Matthew Maurer [Fri, 3 Jan 2025 17:37:01 +0000 (17:37 +0000)]
modules: Support extended MODVERSIONS info
Adds a new format for MODVERSIONS which stores each field in a separate
ELF section. This initially adds support for variable length names, but
could later be used to add additional fields to MODVERSIONS in a
backwards compatible way if needed. Any new fields will be ignored by
old user tooling, unlike the current format where user tooling cannot
tolerate adjustments to the format (for example making the name field
longer).
Since PPC munges its version records to strip leading dots, we reproduce
the munging for the new format. Other architectures do not appear to
have architecture-specific usage of this information.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:40 +0000 (20:45 +0000)]
Documentation/kbuild: Add DWARF module versioning
Add documentation for gendwarfksyms changes, and the kABI stability
features that can be useful for distributions even though they're not
used in mainline kernels.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:39 +0000 (20:45 +0000)]
kbuild: Add gendwarfksyms as an alternative to genksyms
When MODVERSIONS is enabled, allow selecting gendwarfksyms as the
implementation, but default to genksyms.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:38 +0000 (20:45 +0000)]
export: Add __gendwarfksyms_ptr_ references to exported symbols
With gendwarfksyms, we need each TU where the EXPORT_SYMBOL() macro
is used to also contain DWARF type information for the symbols it
exports. However, as a TU can also export external symbols and
compilers may choose not to emit debugging information for symbols not
defined in the current TU, the missing types will result in missing
symbol versions. Stand-alone assembly code also doesn't contain type
information for exported symbols, so we need to compile a temporary
object file with asm-prototypes.h instead, and similarly need to
ensure the DWARF in the temporary object file contains the necessary
types.
To always emit type information for external exports, add explicit
__gendwarfksyms_ptr_<symbol> references to them in EXPORT_SYMBOL().
gendwarfksyms will use the type information for __gendwarfksyms_ptr_*
if needed. Discard the pointers from the final binary to avoid further
bloat.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:37 +0000 (20:45 +0000)]
gendwarfksyms: Add support for symbol type pointers
The compiler may choose not to emit type information in DWARF for
external symbols. Clang, for example, does this for symbols not
defined in the current TU.
To provide a way to work around this issue, add support for
__gendwarfksyms_ptr_<symbol> pointers that force the compiler to emit
the necessary type information in DWARF also for the missing symbols.
Example usage:
#define GENDWARFKSYMS_PTR(sym) \
static typeof(sym) *__gendwarfksyms_ptr_##sym __used \
__section(".discard.gendwarfksyms") = &sym;
extern int external_symbol(void);
GENDWARFKSYMS_PTR(external_symbol);
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:36 +0000 (20:45 +0000)]
gendwarfksyms: Add support for reserved and ignored fields
Distributions that want to maintain a stable kABI need the ability
to make ABI compatible changes to kernel data structures without
affecting symbol versions, either because of LTS updates or backports.
With genksyms, developers would typically hide these changes from
version calculation with #ifndef __GENKSYMS__, which would result
in the symbol version not changing even though the actual type has
changed. When we process precompiled object files, this isn't an
option.
Change union processing to recognize field name prefixes that allow
the user to ignore the union completely during symbol versioning with
a __kabi_ignored prefix in a field name, or to replace the type of a
placeholder field using a __kabi_reserved field name prefix.
For example, assume we want to add a new field to an existing
alignment hole in a data structure, and ignore the new field when
calculating symbol versions:
struct struct1 {
int a;
/* a 4-byte alignment hole */
unsigned long b;
};
To add `int n` to the alignment hole, we can add a union that includes
a __kabi_ignored field that causes gendwarfksyms to ignore the entire
union:
struct struct1 {
int a;
union {
char __kabi_ignored_0;
int n;
};
unsigned long b;
};
With --stable, both structs produce the same symbol version.
Alternatively, when a distribution expects future modification to a
data structure, they can explicitly add reserved fields:
struct struct2 {
long a;
long __kabi_reserved_0; /* reserved for future use */
};
To take the field into use, we can again replace it with a union, with
one of the fields keeping the __kabi_reserved name prefix to indicate
the original type:
struct struct2 {
long a;
union {
long __kabi_reserved_0;
struct {
int b;
int v;
};
};
Here gendwarfksyms --stable replaces the union with the type of the
placeholder field when calculating versions.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:35 +0000 (20:45 +0000)]
gendwarfksyms: Add support for kABI rules
Distributions that want to maintain a stable kABI need the ability
to make ABI compatible changes to kernel without affecting symbol
versions, either because of LTS updates or backports.
With genksyms, developers would typically hide these changes from
version calculation with #ifndef __GENKSYMS__, which would result
in the symbol version not changing even though the actual type has
changed. When we process precompiled object files, this isn't an
option.
To support this use case, add a --stable command line flag that
gates kABI stability features that are not needed in mainline
kernels, but can be useful for distributions, and add support for
kABI rules, which can be used to restrict gendwarfksyms output.
The rules are specified as a set of null-terminated strings stored
in the .discard.gendwarfksyms.kabi_rules section. Each rule consists
of four strings as follows:
"version\0type\0target\0value"
The version string ensures the structure can be changed in a
backwards compatible way. The type string indicates the type of the
rule, and target and value strings contain rule-specific data.
Initially support two simple rules:
1. Declaration-only types
A type declaration can change into a full definition when
additional includes are pulled in to the TU, which changes the
versions of any symbol that references the type. Add support
for defining declaration-only types whose definition is not
expanded during versioning.
2. Ignored enumerators
It's possible to add new enum fields without changing the ABI,
but as the fields are included in symbol versioning, this would
change the versions. Add support for ignoring specific fields.
3. Overridden enumerator values
Add support for overriding enumerator values when calculating
versions. This may be needed when the last field of the enum
is used as a sentinel and new fields must be added before it.
Add examples for using the rules under the examples/ directory.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:34 +0000 (20:45 +0000)]
gendwarfksyms: Add symbol versioning
Calculate symbol versions from the fully expanded type strings in
type_map, and output the versions in a genksyms-compatible format.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:33 +0000 (20:45 +0000)]
gendwarfksyms: Add symtypes output
Add support for producing genksyms-style symtypes files. Process
die_map to find the longest expansions for each type, and use symtypes
references in type definitions. The basic file format is similar to
genksyms, with two notable exceptions:
1. Type names with spaces (common with Rust) in references are
wrapped in single quotes. E.g.:
s#'core::result::Result<u8, core::num::error::ParseIntError>'
2. The actual type definition is the simple parsed DWARF format we
output with --dump-dies, not the preprocessed C-style format
genksyms produces.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:32 +0000 (20:45 +0000)]
gendwarfksyms: Add die_map debugging
Debugging the DWARF processing can be somewhat challenging, so add
more detailed debugging output for die_map operations. Add the
--dump-die-map flag, which adds color coded tags to the output for
die_map changes.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:31 +0000 (20:45 +0000)]
gendwarfksyms: Limit structure expansion
Expand each structure type only once per exported symbol. This
is necessary to support self-referential structures, which would
otherwise result in infinite recursion, and it's sufficient for
catching ABI changes.
Types defined in .c files are opaque to external users and thus
cannot affect the ABI. Consider type definitions in .c files to
be declarations to prevent opaque types from changing symbol
versions.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:30 +0000 (20:45 +0000)]
gendwarfksyms: Expand structure types
Recursively expand DWARF structure types, i.e. structs, unions, and
enums. Also include relevant DWARF attributes in type strings to
encode structure layout, for example.
Example output with --dump-dies:
subprogram (
formal_parameter structure_type &str {
member pointer_type {
base_type u8 byte_size(1) encoding(7)
} data_ptr data_member_location(0) ,
member base_type usize byte_size(8) encoding(7) length data_member_location(8)
} byte_size(16) alignment(8) msg
)
-> base_type void
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:29 +0000 (20:45 +0000)]
gendwarfksyms: Expand array_type
Add support for expanding DW_TAG_array_type, and the subrange type
indicating array size.
Example source code:
const char *s[34];
Output with --dump-dies:
variable array_type[34] {
pointer_type {
const_type {
base_type char byte_size(1) encoding(6)
}
} byte_size(8)
}
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:28 +0000 (20:45 +0000)]
gendwarfksyms: Expand subroutine_type
Add support for expanding DW_TAG_subroutine_type and the parameters
in DW_TAG_formal_parameter. Use this to also expand subprograms.
Example output with --dump-dies:
subprogram (
formal_parameter pointer_type {
const_type {
base_type char byte_size(1) encoding(6)
}
}
)
-> base_type unsigned long byte_size(8) encoding(7)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:27 +0000 (20:45 +0000)]
gendwarfksyms: Expand type modifiers and typedefs
Add support for expanding DWARF type modifiers, such as pointers,
const values etc., and typedefs. These types all have DW_AT_type
attribute pointing to the underlying type, and thus produce similar
output.
Also add linebreaks and indentation to debugging output to make it
more readable.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:26 +0000 (20:45 +0000)]
gendwarfksyms: Add a cache for processed DIEs
Basic types in DWARF repeat frequently and traversing the DIEs using
libdw is relatively slow. Add a simple hashtable based cache for the
processed DIEs.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:25 +0000 (20:45 +0000)]
gendwarfksyms: Expand base_type
Start making gendwarfksyms more useful by adding support for
expanding DW_TAG_base_type types and basic DWARF attributes.
Example:
$ echo loops_per_jiffy | \
scripts/gendwarfksyms/gendwarfksyms \
--debug --dump-dies vmlinux.o
...
gendwarfksyms: process_symbol: loops_per_jiffy
variable base_type unsigned long byte_size(8) encoding(7)
...
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:24 +0000 (20:45 +0000)]
gendwarfksyms: Add address matching
The compiler may choose not to emit type information in DWARF for all
aliases, but it's possible for each alias to be exported separately.
To ensure we find type information for the aliases as well, read
{section, address} tuples from the symbol table and match symbols also
by address.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Sami Tolvanen [Fri, 3 Jan 2025 20:45:23 +0000 (20:45 +0000)]
tools: Add gendwarfksyms
Add a basic DWARF parser, which uses libdw to traverse the debugging
information in an object file and looks for functions and variables.
In follow-up patches, this will be expanded to produce symbol versions
for CONFIG_MODVERSIONS from DWARF.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Fri, 3 Jan 2025 07:30:43 +0000 (16:30 +0900)]
genksyms: use uint32_t instead of unsigned long for calculating CRC
Currently, 'unsigned long' is used for intermediate variables when
calculating CRCs.
The size of 'long' differs depending on the architecture: it is 32 bits
on 32-bit architectures and 64 bits on 64-bit architectures.
The CRC values generated by genksyms represent the compatibility of
exported symbols. Therefore, reproducibility is important. In other
words, we need to ensure that the output is the same when the kernel
source is identical, regardless of whether genksyms is running on a
32-bit or 64-bit build machine.
Fortunately, the output from genksyms is not affected by the build
machine's architecture because only the lower 32 bits of the
'unsigned long' variables are used.
To make it even clearer that the CRC calculation is independent of
the build machine's architecture, this commit explicitly uses the
fixed-width type, uint32_t.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Fri, 3 Jan 2025 07:30:42 +0000 (16:30 +0900)]
genksyms: use generic macros for hash table implementation
Use macros provided by hashtable.h
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Fri, 3 Jan 2025 07:30:41 +0000 (16:30 +0900)]
genksyms: refactor the return points in the for-loop in __add_symbol()
free_list() must be called before returning from this for-loop.
Swap 'break' and the combination of free_list() and 'return'.
This reduces the code and minimizes the risk of introducing memory
leaks in future changes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Fri, 3 Jan 2025 07:30:40 +0000 (16:30 +0900)]
genksyms: reduce the indentation in the for-loop in __add_symbol()
To improve readability, reduce the indentation as follows:
- Use 'continue' earlier when the symbol does not match
- flip !sym->is_declared to flatten the if-else chain
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Fri, 3 Jan 2025 07:30:39 +0000 (16:30 +0900)]
genksyms: fix memory leak when the same symbol is read from *.symref file
When a symbol that is already registered is read again from *.symref
file, __add_symbol() removes the previous one from the hash table without
freeing it.
[Test Case]
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) {}
EXPORT_SYMBOL(foo);
$ cat foo.symref
foo void foo ( void )
foo void foo ( void )
When a symbol is removed from the hash table, it must be freed along
with its ->name and ->defn members. However, sym->name cannot be freed
because it is sometimes shared with node->string, but not always. If
sym->name and node->string share the same memory, free(sym->name) could
lead to a double-free bug.
To resolve this issue, always assign a strdup'ed string to sym->name.
Fixes:
64e6c1e12372 ("genksyms: track symbol checksum changes")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Fri, 3 Jan 2025 07:30:38 +0000 (16:30 +0900)]
genksyms: fix memory leak when the same symbol is added from source
When a symbol that is already registered is added again, __add_symbol()
returns without freeing the symbol definition, making it unreachable.
The following test cases demonstrate different memory leak points.
[Test Case 1]
Forward declaration with exactly the same definition
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) {}
EXPORT_SYMBOL(foo);
[Test Case 2]
Forward declaration with a different definition (e.g. attribute)
$ cat foo.c
#include <linux/export.h>
void foo(void);
__attribute__((__section__(".ref.text"))) void foo(void) {}
EXPORT_SYMBOL(foo);
[Test Case 3]
Preserving an overridden symbol (compile with KBUILD_PRESERVE=1)
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) { }
EXPORT_SYMBOL(foo);
$ cat foo.symref
override foo void foo ( int )
The memory leaks in Test Case 1 and 2 have existed since the introduction
of genksyms into the kernel tree. [1]
The memory leak in Test Case 3 was introduced by commit
5dae9a550a74
("genksyms: allow to ignore symbol checksum changes").
When multiple init_declarators are reduced to an init_declarator_list,
the decl_spec must be duplicated. Otherwise, the following Test Case 4
would result in a double-free bug.
[Test Case 4]
$ cat foo.c
#include <linux/export.h>
extern int foo, bar;
int foo, bar;
EXPORT_SYMBOL(foo);
In this case, 'foo' and 'bar' share the same decl_spec, 'int'. It must
be unshared before being passed to add_symbol().
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=
46bd1da672d66ccd8a639d3c1f8a166048cca608
Fixes:
5dae9a550a74 ("genksyms: allow to ignore symbol checksum changes")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Sat, 28 Dec 2024 15:45:29 +0000 (00:45 +0900)]
modpost: zero-pad CRC values in modversion_info array
I do not think the '#' flag is useful here because adding the explicit
'0x' is clearer. Add the '0' flag to zero-pad the CRC values.
This change gives better alignment in the generated *.mod.c files.
There is no impact to the compiled modules.
[Before]
$ grep -A5 modversion_info fs/efivarfs/efivarfs.mod.c
static const struct modversion_info ____versions[]
__used __section("__versions") = {
{ 0x907d14d, "blocking_notifier_chain_register" },
{ 0x53d3b64, "simple_inode_init_ts" },
{ 0x65487097, "__x86_indirect_thunk_rax" },
{ 0x122c3a7e, "_printk" },
[After]
$ grep -A5 modversion_info fs/efivarfs/efivarfs.mod.c
static const struct modversion_info ____versions[]
__used __section("__versions") = {
{ 0x0907d14d, "blocking_notifier_chain_register" },
{ 0x053d3b64, "simple_inode_init_ts" },
{ 0x65487097, "__x86_indirect_thunk_rax" },
{ 0x122c3a7e, "_printk" },
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Sat, 28 Dec 2024 15:45:28 +0000 (00:45 +0900)]
module: get symbol CRC back to unsigned
Commit
71810db27c1c ("modversions: treat symbol CRCs as 32 bit
quantities") changed the CRC fields to s32 because the __kcrctab and
__kcrctab_gpl sections contained relative references to the actual
CRC values stored in the .rodata section when CONFIG_MODULE_REL_CRCS=y.
Commit
7b4537199a4a ("kbuild: link symbol CRCs at final link, removing
CONFIG_MODULE_REL_CRCS") removed this complexity. Now, the __kcrctab
and __kcrctab_gpl sections directly contain the CRC values in all cases.
The genksyms tool outputs unsigned 32-bit CRC values, so u32 is preferred
over s32.
No functional changes are intended.
Regardless of this change, the CRC value is assigned to the u32 variable
'crcval' before the comparison, as seen in kernel/module/version.c:
crcval = *crc;
It was previously mandatory (but now optional) in order to avoid sign
extension because the following line previously compared 'unsigned long'
and 's32':
if (versions[i].crc == crcval)
return 1;
versions[i].crc is still 'unsigned long' for backward compatibility.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Rolf Eike Beer [Thu, 19 Dec 2024 07:20:34 +0000 (08:20 +0100)]
kconfig: qconf: use preferred form of QString API
A QString constructed from a character literal of length 0, i.e. "", is not
"null" for historical reasons. This does not matter here so use the preferred
method isEmpty() instead.
Also directly construct empty QString objects instead of passing in an empty
character literal that has to be parsed into an empty object first.
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Link: https://doc.qt.io/qt-6/qstring.html#distinction-between-null-and-empty-strings
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
HONG Yifan [Wed, 18 Dec 2024 20:20:11 +0000 (20:20 +0000)]
kheaders: prevent `find` from seeing perl temp files
Symptom:
The command
find ... | xargs ... perl -i
occasionally triggers error messages like the following, with the build
still succeeding:
Can't open <redacted>/kernel/.tmp_dir/include/dt-bindings/clock/XXNX4nW9: No such file or directory.
Analysis:
With strace, the root cause has been identified to be `perl -i` creating
temporary files inside ${tmpdir}, which causes `find` to see the
temporary files and emit the names. `find` is likely implemented with
readdir. POSIX `readdir` says:
If a file is removed from or added to the directory after the most
recent call to opendir() or rewinddir(), whether a subsequent call
to readdir() returns an entry for that file is unspecified.
So if the libc that `find` links against choose to return that entry
in readdir(), a possible sequence of events is the following:
1. find emits foo.h
2. xargs executes `perl -i foo.h`
3. perl (pid=100) creates temporary file `XXXXXXXX`
4. find sees file `XXXXXXXX` and emit it
5. PID 100 exits, cleaning up the temporary file `XXXXXXXX`
6. xargs executes `perl -i XXXXXXXX`
7. perl (pid=200) tries to read the file, but it doesn't exist any more.
... triggering the error message.
One can reproduce the bug with the following command (assuming PWD
contains the list of headers in kheaders.tar.xz)
for i in $(seq 100); do
find -type f -print0 |
xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;';
done
With a `find` linking against musl libc, the error message is emitted
6/100 times.
The fix:
This change stores the results of `find` before feeding them into xargs.
find and xargs will no longer be able to see temporary files that perl
creates after this change.
Signed-off-by: HONG Yifan <elsk@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Wed, 18 Dec 2024 10:37:08 +0000 (19:37 +0900)]
kheaders: use 'tar' instead of 'cpio' for copying files
The 'cpio' command is used solely for copying header files to the
temporary directory. However, there is no strong reason to use 'cpio'
for this purpose. For example, scripts/package/install-extmod-build
uses the 'tar' command to copy files.
This commit replaces the use of 'cpio' with 'tar' because 'tar' is
already used in this script to generate kheaders_data.tar.xz anyway.
Performance-wide, there is no significant difference between 'cpio'
and 'tar'.
[Before]
$ rm -fr kheaders; mkdir kheaders
$ time sh -c '
for f in include arch/x86/include
do
find "$f" -name "*.h"
done | cpio --quiet -pd kheaders
'
real 0m0.148s
user 0m0.021s
sys 0m0.140s
[After]
$ rm -fr kheaders; mkdir kheaders
$ time sh -c '
for f in include arch/x86/include
do
find "$f" -name "*.h"
done | tar -c -f - -T - | tar -xf - -C kheaders
'
real 0m0.098s
user 0m0.024s
sys 0m0.131s
Revert commit
69ef0920bdd3 ("Docs: Add cpio requirement to changes.rst")
because 'cpio' is not used anywhere else during the kernel build.
Please note that the built-in initramfs is created by the in-tree tool,
usr/gen_init_cpio, so it does not rely on the external 'cpio' command
at all.
Remove 'cpio' from the package build dependencies as well.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Wed, 18 Dec 2024 10:37:07 +0000 (19:37 +0900)]
kheaders: rename the 'cpio_dir' variable to 'tmpdir'
The next commit will get rid of the use of 'cpio' command, as there is
no strong reason to use it just for copying files.
Before that, this commit renames the 'cpio_dir' variable to 'tmpdir'.
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Wed, 18 Dec 2024 10:37:06 +0000 (19:37 +0900)]
kheaders: avoid unnecessary process forks of grep
Exclude include/generated/{utsversion.h,autoconf.h} by using the -path
option to reduce the cost of forking new processes.
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Wed, 18 Dec 2024 10:37:05 +0000 (19:37 +0900)]
kheaders: exclude include/generated/utsversion.h from kheaders_data.tar.xz
CONFIG_IKHEADERS has a reproducibility issue because the contents of
kernel/kheaders_data.tar.xz can vary depending on how you build the
kernel.
If you build the kernel with CONFIG_IKHEADERS enabled from a pristine
state, the tarball does not include include/generated/utsversion.h.
$ make -s mrproper
$ make -s defconfig
$ scripts/config -e CONFIG_IKHEADERS
$ make -s
$ tar Jtf kernel/kheaders_data.tar.xz | grep utsversion
However, if you build the kernel with CONFIG_IKHEADERS disabled first
and then enable it later, the tarball does include
include/generated/utsversion.h.
$ make -s mrproper
$ make -s defconfig
$ make -s
$ scripts/config -e CONFIG_IKHEADERS
$ make -s
$ tar Jtf kernel/kheaders_data.tar.xz | grep utsversion
./include/generated/utsversion.h
It is not predictable whether a stale include/generated/utsversion.h
remains when kheaders_data.tar.xz is generated.
For better reproducibility, include/generated/utsversions.h should
always be omitted. It is not necessary for the kheaders anyway.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Tue, 10 Dec 2024 10:24:41 +0000 (19:24 +0900)]
kbuild: suppress stdout from merge_config for silent builds
merge_config does not respect the Make's -s (--silent) option.
Let's sink the stdout from merge_config for silent builds.
This commit does not cater to the direct invocation of merge_config.sh
(e.g. arch/mips/Makefile).
Reported-by: Leon Romanovsky <leon@kernel.org>
Closes: https://lore.kernel.org/all/
e534ce33b0e1060eb85ece8429810f087b034c88.
1733234008.git.leonro@nvidia.com/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Masahiro Yamada [Tue, 10 Dec 2024 10:06:17 +0000 (19:06 +0900)]
kbuild: refactor cross-compiling linux-headers package
Since commit
13b25489b6f8 ("kbuild: change working directory to external
module directory with M="), when cross-building host programs for the
linux-headers package, the "Entering directory" and "Leaving directory"
messages appear multiple times, and each object path shown is relative
to the working directory. This makes it difficult to track which objects
are being rebuilt.
In hindsight, using the external module build (M=) was not a good idea.
This commit simplifies the script by leveraging the run-command target,
resulting in a cleaner build log again.
[Before]
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- bindeb-pkg
[ snip ]
Rebuilding host programs with aarch64-linux-gnu-gcc...
make[5]: Entering directory '/home/masahiro/linux'
make[6]: Entering directory '/home/masahiro/linux/debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+'
HOSTCC scripts/kallsyms
HOSTCC scripts/sorttable
HOSTCC scripts/asn1_compiler
make[6]: Leaving directory '/home/masahiro/linux/debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+'
make[5]: Leaving directory '/home/masahiro/linux'
make[5]: Entering directory '/home/masahiro/linux'
make[6]: Entering directory '/home/masahiro/linux/debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+'
HOSTCC scripts/basic/fixdep
HOSTCC scripts/mod/modpost.o
HOSTCC scripts/mod/file2alias.o
HOSTCC scripts/mod/sumversion.o
HOSTCC scripts/mod/symsearch.o
HOSTLD scripts/mod/modpost
make[6]: Leaving directory '/home/masahiro/linux/debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+'
make[5]: Leaving directory '/home/masahiro/linux'
[After]
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- bindeb-pkg
[ snip ]
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/basic/fixdep
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/kallsyms
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/sorttable
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/asn1_compiler
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/mod/modpost.o
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/mod/file2alias.o
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/mod/sumversion.o
HOSTCC debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/mod/symsearch.o
HOSTLD debian/linux-headers-6.13.0-rc1+/usr/src/linux-headers-6.13.0-rc1+/scripts/mod/modpost
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Johannes Schauer Marin Rodrigues [Tue, 3 Dec 2024 16:17:35 +0000 (17:17 +0100)]
kbuild: deb-pkg: allow hooks also in /usr/share/kernel
By passing an additional directory to run-parts, allow Debian and its
derivatives to ship maintainer scripts in /usr while at the same time
allowing the local admin to override or disable them by placing hooks of
the same name in /etc. This adds support for the mechanism described in
the UAPI Configuration Files Specification for kernel hooks. The same
idea is also used by udev, systemd or modprobe for their config files.
https://uapi-group.org/specifications/specs/configuration_files_specification/
This functionality relies on run-parts 5.21 or later. It is the
responsibility of packages installing hooks into /usr/share/kernel to
also declare a Depends: debianutils (>= 5.21).
KDEB_HOOKDIR can be used to change the list of directories that is
searched. By default, /etc/kernel and /usr/share/kernel are hook
directories. Since the list of directories in KDEB_HOOKDIR is separated
by spaces, the paths must not contain the space character themselves.
Signed-off-by: Johannes Schauer Marin Rodrigues <josch@mister-muffin.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Tue, 3 Dec 2024 13:59:58 +0000 (22:59 +0900)]
kbuild: deb-pkg: do not include empty hook directories
The linux-image package currently includes empty hook directories
(/etc/kernel/{pre,post}{inst,rm}.d/ by default).
These directories were perhaps intended as a fail-safe in case no
hook scripts exist there.
However, they are really unnecessary because the run-parts command is
already guarded by the following check:
test -d ${debhookdir}/${script}.d && run-parts ...
The only difference is that the run-parts command either runs for empty
directories (resulting in a no-op) or is skipped entirely.
The maintainer scripts will succeed without these dummy directories.
The linux-image packages from the Debian kernel do not contain
/etc/kernel/*.d/, either.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sun, 5 Jan 2025 22:13:40 +0000 (14:13 -0800)]
Linux 6.13-rc6
Linus Torvalds [Sun, 5 Jan 2025 18:52:47 +0000 (10:52 -0800)]
Merge tag 'kbuild-fixes-v6.13-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix escaping of '$' in scripts/mksysmap
- Fix a modpost crash observed with the latest binutils
- Fix 'provides' in the linux-api-headers pacman package
* tag 'kbuild-fixes-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: pacman-pkg: provide versioned linux-api-headers package
modpost: work around unaligned data access error
modpost: refactor do_vmbus_entry()
modpost: fix the missed iteration for the max bit in do_input()
scripts/mksysmap: Fix escape chars '$'
Linus Torvalds [Sun, 5 Jan 2025 18:37:45 +0000 (10:37 -0800)]
Merge tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git./linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"25 hotfixes. 16 are cc:stable. 18 are MM and 7 are non-MM.
The usual bunch of singletons and two doubletons - please see the
relevant changelogs for details"
* tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits)
MAINTAINERS: change Arınç _NAL's name and email address
scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity
mm/util: make memdup_user_nul() similar to memdup_user()
mm, madvise: fix potential workingset node list_lru leaks
mm/damon/core: fix ignored quota goals and filters of newly committed schemes
mm/damon/core: fix new damon_target objects leaks on damon_commit_targets()
mm/list_lru: fix false warning of negative counter
vmstat: disable vmstat_work on vmstat_cpu_down_prep()
mm: shmem: fix the update of 'shmem_falloc->nr_unswapped'
mm: shmem: fix incorrect index alignment for within_size policy
percpu: remove intermediate variable in PERCPU_PTR()
mm: zswap: fix race between [de]compression and CPU hotunplug
ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv
fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit
kcov: mark in_softirq_really() as __always_inline
docs: mm: fix the incorrect 'FileHugeMapped' field
mailmap: modify the entry for Mathieu Othacehe
mm/kmemleak: fix sleeping function called from invalid context at print message
mm: hugetlb: independent PMD page table shared count
maple_tree: reload mas before the second call for mas_empty_area
...
Linus Torvalds [Sun, 5 Jan 2025 18:28:34 +0000 (10:28 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A randconfig build fix and a performance fix:
- Fix the CONFIG_RESET_CONTROLLER=n path signature of
clk_imx8mp_audiomix_reset_controller_register() to appease
randconfig
- Speed up the sdhci clk on TH1520 by a factor of 4 by adding
a fixed factor clk"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: clk-imx8mp-audiomix: fix function signature
clk: thead: Fix TH1520 emmc and shdci clock rate
Thomas Weißschuh [Fri, 3 Jan 2025 18:20:23 +0000 (19:20 +0100)]
kbuild: pacman-pkg: provide versioned linux-api-headers package
The Arch Linux glibc package contains a versioned dependency on
"linux-api-headers". If the linux-api-headers package provided by
pacman-pkg does not specify an explicit version this dependency is not
satisfied.
Fix the dependency by providing an explicit version.
Fixes:
c8578539deba ("kbuild: add script and target to generate pacman package")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sat, 4 Jan 2025 18:59:10 +0000 (10:59 -0800)]
Merge tag 'linux-watchdog-6.13-rc6' of git://linux-watchdog.org/linux-watchdog
Pull watchdog fix from Wim Van Sebroeck:
- fix error message during stm32 driver probe
* tag 'linux-watchdog-6.13-rc6' of git://www.linux-watchdog.org/linux-watchdog:
watchdog: stm32_iwdg: fix error message during driver probe
Linus Torvalds [Fri, 3 Jan 2025 23:09:12 +0000 (15:09 -0800)]
Merge tag 'sched_ext-for-6.13-rc5-fixes' of git://git./linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix a bug where bpf_iter_scx_dsq_new() was not initializing the
iterator's flags and could inadvertently enable e.g. reverse
iteration
- Fix a bug where scx_ops_bypass() could call irq_restore twice
- Add Andrea and Changwoo as maintainers for better review coverage
- selftests and tools/sched_ext build and other fixes
* tag 'sched_ext-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Fix dsq_local_on selftest
sched_ext: initialize kit->cursor.flags
sched_ext: Fix invalid irq restore in scx_ops_bypass()
MAINTAINERS: add me as reviewer for sched_ext
MAINTAINERS: add self as reviewer for sched_ext
scx: Fix maximal BPF selftest prog
sched_ext: fix application of sizeof to pointer
selftests/sched_ext: fix build after renames in sched_ext API
sched_ext: Add __weak to fix the build errors
Linus Torvalds [Fri, 3 Jan 2025 23:03:56 +0000 (15:03 -0800)]
Merge tag 'wq-for-6.13-rc5-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
- Suppress a corner case spurious flush dependency warning
- Two trivial changes
* tag 'wq-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: add printf attribute to __alloc_workqueue()
workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker
rust: add safety comment in workqueue traits
Linus Torvalds [Fri, 3 Jan 2025 22:58:57 +0000 (14:58 -0800)]
Merge tag 'block-6.13-
20250103' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"Collection of fixes for block. Particularly the target name overflow
has been a bit annoying, as it results in overwriting random memory
and hence shows up as triggering various other bugs.
- NVMe pull request via Keith:
- Fix device specific quirk for PRP list alignment (Robert)
- Fix target name overflow (Leo)
- Fix target write granularity (Luis)
- Fix target sleeping in atomic context (Nilay)
- Remove unnecessary tcp queue teardown (Chunguang)
- Simple cdrom typo fix"
* tag 'block-6.13-
20250103' of git://git.kernel.dk/linux:
cdrom: Fix typo, 'devicen' to 'device'
nvme-tcp: remove nvme_tcp_destroy_io_queues()
nvmet-loop: avoid using mutex in IO hotpath
nvmet: propagate npwg topology
nvmet: Don't overflow subsysnqn
nvme-pci: 512 byte aligned dma pool segment quirk
Linus Torvalds [Fri, 3 Jan 2025 22:45:59 +0000 (14:45 -0800)]
Merge tag 'io_uring-6.13-
20250103' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix an issue with the read multishot support and posting of CQEs from
io-wq context
- Fix a regression introduced in this cycle, where making the timeout
lock a raw one uncovered another locking dependency. As a result,
move the timeout flushing outside of the timeout lock, punting them
to a local list first
- Fix use of an uninitialized variable in io_async_msghdr. Doesn't
really matter functionally, but silences a valid KMSAN complaint that
it's not always initialized
- Fix use of incrementally provided buffers for read on non-pollable
files, where the buffer always gets committed upfront. Unfortunately
the buffer address isn't resolved first, so the read ends up using
the updated rather than the current value
* tag 'io_uring-6.13-
20250103' of git://git.kernel.dk/linux:
io_uring/kbuf: use pre-committed buffer address for non-pollable file
io_uring/net: always initialize kmsg->msg.msg_inq upfront
io_uring/timeout: flush timeouts outside of the timeout lock
io_uring/rw: fix downgraded mshot read
Linus Torvalds [Fri, 3 Jan 2025 22:36:54 +0000 (14:36 -0800)]
Merge tag 'net-6.13-rc6' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireles and netfilter.
Nothing major here. Over the last two weeks we gathered only around
two-thirds of our normal weekly fix count, but delaying sending these
until -rc7 seemed like a really bad idea.
AFAIK we have no bugs under investigation. One or two reverts for
stuff for which we haven't gotten a proper fix will likely come in the
next PR.
Current release - fix to a fix:
- netfilter: nft_set_hash: unaligned atomic read on struct
nft_set_ext
- eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup
Previous releases - regressions:
- net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
- mptcp:
- fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust
- fix TCP options overflow
- prevent excessive coalescing on receive, fix throughput
- net: fix memory leak in tcp_conn_request() if map insertion fails
- wifi: cw1200: fix potential NULL dereference after conversion to
GPIO descriptors
- phy: micrel: dynamically control external clock of KSZ PHY, fix
suspend behavior
Previous releases - always broken:
- af_packet: fix VLAN handling with MSG_PEEK
- net: restrict SO_REUSEPORT to inet sockets
- netdev-genl: avoid empty messages in NAPI get
- dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X
- eth:
- gve: XDP fixes around transmit, queue wakeup etc.
- ti: icssg-prueth: fix firmware load sequence to prevent time
jump which breaks timesync related operations
Misc:
- netlink: specs: mptcp: add missing attr and improve documentation"
* tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init
net: ti: icssg-prueth: Fix firmware load sequence.
mptcp: prevent excessive coalescing on receive
mptcp: don't always assume copied data in mptcp_cleanup_rbuf()
mptcp: fix recvbuffer adjust on sleeping rcvmsg
ila: serialize calls to nf_register_net_hooks()
af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK
af_packet: fix vlan_get_tci() vs MSG_PEEK
net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init()
net: restrict SO_REUSEPORT to inet sockets
net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
net: sfc: Correct key_len for efx_tc_ct_zone_ht_params
net: wwan: t7xx: Fix FSM command timeout issue
sky2: Add device ID 11ab:4373 for Marvell
88E8075
mptcp: fix TCP options overflow.
net: mv643xx_eth: fix an OF node reference leak
gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup
eth: bcmsysport: fix call balance of priv->clk handling routines
net: llc: reset skb->transport_header
netlink: specs: mptcp: fix missing doc
...
Linus Torvalds [Fri, 3 Jan 2025 22:16:25 +0000 (14:16 -0800)]
Merge tag 'nios2_update_for_v6.14' of git://git./linux/kernel/git/dinguyen/linux
Pull nios2 fixlet from Dinh Nguyen:
- Use str_yes_no() helper function
* tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
nios2: Use str_yes_no() helper in show_cpuinfo()
Linus Torvalds [Fri, 3 Jan 2025 19:09:35 +0000 (11:09 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"A lot of fixes accumulated over the holiday break:
- Static tool fixes, value is already proven to be NULL, possible
integer overflow
- Many bnxt_re fixes:
- Crashes due to a mismatch in the maximum SGE list size
- Don't waste memory for user QPs by creating kernel-only
structures
- Fix compatability issues with older HW in some of the new HW
features recently introduced: RTS->RTS feature, work around 9096
- Do not allow destroy_qp to fail
- Validate QP MTU against device limits
- Add missing validation on madatory QP attributes for RTR->RTS
- Report port_num in query_qp as required by the spec
- Fix creation of QPs of the maximum queue size, and in the
variable mode
- Allow all QPs to be used on newer HW by limiting a work around
only to HW it affects
- Use the correct MSN table size for variable mode QPs
- Add missing locking in create_qp() accessing the qp_tbl
- Form WQE buffers correctly when some of the buffers are 0 hop
- Don't crash on QP destroy if the userspace doesn't setup the
dip_ctx
- Add the missing QP flush handler call on the DWQE path to avoid
hanging on error recovery
- Consistently use ENXIO for return codes if the devices is
fatally errored
- Try again to fix VLAN support on iwarp, previous fix was reverted
due to breaking other cards
- Correct error path return code for rdma netlink events
- Remove the seperate net_device pointer in siw and rxe which
syzkaller found a way to UAF
- Fix a UAF of a stack ib_sge in rtrs
- Fix a regression where old mlx5 devices and FW were wrongly
activing new device features and failing"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits)
RDMA/mlx5: Enable multiplane mode only when it is supported
RDMA/bnxt_re: Fix error recovery sequence
RDMA/rtrs: Ensure 'ib_sge list' is accessible
RDMA/rxe: Remove the direct link to net_device
RDMA/hns: Fix missing flush CQE for DWQE
RDMA/hns: Fix warning storm caused by invalid input in IO path
RDMA/hns: Fix accessing invalid dip_ctx during destroying QP
RDMA/hns: Fix mapping error of zero-hop WQE buffer
RDMA/bnxt_re: Fix the locking while accessing the QP table
RDMA/bnxt_re: Fix MSN table size for variable wqe mode
RDMA/bnxt_re: Add send queue size check for variable wqe
RDMA/bnxt_re: Disable use of reserved wqes
RDMA/bnxt_re: Fix max_qp_wrs reported
RDMA/siw: Remove direct link to net_device
RDMA/nldev: Set error code in rdma_nl_notify_event
RDMA/bnxt_re: Fix reporting hw_ver in query_device
RDMA/bnxt_re: Fix to export port num to ib_query_qp
RDMA/bnxt_re: Fix setting mandatory attributes for modify_qp
RDMA/bnxt_re: Add check for path mtu in modify_qp
RDMA/bnxt_re: Fix the check for 9060 condition
...
Linus Torvalds [Fri, 3 Jan 2025 18:57:57 +0000 (10:57 -0800)]
Merge tag 'pinctrl-v6.13-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- A small Kconfig fixup for the i.MX.
In principle this could come in from the SoC tree but the bug was
introduced from the pin control tree so let's fix it from here.
- Fix a sleep in atomic context in the MCP23xxx GPIO expander by
disabling the regmap locking and using explicit mutex locks.
* tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking
ARM: imx: Re-introduce the PINCTRL selection
Linus Torvalds [Fri, 3 Jan 2025 18:54:51 +0000 (10:54 -0800)]
Merge tag 'sound-6.13-rc6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"The first new year pull request: no surprises, all small fixes,
including:
- Follow-up fixes for the new compress-offload API extension
- A couple of fixes for MIDI 2.0 UMP handling
- A trivial race fix for OSS sequencer emulation ioctls
- USB-audio and HD-audio fixes / quirks"
* tag 'sound-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: seq: Check UMP support for midi_version change
ALSA hda/realtek: Add quirk for Framework F111:000C
Revert "ALSA: ump: Don't enumeration invalid groups for legacy rawmidi"
ALSA: seq: oss: Fix races at processing SysEx messages
ALSA: compress_offload: fix remaining descriptor races in sound/core/compress_offload.c
ALSA: compress_offload: Drop unneeded no_free_ptr()
ALSA: hda/tas2781: Ignore SUBSYS_ID not found for tas2563 projects
ALSA: usb-audio: US16x08: Initialize array before use
Linus Torvalds [Fri, 3 Jan 2025 18:06:44 +0000 (10:06 -0800)]
Merge tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Happy New Year.
It was fairly quiet for holidays period, certainly nothing that worth
getting off the couch before I needed to, this is for the past two
weeks, i915, xe and some adv7511, I expect we will see some amdgpu etc
happening next week, but otherwise all quiet.
i915:
- Fix C10 pll programming sequence [cx0_phy]
- Fix power gate sequence. [dg1]
xe:
- uapi: Revert some devcoredump file format changes breaking a mesa
debug tool
- Fixes around waits when moving to system
- Fix a typo when checking for LMEM provisioning
- Fix a fault on fd close after unbind
- A couple of OA fixes squashed for stable backporting
adv7511:
- fix UAF
- drop single lane support
- audio infoframe fix"
* tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernel:
xe/oa: Fix query mode of operation for OAR/OAC
drm/i915/dg1: Fix power gate sequence.
drm/i915/cx0_phy: Fix C10 pll programming sequence
drm/xe: Fix fault on fd close after unbind
drm/xe/pf: Use correct function to check LMEM provisioning
drm/xe: Wait for migration job before unmapping pages
drm/xe: Use non-interruptible wait when moving BO to system
drm/xe: Revert some changes that break a mesa debug tool
drm: adv7511: Drop dsi single lane support
dt-bindings: display: adi,adv7533: Drop single lane support
drm: adv7511: Fix use-after-free in adv7533_attach_dsi()
drm/bridge: adv7511_audio: Update Audio InfoFrame properly
Linus Torvalds [Fri, 3 Jan 2025 18:04:43 +0000 (10:04 -0800)]
Merge tag 'ftrace-v6.13-rc5-2' of git://git./linux/kernel/git/trace/linux-trace
Pull ftrace fixes from Steven Rostedt:
- Add needed READ_ONCE() around access to the fgraph array element
The updates to the fgraph array can happen when callbacks are
registered and unregistered. The __ftrace_return_to_handler() can
handle reading either the old value or the new value. But once it
reads that value it must stay consistent otherwise the check that
looks to see if the value is a stub may show false, but if the
compiler decides to re-read after that check, it can be true which
can cause the code to crash later on.
- Make function profiler use the top level ops for filtering again
When function graph became available for instances, its filter ops
became independent from the top level set_ftrace_filter. In the
process the function profiler received its own filter ops as well.
But the function profiler uses the top level set_ftrace_filter file
and does not have one of its own. In giving it its own filter ops, it
lost any user interface it once had. Make it use the top level
set_ftrace_filter file again. This fixes a regression.
* tag 'ftrace-v6.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ftrace: Fix function profiler's filtering functionality
fgraph: Add READ_ONCE() when accessing fgraph_array[]
Jens Axboe [Fri, 3 Jan 2025 16:29:09 +0000 (09:29 -0700)]
io_uring/kbuf: use pre-committed buffer address for non-pollable file
For non-pollable files, buffer ring consumption will commit upfront.
This is fine, but io_ring_buffer_select() will return the address of the
buffer after having committed it. For incrementally consumed buffers,
this is incorrect as it will modify the buffer address.
Store the pre-committed value and return that. If that isn't done, then
the initial part of the buffer is not used and the application will
correctly assume the content arrived at the start of the userspace
buffer, but the kernel will have put it later in the buffer. Or it can
cause a spurious -EFAULT returned in the CQE, depending on the buffer
size. As bounds are suitably checked for doing the actual IO, no adverse
side effects are possible - it's just a data misplacement within the
existing buffer.
Reported-by: Gwendal Fernet <gwendalfernet@gmail.com>
Cc: stable@vger.kernel.org
Fixes:
ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mark Zhang [Thu, 19 Dec 2024 12:23:36 +0000 (14:23 +0200)]
RDMA/mlx5: Enable multiplane mode only when it is supported
Driver queries vport_cxt.num_plane and enables multiplane when it is
greater then 0, but some old FWs (versions from x.40.1000 till x.42.1000),
report vport_cxt.num_plane = 1 unexpectedly.
Fix it by querying num_plane only when HCA_CAP2.multiplane bit is set.
Fixes:
2a5db20fa532 ("RDMA/mlx5: Add support to multi-plane device and port")
Link: https://patch.msgid.link/r/1ef901acdf564716fcf550453cf5e94f343777ec.1734610916.git.leon@kernel.org
Cc: stable@vger.kernel.org
Reported-by: Francesco Poli <invernomuto@paranoici.org>
Closes: https://lore.kernel.org/all/nvs4i2v7o6vn6zhmtq4sgazy2hu5kiulukxcntdelggmznnl7h@so3oul6uwgbl/
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
David S. Miller [Fri, 3 Jan 2025 11:54:06 +0000 (11:54 +0000)]
Merge branch 'net-iep-clock-module-fixes'
Meghana Malladi says:
====================
IEP clock module bug fixes
This series has some bug fixes for IEP module needed by PPS and
timesync operations.
Patch 1/2 fixes firmware load sequence to run all the firmwares
when either of the ethernet interfaces is up. Move all the code
common for firmware bringup under common functions.
Patch 2/2 fixes distorted PPS signal when the ethernet interfaces
are brough down and up. This patch also fixes enabling PPS signal
after bringing the interface up, without disabling PPS.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Meghana Malladi [Mon, 23 Dec 2024 15:15:50 +0000 (20:45 +0530)]
net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init
When ICSSG interfaces are brought down and brought up again, the
pru cores are shut down and booted again, flushing out all the memories
and start again in a clean state. Hence it is expected that the
IEP_CMP_CFG register needs to be flushed during iep_init() to ensure
that the existing residual configuration doesn't cause any unusual
behavior. If the register is not cleared, existing IEP_CMP_CFG set for
CMP1 will result in SYNC0_OUT signal based on the SYNC_OUT register values.
After bringing the interface up, calling PPS enable doesn't work as
the driver believes PPS is already enabled, (iep->pps_enabled is not
cleared during interface bring down) and driver will just return true
even though there is no signal. Fix this by disabling pps and perout.
Fixes:
c1e0230eeaab ("net: ti: icss-iep: Add IEP driver")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
MD Danish Anwar [Mon, 23 Dec 2024 15:15:49 +0000 (20:45 +0530)]
net: ti: icssg-prueth: Fix firmware load sequence.
Timesync related operations are ran in PRU0 cores for both ICSSG SLICE0
and SLICE1. Currently whenever any ICSSG interface comes up we load the
respective firmwares to PRU cores and whenever interface goes down, we
stop the resective cores. Due to this, when SLICE0 goes down while
SLICE1 is still active, PRU0 firmwares are unloaded and PRU0 core is
stopped. This results in clock jump for SLICE1 interface as the timesync
related operations are no longer running.
As there are interdependencies between SLICE0 and SLICE1 firmwares,
fix this by running both PRU0 and PRU1 firmwares as long as at least 1
ICSSG interface is up. Add new flag in prueth struct to check if all
firmwares are running and remove the old flag (fw_running).
Use emacs_initialized as reference count to load the firmwares for the
first and last interface up/down. Moving init_emac_mode and fw_offload_mode
API outside of icssg_config to icssg_common_start API as they need
to be called only once per firmware boot.
Change prueth_emac_restart() to return error code and add error prints
inside the caller of this functions in case of any failures.
Move prueth_emac_stop() from common to sr1 driver.
sr1 and sr2 drivers have different logic handling for stopping
the firmwares. While sr1 driver is dependent on emac structure
to stop the corresponding pru cores for that slice, for sr2
all the pru cores of both the slices are stopped and is not
dependent on emac. So the prueth_emac_stop() function is no
longer common and can be moved to sr1 driver.
Fixes:
c1e0230eeaab ("net: ti: icss-iep: Add IEP driver")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 3 Jan 2025 02:44:05 +0000 (18:44 -0800)]
Merge branch 'mptcp-rx-path-fixes'
Matthieu Baerts says:
====================
mptcp: rx path fixes
Here are 3 different fixes, all related to the MPTCP receive buffer:
- Patch 1: fix receive buffer space when recvmsg() blocks after
receiving some data. For a fix introduced in v6.12, backported to
v6.1.
- Patch 2: mptcp_cleanup_rbuf() can be called when no data has been
copied. For 5.11.
- Patch 3: prevent excessive coalescing on receive, which can affect the
throughput badly. It looks better to wait a bit before backporting
this one to stable versions, to get more results. For 5.10.
====================
Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-0-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 30 Dec 2024 18:12:32 +0000 (19:12 +0100)]
mptcp: prevent excessive coalescing on receive
Currently the skb size after coalescing is only limited by the skb
layout (the skb must not carry frag_list). A single coalesced skb
covering several MSS can potentially fill completely the receive
buffer. In such a case, the snd win will zero until the receive buffer
will be empty again, affecting tput badly.
Fixes:
8268ed4c9d19 ("mptcp: introduce and use mptcp_try_coalesce()")
Cc: stable@vger.kernel.org # please delay 2 weeks after 6.13-final release
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-3-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 30 Dec 2024 18:12:31 +0000 (19:12 +0100)]
mptcp: don't always assume copied data in mptcp_cleanup_rbuf()
Under some corner cases the MPTCP protocol can end-up invoking
mptcp_cleanup_rbuf() when no data has been copied, but such helper
assumes the opposite condition.
Explicitly drop such assumption and performs the costly call only
when strictly needed - before releasing the msk socket lock.
Fixes:
fd8976790a6c ("mptcp: be careful on MPTCP-level ack.")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-2-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 30 Dec 2024 18:12:30 +0000 (19:12 +0100)]
mptcp: fix recvbuffer adjust on sleeping rcvmsg
If the recvmsg() blocks after receiving some data - i.e. due to
SO_RCVLOWAT - the MPTCP code will attempt multiple times to
adjust the receive buffer size, wrongly accounting every time the
cumulative of received data - instead of accounting only for the
delta.
Address the issue moving mptcp_rcv_space_adjust just after the
data reception and passing it only the just received bytes.
This also removes an unneeded difference between the TCP and MPTCP
RX code path implementation.
Fixes:
581302298524 ("mptcp: error out earlier on disconnect")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-1-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 30 Dec 2024 16:28:49 +0000 (16:28 +0000)]
ila: serialize calls to nf_register_net_hooks()
syzbot found a race in ila_add_mapping() [1]
commit
031ae72825ce ("ila: call nf_unregister_net_hooks() sooner")
attempted to fix a similar issue.
Looking at the syzbot repro, we have concurrent ILA_CMD_ADD commands.
Add a mutex to make sure at most one thread is calling nf_register_net_hooks().
[1]
BUG: KASAN: slab-use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline]
BUG: KASAN: slab-use-after-free in __rhashtable_lookup.constprop.0+0x426/0x550 include/linux/rhashtable.h:604
Read of size 4 at addr
ffff888028f40008 by task dhcpcd/5501
CPU: 1 UID: 0 PID: 5501 Comm: dhcpcd Not tainted
6.13.0-rc4-syzkaller-00054-gd6ef8b40d075 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xc3/0x620 mm/kasan/report.c:489
kasan_report+0xd9/0x110 mm/kasan/report.c:602
rht_key_hashfn include/linux/rhashtable.h:159 [inline]
__rhashtable_lookup.constprop.0+0x426/0x550 include/linux/rhashtable.h:604
rhashtable_lookup include/linux/rhashtable.h:646 [inline]
rhashtable_lookup_fast include/linux/rhashtable.h:672 [inline]
ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:127 [inline]
ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline]
ila_nf_input+0x1ee/0x620 net/ipv6/ila/ila_xlat.c:185
nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
nf_hook_slow+0xbb/0x200 net/netfilter/core.c:626
nf_hook.constprop.0+0x42e/0x750 include/linux/netfilter.h:269
NF_HOOK include/linux/netfilter.h:312 [inline]
ipv6_rcv+0xa4/0x680 net/ipv6/ip6_input.c:309
__netif_receive_skb_one_core+0x12e/0x1e0 net/core/dev.c:5672
__netif_receive_skb+0x1d/0x160 net/core/dev.c:5785
process_backlog+0x443/0x15f0 net/core/dev.c:6117
__napi_poll.constprop.0+0xb7/0x550 net/core/dev.c:6883
napi_poll net/core/dev.c:6952 [inline]
net_rx_action+0xa94/0x1010 net/core/dev.c:7074
handle_softirqs+0x213/0x8f0 kernel/softirq.c:561
__do_softirq kernel/softirq.c:595 [inline]
invoke_softirq kernel/softirq.c:435 [inline]
__irq_exit_rcu+0x109/0x170 kernel/softirq.c:662
irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
sysvec_apic_timer_interrupt+0xa4/0xc0 arch/x86/kernel/apic/apic.c:1049
Fixes:
7f00feaf1076 ("ila: Add generic ILA translation facility")
Reported-by: syzbot+47e761d22ecf745f72b9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
6772c9ae.
050a0220.2f3838.04c7.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Tom Herbert <tom@herbertland.com>
Link: https://patch.msgid.link/20241230162849.2795486-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 30 Dec 2024 16:10:04 +0000 (16:10 +0000)]
af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK
Blamed commit forgot MSG_PEEK case, allowing a crash [1] as found
by syzbot.
Rework vlan_get_protocol_dgram() to not touch skb at all,
so that it can be used from many cpus on the same skb.
Add a const qualifier to skb argument.
[1]
skbuff: skb_under_panic: text:
ffffffff8a8ccd05 len:29 put:14 head:
ffff88807fc8e400 data:
ffff88807fc8e3f4 tail:0x11 end:0x140 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:206 !
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 1 UID: 0 PID: 5892 Comm: syz-executor883 Not tainted
6.13.0-rc4-syzkaller-00054-gd6ef8b40d075 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:skb_panic net/core/skbuff.c:206 [inline]
RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216
Code: 0b 8d 48 c7 c6 86 d5 25 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 5a 69 79 f7 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3
RSP: 0018:
ffffc900038d7638 EFLAGS:
00010282
RAX:
0000000000000087 RBX:
dffffc0000000000 RCX:
609ffd18ea660600
RDX:
0000000000000000 RSI:
0000000080000000 RDI:
0000000000000000
RBP:
ffff88802483c8d0 R08:
ffffffff817f0a8c R09:
1ffff9200071ae60
R10:
dffffc0000000000 R11:
fffff5200071ae61 R12:
0000000000000140
R13:
ffff88807fc8e400 R14:
ffff88807fc8e3f4 R15:
0000000000000011
FS:
00007fbac5e006c0(0000) GS:
ffff8880b8700000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fbac5e00d58 CR3:
000000001238e000 CR4:
00000000003526f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
skb_push+0xe5/0x100 net/core/skbuff.c:2636
vlan_get_protocol_dgram+0x165/0x290 net/packet/af_packet.c:585
packet_recvmsg+0x948/0x1ef0 net/packet/af_packet.c:3552
sock_recvmsg_nosec net/socket.c:1033 [inline]
sock_recvmsg+0x22f/0x280 net/socket.c:1055
____sys_recvmsg+0x1c6/0x480 net/socket.c:2803
___sys_recvmsg net/socket.c:2845 [inline]
do_recvmmsg+0x426/0xab0 net/socket.c:2940
__sys_recvmmsg net/socket.c:3014 [inline]
__do_sys_recvmmsg net/socket.c:3037 [inline]
__se_sys_recvmmsg net/socket.c:3030 [inline]
__x64_sys_recvmmsg+0x199/0x250 net/socket.c:3030
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes:
79eecf631c14 ("af_packet: Handle outgoing VLAN packets without hardware offloading")
Reported-by: syzbot+74f70bb1cb968bf09e4f@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
6772c485.
050a0220.2f3838.04c5.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Chengen Du <chengen.du@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241230161004.2681892-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 30 Dec 2024 16:10:03 +0000 (16:10 +0000)]
af_packet: fix vlan_get_tci() vs MSG_PEEK
Blamed commit forgot MSG_PEEK case, allowing a crash [1] as found
by syzbot.
Rework vlan_get_tci() to not touch skb at all,
so that it can be used from many cpus on the same skb.
Add a const qualifier to skb argument.
[1]
skbuff: skb_under_panic: text:
ffffffff8a8da482 len:32 put:14 head:
ffff88807a1d5800 data:
ffff88807a1d5810 tail:0x14 end:0x140 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:206 !
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 0 UID: 0 PID: 5880 Comm: syz-executor172 Not tainted
6.13.0-rc3-syzkaller-00762-g9268abe611b0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:skb_panic net/core/skbuff.c:206 [inline]
RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216
Code: 0b 8d 48 c7 c6 9e 6c 26 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 3a 5a 79 f7 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3
RSP: 0018:
ffffc90003baf5b8 EFLAGS:
00010286
RAX:
0000000000000087 RBX:
dffffc0000000000 RCX:
8565c1eec37aa000
RDX:
0000000000000000 RSI:
0000000080000000 RDI:
0000000000000000
RBP:
ffff88802616fb50 R08:
ffffffff817f0a4c R09:
1ffff92000775e50
R10:
dffffc0000000000 R11:
fffff52000775e51 R12:
0000000000000140
R13:
ffff88807a1d5800 R14:
ffff88807a1d5810 R15:
0000000000000014
FS:
00007fa03261f6c0(0000) GS:
ffff8880b8600000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007ffd65753000 CR3:
0000000031720000 CR4:
00000000003526f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
skb_push+0xe5/0x100 net/core/skbuff.c:2636
vlan_get_tci+0x272/0x550 net/packet/af_packet.c:565
packet_recvmsg+0x13c9/0x1ef0 net/packet/af_packet.c:3616
sock_recvmsg_nosec net/socket.c:1044 [inline]
sock_recvmsg+0x22f/0x280 net/socket.c:1066
____sys_recvmsg+0x1c6/0x480 net/socket.c:2814
___sys_recvmsg net/socket.c:2856 [inline]
do_recvmmsg+0x426/0xab0 net/socket.c:2951
__sys_recvmmsg net/socket.c:3025 [inline]
__do_sys_recvmmsg net/socket.c:3048 [inline]
__se_sys_recvmmsg net/socket.c:3041 [inline]
__x64_sys_recvmmsg+0x199/0x250 net/socket.c:3041
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
Fixes:
79eecf631c14 ("af_packet: Handle outgoing VLAN packets without hardware offloading")
Reported-by: syzbot+8400677f3fd43f37d3bc@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
6772c485.
050a0220.2f3838.04c6.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Chengen Du <chengen.du@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241230161004.2681892-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej S. Szmigiero [Sun, 29 Dec 2024 16:46:58 +0000 (17:46 +0100)]
net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init()
ipc_mmio_init() used the post-decrement operator in its loop continuing
condition of "retries" counter being "> 0", which meant that when this
condition caused loop exit "retries" counter reached -1.
But the later valid exec stage failure check only tests for "retries"
counter being exactly zero, so it didn't trigger in this case (but
would wrongly trigger if the code reaches a valid exec stage in the
very last loop iteration).
Fix this by using the pre-decrement operator instead, so the loop counter
is exactly zero on valid exec stage failure.
Fixes:
dc0514f5d828 ("net: iosm: mmio scratchpad")
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Link: https://patch.msgid.link/8b19125a825f9dcdd81c667c1e5c48ba28d505a6.1735490770.git.mail@maciej.szmigiero.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 31 Dec 2024 16:05:27 +0000 (16:05 +0000)]
net: restrict SO_REUSEPORT to inet sockets
After blamed commit, crypto sockets could accidentally be destroyed
from RCU call back, as spotted by zyzbot [1].
Trying to acquire a mutex in RCU callback is not allowed.
Restrict SO_REUSEPORT socket option to inet sockets.
v1 of this patch supported TCP, UDP and SCTP sockets,
but fcnal-test.sh test needed RAW and ICMP support.
[1]
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 24, name: ksoftirqd/1
preempt_count: 100, expected: 0
RCU nest depth: 0, expected: 0
1 lock held by ksoftirqd/1/24:
#0:
ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#0:
ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2561 [inline]
#0:
ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_core+0xa37/0x17a0 kernel/rcu/tree.c:2823
Preemption disabled at:
[<
ffffffff8161c8c8>] softirq_handle_begin kernel/softirq.c:402 [inline]
[<
ffffffff8161c8c8>] handle_softirqs+0x128/0x9b0 kernel/softirq.c:537
CPU: 1 UID: 0 PID: 24 Comm: ksoftirqd/1 Not tainted
6.13.0-rc3-syzkaller-00174-ga024e377efed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
__might_resched+0x5d4/0x780 kernel/sched/core.c:8758
__mutex_lock_common kernel/locking/mutex.c:562 [inline]
__mutex_lock+0x131/0xee0 kernel/locking/mutex.c:735
crypto_put_default_null_skcipher+0x18/0x70 crypto/crypto_null.c:179
aead_release+0x3d/0x50 crypto/algif_aead.c:489
alg_do_release crypto/af_alg.c:118 [inline]
alg_sock_destruct+0x86/0xc0 crypto/af_alg.c:502
__sk_destruct+0x58/0x5f0 net/core/sock.c:2260
rcu_do_batch kernel/rcu/tree.c:2567 [inline]
rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823
handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
run_ksoftirqd+0xca/0x130 kernel/softirq.c:950
smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Fixes:
8c7138b33e5c ("net: Unpublish sk from sk_reuseport_cb before call_rcu")
Reported-by: syzbot+b3e02953598f447d4d2a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
6772f2f4.
050a0220.2f3838.04cb.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241231160527.3994168-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Willem de Bruijn [Wed, 1 Jan 2025 16:47:40 +0000 (11:47 -0500)]
net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
The blamed commit disabled hardware offoad of IPv6 packets with
extension headers on devices that advertise NETIF_F_IPV6_CSUM,
based on the definition of that feature in skbuff.h:
* * - %NETIF_F_IPV6_CSUM
* - Driver (device) is only able to checksum plain
* TCP or UDP packets over IPv6. These are specifically
* unencapsulated packets of the form IPv6|TCP or
* IPv6|UDP where the Next Header field in the IPv6
* header is either TCP or UDP. IPv6 extension headers
* are not supported with this feature. This feature
* cannot be set in features for a device with
* NETIF_F_HW_CSUM also set. This feature is being
* DEPRECATED (see below).
The change causes skb_warn_bad_offload to fire for BIG TCP
packets.
[ 496.310233] WARNING: CPU: 13 PID: 23472 at net/core/dev.c:3129 skb_warn_bad_offload+0xc4/0xe0
[ 496.310297] ? skb_warn_bad_offload+0xc4/0xe0
[ 496.310300] skb_checksum_help+0x129/0x1f0
[ 496.310303] skb_csum_hwoffload_help+0x150/0x1b0
[ 496.310306] validate_xmit_skb+0x159/0x270
[ 496.310309] validate_xmit_skb_list+0x41/0x70
[ 496.310312] sch_direct_xmit+0x5c/0x250
[ 496.310317] __qdisc_run+0x388/0x620
BIG TCP introduced an IPV6_TLV_JUMBO IPv6 extension header to
communicate packet length, as this is an IPv6 jumbogram. But, the
feature is only enabled on devices that support BIG TCP TSO. The
header is only present for PF_PACKET taps like tcpdump, and not
transmitted by physical devices.
For this specific case of extension headers that are not
transmitted, return to the situation before the blamed commit
and support hardware offload.
ipv6_has_hopopt_jumbo() tests not only whether this header is present,
but also that it is the only extension header before a terminal (L4)
header.
Fixes:
04c20a9356f2 ("net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Closes: https://lore.kernel.org/netdev/CANn89iK1hdC3Nt8KPhOtTF8vCPc1AHDCtse_BTNki1pWxAByTQ@mail.gmail.com/
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250101164909.1331680-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Liang Jie [Mon, 30 Dec 2024 09:37:09 +0000 (17:37 +0800)]
net: sfc: Correct key_len for efx_tc_ct_zone_ht_params
In efx_tc_ct_zone_ht_params, the key_len was previously set to
offsetof(struct efx_tc_ct_zone, linkage). This calculation is incorrect
because it includes any padding between the zone field and the linkage
field due to structure alignment, which can vary between systems.
This patch updates key_len to use sizeof_field(struct efx_tc_ct_zone, zone)
, ensuring that the hash table correctly uses the zone as the key. This fix
prevents potential hash lookup errors and improves connection tracking
reliability.
Fixes:
c3bb5c6acd4e ("sfc: functions to register for conntrack zone offload")
Signed-off-by: Liang Jie <liangjie@lixiang.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20241230093709.3226854-1-buaajxlj@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dave Airlie [Fri, 3 Jan 2025 00:57:00 +0000 (10:57 +1000)]
Merge tag 'drm-xe-fixes-2025-01-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- A couple of OA fixes squashed for stable backporting (Umesh)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z3bur0RmH6-70YSh@fedora
Dave Airlie [Fri, 3 Jan 2025 00:43:36 +0000 (10:43 +1000)]
Merge tag 'drm-misc-fixes-2025-01-02' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.13-rc6:
- Only fixes for adv7511 driver, including a use-after-free.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f58429b7-5f11-4b78-b577-de32b41299ea@linux.intel.com
Dave Airlie [Fri, 3 Jan 2025 00:40:43 +0000 (10:40 +1000)]
Merge tag 'drm-intel-fixes-2024-12-25' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Fix C10 pll programming sequence [cx0_phy] (Suraj Kandpal)
- Fix power gate sequence. [dg1] (Rodrigo Vivi)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z2wKf7tmElKFdnoP@linux
Dave Airlie [Fri, 3 Jan 2025 00:28:43 +0000 (10:28 +1000)]
Merge tag 'drm-xe-fixes-2024-12-23' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Revert some devcoredump file format changes
breaking a mesa debug tool (John)
Driver Changes:
- Fixes around waits when moving to system (Nirmoy)
- Fix a typo when checking for LMEM provisioning (Michal)
- Fix a fault on fd close after unbind (Lucas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z2mjt7OTfH76cgua@fedora
Jens Axboe [Thu, 2 Jan 2025 23:32:51 +0000 (16:32 -0700)]
io_uring/net: always initialize kmsg->msg.msg_inq upfront
syzbot reports that ->msg_inq may get used uinitialized from the
following path:
BUG: KMSAN: uninit-value in io_recv_buf_select io_uring/net.c:1094 [inline]
BUG: KMSAN: uninit-value in io_recv+0x930/0x1f90 io_uring/net.c:1158
io_recv_buf_select io_uring/net.c:1094 [inline]
io_recv+0x930/0x1f90 io_uring/net.c:1158
io_issue_sqe+0x420/0x2130 io_uring/io_uring.c:1740
io_queue_sqe io_uring/io_uring.c:1950 [inline]
io_req_task_submit+0xfa/0x1d0 io_uring/io_uring.c:1374
io_handle_tw_list+0x55f/0x5c0 io_uring/io_uring.c:1057
tctx_task_work_run+0x109/0x3e0 io_uring/io_uring.c:1121
tctx_task_work+0x6d/0xc0 io_uring/io_uring.c:1139
task_work_run+0x268/0x310 kernel/task_work.c:239
io_run_task_work+0x43a/0x4a0 io_uring/io_uring.h:343
io_cqring_wait io_uring/io_uring.c:2527 [inline]
__do_sys_io_uring_enter io_uring/io_uring.c:3439 [inline]
__se_sys_io_uring_enter+0x204f/0x4ce0 io_uring/io_uring.c:3330
__x64_sys_io_uring_enter+0x11f/0x1a0 io_uring/io_uring.c:3330
x64_sys_call+0xce5/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:427
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
and it is correct, as it's never initialized upfront. Hence the first
submission can end up using it uninitialized, if the recv wasn't
successful and the networking stack didn't honor ->msg_get_inq being set
and filling in the output value of ->msg_inq as requested.
Set it to 0 upfront when it's allocated, just to silence this KMSAN
warning. There's no side effect of using it uninitialized, it'll just
potentially cause the next receive to use a recv value hint that's not
accurate.
Fixes:
c6f32c7d9e09 ("io_uring/net: get rid of ->prep_async() for receive side")
Reported-by: syzbot+068ff190354d2f74892f@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>