thermal: core: Delay exposing sysfs interface
authorLucas De Marchi <lucas.demarchi@intel.com>
Sat, 8 Mar 2025 01:02:01 +0000 (17:02 -0800)
committerRafael J. Wysocki <rafael.j.wysocki@intel.com>
Wed, 12 Mar 2025 20:24:33 +0000 (21:24 +0100)
There's a race between initializing the governor and userspace accessing
the sysfs interface. From time to time the Intel graphics CI shows this
signature:

<1>[] #PF: error_code(0x0000) - not-present page
<6>[] PGD 0 P4D 0
<4>[] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[] CPU: 3 UID: 0 PID: 562 Comm: thermald Not tainted 6.14.0-rc4-CI_DRM_16208-g7e37396f86d8+ #1
<4>[] Hardware name: Intel Corporation Twin Lake Client Platform/AlderLake-N LP5 RVP, BIOS TWLNFWI1.R00.5222.A01.2405290634 05/29/2024
<4>[] RIP: 0010:policy_show+0x1a/0x40

thermald tries to read the policy file between the sysfs files being
created and the governor set by thermal_set_governor(), which causes the
NULL pointer dereference.

Similarly to the hwmon interface, delay exposing the sysfs files to when
the governor is already set.

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13655
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/20250307-thermal-sysfs-race-v1-1-8a3d4d4ac9c4@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
drivers/thermal/thermal_core.c

index 2328ac0d8561b1d051a0dea6d7565c145d584cbb..f96ca2710928864d9df9a0eab4e88cb4d69f6cbc 100644 (file)
@@ -1589,26 +1589,26 @@ thermal_zone_device_register_with_trips(const char *type,
 
        tz->state = TZ_STATE_FLAG_INIT;
 
+       result = dev_set_name(&tz->device, "thermal_zone%d", tz->id);
+       if (result)
+               goto remove_id;
+
+       thermal_zone_device_init(tz);
+
+       result = thermal_zone_init_governor(tz);
+       if (result)
+               goto remove_id;
+
        /* sys I/F */
        /* Add nodes that are always present via .groups */
        result = thermal_zone_create_device_groups(tz);
        if (result)
                goto remove_id;
 
-       result = dev_set_name(&tz->device, "thermal_zone%d", tz->id);
-       if (result) {
-               thermal_zone_destroy_device_groups(tz);
-               goto remove_id;
-       }
-       thermal_zone_device_init(tz);
        result = device_register(&tz->device);
        if (result)
                goto release_device;
 
-       result = thermal_zone_init_governor(tz);
-       if (result)
-               goto unregister;
-
        if (!tz->tzp || !tz->tzp->no_hwmon) {
                result = thermal_add_hwmon_sysfs(tz);
                if (result)