linux-block.git
11 days agoMerge branch 'pci/controller/cadence'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:04 +0000 (10:50 -0500)]
Merge branch 'pci/controller/cadence'

- Drop a runtime PM 'put' to resolve a runtime atomic count underflow (Hans
  Zhang)

- Use shared PCIE_MSG_CODE_* definitions and remove duplicate
  cdns_pcie_msg_code definitions (Hans Zhang)

- Make the cadence core buildable as a module (Kishon Vijay Abraham I)

- Add cdns_pcie_host_disable() and cdns_pcie_ep_disable() for use by
  loadable drivers when they are removed (Siddharth Vadapalli)

- Make j721e buildable as a loadable and removable module (Siddharth
  Vadapalli)

- Fix j721e host/endpoint dependencies that result in link failures in
  some configs (Arnd Bergmann)

* pci/controller/cadence:
  PCI: j721e: Fix host/endpoint dependencies
  PCI: j721e: Add support to build as a loadable module
  PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup
  PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup
  PCI: cadence: Add support to build pcie-cadence library as a kernel module
  PCI: cadence: Remove duplicate message code definitions
  PCI: cadence: Fix runtime atomic count underflow

11 days agoMerge branch 'pci/controller/apple'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:04 +0000 (10:50 -0500)]
Merge branch 'pci/controller/apple'

- Skip ports disabled in DT when setting up ports (Janne Grunau)

- Add t6020 compatible string (Alyssa Rosenzweig)

- Extract ECAM bridge creation helper from pci_host_common_probe() to
  separate driver-specific things like MSI from PCI things (Marc Zyngier)

- Dynamically allocate RID-to_SID bitmap to prepare for SoCs with varying
  capabilities (Marc Zyngier)

- Directly set/clear INTx mask bits because T602x dropped the accessors
  that could do this without locking (Marc Zyngier)

- Move port PHY registers to their own reg items to accommodate T602x,
  which moves them around; retain default offsets for existing DTs that
  lack phy%d entries with the reg offsets (Hector Martin)

- Stop polling for core refclk, which doesn't work on T602x and the
  bootloader has already done anyway (Hector Martin)

- Use gpiod_set_value_cansleep() when asserting PERST# in probe because
  we're allowed to sleep there (Hector Martin)

- Move register offsets into SoC-specific structure (Hector Martin)

- Add T602x PCIe support (Hector Martin)

* pci/controller/apple:
  PCI: apple: Add T602x PCIe support
  PCI: apple: Abstract register offsets via a SoC-specific structure
  PCI: apple: Use gpiod_set_value_cansleep in probe flow
  PCI: apple: Drop poll for CORE_RC_PHYIF_STAT_REFCLK
  PCI: apple: Move port PHY registers to their own reg items
  PCI: apple: Fix missing OF node reference in apple_pcie_setup_port
  PCI: apple: Move away from INTMSK{SET,CLR} for INTx and private interrupts
  PCI: apple: Dynamically allocate RID-to_SID bitmap
  PCI: apple: Move over to standalone probing
  PCI: ecam: Allow cfg->priv to be pre-populated from the root port device
  PCI: host-generic: Extract an ECAM bridge creation helper from pci_host_common_probe()
  dt-bindings: pci: apple,pcie: Add t6020 compatible string
  PCI: apple: Set only available ports up

11 days agoMerge branch 'pci/endpoint'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:03 +0000 (10:50 -0500)]
Merge branch 'pci/endpoint'

- For fixed-size BARs, retain both the actual size and the possibly larger
  size allocated to accommodate iATU alignment requirements (Jerome Brunet)

- Simplify ctrl/SPAD space allocation and avoid allocating more space than
  needed (Jerome Brunet)

- Correct MSI-X PBA offset calculations for DesignWare and Cadence endpoint
  controllers (Niklas Cassel)

- Align the return value (number of interrupts) encoding for
  pci_epc_get_msi()/pci_epc_ops::get_msi() and
  pci_epc_get_msix()/pci_epc_ops::get_msix() (Niklas Cassel)

- Align the nr_irqs parameter encoding for
  pci_epc_set_msi()/pci_epc_ops::set_msi() and
  pci_epc_set_msix()/pci_epc_ops::set_msix() (Niklas Cassel)

* pci/endpoint:
  PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding
  PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding
  PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding
  PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding
  PCI: cadence-ep: Correct PBA offset in .set_msix() callback
  PCI: dwc: ep: Correct PBA offset in .set_msix() callback
  PCI: endpoint: pci-epf-vntb: Simplify ctrl/SPAD space allocation
  PCI: endpoint: Retain fixed-size BAR size as well as aligned size

11 days agoMerge branch 'pci/virtualization'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:03 +0000 (10:50 -0500)]
Merge branch 'pci/virtualization'

- Add an ACS quirk for Loongson Root Ports that don't advertise ACS but
  don't allow peer-to-peer transactions between Root Ports; the quirk
  allows each Root Port to be in a separate IOMMU group (Huacai Chen)

* pci/virtualization:
  PCI: Add ACS quirk for Loongson PCIe

11 days agoMerge branch 'pci/reset'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:02 +0000 (10:50 -0500)]
Merge branch 'pci/reset'

- Fix locking issue in the slot reset path (Ilpo Järvinen)

* pci/reset:
  PCI: Fix lock symmetry in pci_slot_unlock()

11 days agoMerge branch 'pci/pwrctrl'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:02 +0000 (10:50 -0500)]
Merge branch 'pci/pwrctrl'

- Rename pwrctrl Kconfig symbols from 'PWRCTL' to 'PWRCTRL' to match the
  filename paths.  Retain old deprecated symbols for compatibility, except
  for the pwrctrl slot driver (PCI_PWRCTRL_SLOT) (Johan Hovold)

- When unregistering pwrctrl, cancel outstanding rescan work before
  cleaning up data structures to avoid use-after-free issues (Brian Norris)

* pci/pwrctrl:
  arm64: Kconfig: switch to HAVE_PWRCTRL
  wifi: ath12k: switch to PCI_PWRCTRL_PWRSEQ
  wifi: ath11k: switch to PCI_PWRCTRL_PWRSEQ
  PCI/pwrctrl: Rename pwrctrl Kconfig symbols and slot module
  PCI/pwrctrl: Cancel outstanding rescan work when unregistering

11 days agoMerge branch 'pci/pm'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:01 +0000 (10:50 -0500)]
Merge branch 'pci/pm'

- Add pm_runtime_put() cleanup helper for use with __free() to
  automatically drop the device usage count when a pointer goes out of
  scope (Alex Williamson)

- Increment PM usage counter when probing reset methods so we don't try to
  read config space of a powered-off device (Alex Williamson)

- Set all devices to D0 during enumeration to ensure ACPI opregion is
  connected via _REG (Mario Limonciello)

* pci/pm:
  PCI: Explicitly put devices into D0 when initializing
  PCI: Increment PM usage counter when probing reset methods
  PM: runtime: Define pm_runtime_put cleanup helper

11 days agoMerge branch 'pci/pci-acpi'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:00 +0000 (10:50 -0500)]
Merge branch 'pci/pci-acpi'

- Fix pci_acpi_scan_root() memory leak when we fail to create a PCI bus
  (Zhe Qiao)

* pci/pci-acpi:
  PCI/ACPI: Fix allocated memory release on error in pci_acpi_scan_root()

11 days agoMerge branch 'pci/irq'
Bjorn Helgaas [Wed, 4 Jun 2025 15:50:00 +0000 (10:50 -0500)]
Merge branch 'pci/irq'

- Use of_fwnode_handle() so of_node_to_fwnode() can be removed (Jiri Slaby)

* pci/irq:
  irqdomain: pci: Switch to of_fwnode_handle()

11 days agoMerge branch 'pci/hotplug'
Bjorn Helgaas [Wed, 4 Jun 2025 15:49:59 +0000 (10:49 -0500)]
Merge branch 'pci/hotplug'

- Ignore Presence Detect Changed caused by DPC.  pciehp already ignores
  Link Down/Up events caused by DPC, but on slots using in-band presence
  detect, DPC causes a spurious Presence Detect Changed event (Lukas
  Wunner)

- Ignore Link Down/Up caused by Secondary Bus Reset.  On hotplug ports
  using in-band presence detect, the reset causes a Presence Detect Changed
  event, which mistakenly caused teardown and re-enumeration of the device.
  Drivers may need to annotate code that resets their device (Lukas Wunner)

* pci/hotplug:
  PCI: hotplug: Drop superfluous #include directives
  PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset
  PCI: pciehp: Ignore Presence Detect Changed caused by DPC

# Conflicts:
# drivers/pci/pci.h

11 days agoMerge branch 'pci/enumeration'
Bjorn Helgaas [Wed, 4 Jun 2025 15:49:56 +0000 (10:49 -0500)]
Merge branch 'pci/enumeration'

- Remove pci_fixup_cardbus(), which has no users left (Heiner Kallweit)

- Print the actual delay time in pci_bridge_wait_for_secondary_bus()
  instead of assuming it was 1000ms (Wilfred Mallawa)

- Revert 'iommu/amd: Prevent binding other PCI drivers to IOMMU PCI
  devices', which broke resume from system sleep on AMD platforms and has
  been fixed by other commits (Lukas Wunner)

- Restrict visibility of pci_dev.match_driver since it's no longer used
  outside the PCI core (Lukas Wunner)

* pci/enumeration:
  PCI: Limit visibility of match_driver flag to PCI core
  Revert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices"
  PCI: Print the actual delay time in pci_bridge_wait_for_secondary_bus()
  PCI: Use PCI_STD_NUM_BARS instead of 6
  PCI: Remove pci_fixup_cardbus()

# Conflicts:
# drivers/pci/pci.h

11 days agoMerge branch 'pci/devres'
Bjorn Helgaas [Wed, 4 Jun 2025 15:49:50 +0000 (10:49 -0500)]
Merge branch 'pci/devres'

- Remove mtip32xx use of pcim_iounmap_regions(), which is deprecated and
  unnecessary (Philipp Stanner)

- Remove pcim_iounmap_regions() and pcim_request_region_exclusive() and
  related flags since all uses have been removed (Philipp Stanner)

- Rework devres 'request' functions so they are no longer 'hybrid', i.e.,
  their behavior no longer depends on whether pcim_enable_device or
  pci_enable_device() was used, and remove related code (Philipp Stanner)

* pci/devres:
  PCI: Remove function pcim_intx() prototype from pci.h
  PCI: Remove hybrid-devres usage warnings from kernel-doc
  PCI: Remove redundant set of request functions
  PCI: Remove exclusive requests flags from _pcim_request_region()
  PCI: Remove pcim_request_region_exclusive()
  Documentation/driver-api: Update pcim_enable_device()
  PCI: Remove hybrid devres nature from request functions
  PCI: Remove pcim_iounmap_regions()
  mtip32xx: Remove unnecessary pcim_iounmap_regions() calls

11 days agoMerge branch 'pci/bwctrl'
Bjorn Helgaas [Wed, 4 Jun 2025 15:49:49 +0000 (10:49 -0500)]
Merge branch 'pci/bwctrl'

- Simplify link bandwidth controller by replacing the count of Link
  Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN flag
  (Ilpo Järvinen)

- Update the Link Speed after retraining, since the Link Speed may have
  changed (Ilpo Järvinen)

* pci/bwctrl:
  PCI: Update Link Speed after retraining
  PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag

11 days agoMerge branch 'pci/aer'
Bjorn Helgaas [Wed, 4 Jun 2025 15:49:49 +0000 (10:49 -0500)]
Merge branch 'pci/aer'

- Initialize struct aer_err_info before using it to avoid depending on
  stack garbage (Bjorn Helgaas)

- Log the DPC Error Source ID only when it's actually valid (when ERR_FATAL
  or ERR_NONFATAL was received from a downstream device) and decode into
  bus/device/function (Bjorn Helgaas)

- Consolidate AER Error Source ID in one place for message consistency
  (Bjorn Helgaas)

- Update statistics and emit trace events early in AER logging paths,
  before any potential ratelimiting (Bjorn Helgaas)

- Determine AER log level once and save it so all related messages use the
  same level (Karolina Stolarek)

- Use KERN_WARNING, not KERN_ERR, when logging PCIe Correctable Errors.

- Ratelimit PCIe Correctable and Non-Fatal error logging, with sysfs
  controls on interval and burst count, to avoid flooding logs and RCU
  stall warnings (Jon Pan-Doh)

* pci/aer:
  PCI/ERR: Remove misleading TODO regarding kernel panic
  PCI/AER: Add sysfs attributes for log ratelimits
  PCI/AER: Add ratelimits to PCI AER Documentation
  PCI/AER: Ratelimit correctable and non-fatal error logging
  PCI/AER: Simplify add_error_device()
  PCI/AER: Convert aer_get_device_error_info(), aer_print_error() to index
  PCI/AER: Rename struct aer_stats to aer_info
  PCI/AER: Reduce pci_print_aer() correctable error level to KERN_WARNING
  PCI/ERR: Add printk level to pcie_print_tlp_log()
  PCI/AER: Check log level once and remember it
  PCI/AER: Trace error event before ratelimiting
  PCI/AER: Update statistics before ratelimiting
  PCI/AER: Simplify pci_print_aer()
  PCI/AER: Initialize aer_err_info before using it
  PCI/AER: Move aer_print_source() earlier in file
  PCI/AER: Rename aer_print_port_info() to aer_print_source()
  PCI/AER: Extract bus/dev/fn in aer_print_port_info() with PCI_BUS_NUM(), etc
  PCI/AER: Consolidate Error Source ID logging in aer_isr_one_error_type()
  PCI/AER: Factor COR/UNCOR error handling out from aer_isr_one_error()
  PCI/DPC: Log Error Source ID only when valid
  PCI/DPC: Initialize aer_err_info before using it

13 days agoPCI: j721e: Fix host/endpoint dependencies
Arnd Bergmann [Wed, 23 Apr 2025 16:25:16 +0000 (18:25 +0200)]
PCI: j721e: Fix host/endpoint dependencies

The j721e driver has a single platform driver that can be built-in or a
loadable module, but it calls two separate backend drivers depending on
whether it is a host or endpoint.

If the two modes are not the same, we can end up with a situation where the
built-in pci-j721e driver tries to call the modular host or endpoint
driver, which causes a link failure:

  ld.lld-21: error: undefined symbol: cdns_pcie_ep_setup
  >>> referenced by pci-j721e.c
  >>>               drivers/pci/controller/cadence/pci-j721e.o:(j721e_pcie_probe) in archive vmlinux.a

  ld.lld-21: error: undefined symbol: cdns_pcie_host_setup
  >>> referenced by pci-j721e.c
  >>>               drivers/pci/controller/cadence/pci-j721e.o:(j721e_pcie_probe) in archive vmlinux.a

Rework the dependencies so that the 'select' is done by the common Kconfig
symbol, based on which of the two are enabled. Effectively this means that
having one built-in makes the other either built-in or disabled, but all
configurations will now build.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20250423162523.2060405-1-arnd@kernel.org
13 days agoPCI: j721e: Add support to build as a loadable module
Siddharth Vadapalli [Thu, 17 Apr 2025 12:44:08 +0000 (18:14 +0530)]
PCI: j721e: Add support to build as a loadable module

The 'pci-j721e.c' driver is the application/glue/wrapper driver for the
Cadence PCIe Controllers on TI SoCs. Implement support for building it as a
loadable module.

Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-5-s-vadapalli@ti.com
13 days agoPCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup
Siddharth Vadapalli [Thu, 17 Apr 2025 12:44:07 +0000 (18:14 +0530)]
PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup

Introduce the helper function cdns_pcie_ep_disable() which will undo the
configuration performed by cdns_pcie_ep_setup(). Also, export it for use
by the existing callers of cdns_pcie_ep_setup(), thereby allowing them
to cleanup on their exit path.

Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-4-s-vadapalli@ti.com
13 days agoPCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup
Siddharth Vadapalli [Thu, 17 Apr 2025 12:44:06 +0000 (18:14 +0530)]
PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup

Introduce the helper function cdns_pcie_host_disable() which will undo
the configuration performed by cdns_pcie_host_setup(). Also, export it
for use by existing callers of cdns_pcie_host_setup(), thereby allowing
them to cleanup on their exit path.

Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-3-s-vadapalli@ti.com
13 days agoPCI: cadence: Add support to build pcie-cadence library as a kernel module
Kishon Vijay Abraham I [Thu, 17 Apr 2025 12:44:05 +0000 (18:14 +0530)]
PCI: cadence: Add support to build pcie-cadence library as a kernel module

Currently, the Cadence PCIe controller driver can be built as a built-in
module only. Since PCIe functionality is not a necessity for booting, add
support to build the Cadence PCIe driver as a loadable module as well.

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250417124408.2752248-2-s-vadapalli@ti.com
2 weeks agoPCI/ERR: Remove misleading TODO regarding kernel panic
Manivannan Sadhasivam [Thu, 8 May 2025 07:10:30 +0000 (12:40 +0530)]
PCI/ERR: Remove misleading TODO regarding kernel panic

A PCI device is just another peripheral in a system. So failure to
recover it, must not result in a kernel panic. So remove the TODO which
is quite misleading.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Link: https://patch.msgid.link/20250508-pcie-reset-slot-v4-1-7050093e2b50@linaro.org
2 weeks agoPCI: cadence: Remove duplicate message code definitions
Hans Zhang [Tue, 1 Apr 2025 14:50:23 +0000 (22:50 +0800)]
PCI: cadence: Remove duplicate message code definitions

The Cadence PCIe controller driver defines message codes in enum
cdns_pcie_msg_code duplicating the existing PCIE_MSG_CODE_* definitions in
drivers/pci/pci.h. The driver only uses ASSERT_INTA and DEASSERT_INTA codes
from this enum.

Remove the redundant Cadence-specific enum definitions and use the ones
available in drivers/pci/pci.h. This helps in avoiding code duplication,
maintaining consistency with the spec, and simplifying the code
maintenance.

Signed-off-by: Hans Zhang <18255117159@163.com>
[mani: commit message rewording]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250401145023.22948-1-18255117159@163.com
2 weeks agoPCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding
Niklas Cassel [Wed, 14 May 2025 07:43:19 +0000 (09:43 +0200)]
PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding

The kdoc for pci_epc_set_msix() says:
"Invoke to set the required number of MSI-X interrupts."

The kdoc for the callback pci_epc_ops->set_msix() says:
"ops to set the requested number of MSI-X interrupts in the MSI-X
capability register"

pci_epc_ops::set_msix() does however expect the parameter 'interrupts' to
be in the encoding as defined by the Table Size field. Nowhere in the
kdoc does it say that the number of interrupts should be in Table Size
encoding.

It is very confusing that the API pci_epc_set_msix() and the callback
function pci_epc_ops::set_msix() both take a parameter named 'interrupts',
but they expect completely different encodings.

Clean up the API and the callback function to have the same semantics,
i.e. the parameter represents the number of interrupts, regardless of the
internal encoding of that value.

Also rename the parameter 'interrupts' to 'nr_irqs', in both the wrapper
function and the callback function, such that the name is unambiguous.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-14-cassel@kernel.org
2 weeks agoPCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding
Niklas Cassel [Wed, 14 May 2025 07:43:18 +0000 (09:43 +0200)]
PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding

The kdoc for pci_epc_set_msi() says:
"Invoke to set the required number of MSI interrupts."

The kdoc for the callback pci_epc_ops::set_msi() says:
"ops to set the requested number of MSI interrupts in the MSI capability
register"

pci_epc_ops::set_msi() does however expect the parameter 'interrupts' to be
in the encoding as defined by the Multiple Message Capable (MMC) field of
the MSI capability structure. Nowhere in the kdoc does it say that the
number of interrupts should be in MMC encoding.

It is very confusing that the API pci_epc_set_msi() and the callback
function pci_epc_ops::set_msi() both take a parameter named 'interrupts',
but they expect completely different encodings.

Clean up the API and the callback function to have the same semantics,
i.e. the parameter represents the number of interrupts, regardless of the
internal encoding of that value.

Also rename the parameter 'interrupts' to 'nr_irqs', in both the wrapper
function and the callback function, such that the name is unambiguous.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-13-cassel@kernel.org
2 weeks agoPCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding
Niklas Cassel [Wed, 14 May 2025 07:43:17 +0000 (09:43 +0200)]
PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding

The kdoc for pci_epc_get_msix() says:
"Invoke to get the number of MSI-X interrupts allocated by the RC"

The kdoc for the callback pci_epc_ops->get_msix() says:
"ops to get the number of MSI-X interrupts allocated by the RC from the
MSI-X capability register"

pci_epc_ops::get_msix() does however return the number of interrupts in the
encoding as defined by the Table Size field. Nowhere in the kdoc does it
say that the returned number of interrupts is in Table Size encoding.

It is very confusing that the API pci_epc_get_msix() and the callback
function pci_epc_ops::get_msix() don't return the same value.

Clean up the API and the callback function to have the same semantics,
i.e. return the number of interrupts, regardless of the internal encoding
of that value.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-12-cassel@kernel.org
2 weeks agoPCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding
Niklas Cassel [Wed, 14 May 2025 07:43:16 +0000 (09:43 +0200)]
PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding

The kdoc for API pci_epc_get_msi() says:
"Invoke to get the number of MSI interrupts allocated by the RC"

The kdoc for the callback pci_epc_ops::get_msi() says:
"ops to get the number of MSI interrupts allocated by the RC from
the MSI capability register"

pci_epc_ops::get_msi() does however return the number of interrupts in the
encoding as defined by the Multiple Message Enable (MME) field of the MSI
Capability structure.

Nowhere in the kdoc does it say that the returned number of interrupts is
in MME encoding. It is very confusing that the API pci_epc_get_msi() and
the callback function pci_epc_ops::get_msi() don't return the same value.

Clean up the API and the callback function to have the same semantics,
i.e. return the number of interrupts, regardless of the internal encoding
of that value.

[bhelgaas: more specific subject]

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-11-cassel@kernel.org
2 weeks agoPCI: cadence-ep: Correct PBA offset in .set_msix() callback
Niklas Cassel [Wed, 14 May 2025 07:43:15 +0000 (09:43 +0200)]
PCI: cadence-ep: Correct PBA offset in .set_msix() callback

While cdns_pcie_ep_set_msix() writes the Table Size field correctly (N-1),
the calculation of the PBA offset is wrong because it calculates space for
(N-1) entries instead of N.

This results in the following QEMU error when using PCI passthrough on a
device which relies on the PCI endpoint subsystem:

  failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align

Fix the calculation of PBA offset in the MSI-X capability.

[bhelgaas: more specific subject and commit log]

Fixes: 3ef5d16f50f8 ("PCI: cadence: Add MSI-X support to Endpoint driver")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250514074313.283156-10-cassel@kernel.org
2 weeks agoPCI: dwc: ep: Correct PBA offset in .set_msix() callback
Niklas Cassel [Wed, 14 May 2025 07:43:14 +0000 (09:43 +0200)]
PCI: dwc: ep: Correct PBA offset in .set_msix() callback

While dw_pcie_ep_set_msix() writes the Table Size field correctly (N-1),
the calculation of the PBA offset is wrong because it calculates space for
(N-1) entries instead of N.

This results in the following QEMU error when using PCI passthrough on a
device which relies on the PCI endpoint subsystem:

  failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align

Fix the calculation of PBA offset in the MSI-X capability.

[bhelgaas: more specific subject and commit log]

Fixes: 83153d9f36e2 ("PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250514074313.283156-9-cassel@kernel.org
2 weeks agoPCI: endpoint: pci-epf-vntb: Simplify ctrl/SPAD space allocation
Jerome Brunet [Thu, 24 Apr 2025 08:34:05 +0000 (10:34 +0200)]
PCI: endpoint: pci-epf-vntb: Simplify ctrl/SPAD space allocation

When allocating the shared ctrl/SPAD space, epf_ntb_config_spad_bar_alloc()
should not try to handle the size quirks for underlying BAR, whether it is
fixed size or alignment. This is already handled by pci_epf_alloc_space().

Also, when handling the alignment, this allocates more space than
necessary. For example, with a SPAD size of 1024B and a ctrl size of 308B,
the space necessary is 1332B. If the alignment is 1MB,
epf_ntb_config_spad_bar_alloc() tries to allocate 2MB where 1MB would have
been more than enough.

Drop the handling of the BAR size quirks and let pci_epf_alloc_space()
handle that. Just make sure the 32bits SPAD register are aligned on 32bits.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250424-pci-ep-size-alignment-v5-2-2d4ec2af23f5@baylibre.com
2 weeks agoPCI: endpoint: Retain fixed-size BAR size as well as aligned size
Jerome Brunet [Thu, 24 Apr 2025 08:34:04 +0000 (10:34 +0200)]
PCI: endpoint: Retain fixed-size BAR size as well as aligned size

When allocating space for an endpoint function on a BAR with a fixed size,
the size saved in 'struct pci_epf_bar.size' should be the fixed size as
expected by pci_epc_set_bar().

However, if pci_epf_alloc_space() increased the allocation size to
accommodate iATU alignment requirements, it previously saved the larger
aligned size in .size, which broke pci_epc_set_bar().

To solve this, keep the fixed BAR size in .size and save the aligned size
in a new .aligned_size for use when deallocating it.

Fixes: 2a9a801620ef ("PCI: endpoint: Add support to specify alignment for buffers allocated to BARs")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
[mani: commit message fixup]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: more specific subject, commit log, wrap comment to match file]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250424-pci-ep-size-alignment-v5-1-2d4ec2af23f5@baylibre.com
3 weeks agoarm64: Kconfig: switch to HAVE_PWRCTRL
Johan Hovold [Wed, 2 Apr 2025 13:26:34 +0000 (15:26 +0200)]
arm64: Kconfig: switch to HAVE_PWRCTRL

The HAVE_PWRCTRL symbol has been renamed to reflect the pwrctrl framework
name. Switch to the non-deprecated symbol.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-5-johan+linaro@kernel.org
3 weeks agowifi: ath12k: switch to PCI_PWRCTRL_PWRSEQ
Johan Hovold [Wed, 2 Apr 2025 13:26:33 +0000 (15:26 +0200)]
wifi: ath12k: switch to PCI_PWRCTRL_PWRSEQ

The PCI_PWRCTRL_PWRSEQ and HAVE_PWRCTRL symbols have been renamed to
reflect the pwrctrl framework name. Switch to the non-deprecated symbols.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jeff Johnson <jjohnson@kernel.org> # drivers/net/wireless/ath/...
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-4-johan+linaro@kernel.org
3 weeks agowifi: ath11k: switch to PCI_PWRCTRL_PWRSEQ
Johan Hovold [Wed, 2 Apr 2025 13:26:32 +0000 (15:26 +0200)]
wifi: ath11k: switch to PCI_PWRCTRL_PWRSEQ

The PCI_PWRCTRL_PWRSEQ and HAVE_PWRCTRL symbols have been renamed to
reflect the pwrctrl framework name. Switch to the non-deprecated symbols.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jeff Johnson <jjohnson@kernel.org> # drivers/net/wireless/ath/...
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-3-johan+linaro@kernel.org
3 weeks agoPCI/pwrctrl: Rename pwrctrl Kconfig symbols and slot module
Johan Hovold [Wed, 2 Apr 2025 13:26:31 +0000 (15:26 +0200)]
PCI/pwrctrl: Rename pwrctrl Kconfig symbols and slot module

Commits b88cbaaa6fa1 ("PCI/pwrctrl: Rename pwrctl files to pwrctrl") and
3f925cd62874 ("PCI/pwrctrl: Rename pwrctrl functions and structures")
renamed the "pwrctl" framework to "pwrctrl" for consistency reasons.

Rename also the Kconfig symbols so that they reflect the new name while
adding entries for the deprecated ones. The old symbols can be removed once
everything that depends on them has been updated.

Note that no deprecated symbol is added for the new slot driver to avoid
having to add a user visible option.

Rename the new slot module to reflect the framework name and match the
other pwrctrl modules.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250402132634.18065-2-johan+linaro@kernel.org
3 weeks agoPCI/pwrctrl: Cancel outstanding rescan work when unregistering
Brian Norris [Wed, 9 Apr 2025 18:53:13 +0000 (11:53 -0700)]
PCI/pwrctrl: Cancel outstanding rescan work when unregistering

It's possible to trigger use-after-free here by:

  (a) forcing rescan_work_func() to take a long time and
  (b) utilizing a pwrctrl driver that may be unloaded for some reason

Cancel outstanding work to ensure it is finished before we allow our data
structures to be cleaned up.

[bhelgaas: tidy commit log]
Fixes: 8f62819aaace ("PCI/pwrctl: Rescan bus on a separate thread")
Signed-off-by: Brian Norris <briannorris@google.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: Konrad Dybcio <konradybcio@kernel.org>
Link: https://patch.msgid.link/20250409115313.1.Ia319526ed4ef06bec3180378c9a008340cec9658@changeid
3 weeks agoPCI/AER: Add sysfs attributes for log ratelimits
Jon Pan-Doh [Thu, 22 May 2025 23:21:26 +0000 (18:21 -0500)]
PCI/AER: Add sysfs attributes for log ratelimits

Allow userspace to read/write log ratelimits per device (including
enable/disable). Create aer/ sysfs directory to store them and any
future AER configs.

The new sysfs files are:

  /sys/bus/pci/devices/*/aer/correctable_ratelimit_burst
  /sys/bus/pci/devices/*/aer/correctable_ratelimit_interval_ms
  /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_burst
  /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_interval_ms

The default values are ratelimit_burst=10, ratelimit_interval_ms=5000, so
if we try to emit more than 10 messages in a 5 second period, some are
suppressed.

Update AER sysfs ABI filename to reflect the broader scope of AER sysfs
attributes (e.g. stats and ratelimits).

  Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats ->
    sysfs-bus-pci-devices-aer

Tested using aer-inject[1]. Configured correctable log ratelimit to 5.
Sent 6 AER errors. Observed 5 errors logged while AER stats
(cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) shows 6.

Disabled ratelimiting and sent 6 more AER errors. Observed all 6 errors
logged and accounted in AER stats (12 total errors).

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git

[bhelgaas: note fatal errors are not ratelimited, "aer_report" ->
"aer_info", replace ratelimit_log_enable toggle with *_ratelimit_interval_ms]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-21-helgaas@kernel.org
3 weeks agoPCI/AER: Add ratelimits to PCI AER Documentation
Jon Pan-Doh [Thu, 22 May 2025 23:21:25 +0000 (18:21 -0500)]
PCI/AER: Add ratelimits to PCI AER Documentation

Add ratelimits section for rationale and defaults.

[bhelgaas: note fatal errors are not ratelimited]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://patch.msgid.link/20250522232339.1525671-20-helgaas@kernel.org
3 weeks agoPCI/AER: Ratelimit correctable and non-fatal error logging
Jon Pan-Doh [Thu, 22 May 2025 23:21:24 +0000 (18:21 -0500)]
PCI/AER: Ratelimit correctable and non-fatal error logging

Spammy devices can flood kernel logs with AER errors and slow/stall
execution. Add per-device ratelimits for AER correctable and non-fatal
uncorrectable errors that use the kernel defaults (10 per 5s).  Logging of
fatal errors is not ratelimited.

There are two AER logging entry points:

  - aer_print_error() is used by DPC and native AER

  - pci_print_aer() is used by GHES and CXL

The native AER aer_print_error() case includes a loop that may log details
from multiple devices, which are ratelimited individually.  If we log
details for any device, we also log the Error Source ID from the Root Port
or RCEC.

If no such device details are found, we still log the Error Source from the
ERR_* Message, ratelimited by the Root Port or RCEC that received it.

The DPC aer_print_error() case is not ratelimited, since this only happens
for fatal errors.

The CXL pci_print_aer() case is ratelimited by the Error Source device.

The GHES pci_print_aer() case is via aer_recover_work_func(), which
searches for the Error Source device.  If the device is not found, there's
no per-device ratelimit, so we use a system-wide ratelimit that covers all
error types (correctable, non-fatal, and fatal).

Sargun at Meta reported internally that a flood of AER errors causes RCU
CPU stall warnings and CSD-lock warnings.

Tested using aer-inject[1]. Sent 11 AER errors. Observed 10 errors logged
while AER stats (cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) show
true count of 11.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git

[bhelgaas: commit log, factor out trace_aer_event() and aer_print_rp_info()
changes to previous patches, enable Error Source logging if any downstream
detail will be printed, don't ratelimit fatal errors, "aer_report" ->
"aer_info", "cor_log_ratelimit" -> "correctable_ratelimit",
"uncor_log_ratelimit" -> "nonfatal_ratelimit"]

Reported-by: Sargun Dhillon <sargun@meta.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-19-helgaas@kernel.org
3 weeks agoPCI/AER: Simplify add_error_device()
Bjorn Helgaas [Thu, 22 May 2025 23:21:23 +0000 (18:21 -0500)]
PCI/AER: Simplify add_error_device()

Return -ENOSPC error early so the usual path through add_error_device() is
the straightline code.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-18-helgaas@kernel.org
3 weeks agoPCI/AER: Convert aer_get_device_error_info(), aer_print_error() to index
Bjorn Helgaas [Thu, 22 May 2025 23:21:22 +0000 (18:21 -0500)]
PCI/AER: Convert aer_get_device_error_info(), aer_print_error() to index

Previously aer_get_device_error_info() and aer_print_error() took a pointer
to struct aer_err_info and a pointer to a pci_dev.  Typically the pci_dev
was one of the elements of the aer_err_info.dev[] array (DPC was an
exception, where the dev[] array was unused).

Convert aer_get_device_error_info() and aer_print_error() to take an index
into the aer_err_info.dev[] array instead.  A future patch will add
per-device ratelimit information, so the index makes it convenient to find
the ratelimit associated with the device.

To accommodate DPC, set info->dev[0] to the DPC port before using these
interfaces.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-17-helgaas@kernel.org
3 weeks agoPCI/AER: Rename struct aer_stats to aer_info
Karolina Stolarek [Thu, 22 May 2025 23:21:21 +0000 (18:21 -0500)]
PCI/AER: Rename struct aer_stats to aer_info

Update name to reflect the broader definition of structs/variables that are
stored (e.g. ratelimits). This is a preparatory patch for adding rate limit
support.

[bhelgaas: "aer_report" -> "aer_info"]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-16-helgaas@kernel.org
3 weeks agoPCI/AER: Reduce pci_print_aer() correctable error level to KERN_WARNING
Karolina Stolarek [Thu, 22 May 2025 23:21:20 +0000 (18:21 -0500)]
PCI/AER: Reduce pci_print_aer() correctable error level to KERN_WARNING

Some existing logs in pci_print_aer() log with error severity by default.

Convert them to use KERN_WARNING for correctable errors and KERN_ERR for
uncorrectable errors.

[bhelgaas: commit log]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-15-helgaas@kernel.org
3 weeks agoPCI/ERR: Add printk level to pcie_print_tlp_log()
Bjorn Helgaas [Thu, 22 May 2025 23:21:19 +0000 (18:21 -0500)]
PCI/ERR: Add printk level to pcie_print_tlp_log()

aer_print_error() produces output at a printk level (KERN_ERR/KERN_WARNING/
etc) that depends on the kind of error, and it calls pcie_print_tlp_log(),
which previously always produced output at KERN_ERR.

Add a "level" parameter so aer_print_error() can control the level of the
pcie_print_tlp_log() output to match.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-14-helgaas@kernel.org
3 weeks agoPCI/AER: Check log level once and remember it
Karolina Stolarek [Thu, 22 May 2025 23:21:18 +0000 (18:21 -0500)]
PCI/AER: Check log level once and remember it

When reporting an AER error, we check its type multiple times to determine
the log level for each message. Do this check only in the top-level
functions (aer_isr_one_error(), pci_print_aer()) and save the level in
struct aer_err_info.

[bhelgaas: save log level in struct aer_err_info instead of passing it
as a parameter]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-13-helgaas@kernel.org
3 weeks agoPCI/AER: Trace error event before ratelimiting
Bjorn Helgaas [Thu, 22 May 2025 23:21:17 +0000 (18:21 -0500)]
PCI/AER: Trace error event before ratelimiting

As with the AER statistics, we always want to emit trace events, even if
the actual dmesg logging is rate limited.

Call trace_aer_event() immediately after pci_dev_aer_stats_incr() so both
happen before ratelimiting.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-12-helgaas@kernel.org
3 weeks agoPCI/AER: Update statistics before ratelimiting
Bjorn Helgaas [Thu, 22 May 2025 23:21:16 +0000 (18:21 -0500)]
PCI/AER: Update statistics before ratelimiting

There are two AER logging entry points:

  - aer_print_error() is used by DPC (dpc_process_error()) and native AER
    handling (aer_process_err_devices()).

  - pci_print_aer() is used by GHES (aer_recover_work_func()) and CXL
    (cxl_handle_rdport_errors())

Both use __aer_print_error() to print the AER error bits.  Previously
__aer_print_error() also incremented the AER statistics via
pci_dev_aer_stats_incr().

Call pci_dev_aer_stats_incr() early in the entry points instead of in
__aer_print_error() so we update the statistics even if the actual printing
of error bits is rate limited by a future change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-11-helgaas@kernel.org
3 weeks agoPCI/AER: Simplify pci_print_aer()
Bjorn Helgaas [Thu, 22 May 2025 23:21:15 +0000 (18:21 -0500)]
PCI/AER: Simplify pci_print_aer()

Simplify pci_print_aer() by initializing the struct aer_err_info "info"
with a designated initializer list (it was previously initialized with
memset()) and using pci_name().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-10-helgaas@kernel.org
3 weeks agoPCI/AER: Initialize aer_err_info before using it
Bjorn Helgaas [Thu, 22 May 2025 23:21:14 +0000 (18:21 -0500)]
PCI/AER: Initialize aer_err_info before using it

Previously the struct aer_err_info "e_info" was allocated on the stack
without being initialized, so it contained junk except for the fields we
explicitly set later.

Initialize "e_info" at declaration with a designated initializer list,
which initializes the other members to zero.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-9-helgaas@kernel.org
3 weeks agoPCI/AER: Move aer_print_source() earlier in file
Bjorn Helgaas [Thu, 22 May 2025 23:21:13 +0000 (18:21 -0500)]
PCI/AER: Move aer_print_source() earlier in file

Move aer_print_source() earlier in the file so a future change can use it
from aer_print_error(), where it's easier to rate limit it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-8-helgaas@kernel.org
3 weeks agoPCI/AER: Rename aer_print_port_info() to aer_print_source()
Jon Pan-Doh [Thu, 22 May 2025 23:21:12 +0000 (18:21 -0500)]
PCI/AER: Rename aer_print_port_info() to aer_print_source()

Rename aer_print_port_info() to aer_print_source() to be more descriptive.
This logs the Error Source ID logged by a Root Port or Root Complex Event
Collector when it receives an ERR_COR, ERR_NONFATAL, or ERR_FATAL Message.

[bhelgaas: aer_print_rp_info() -> aer_print_source()]

Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-7-helgaas@kernel.org
3 weeks agoPCI/AER: Extract bus/dev/fn in aer_print_port_info() with PCI_BUS_NUM(), etc
Bjorn Helgaas [Thu, 22 May 2025 23:21:11 +0000 (18:21 -0500)]
PCI/AER: Extract bus/dev/fn in aer_print_port_info() with PCI_BUS_NUM(), etc

Use PCI_BUS_NUM(), PCI_SLOT(), PCI_FUNC() to extract the bus number,
device, and function number directly from the Error Source ID.  There's no
need to shift and mask it explicitly.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-6-helgaas@kernel.org
3 weeks agoPCI/AER: Consolidate Error Source ID logging in aer_isr_one_error_type()
Bjorn Helgaas [Thu, 22 May 2025 23:21:10 +0000 (18:21 -0500)]
PCI/AER: Consolidate Error Source ID logging in aer_isr_one_error_type()

Previously we decoded the AER Error Source ID in aer_isr_one_error_type(),
then again in find_source_device() if we didn't find any devices with
errors logged in their AER Capabilities.

Consolidate this so we only decode and log the Error Source ID once in
aer_isr_one_error_type().  Add a "found" parameter so we can add a note
when we didn't find any downstream devices with errors logged in their AER
Capability.

This changes the dmesg logging when we found no devices with errors logged:

  - pci 0000:00:01.0: AER: Correctable error message received from 0000:02:00.0
  - pci 0000:00:01.0: AER: found no error details for 0000:02:00.0
  + pci 0000:00:01.0: AER: Correctable error message received from 0000:02:00.0 (no details found)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-5-helgaas@kernel.org
3 weeks agoPCI/AER: Factor COR/UNCOR error handling out from aer_isr_one_error()
Bjorn Helgaas [Thu, 22 May 2025 23:21:09 +0000 (18:21 -0500)]
PCI/AER: Factor COR/UNCOR error handling out from aer_isr_one_error()

aer_isr_one_error() duplicates the Error Source ID logging and AER error
processing for Correctable Errors and Uncorrectable Errors.  Factor out the
duplicated code to aer_isr_one_error_type().

aer_isr_one_error() doesn't need the struct aer_rpc pointer, so pass it the
Root Port or RCEC pci_dev pointer instead.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-4-helgaas@kernel.org
3 weeks agoPCI/DPC: Log Error Source ID only when valid
Bjorn Helgaas [Thu, 22 May 2025 23:21:08 +0000 (18:21 -0500)]
PCI/DPC: Log Error Source ID only when valid

DPC Error Source ID is only valid when the DPC Trigger Reason indicates
that DPC was triggered due to reception of an ERR_NONFATAL or ERR_FATAL
Message (PCIe r6.0, sec 7.9.14.5).

When DPC was triggered by ERR_NONFATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_NFE)
or ERR_FATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_FE) from a downstream device,
log the Error Source ID (decoded into domain/bus/device/function).  Don't
print the source otherwise, since it's not valid.

For DPC trigger due to reception of ERR_NONFATAL or ERR_FATAL, the dmesg
logging changes:

  - pci 0000:00:01.0: DPC: containment event, status:0x000d source:0x0200
  - pci 0000:00:01.0: DPC: ERR_FATAL detected
  + pci 0000:00:01.0: DPC: containment event, status:0x000d, ERR_FATAL received from 0000:02:00.0

and when DPC triggered for other reasons, where DPC Error Source ID is
undefined, e.g., unmasked uncorrectable error:

  - pci 0000:00:01.0: DPC: containment event, status:0x0009 source:0x0200
  - pci 0000:00:01.0: DPC: unmasked uncorrectable error detected
  + pci 0000:00:01.0: DPC: containment event, status:0x0009: unmasked uncorrectable error detected

Previously the "containment event" message was at KERN_INFO and the
"%s detected" message was at KERN_WARNING.  Now the single message is at
KERN_WARNING.

Fixes: 26e515713342 ("PCI: Add Downstream Port Containment driver")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-3-helgaas@kernel.org
3 weeks agoPCI/DPC: Initialize aer_err_info before using it
Bjorn Helgaas [Thu, 22 May 2025 23:21:07 +0000 (18:21 -0500)]
PCI/DPC: Initialize aer_err_info before using it

Previously the struct aer_err_info "info" was allocated on the stack
without being initialized, so it contained junk except for the fields we
explicitly set later.

Initialize "info" at declaration so it starts as all zeros.

Fixes: 8aefa9b0d910 ("PCI/DPC: Print AER status in DPC event handling")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-2-helgaas@kernel.org
3 weeks agoPCI/ACPI: Fix allocated memory release on error in pci_acpi_scan_root()
Zhe Qiao [Wed, 30 Apr 2025 06:06:03 +0000 (14:06 +0800)]
PCI/ACPI: Fix allocated memory release on error in pci_acpi_scan_root()

In the pci_acpi_scan_root() function, when creating a PCI bus fails,
we need to free up the previously allocated memory, which can avoid
invalid memory usage and save resources.

Fixes: 789befdfa389 ("arm64: PCI: Migrate ACPI related functions to pci-acpi.c")
Signed-off-by: Zhe Qiao <qiaozhe@iscas.ac.cn>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250430060603.381504-1-qiaozhe@iscas.ac.cn
3 weeks agoPCI: Remove function pcim_intx() prototype from pci.h
Philipp Stanner [Thu, 22 May 2025 08:46:27 +0000 (10:46 +0200)]
PCI: Remove function pcim_intx() prototype from pci.h

The subsystem-internal header pci.h still contains the function
prototype of pcim_intx(), which has since been made public in the global
header.

Remove the redundant function prototype.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20250522084626.150148-2-phasta@kernel.org
3 weeks agoPCI: Remove hybrid-devres usage warnings from kernel-doc
Philipp Stanner [Mon, 19 May 2025 11:30:00 +0000 (13:30 +0200)]
PCI: Remove hybrid-devres usage warnings from kernel-doc

pci/iomap.c still contains warnings about those functions not behaving
in a managed manner if pcim_enable_device() was called. Since all hybrid
behavior that users could know about has been removed by now, those
explicit warnings are no longer necessary.

Remove the hybrid-devres usage warnings from the kernel-doc.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/20250519112959.25487-8-phasta@kernel.org
3 weeks agoPCI: Remove redundant set of request functions
Philipp Stanner [Mon, 19 May 2025 11:29:59 +0000 (13:29 +0200)]
PCI: Remove redundant set of request functions

When the demangling of the hybrid devres functions within PCI was
implemented, it was necessary to implement several PCI functions a
second time to avoid cyclic calls, since the hybrid functions in pci.c
call the managed functions in devres.c, which in turn can be directly
used outside of PCI and needed request infrastructure, too.

Therefore, __pcim_request_region_range(), __pci_release_region_range()
and wrappers around them were implemented.

The hybrid nature has recently been removed from all functions in pci.c.
Therefore, the functions in devres.c can now directly use their
counterparts in pci.c without causing a call-cycle.

Remove __pcim_request_region_range(), __pcim_request_region_range() and
the wrappers. Use the corresponding request functions from pci.c in
devres.c

Signed-off-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/20250519112959.25487-7-phasta@kernel.org
3 weeks agoPCI: Remove exclusive requests flags from _pcim_request_region()
Philipp Stanner [Mon, 19 May 2025 11:29:58 +0000 (13:29 +0200)]
PCI: Remove exclusive requests flags from _pcim_request_region()

pcim_request_region_exclusive(), the only user in PCI devres that needed
exclusive region requests, has been removed.

All features related to exclusive requests can, therefore, be removed,
too. Remove them.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20250519112959.25487-6-phasta@kernel.org
4 weeks agoPCI: Remove pcim_request_region_exclusive()
Philipp Stanner [Mon, 19 May 2025 11:29:57 +0000 (13:29 +0200)]
PCI: Remove pcim_request_region_exclusive()

pcim_request_region_exclusive() was only needed for redirecting the
relatively exotic exclusive request functions in pci.c in case of them
operating in managed mode.

The managed nature has been removed from those functions and no one else
uses pcim_request_region_exclusive().

Remove pcim_request_region_exclusive().

Signed-off-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/20250519112959.25487-5-phasta@kernel.org
4 weeks agoDocumentation/driver-api: Update pcim_enable_device()
Philipp Stanner [Mon, 19 May 2025 11:29:56 +0000 (13:29 +0200)]
Documentation/driver-api: Update pcim_enable_device()

pcim_enable_device() is not related anymore to switching the mode of
operation of any functions. It merely sets up a devres callback for
automatically disabling the PCI device on driver detach.

Adjust the function's documentation.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20250519112959.25487-4-phasta@kernel.org
4 weeks agoPCI: Remove hybrid devres nature from request functions
Philipp Stanner [Mon, 19 May 2025 11:29:55 +0000 (13:29 +0200)]
PCI: Remove hybrid devres nature from request functions

All functions based on __pci_request_region() and its release counter
part support "hybrid mode", where the functions become managed if the
PCI device was enabled with pcim_enable_device().

Removing this undesirable feature requires to remove all users who
activated their device with that function and use one of the affected
request functions.

These users were:
ASoC
alsa
cardreader
cirrus
i2c
mmc
mtd
mtd
mxser
net
spi
vdpa
vmwgfx

all of which have been ported to always-managed pcim_ functions by now.

The hybrid nature can, thus, be removed from the aforementioned PCI
functions.

Remove all function guards and documentation in pci.c related to the
hybrid redirection. Adjust the visibility of pcim_release_region().

Signed-off-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/20250519112959.25487-3-phasta@kernel.org
4 weeks agoPCI: Update Link Speed after retraining
Ilpo Järvinen [Wed, 14 May 2025 13:28:21 +0000 (16:28 +0300)]
PCI: Update Link Speed after retraining

PCIe Link Retraining can alter Link Speed. pcie_retrain_link() that
performs the Link Training is called from bwctrl and ASPM driver.

While bwctrl listens for Link Bandwidth Management Status (LBMS) to
pick up changes in Link Speed, there is a race between
pcie_reset_lbms() clearing LBMS after the Link Training and
pcie_bwnotif_irq() reading the Link Status register. If LBMS is already
cleared when the irq handler reads the register, the interrupt handler
will return early with IRQ_NONE and won't update the Link Speed.

When Link Speed update originates from bwctrl,
pcie_bwctrl_change_speed() ensures Link Speed is updated after the
retraining. ASPM driver, however, calls pcie_retrain_link() but does
not update the Link Speed after retraining which can result in stale
Link Speed. Also, it is possible to have ASPM support with
CONFIG_PCIEPORTBUS=n in which case bwctrl will not be built in (and
thus won't update the Link Speed at all).

To ensure Link Speed is not left stale after Link Training, move the
call to pcie_update_link_speed() from pcie_bwctrl_change_speed() into
pcie_retrain_link().

Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/linux-pci/aBCjpfyYmlkJ12AZ@wunner.de
Link: https://lore.kernel.org/r/20250514132821.15705-1-ilpo.jarvinen@linux.intel.com
4 weeks agoPCI: Limit visibility of match_driver flag to PCI core
Lukas Wunner [Fri, 25 Apr 2025 09:24:22 +0000 (11:24 +0200)]
PCI: Limit visibility of match_driver flag to PCI core

Since commit 58d9a38f6fac ("PCI: Skip attaching driver in device_add()"),
PCI enumeration is split into two steps:  In the first step, all devices
are published in sysfs with device_add().  In the second step, drivers are
bound to the devices with device_attach().  To delay driver binding until
the second step, a "bool match_driver" in struct pci_dev is used.

Instead of a bool, use a bit in the "unsigned long priv_flags" to shrink
struct pci_dev a little and prevent use of the bool outside the PCI core
(as has happened with commit cbbc00be2ce3 ("iommu/amd: Prevent binding
other PCI drivers to IOMMU PCI devices")).

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://patch.msgid.link/d22a9e5b81d6bd8dd1837607d6156679b3b1199c.1745572340.git.lukas@wunner.de
4 weeks agoRevert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices"
Lukas Wunner [Fri, 25 Apr 2025 09:24:21 +0000 (11:24 +0200)]
Revert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices"

Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") changed IRQ handling on PCI driver probing.
It inadvertently broke resume from system sleep on AMD platforms:

  https://lore.kernel.org/r/20150926164651.GA3640@pd.tnic/

This was fixed by two independent commits:

8affb487d4a4 ("x86/PCI: Don't alloc pcibios-irq when MSI is enabled")
cbbc00be2ce3 ("iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices")

The breaking change and one of these two fixes were subsequently reverted:

fe25d078874f ("Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"")
6c777e8799a9 ("Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"")

This rendered the second fix unnecessary, so revert it as well.  It used
the match_driver flag in struct pci_dev, which is internal to the PCI core
and not supposed to be touched by arbitrary drivers.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://patch.msgid.link/9a3ddff5cc49512044f963ba0904347bd404094d.1745572340.git.lukas@wunner.de
4 weeks agoPCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
Ilpo Järvinen [Tue, 22 Apr 2025 11:55:47 +0000 (14:55 +0300)]
PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag

PCIe BW controller counted LBMS assertions for the purposes of the Target
Speed quirk (pcie_failed_link_retrain()). It was also a plan to expose the
LBMS count through sysfs to allow better diagnosing link related issues.
Lukas Wunner suggested, however, that adding a trace event would be better
for diagnostics purposes, leaving only pcie_failed_link_retrain() as a user
of the lbms_count.

The logic in pcie_failed_link_retrain() does not require keeping count of
LBMS assertions, so replace lbms_count with a simple flag in pci_dev's
priv_flags.  The reduced complexity allows removing pcie_bwctrl_lbms_rwsem.

Since pcie_failed_link_retrain() runs before bwctrl is probed during boot,
the LBMS in Link Status register still has to be checked by the quirk.

The priv_flags numbering is not continuous because hotplug code added a few
flags to fill numbers 4-5 (hotplug and bwctrl changes are routed through in
different branches).

Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: squashed a fix to resolve build failures from
https://lore.kernel.org/all/20250508090036.1528-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/20250422115548.1483-1-ilpo.jarvinen@linux.intel.com
5 weeks agoPCI: Explicitly put devices into D0 when initializing
Mario Limonciello [Thu, 24 Apr 2025 04:31:32 +0000 (23:31 -0500)]
PCI: Explicitly put devices into D0 when initializing

AMD BIOS team has root caused an issue that NVMe storage failed to come
back from suspend to a lack of a call to _REG when NVMe device was probed.

112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states") added
support for calling _REG when transitioning D-states, but this only works
if the device actually "transitions" D-states.

967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
added support for runtime PM on PCI devices, but never actually
'explicitly' sets the device to D0.

To make sure that devices are in D0 and that platform methods such as
_REG are called, explicitly set all devices into D0 during initialization.

Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Denis Benato <benato.denis96@gmail.com>
Tested-By: Yijun Shen <Yijun_Shen@Dell.com>
Tested-By: David Perry <david.perry@amd.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://patch.msgid.link/20250424043232.1848107-1-superm1@kernel.org
5 weeks agoPCI: Fix lock symmetry in pci_slot_unlock()
Ilpo Järvinen [Mon, 5 May 2025 11:54:12 +0000 (14:54 +0300)]
PCI: Fix lock symmetry in pci_slot_unlock()

The commit a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()")
made the lock function to call depend on dev->subordinate but left
pci_slot_unlock() unmodified creating locking asymmetry compared with
pci_slot_lock().

Because of the asymmetric lock handling, the same bridge device is unlocked
twice. First pci_bus_unlock() unlocks bus->self and then pci_slot_unlock()
will unconditionally unlock the same bridge device.

Move pci_dev_unlock() inside an else branch to match the logic in
pci_slot_lock().

Fixes: a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250505115412.37628-1-ilpo.jarvinen@linux.intel.com
7 weeks agoPCI: Increment PM usage counter when probing reset methods
Alex Williamson [Tue, 22 Apr 2025 23:05:32 +0000 (17:05 -0600)]
PCI: Increment PM usage counter when probing reset methods

We can get different results probing reset methods for a device depending
on its power state.  For example, reading the PM control register of a
device in D3cold will always indicate NoSoftRst+ because we get ~0 data
when the config read fails on PCI, preventing us from correctly probing PM
reset support.

Increment the PM usage counter before any probes and use the cleanup __free
facility to automatically drop the usage counter out of scope.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250422230534.2295291-3-alex.williamson@redhat.com
7 weeks agoPM: runtime: Define pm_runtime_put cleanup helper
Alex Williamson [Tue, 22 Apr 2025 23:05:31 +0000 (17:05 -0600)]
PM: runtime: Define pm_runtime_put cleanup helper

Define a cleanup helper for use with __free to automatically drop the
device usage count when the pointer goes out of scope.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://patch.msgid.link/20250422230534.2295291-2-alex.williamson@redhat.com
7 weeks agoPCI: apple: Add T602x PCIe support
Hector Martin [Tue, 1 Apr 2025 09:17:13 +0000 (10:17 +0100)]
PCI: apple: Add T602x PCIe support

This version of the hardware moved around a bunch of registers, so we
avoid the old compatible for these and introduce register offset
structures to handle the differences.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-14-maz@kernel.org
7 weeks agoPCI: apple: Abstract register offsets via a SoC-specific structure
Hector Martin [Tue, 1 Apr 2025 09:17:12 +0000 (10:17 +0100)]
PCI: apple: Abstract register offsets via a SoC-specific structure

Newer versions of the Apple PCIe block have a bunch of small, but
annoying differences.

In order to embrace this diversity of implementations, move the
currently hardcoded offsets into a hw_info structure. Future SoCs
will provide their own structure describing the applicable offsets.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
[maz: split from original patch to only address T8103]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-13-maz@kernel.org
7 weeks agoPCI: apple: Use gpiod_set_value_cansleep in probe flow
Hector Martin [Tue, 1 Apr 2025 09:17:11 +0000 (10:17 +0100)]
PCI: apple: Use gpiod_set_value_cansleep in probe flow

We're allowed to sleep here, so tell the GPIO core by using
gpiod_set_value_cansleep instead of gpiod_set_value.

Fixes: 1e33888fbe44 ("PCI: apple: Add initial hardware bring-up")
Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-12-maz@kernel.org
7 weeks agoPCI: apple: Drop poll for CORE_RC_PHYIF_STAT_REFCLK
Hector Martin [Tue, 1 Apr 2025 09:17:10 +0000 (10:17 +0100)]
PCI: apple: Drop poll for CORE_RC_PHYIF_STAT_REFCLK

This is checking a core refclk in per-port setup which doesn't make a
lot of sense, and the bootloader needs to have gone through this anyway.

It doesn't work on T602x, so just drop it across the board.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-11-maz@kernel.org
7 weeks agoPCI: apple: Move port PHY registers to their own reg items
Hector Martin [Tue, 1 Apr 2025 09:17:09 +0000 (10:17 +0100)]
PCI: apple: Move port PHY registers to their own reg items

T602x PCIe cores move these registers around. Instead of hardcoding in
another offset, let's move them into their own reg entries. This matches
what Apple does on macOS device trees too.

Maintains backwards compatibility with old DTs by using the old offsets.

Note that we open code devm_platform_ioremap_resource_byname() to avoid
error messages on older platforms with missing resources in the pcie
node. ("pcie-apple 590000000.pcie: invalid resource (null)" on probe)

Co-developed-by: Janne Grunau <j@jannau.net>
Signed-off-by: Janne Grunau <j@jannau.net>
Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-10-maz@kernel.org
7 weeks agoPCI: apple: Fix missing OF node reference in apple_pcie_setup_port
Hector Martin [Tue, 1 Apr 2025 09:17:08 +0000 (10:17 +0100)]
PCI: apple: Fix missing OF node reference in apple_pcie_setup_port

In the success path, we hang onto a reference to the node, so make sure
to grab one. The caller iterator puts our borrowed reference when we
return.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-9-maz@kernel.org
7 weeks agoPCI: apple: Move away from INTMSK{SET,CLR} for INTx and private interrupts
Marc Zyngier [Tue, 1 Apr 2025 09:17:07 +0000 (10:17 +0100)]
PCI: apple: Move away from INTMSK{SET,CLR} for INTx and private interrupts

T602x seems to have dropped the rather useful SET/CLR accessors
to the masking register.

Instead, let's use the mask register directly, and wrap it with
a brand new spinlock. No, this isn't moving in the right direction.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-8-maz@kernel.org
7 weeks agoPCI: apple: Dynamically allocate RID-to_SID bitmap
Marc Zyngier [Tue, 1 Apr 2025 09:17:06 +0000 (10:17 +0100)]
PCI: apple: Dynamically allocate RID-to_SID bitmap

As we move towards supporting SoCs with varying RID-to-SID mapping
capabilities, turn the static SID tracking bitmap into a dynamically
allocated one. The current allocation size is still the same, but
that's about to change.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-7-maz@kernel.org
7 weeks agoPCI: apple: Move over to standalone probing
Marc Zyngier [Tue, 1 Apr 2025 09:17:05 +0000 (10:17 +0100)]
PCI: apple: Move over to standalone probing

Now that we have the required infrastructure, split the Apple PCIe
setup into two categories:

- stuff that has to do with PCI setup stays in the .init() callback

- stuff that is just driver gunk (such as MSI setup) goes into a
  probe routine, which will eventually call into the host-common
  code

The result is a far more logical setup process.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-6-maz@kernel.org
7 weeks agoPCI: ecam: Allow cfg->priv to be pre-populated from the root port device
Marc Zyngier [Tue, 1 Apr 2025 09:17:04 +0000 (10:17 +0100)]
PCI: ecam: Allow cfg->priv to be pre-populated from the root port device

In order to decouple ecam config space creation from probing via
pci_host_common_probe(), allow the private pointer to be populated
via the device drvdata pointer.

Crucially, this is set before calling ops->init(), allowing that
particular callback to have access to probe data.

This should have no impact on existing code which ignores the
current value of cfg->priv.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-5-maz@kernel.org
7 weeks agoPCI: host-generic: Extract an ECAM bridge creation helper from pci_host_common_probe()
Marc Zyngier [Tue, 1 Apr 2025 09:17:03 +0000 (10:17 +0100)]
PCI: host-generic: Extract an ECAM bridge creation helper from pci_host_common_probe()

pci_host_common_probe() is an extremely useful helper, as it
abstracts away most of the gunk that a "mostly-ECAM-compliant"
device driver needs.

However, it is structured as a probe function, meaning that a lot
of the driver-specific setup has to happen in a .init() callback,
after the bridge and config space have been instantiated.

This is a bit awkward, and results in a number of convolutions
that could be avoided if the host-common code was more like
a library.

Introduce a pci_host_common_init() helper that does exactly that,
taking the platform device and a struct pci_ecam_op as parameters.

This can then be called from the probe routine, and a lot of the
code that isn't relevant to PCI setup moved away from the .init()
callback. This also removes the dependency on the device match
data, which is an oddity.

Signed-off-by: Marc Zyngier <maz@kernel.org>
[mani: fixed spelling mistakes]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-4-maz@kernel.org
7 weeks agoPCI: cadence: Fix runtime atomic count underflow
Hans Zhang [Sat, 19 Apr 2025 13:30:58 +0000 (21:30 +0800)]
PCI: cadence: Fix runtime atomic count underflow

If the call to pci_host_probe() in cdns_pcie_host_setup() fails, PM
runtime count is decremented in the error path using pm_runtime_put_sync().
But the runtime count is not incremented by this driver, but only by the
callers (cdns_plat_pcie_probe/j721e_pcie_probe). And the callers also
decrement the runtime PM count in their error path. So this leads to the
below warning from the PM core:

"runtime PM usage count underflow!"

So fix it by getting rid of pm_runtime_put_sync() in the error path and
directly return the errno.

Fixes: 49e427e6bdd1 ("Merge branch 'pci/host-probe-refactor'")
Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250419133058.162048-1-18255117159@163.com
8 weeks agodt-bindings: pci: apple,pcie: Add t6020 compatible string
Alyssa Rosenzweig [Tue, 1 Apr 2025 09:17:02 +0000 (10:17 +0100)]
dt-bindings: pci: apple,pcie: Add t6020 compatible string

t6020 adds some register ranges compared to t8103, so requires
a new compatible as well as the new PHY registers.

Thanks to Mark and Rob for their helpful suggestions in updating
the binding.

Suggested-by: Mark Kettenis <mark.kettenis@xs4all.nl>
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
[maz: added PHY registers, constraints]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://patch.msgid.link/20250401091713.2765724-3-maz@kernel.org
8 weeks agoPCI: apple: Set only available ports up
Janne Grunau [Tue, 1 Apr 2025 09:17:01 +0000 (10:17 +0100)]
PCI: apple: Set only available ports up

Iterating over disabled ports results in of_irq_parse_raw() parsing
the wrong "interrupt-map" entries, as it takes the status of the node
into account.

This became apparent after disabling unused PCIe ports in the Apple
Silicon device trees instead of deleting them.

Switching from for_each_child_of_node_scoped() to
for_each_available_child_of_node_scoped() solves this issue.

Fixes: 1e33888fbe44 ("PCI: apple: Add initial hardware bring-up")
Fixes: a0189fdfb73d ("arm64: dts: apple: t8103: Disable unused PCIe ports")
Signed-off-by: Janne Grunau <j@jannau.net>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/asahi/20230214-apple_dts_pcie_disable_unused-v1-0-5ea0d3ddcde3@jannau.net/
Link: https://lore.kernel.org/asahi/1ea2107a-bb86-8c22-0bbc-82c453ab08ce@linaro.org/
Link: https://patch.msgid.link/20250401091713.2765724-2-maz@kernel.org
8 weeks agoPCI: Add ACS quirk for Loongson PCIe
Huacai Chen [Thu, 3 Apr 2025 04:07:56 +0000 (12:07 +0800)]
PCI: Add ACS quirk for Loongson PCIe

Loongson PCIe Root Ports don't advertise an ACS capability, but they do not
allow peer-to-peer transactions between Root Ports. Add an ACS quirk so
each Root Port can be in a separate IOMMU group.

Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250403040756.720409-1-chenhuacai@loongson.cn
8 weeks agoPCI: Print the actual delay time in pci_bridge_wait_for_secondary_bus()
Wilfred Mallawa [Mon, 14 Apr 2025 00:15:06 +0000 (10:15 +1000)]
PCI: Print the actual delay time in pci_bridge_wait_for_secondary_bus()

Print the delay amount that pcie_wait_for_link_delay() is invoked with
instead of the hardcoded 1000ms value in the debug info print.

Fixes: 7b3ba09febf4 ("PCI/PM: Shorten pci_bridge_wait_for_secondary_bus() wait time for slow links")
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250414001505.21243-2-wilfred.opensource@gmail.com
8 weeks agoPCI: hotplug: Drop superfluous #include directives
Lukas Wunner [Mon, 14 Apr 2025 14:25:21 +0000 (16:25 +0200)]
PCI: hotplug: Drop superfluous #include directives

In February 2003, historic commit

  https://git.kernel.org/tglx/history/c/280c1c9a0ea4
  ("[PATCH] PCI Hotplug: Replace pcihpfs with sysfs.")

removed all invocations of __get_free_page() and free_page() from the PCI
hotplug core without also removing the #include <linux/pagemap.h>
directive.

It removed all invocations of kern_mount(), mntget() and mntput()
without also removing the #include <linux/mount.h> directive.

It removed all invocations of lookup_hash()
without also removing the #include <linux/namei.h> directive.

It removed all invocations of copy_to_user() and copy_from_user()
without also removing the #include <linux/uaccess.h> directive.

These #include directives are still unnecessary today, so drop them.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/c19e25bf2cefecc14e0822c6a9bb3a7f546258bc.1744640331.git.lukas@wunner.de
8 weeks agoPCI: Use PCI_STD_NUM_BARS instead of 6
Ilpo Järvinen [Wed, 16 Apr 2025 10:02:39 +0000 (13:02 +0300)]
PCI: Use PCI_STD_NUM_BARS instead of 6

pci_read_bases() is given literal 6 that means PCI_STD_NUM_BARS.  Replace
the literal with the define to annotate the code better.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250416100239.6958-1-ilpo.jarvinen@linux.intel.com
2 months agoPCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset
Lukas Wunner [Thu, 10 Apr 2025 15:27:12 +0000 (17:27 +0200)]
PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset

When a Secondary Bus Reset is issued at a hotplug port, it causes a Data
Link Layer State Changed event as a side effect.  On hotplug ports using
in-band presence detect, it additionally causes a Presence Detect Changed
event.

These spurious events should not result in teardown and re-enumeration of
the device in the slot.  Hence commit 2e35afaefe64 ("PCI: pciehp: Add
reset_slot() method") masked the Presence Detect Changed Enable bit in the
Slot Control register during a Secondary Bus Reset.  Commit 06a8d89af551
("PCI: pciehp: Disable link notification across slot reset") additionally
masked the Data Link Layer State Changed Enable bit.

However masking those bits only disables interrupt generation (PCIe r6.2
sec 6.7.3.1).  The events are still visible in the Slot Status register
and picked up by the IRQ handler if it runs during a Secondary Bus Reset.
This can happen if the interrupt is shared or if an unmasked hotplug event
occurs, e.g. Attention Button Pressed or Power Fault Detected.

The likelihood of this happening used to be small, so it wasn't much of a
problem in practice.  That has changed with the recent introduction of
bandwidth control in v6.13-rc1 with commit 665745f27487 ("PCI/bwctrl:
Re-add BW notification portdrv as PCIe BW controller"):

Bandwidth control shares the interrupt with PCIe hotplug.  A Secondary Bus
Reset causes a Link Bandwidth Notification, so the hotplug IRQ handler
runs, picks up the masked events and tears down the device in the slot.

As a result, Joel reports VFIO passthrough failure of a GPU, which Ilpo
root-caused to the incorrect handling of masked hotplug events.

Clearly, a more reliable way is needed to ignore spurious hotplug events.

For Downstream Port Containment, a new ignore mechanism was introduced by
commit a97396c6eb13 ("PCI: pciehp: Ignore Link Down/Up caused by DPC").
It has been working reliably for the past four years.

Adapt it for Secondary Bus Resets.

Introduce two helpers to annotate code sections which cause spurious link
changes:  pci_hp_ignore_link_change() and pci_hp_unignore_link_change()
Use those helpers in lieu of masking interrupts in the Slot Control
register.

Introduce a helper to check whether such a code section is executing
concurrently and if so, await it:  pci_hp_spurious_link_change()
Invoke the helper in the hotplug IRQ thread pciehp_ist().  Re-use the
IRQ thread's existing code which ignores DPC-induced link changes unless
the link is unexpectedly down after reset recovery or the device was
replaced during the bus reset.

That code block in pciehp_ist() was previously only executed if a Data
Link Layer State Changed event has occurred.  Additionally execute it for
Presence Detect Changed events.  That's necessary for compatibility with
PCIe r1.0 hotplug ports because Data Link Layer State Changed didn't exist
before PCIe r1.1.  DPC was added with PCIe r3.1 and thus DPC-capable
hotplug ports always support Data Link Layer State Changed events.
But the same cannot be assumed for Secondary Bus Reset, which already
existed in PCIe r1.0.

Secondary Bus Reset is only one of many causes of spurious link changes.
Others include runtime suspend to D3cold, firmware updates or FPGA
reconfiguration.  The new pci_hp_{,un}ignore_link_change() helpers may be
used by all kinds of drivers to annotate such code sections, hence their
declarations are publicly visible in <linux/pci.h>.  A case in point is
the Mellanox Ethernet driver which disables a firmware reset feature if
the Ethernet card is attached to a hotplug port, see commit 3d7a3f2612d7
("net/mlx5: Nack sync reset request when HotPlug is enabled").  Going
forward, PCIe hotplug will be able to cope gracefully with all such use
cases once the code sections are properly annotated.

The new helpers internally use two bits in struct pci_dev's priv_flags as
well as a wait_queue.  This mirrors what was done for DPC by commit
a97396c6eb13 ("PCI: pciehp: Ignore Link Down/Up caused by DPC").  That may
be insufficient if spurious link changes are caused by multiple sources
simultaneously.  An example might be a Secondary Bus Reset issued by AER
during FPGA reconfiguration.  If this turns out to happen in real life,
support for it can easily be added by replacing the PCI_LINK_CHANGING flag
with an atomic_t counter incremented by pci_hp_ignore_link_change() and
decremented by pci_hp_unignore_link_change().  Instead of awaiting a zero
PCI_LINK_CHANGING flag, the pci_hp_spurious_link_change() helper would
then simply await a zero counter.

Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Reported-by: Joel Mathew Thomas <proxy0@tutamail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219765
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Joel Mathew Thomas <proxy0@tutamail.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/d04deaf49d634a2edf42bf3c06ed81b4ca54d17b.1744298239.git.lukas@wunner.de
2 months agoPCI: pciehp: Ignore Presence Detect Changed caused by DPC
Lukas Wunner [Thu, 10 Apr 2025 15:27:11 +0000 (17:27 +0200)]
PCI: pciehp: Ignore Presence Detect Changed caused by DPC

Commit a97396c6eb13 ("PCI: pciehp: Ignore Link Down/Up caused by DPC")
amended PCIe hotplug to not bring down the slot upon Data Link Layer State
Changed events caused by Downstream Port Containment.

However Keith reports off-list that if the slot uses in-band presence
detect (i.e. Presence Detect State is derived from Data Link Layer Link
Active), DPC also causes a spurious Presence Detect Changed event.

This needs to be ignored as well.

Unfortunately there's no register indicating that in-band presence detect
is used.  PCIe r5.0 sec 7.5.3.10 introduced the In-Band PD Disable bit in
the Slot Control Register.  The PCIe hotplug driver sets this bit on
ports supporting it.  But older ports may still use in-band presence
detect.

If in-band presence detect can be disabled, Presence Detect Changed events
occurring during DPC must not be ignored because they signal device
replacement.  On all other ports, device replacement cannot be detected
reliably because the Presence Detect Changed event could be a side effect
of DPC.  On those (older) ports, perform a best-effort device replacement
check by comparing the Vendor ID, Device ID and other data in Config Space
with the values cached in struct pci_dev.  Use the existing helper
pciehp_device_replaced() to accomplish this.  It is currently #ifdef'ed to
CONFIG_PM_SLEEP in pciehp_core.c, so move it to pciehp_hpc.c where most
other functions accessing config space reside.

Reported-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/fa264ff71952915c4e35a53c89eb0cde8455a5c5.1744298239.git.lukas@wunner.de
2 months agoPCI: Remove pci_fixup_cardbus()
Heiner Kallweit [Wed, 9 Apr 2025 20:43:10 +0000 (22:43 +0200)]
PCI: Remove pci_fixup_cardbus()

Since 1c7f4fe86f17 ("powerpc/pci: Remove pcibios_setup_bus_devices()")
there's no architecture left setting pci_fixup_cardbus. Therefore remove
support from PCI core.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/8de7da4c-2b16-4ee1-8c42-0d04f3c821c6@gmail.com
2 months agoPCI: Remove pcim_iounmap_regions()
Philipp Stanner [Thu, 27 Mar 2025 11:07:08 +0000 (12:07 +0100)]
PCI: Remove pcim_iounmap_regions()

All users of the deprecated function pcim_iounmap_regions() have been
ported by now. Remove it.

Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20250327110707.20025-4-phasta@kernel.org
2 months agomtip32xx: Remove unnecessary pcim_iounmap_regions() calls
Philipp Stanner [Thu, 27 Mar 2025 11:07:07 +0000 (12:07 +0100)]
mtip32xx: Remove unnecessary pcim_iounmap_regions() calls

pcim_iounmap_regions() is deprecated. Moreover, it is not necessary to call
it in the driver's remove() function or if probe() fails, since it does
cleanup automatically on driver detach.

Remove all calls to pcim_iounmap_regions().

Signed-off-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20250327110707.20025-3-phasta@kernel.org
2 months agoirqdomain: pci: Switch to of_fwnode_handle()
Jiri Slaby (SUSE) [Wed, 19 Mar 2025 09:29:00 +0000 (10:29 +0100)]
irqdomain: pci: Switch to of_fwnode_handle()

of_node_to_fwnode() is irqdomain's reimplementation of the "officially"
defined of_fwnode_handle(). The former is in the process of being removed,
so use the latter instead.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250319092951.37667-8-jirislaby@kernel.org
2 months agoLinux 6.15-rc1 v6.15-rc1
Linus Torvalds [Sun, 6 Apr 2025 20:11:33 +0000 (13:11 -0700)]
Linux 6.15-rc1

2 months agotools/include: make uapi/linux/types.h usable from assembly
Thomas Weißschuh [Wed, 2 Apr 2025 20:21:57 +0000 (21:21 +0100)]
tools/include: make uapi/linux/types.h usable from assembly

The "real" linux/types.h UAPI header gracefully degrades to a NOOP when
included from assembly code.

Mirror this behaviour in the tools/ variant.

Test for __ASSEMBLER__ over __ASSEMBLY__ as the former is provided by the
toolchain automatically.

Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/lkml/af553c62-ca2f-4956-932c-dd6e3a126f58@sirena.org.uk/
Fixes: c9fbaa879508 ("selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20250321-uapi-consistency-v1-1-439070118dc0@linutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agoMerge tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Apr 2025 19:32:43 +0000 (12:32 -0700)]
Merge tag 'turbostat-2025.05.06' of git://git./linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

 - support up to 8192 processors

 - add cpuidle governor debug telemetry, disabled by default

 - update default output to exclude cpuidle invocation counts

 - bug fixes

* tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: v2025.05.06
  tools/power turbostat: disable "cpuidle" invocation counters, by default
  tools/power turbostat: re-factor sysfs code
  tools/power turbostat: Restore GFX sysfs fflush() call
  tools/power turbostat: Document GNR UncMHz domain convention
  tools/power turbostat: report CoreThr per measurement interval
  tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192
  tools/power turbostat: Add idle governor statistics reporting
  tools/power turbostat: Fix names matching
  tools/power turbostat: Allow Zero return value for some RAPL registers
  tools/power turbostat: Clustered Uncore MHz counters should honor show/hide options

2 months agoMerge tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 6 Apr 2025 19:04:53 +0000 (12:04 -0700)]
Merge tag 'soundwire-6.15-rc1-fixes' of git://git./linux/kernel/git/vkoul/soundwire

Pull soundwire fix from Vinod Koul:

 - add missing config symbol CONFIG_SND_HDA_EXT_CORE required for asoc
   driver CONFIG_SND_SOF_SOF_HDA_SDW_BPT

* tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  ASoC: SOF: Intel: Let SND_SOF_SOF_HDA_SDW_BPT select SND_HDA_EXT_CORE

2 months agotools/power turbostat: v2025.05.06
Len Brown [Sun, 6 Apr 2025 18:49:20 +0000 (14:49 -0400)]
tools/power turbostat: v2025.05.06

Support up to 8192 processors
Add cpuidle governor debug telemetry, disabled by default
Update default output to exclude cpuidle invocation counts
Bug fixes

Signed-off-by: Len Brown <len.brown@intel.com>
2 months agotools/power turbostat: disable "cpuidle" invocation counters, by default
Len Brown [Sun, 6 Apr 2025 18:29:57 +0000 (14:29 -0400)]
tools/power turbostat: disable "cpuidle" invocation counters, by default

Create "pct_idle" counter group, the sofware notion of residency
so it can now be singled out, independent of other counter groups.

Create "cpuidle" group, the cpuidle invocation counts.
Disable "cpuidle", by default.

Create "swidle" = "cpuidle" + "pct_idle".
Undocument "sysfs", the old name for "swidle", but keep it working
for backwards compatibilty.

Create "hwidle", all the HW idle counters

Modify "idle", enabled by default
"idle" = "hwidle" + "pct_idle" (and now excludes "cpuidle")

Signed-off-by: Len Brown <len.brown@intel.com>