Merge tag 'nf-24-04-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

author David S. Miller <davem@davemloft.net>

Fri, 12 Apr 2024 12:02:13 +0000 (13:02 +0100)

committer David S. Miller <davem@davemloft.net>

Fri, 12 Apr 2024 12:02:13 +0000 (13:02 +0100)
author David S. Miller <davem@davemloft.net>
Fri, 12 Apr 2024 12:02:13 +0000 (13:02 +0100)
committer David S. Miller <davem@davemloft.net>
Fri, 12 Apr 2024 12:02:13 +0000 (13:02 +0100)
diff --git a/.mailmap b/.mailmap

index 59c9a841bf71494f3371c9658af76aa5644783a5..8284692f9610715fa2bc04f8771cb45d1b5ebf88 100644 (file)
--- a/.mailmap
+++ b/.mailmap
@@ -20,6 +20,7 @@ Adam Oldham <oldhamca@gmail.com>
  Adam Radford <aradford@gmail.com>
  Adriana Reus <adi.reus@gmail.com> <adriana.reus@intel.com>
  Adrian Bunk <bunk@stusta.de>
+Ajay Kaher <ajay.kaher@broadcom.com> <akaher@vmware.com>
  Akhil P Oommen <quic_akhilpo@quicinc.com> <akhilpo@codeaurora.org>
  Alan Cox <alan@lxorguk.ukuu.org.uk>
  Alan Cox <root@hraefn.swansea.linux.org.uk>
@@ -36,6 +37,7 @@ Alexei Avshalom Lazar <quic_ailizaro@quicinc.com> <ailizaro@codeaurora.org>
  Alexei Starovoitov <ast@kernel.org> <alexei.starovoitov@gmail.com>
  Alexei Starovoitov <ast@kernel.org> <ast@fb.com>
  Alexei Starovoitov <ast@kernel.org> <ast@plumgrid.com>
+Alexey Makhalov <alexey.amakhalov@broadcom.com> <amakhalov@vmware.com>
  Alex Hung <alexhung@gmail.com> <alex.hung@canonical.com>
  Alex Shi <alexs@kernel.org> <alex.shi@intel.com>
  Alex Shi <alexs@kernel.org> <alex.shi@linaro.org>
@@ -110,6 +112,7 @@ Brendan Higgins <brendan.higgins@linux.dev> <brendanhiggins@google.com>
  Brian Avery <b.avery@hp.com>
  Brian King <brking@us.ibm.com>
  Brian Silverman <bsilver16384@gmail.com> <brian.silverman@bluerivertech.com>
+Bryan Tan <bryan-bt.tan@broadcom.com> <bryantan@vmware.com>
  Cai Huoqing <cai.huoqing@linux.dev> <caihuoqing@baidu.com>
  Can Guo <quic_cang@quicinc.com> <cang@codeaurora.org>
  Carl Huang <quic_cjhuang@quicinc.com> <cjhuang@codeaurora.org>
@@ -529,6 +532,7 @@ Rocky Liao <quic_rjliao@quicinc.com> <rjliao@codeaurora.org>
  Roman Gushchin <roman.gushchin@linux.dev> <guro@fb.com>
  Roman Gushchin <roman.gushchin@linux.dev> <guroan@gmail.com>
  Roman Gushchin <roman.gushchin@linux.dev> <klamm@yandex-team.ru>
+Ronak Doshi <ronak.doshi@broadcom.com> <doshir@vmware.com>
  Muchun Song <muchun.song@linux.dev> <songmuchun@bytedance.com>
  Muchun Song <muchun.song@linux.dev> <smuchun@gmail.com>
  Ross Zwisler <zwisler@kernel.org> <ross.zwisler@linux.intel.com>
@@ -651,6 +655,7 @@ Viresh Kumar <vireshk@kernel.org> <viresh.kumar@st.com>
  Viresh Kumar <vireshk@kernel.org> <viresh.linux@gmail.com>
  Viresh Kumar <viresh.kumar@linaro.org> <viresh.kumar@linaro.org>
  Viresh Kumar <viresh.kumar@linaro.org> <viresh.kumar@linaro.com>
+Vishnu Dasa <vishnu.dasa@broadcom.com> <vdasa@vmware.com>
  Vivek Aknurwar <quic_viveka@quicinc.com> <viveka@codeaurora.org>
  Vivien Didelot <vivien.didelot@gmail.com> <vivien.didelot@savoirfairelinux.com>
  Vlad Dogaru <ddvlad@gmail.com> <vlad.dogaru@intel.com>
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst

index cce768afec6bed11a961643dcdc2d1ae97848684..b70b1d8bd8e6572374ae10632f46757269f2fa7e 100644 (file)
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -138,11 +138,10 @@ associated with the source address of the indirect branch. Specifically,
  the BHB might be shared across privilege levels even in the presence of
  Enhanced IBRS.
  
-Currently the only known real-world BHB attack vector is via
-unprivileged eBPF. Therefore, it's highly recommended to not enable
-unprivileged eBPF, especially when eIBRS is used (without retpolines).
-For a full mitigation against BHB attacks, it's recommended to use
-retpolines (or eIBRS combined with retpolines).
+Previously the only known real-world BHB attack vector was via unprivileged
+eBPF. Further research has found attacks that don't require unprivileged eBPF.
+For a full mitigation against BHB attacks it is recommended to set BHI_DIS_S or
+use the BHB clearing sequence.
  
  Attack scenarios
  ----------------
@@ -430,6 +429,23 @@ The possible values in this file are:
    'PBRSB-eIBRS: Not affected'  CPU is not affected by PBRSB
    ===========================  =======================================================
  
+  - Branch History Injection (BHI) protection status:
+
+.. list-table::
+
+ * - BHI: Not affected
+   - System is not affected
+ * - BHI: Retpoline
+   - System is protected by retpoline
+ * - BHI: BHI_DIS_S
+   - System is protected by BHI_DIS_S
+ * - BHI: SW loop; KVM SW loop
+   - System is protected by software clearing sequence
+ * - BHI: Syscall hardening
+   - Syscalls are hardened against BHI
+ * - BHI: Syscall hardening; KVM: SW loop
+   - System is protected from userspace attacks by syscall hardening; KVM is protected by software clearing sequence
+
  Full mitigation might require a microcode update from the CPU
  vendor. When the necessary microcode is not available, the kernel will
  report vulnerability.
@@ -484,7 +500,11 @@ Spectre variant 2
  
     Systems which support enhanced IBRS (eIBRS) enable IBRS protection once at
     boot, by setting the IBRS bit, and they're automatically protected against
-   Spectre v2 variant attacks.
+   some Spectre v2 variant attacks. The BHB can still influence the choice of
+   indirect branch predictor entry, and although branch predictor entries are
+   isolated between modes when eIBRS is enabled, the BHB itself is not isolated
+   between modes. Systems which support BHI_DIS_S will set it to protect against
+   BHI attacks.
  
     On Intel's enhanced IBRS systems, this includes cross-thread branch target
     injections on SMT systems (STIBP). In other words, Intel eIBRS enables
@@ -638,6 +658,22 @@ kernel command line.
                 spectre_v2=off. Spectre variant 1 mitigations
                 cannot be disabled.
  
+       spectre_bhi=
+
+               [X86] Control mitigation of Branch History Injection
+               (BHI) vulnerability. Syscalls are hardened against BHI
+               regardless of this setting. This setting affects the deployment
+               of the HW BHI control and the SW BHB clearing sequence.
+
+               on
+                       unconditionally enable.
+               off
+                       unconditionally disable.
+               auto
+                       enable if hardware mitigation
+                       control(BHI_DIS_S) is available, otherwise
+                       enable alternate mitigation in KVM.
+
  For spectre_v2_user see Documentation/admin-guide/kernel-parameters.txt
  
  Mitigation selection guide
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt

index 623fce7d5fcd0c4392432908e21aaba5134e3aa0..70046a019d42d80b1f56d1e80577dfd084fdd5f8 100644 (file)
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6063,6 +6063,18 @@
         sonypi.*=       [HW] Sony Programmable I/O Control Device driver
                         See Documentation/admin-guide/laptops/sonypi.rst
  
+       spectre_bhi=    [X86] Control mitigation of Branch History Injection
+                       (BHI) vulnerability. Syscalls are hardened against BHI
+                       reglardless of this setting. This setting affects the
+                       deployment of the HW BHI control and the SW BHB
+                       clearing sequence.
+
+                       on   - unconditionally enable.
+                       off  - unconditionally disable.
+                       auto - (default) enable hardware mitigation
+                              (BHI_DIS_S) if available, otherwise enable
+                              alternate mitigation in KVM.
+
         spectre_v2=     [X86,EARLY] Control mitigation of Spectre variant 2
                         (indirect branch speculation) vulnerability.
                         The default operation protects the kernel from
diff --git a/Documentation/devicetree/bindings/clock/keystone-gate.txt b/Documentation/devicetree/bindings/clock/keystone-gate.txt

index c5aa187026e3a53e1f1638f8808530cc5920df03..43f6fb6c939276dcac480ccbe3e9e30fa58935a3 100644 (file)
--- a/Documentation/devicetree/bindings/clock/keystone-gate.txt
+++ b/Documentation/devicetree/bindings/clock/keystone-gate.txt
@@ -1,5 +1,3 @@
-Status: Unstable - ABI compatibility may be broken in the future
-
  Binding for Keystone gate control driver which uses PSC controller IP.
  
  This binding uses the common clock binding[1].
diff --git a/Documentation/devicetree/bindings/clock/keystone-pll.txt b/Documentation/devicetree/bindings/clock/keystone-pll.txt

index 9a3fbc66560652b4fb05033aa7d900b1f8759fe0..69b0eb7c03c9e60d31483305e78095a6ce1c7cb1 100644 (file)
--- a/Documentation/devicetree/bindings/clock/keystone-pll.txt
+++ b/Documentation/devicetree/bindings/clock/keystone-pll.txt
@@ -1,5 +1,3 @@
-Status: Unstable - ABI compatibility may be broken in the future
-
  Binding for keystone PLLs. The main PLL IP typically has a multiplier,
  a divider and a post divider. The additional PLL IPs like ARMPLL, DDRPLL
  and PAPLL are controlled by the memory mapped register where as the Main
diff --git a/Documentation/devicetree/bindings/clock/ti/adpll.txt b/Documentation/devicetree/bindings/clock/ti/adpll.txt

index 4c8a2ce2cd70181ead140f3df75472fdc0151bdb..3122360adcf3c0abe3d50f5a2f326427c16c7cfa 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/adpll.txt
+++ b/Documentation/devicetree/bindings/clock/ti/adpll.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments ADPLL clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1]. It assumes a
  register-mapped ADPLL with two to three selectable input clocks
  and three to four children.
diff --git a/Documentation/devicetree/bindings/clock/ti/apll.txt b/Documentation/devicetree/bindings/clock/ti/apll.txt

index ade4dd4c30f0e12804a94845b71ee462e30f1d99..bbd505c1199df5b01e9abb2b001c9315c1b2c061 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/apll.txt
+++ b/Documentation/devicetree/bindings/clock/ti/apll.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments APLL clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1].  It assumes a
  register-mapped APLL with usually two selectable input clocks
  (reference clock and bypass clock), with analog phase locked
diff --git a/Documentation/devicetree/bindings/clock/ti/autoidle.txt b/Documentation/devicetree/bindings/clock/ti/autoidle.txt

index 7c735dde9fe971d7ad20f6c6b403581421b2b4c4..05645a10a9e33ce6a6b9bb7b06388442c3a2b85c 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/autoidle.txt
+++ b/Documentation/devicetree/bindings/clock/ti/autoidle.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments autoidle clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1]. It assumes a register mapped
  clock which can be put to idle automatically by hardware based on the usage
  and a configuration bit setting. Autoidle clock is never an individual
diff --git a/Documentation/devicetree/bindings/clock/ti/clockdomain.txt b/Documentation/devicetree/bindings/clock/ti/clockdomain.txt

index 9c6199249ce596cd4c1d4144b08b1b1868b968fa..edf0b5d427682421f4475f09e41d84c2af03ef2c 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/clockdomain.txt
+++ b/Documentation/devicetree/bindings/clock/ti/clockdomain.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments clockdomain.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1] in consumer role.
  Every clock on TI SoC belongs to one clockdomain, but software
  only needs this information for specific clocks which require
diff --git a/Documentation/devicetree/bindings/clock/ti/composite.txt b/Documentation/devicetree/bindings/clock/ti/composite.txt

index 33ac7c9ad053c7d13e93d3142f86357a76476ea5..6f7e1331b5466cfa63d25377b64e76354aad5b5d 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/composite.txt
+++ b/Documentation/devicetree/bindings/clock/ti/composite.txt
@@ -1,7 +1,5 @@
  Binding for TI composite clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1]. It assumes a
  register-mapped composite clock with multiple different sub-types;
  
diff --git a/Documentation/devicetree/bindings/clock/ti/divider.txt b/Documentation/devicetree/bindings/clock/ti/divider.txt

index 9b13b32974f9926874d1a893fa00be961c5aef4d..4d7c76f0b356950194a5f1184eb1ceb1a1a54ca0 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/divider.txt
+++ b/Documentation/devicetree/bindings/clock/ti/divider.txt
@@ -1,7 +1,5 @@
  Binding for TI divider clock
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1].  It assumes a
  register-mapped adjustable clock rate divider that does not gate and has
  only one input clock or parent.  By default the value programmed into
diff --git a/Documentation/devicetree/bindings/clock/ti/dpll.txt b/Documentation/devicetree/bindings/clock/ti/dpll.txt

index 37a7cb6ad07d873fec2b9be8bad19a4ebea7bcc2..14a1b72c2e712016d97fc7632a7767bc562207e9 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/dpll.txt
+++ b/Documentation/devicetree/bindings/clock/ti/dpll.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments DPLL clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1].  It assumes a
  register-mapped DPLL with usually two selectable input clocks
  (reference clock and bypass clock), with digital phase locked
diff --git a/Documentation/devicetree/bindings/clock/ti/fapll.txt b/Documentation/devicetree/bindings/clock/ti/fapll.txt

index c19b3f253b8cf7fa31ed962ef076ce6e56681f4c..88986ef39ddd245f637328155e3e6958487652c7 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/fapll.txt
+++ b/Documentation/devicetree/bindings/clock/ti/fapll.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments FAPLL clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1]. It assumes a
  register-mapped FAPLL with usually two selectable input clocks
  (reference clock and bypass clock), and one or more child
diff --git a/Documentation/devicetree/bindings/clock/ti/fixed-factor-clock.txt b/Documentation/devicetree/bindings/clock/ti/fixed-factor-clock.txt

index 518e3c1422762cfd32676fd9aee8a2645b0dec6d..dc69477b6e98eb8e1a37f7d488f1ab5bd58b7971 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/fixed-factor-clock.txt
+++ b/Documentation/devicetree/bindings/clock/ti/fixed-factor-clock.txt
@@ -1,7 +1,5 @@
  Binding for TI fixed factor rate clock sources.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1], and also uses the autoidle
  support from TI autoidle clock [2].
  
diff --git a/Documentation/devicetree/bindings/clock/ti/gate.txt b/Documentation/devicetree/bindings/clock/ti/gate.txt

index 4982615c01b9cb7fd187828e2c1d03a6c1d59255..a8e0335b006a07b1733a116c7d1ccea34af69891 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/gate.txt
+++ b/Documentation/devicetree/bindings/clock/ti/gate.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments gate clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1]. This clock is
  quite much similar to the basic gate-clock [2], however,
  it supports a number of additional features. If no register
diff --git a/Documentation/devicetree/bindings/clock/ti/interface.txt b/Documentation/devicetree/bindings/clock/ti/interface.txt

index d3eb5ca92a7fe6e349f974a97f9eeb4c721e5304..85fb1f2d2d286b95b2bdabb6c0b421cdaa3d33c7 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/interface.txt
+++ b/Documentation/devicetree/bindings/clock/ti/interface.txt
@@ -1,7 +1,5 @@
  Binding for Texas Instruments interface clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1]. This clock is
  quite much similar to the basic gate-clock [2], however,
  it supports a number of additional features, including
diff --git a/Documentation/devicetree/bindings/clock/ti/mux.txt b/Documentation/devicetree/bindings/clock/ti/mux.txt

index b33f641f104321ff1e7d5f6ef5fc66a6a79f9d76..cd56d3c1c09f3bf8ff6d9aa0c0fc859a2bad76af 100644 (file)
--- a/Documentation/devicetree/bindings/clock/ti/mux.txt
+++ b/Documentation/devicetree/bindings/clock/ti/mux.txt
@@ -1,7 +1,5 @@
  Binding for TI mux clock.
  
-Binding status: Unstable - ABI compatibility may be broken in the future
-
  This binding uses the common clock binding[1].  It assumes a
  register-mapped multiplexer with multiple input clock signals or
  parents, one of which can be selected as output.  This clock does not
diff --git a/Documentation/devicetree/bindings/dts-coding-style.rst b/Documentation/devicetree/bindings/dts-coding-style.rst

index a9bdd2b59dcab62b3cf6eac23dbefb44a71d648d..8a68331075a098ab8f0a1fece9525c7a2f7d6ddc 100644 (file)
--- a/Documentation/devicetree/bindings/dts-coding-style.rst
+++ b/Documentation/devicetree/bindings/dts-coding-style.rst
@@ -144,6 +144,8 @@ Example::
                 #dma-cells = <1>;
                 clocks = <&clock_controller 0>, <&clock_controller 1>;
                 clock-names = "bus", "host";
+               #address-cells = <1>;
+               #size-cells = <1>;
                 vendor,custom-property = <2>;
                 status = "disabled";
  
diff --git a/Documentation/devicetree/bindings/remoteproc/ti,davinci-rproc.txt b/Documentation/devicetree/bindings/remoteproc/ti,davinci-rproc.txt

index 25f8658e216ff03c46b244fcd7679d0e65cf6797..48a49c516b62cb03a0293464fa09c763d4ec6ad2 100644 (file)
--- a/Documentation/devicetree/bindings/remoteproc/ti,davinci-rproc.txt
+++ b/Documentation/devicetree/bindings/remoteproc/ti,davinci-rproc.txt
@@ -1,9 +1,6 @@
  TI Davinci DSP devices
  =======================
  
-Binding status: Unstable - Subject to changes for DT representation of clocks
-                          and resets
-
  The TI Davinci family of SoCs usually contains a TI DSP Core sub-system that
  is used to offload some of the processor-intensive tasks or algorithms, for
  achieving various system level goals.
diff --git a/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-dcfg.yaml b/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-dcfg.yaml

index 397f75909b20588506fdff7d7fdc2d8f07d49ba9..ce1a6505eb5149dedc4ecf5ec975ad2a612663eb 100644 (file)
--- a/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-dcfg.yaml
+++ b/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-dcfg.yaml
@@ -51,7 +51,7 @@ properties:
    ranges: true
  
  patternProperties:
-  "^clock-controller@[0-9a-z]+$":
+  "^clock-controller@[0-9a-f]+$":
      $ref: /schemas/clock/fsl,flexspi-clock.yaml#
  
  required:
diff --git a/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml b/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml

index 8d088b5fe8236b667c9aa3d7e5e7341bc44b38d1..a6a511b00a1281a36b452ed595a1d376c6531eea 100644 (file)
--- a/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml
+++ b/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml
@@ -41,7 +41,7 @@ properties:
    ranges: true
  
  patternProperties:
-  "^interrupt-controller@[a-z0-9]+$":
+  "^interrupt-controller@[a-f0-9]+$":
      $ref: /schemas/interrupt-controller/fsl,ls-extirq.yaml#
  
  required:
diff --git a/Documentation/devicetree/bindings/timer/arm,arch_timer_mmio.yaml b/Documentation/devicetree/bindings/timer/arm,arch_timer_mmio.yaml

index 7a4a6ab85970d6ebad1b45d7d57468b8de2f6b55..ab8f28993139e5443817207a830d99bfcde25c48 100644 (file)
--- a/Documentation/devicetree/bindings/timer/arm,arch_timer_mmio.yaml
+++ b/Documentation/devicetree/bindings/timer/arm,arch_timer_mmio.yaml
@@ -60,7 +60,7 @@ properties:
        be implemented in an always-on power domain."
  
  patternProperties:
-  '^frame@[0-9a-z]*$':
+  '^frame@[0-9a-f]+$':
      type: object
      additionalProperties: false
      description: A timer node has up to 8 frame sub-nodes, each with the following properties.
diff --git a/Documentation/devicetree/bindings/ufs/qcom,ufs.yaml b/Documentation/devicetree/bindings/ufs/qcom,ufs.yaml

index 10c146424baa1edd24c3e316625c07a35816f7f6..cd3680dc002f961f0bb95164b98e08279a755a41 100644 (file)
--- a/Documentation/devicetree/bindings/ufs/qcom,ufs.yaml
+++ b/Documentation/devicetree/bindings/ufs/qcom,ufs.yaml
@@ -27,10 +27,13 @@ properties:
            - qcom,msm8996-ufshc
            - qcom,msm8998-ufshc
            - qcom,sa8775p-ufshc
+          - qcom,sc7180-ufshc
            - qcom,sc7280-ufshc
+          - qcom,sc8180x-ufshc
            - qcom,sc8280xp-ufshc
            - qcom,sdm845-ufshc
            - qcom,sm6115-ufshc
+          - qcom,sm6125-ufshc
            - qcom,sm6350-ufshc
            - qcom,sm8150-ufshc
            - qcom,sm8250-ufshc
@@ -42,11 +45,11 @@ properties:
        - const: jedec,ufs-2.0
  
    clocks:
-    minItems: 8
+    minItems: 7
      maxItems: 11
  
    clock-names:
-    minItems: 8
+    minItems: 7
      maxItems: 11
  
    dma-coherent: true
@@ -112,6 +115,31 @@ required:
  allOf:
    - $ref: ufs-common.yaml
  
+  - if:
+      properties:
+        compatible:
+          contains:
+            enum:
+              - qcom,sc7180-ufshc
+    then:
+      properties:
+        clocks:
+          minItems: 7
+          maxItems: 7
+        clock-names:
+          items:
+            - const: core_clk
+            - const: bus_aggr_clk
+            - const: iface_clk
+            - const: core_clk_unipro
+            - const: ref_clk
+            - const: tx_lane0_sync_clk
+            - const: rx_lane0_sync_clk
+        reg:
+          maxItems: 1
+        reg-names:
+          maxItems: 1
+
    - if:
        properties:
          compatible:
@@ -120,6 +148,7 @@ allOf:
                - qcom,msm8998-ufshc
                - qcom,sa8775p-ufshc
                - qcom,sc7280-ufshc
+              - qcom,sc8180x-ufshc
                - qcom,sc8280xp-ufshc
                - qcom,sm8250-ufshc
                - qcom,sm8350-ufshc
@@ -215,6 +244,7 @@ allOf:
            contains:
              enum:
                - qcom,sm6115-ufshc
+              - qcom,sm6125-ufshc
      then:
        properties:
          clocks:
@@ -248,7 +278,7 @@ allOf:
          reg:
            maxItems: 1
          clocks:
-          minItems: 8
+          minItems: 7
            maxItems: 8
      else:
        properties:
@@ -256,7 +286,7 @@ allOf:
            minItems: 1
            maxItems: 2
          clocks:
-          minItems: 8
+          minItems: 7
            maxItems: 11
  
  unevaluatedProperties: false
diff --git a/Documentation/filesystems/bcachefs/index.rst b/Documentation/filesystems/bcachefs/index.rst

new file mode 100644 (file)

index 0000000..e2bd61c
--- /dev/null
+++ b/Documentation/filesystems/bcachefs/index.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+bcachefs Documentation
+======================
+
+.. toctree::
+   :maxdepth: 2
+   :numbered:
+
+   errorcodes
diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst

index 0ea1e44fa02823ffd51f4739a3a9aab635a35bbe..1f9b4c905a6a7c0646fca9764829151582eb6e7c 100644 (file)
--- a/Documentation/filesystems/index.rst
+++ b/Documentation/filesystems/index.rst
@@ -69,6 +69,7 @@ Documentation for filesystem implementations.
     afs
     autofs
     autofs-mount-control
+   bcachefs/index
     befs
     bfs
     btrfs
diff --git a/MAINTAINERS b/MAINTAINERS

index f389bf6d78a3cb312b0ac87d880acacb868c27e3..b5b89687680b98eace9cb453ec2690bad735a6f2 100644 (file)
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3572,6 +3572,7 @@ S:        Supported
  C:     irc://irc.oftc.net/bcache
  T:     git https://evilpiepirate.org/git/bcachefs.git
  F:     fs/bcachefs/
+F:     Documentation/filesystems/bcachefs/
  
  BDISP ST MEDIA DRIVER
  M:     Fabien Dessenne <fabien.dessenne@foss.st.com>
@@ -16728,9 +16729,9 @@ F:      include/uapi/linux/ppdev.h
  
  PARAVIRT_OPS INTERFACE
  M:     Juergen Gross <jgross@suse.com>
-R:     Ajay Kaher <akaher@vmware.com>
-R:     Alexey Makhalov <amakhalov@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+R:     Ajay Kaher <ajay.kaher@broadcom.com>
+R:     Alexey Makhalov <alexey.amakhalov@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     virtualization@lists.linux.dev
  L:     x86@kernel.org
  S:     Supported
@@ -22425,6 +22426,7 @@ S:      Maintained
  W:     https://kernsec.org/wiki/index.php/Linux_Kernel_Integrity
  Q:     https://patchwork.kernel.org/project/linux-integrity/list/
  T:     git git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd.git
+F:     Documentation/devicetree/bindings/tpm/
  F:     drivers/char/tpm/
  
  TPS546D24 DRIVER
@@ -22571,6 +22573,7 @@ Q:      https://patchwork.kernel.org/project/linux-pm/list/
  B:     https://bugzilla.kernel.org
  T:     git git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git turbostat
  F:     tools/power/x86/turbostat/
+F:     tools/testing/selftests/turbostat/
  
  TW5864 VIDEO4LINUX DRIVER
  M:     Bluecherry Maintainers <maintainers@bluecherrydvr.com>
@@ -23649,9 +23652,9 @@ S:      Supported
  F:     drivers/misc/vmw_balloon.c
  
  VMWARE HYPERVISOR INTERFACE
-M:     Ajay Kaher <akaher@vmware.com>
-M:     Alexey Makhalov <amakhalov@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+M:     Ajay Kaher <ajay.kaher@broadcom.com>
+M:     Alexey Makhalov <alexey.amakhalov@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     virtualization@lists.linux.dev
  L:     x86@kernel.org
  S:     Supported
@@ -23660,33 +23663,34 @@ F:    arch/x86/include/asm/vmware.h
  F:     arch/x86/kernel/cpu/vmware.c
  
  VMWARE PVRDMA DRIVER
-M:     Bryan Tan <bryantan@vmware.com>
-M:     Vishnu Dasa <vdasa@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+M:     Bryan Tan <bryan-bt.tan@broadcom.com>
+M:     Vishnu Dasa <vishnu.dasa@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     linux-rdma@vger.kernel.org
  S:     Supported
  F:     drivers/infiniband/hw/vmw_pvrdma/
  
  VMWARE PVSCSI DRIVER
-M:     Vishal Bhakta <vbhakta@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+M:     Vishal Bhakta <vishal.bhakta@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     linux-scsi@vger.kernel.org
  S:     Supported
  F:     drivers/scsi/vmw_pvscsi.c
  F:     drivers/scsi/vmw_pvscsi.h
  
  VMWARE VIRTUAL PTP CLOCK DRIVER
-R:     Ajay Kaher <akaher@vmware.com>
-R:     Alexey Makhalov <amakhalov@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+M:     Nick Shi <nick.shi@broadcom.com>
+R:     Ajay Kaher <ajay.kaher@broadcom.com>
+R:     Alexey Makhalov <alexey.amakhalov@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     netdev@vger.kernel.org
  S:     Supported
  F:     drivers/ptp/ptp_vmw.c
  
  VMWARE VMCI DRIVER
-M:     Bryan Tan <bryantan@vmware.com>
-M:     Vishnu Dasa <vdasa@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+M:     Bryan Tan <bryan-bt.tan@broadcom.com>
+M:     Vishnu Dasa <vishnu.dasa@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     linux-kernel@vger.kernel.org
  S:     Supported
  F:     drivers/misc/vmw_vmci/
@@ -23701,16 +23705,16 @@ F:    drivers/input/mouse/vmmouse.c
  F:     drivers/input/mouse/vmmouse.h
  
  VMWARE VMXNET3 ETHERNET DRIVER
-M:     Ronak Doshi <doshir@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+M:     Ronak Doshi <ronak.doshi@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     netdev@vger.kernel.org
  S:     Supported
  F:     drivers/net/vmxnet3/
  
  VMWARE VSOCK VMCI TRANSPORT DRIVER
-M:     Bryan Tan <bryantan@vmware.com>
-M:     Vishnu Dasa <vdasa@vmware.com>
-R:     VMware PV-Drivers Reviewers <pv-drivers@vmware.com>
+M:     Bryan Tan <bryan-bt.tan@broadcom.com>
+M:     Vishnu Dasa <vishnu.dasa@broadcom.com>
+R:     Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
  L:     linux-kernel@vger.kernel.org
  S:     Supported
  F:     net/vmw_vsock/vmci_transport*
diff --git a/Makefile b/Makefile

index 4bef6323c47decb8805896b3a017213ef461c44e..e1bf12891cb0e4a7471d60cf4b2eb0050d600d2f 100644 (file)
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
  VERSION = 6
  PATCHLEVEL = 9
  SUBLEVEL = 0
-EXTRAVERSION = -rc2
+EXTRAVERSION = -rc3
  NAME = Hurr durr I'ma ninja sloth
  
  # *DOCUMENTATION*
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c

index 162b030ab9da33fd089bcdf269dde014ead7e712..0d022599eb61b38183ffd34eae2bfa9c2f643a56 100644 (file)
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -761,7 +761,6 @@ static void sve_init_header_from_task(struct user_sve_header *header,
  {
         unsigned int vq;
         bool active;
-       bool fpsimd_only;
         enum vec_type task_type;
  
         memset(header, 0, sizeof(*header));
@@ -777,12 +776,10 @@ static void sve_init_header_from_task(struct user_sve_header *header,
         case ARM64_VEC_SVE:
                 if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT))
                         header->flags |= SVE_PT_VL_INHERIT;
-               fpsimd_only = !test_tsk_thread_flag(target, TIF_SVE);
                 break;
         case ARM64_VEC_SME:
                 if (test_tsk_thread_flag(target, TIF_SME_VL_INHERIT))
                         header->flags |= SVE_PT_VL_INHERIT;
-               fpsimd_only = false;
                 break;
         default:
                 WARN_ON_ONCE(1);
@@ -790,7 +787,7 @@ static void sve_init_header_from_task(struct user_sve_header *header,
         }
  
         if (active) {
-               if (fpsimd_only) {
+               if (target->thread.fp_type == FP_STATE_FPSIMD) {
                         header->flags |= SVE_PT_REGS_FPSIMD;
                 } else {
                         header->flags |= SVE_PT_REGS_SVE;
diff --git a/arch/loongarch/boot/dts/loongson-2k1000.dtsi b/arch/loongarch/boot/dts/loongson-2k1000.dtsi

index 49a70f8c3cab22b758dd290a9fab2374a62abae9..b6aeb1f70e2a038ac2eb3bfe6c402bd37b4dcd6a 100644 (file)
--- a/arch/loongarch/boot/dts/loongson-2k1000.dtsi
+++ b/arch/loongarch/boot/dts/loongson-2k1000.dtsi
@@ -100,6 +100,13 @@
                 #size-cells = <2>;
                 dma-coherent;
  
+               isa@18000000 {
+                       compatible = "isa";
+                       #size-cells = <1>;
+                       #address-cells = <2>;
+                       ranges = <1 0x0 0x0 0x18000000 0x4000>;
+               };
+
                 liointc0: interrupt-controller@1fe01400 {
                         compatible = "loongson,liointc-2.0";
                         reg = <0x0 0x1fe01400 0x0 0x40>,
diff --git a/arch/loongarch/boot/dts/loongson-2k2000-ref.dts b/arch/loongarch/boot/dts/loongson-2k2000-ref.dts

index dca91caf895e3cd9e428e75b91da9392bfb49d82..74b99bd234cc38df9a087915280e86ddb5bd56d4 100644 (file)
--- a/arch/loongarch/boot/dts/loongson-2k2000-ref.dts
+++ b/arch/loongarch/boot/dts/loongson-2k2000-ref.dts
@@ -61,12 +61,45 @@
  
  &gmac0 {
         status = "okay";
+
+       phy-mode = "gmii";
+       phy-handle = <&phy0>;
+       mdio {
+               compatible = "snps,dwmac-mdio";
+               #address-cells = <1>;
+               #size-cells = <0>;
+               phy0: ethernet-phy@0 {
+                       reg = <2>;
+               };
+       };
  };
  
  &gmac1 {
         status = "okay";
+
+       phy-mode = "gmii";
+       phy-handle = <&phy1>;
+       mdio {
+               compatible = "snps,dwmac-mdio";
+               #address-cells = <1>;
+               #size-cells = <0>;
+               phy1: ethernet-phy@1 {
+                       reg = <2>;
+               };
+       };
  };
  
  &gmac2 {
         status = "okay";
+
+       phy-mode = "rgmii";
+       phy-handle = <&phy2>;
+       mdio {
+               compatible = "snps,dwmac-mdio";
+               #address-cells = <1>;
+               #size-cells = <0>;
+               phy2: ethernet-phy@2 {
+                       reg = <0>;
+               };
+       };
  };
diff --git a/arch/loongarch/boot/dts/loongson-2k2000.dtsi b/arch/loongarch/boot/dts/loongson-2k2000.dtsi

index a231949b5f553a3814f48f6875e65ac2ed73d09a..9eab2d02cbe8bff12a26ce11dd7ac1543b7c1f82 100644 (file)
--- a/arch/loongarch/boot/dts/loongson-2k2000.dtsi
+++ b/arch/loongarch/boot/dts/loongson-2k2000.dtsi
@@ -51,6 +51,13 @@
                 #address-cells = <2>;
                 #size-cells = <2>;
  
+               isa@18400000 {
+                       compatible = "isa";
+                       #size-cells = <1>;
+                       #address-cells = <2>;
+                       ranges = <1 0x0 0x0 0x18400000 0x4000>;
+               };
+
                 pmc: power-management@100d0000 {
                         compatible = "loongson,ls2k2000-pmc", "loongson,ls2k0500-pmc", "syscon";
                         reg = <0x0 0x100d0000 0x0 0x58>;
@@ -109,6 +116,8 @@
                 msi: msi-controller@1fe01140 {
                         compatible = "loongson,pch-msi-1.0";
                         reg = <0x0 0x1fe01140 0x0 0x8>;
+                       interrupt-controller;
+                       #interrupt-cells = <1>;
                         msi-controller;
                         loongson,msi-base-vec = <64>;
                         loongson,msi-num-vecs = <192>;
@@ -140,27 +149,34 @@
                         #address-cells = <3>;
                         #size-cells = <2>;
                         device_type = "pci";
+                       msi-parent = <&msi>;
                         bus-range = <0x0 0xff>;
-                       ranges = <0x01000000 0x0 0x00008000 0x0 0x18400000 0x0 0x00008000>,
+                       ranges = <0x01000000 0x0 0x00008000 0x0 0x18408000 0x0 0x00008000>,
                                  <0x02000000 0x0 0x60000000 0x0 0x60000000 0x0 0x20000000>;
  
                         gmac0: ethernet@3,0 {
                                 reg = <0x1800 0x0 0x0 0x0 0x0>;
-                               interrupts = <12 IRQ_TYPE_LEVEL_HIGH>;
+                               interrupts = <12 IRQ_TYPE_LEVEL_HIGH>,
+                                            <13 IRQ_TYPE_LEVEL_HIGH>;
+                               interrupt-names = "macirq", "eth_lpi";
                                 interrupt-parent = <&pic>;
                                 status = "disabled";
                         };
  
                         gmac1: ethernet@3,1 {
                                 reg = <0x1900 0x0 0x0 0x0 0x0>;
-                               interrupts = <14 IRQ_TYPE_LEVEL_HIGH>;
+                               interrupts = <14 IRQ_TYPE_LEVEL_HIGH>,
+                                            <15 IRQ_TYPE_LEVEL_HIGH>;
+                               interrupt-names = "macirq", "eth_lpi";
                                 interrupt-parent = <&pic>;
                                 status = "disabled";
                         };
  
                         gmac2: ethernet@3,2 {
                                 reg = <0x1a00 0x0 0x0 0x0 0x0>;
-                               interrupts = <17 IRQ_TYPE_LEVEL_HIGH>;
+                               interrupts = <17 IRQ_TYPE_LEVEL_HIGH>,
+                                            <18 IRQ_TYPE_LEVEL_HIGH>;
+                               interrupt-names = "macirq", "eth_lpi";
                                 interrupt-parent = <&pic>;
                                 status = "disabled";
                         };
diff --git a/arch/loongarch/include/asm/addrspace.h b/arch/loongarch/include/asm/addrspace.h

index b24437e28c6eda457b2be003b51ad3809600f7cc..7bd47d65bf7a048fda5183ed6844d5dbc129b232 100644 (file)
--- a/arch/loongarch/include/asm/addrspace.h
+++ b/arch/loongarch/include/asm/addrspace.h
@@ -11,6 +11,7 @@
  #define _ASM_ADDRSPACE_H
  
  #include <linux/const.h>
+#include <linux/sizes.h>
  
  #include <asm/loongarch.h>
  
diff --git a/arch/loongarch/include/asm/io.h b/arch/loongarch/include/asm/io.h

index 4a8adcca329b81e4f289dd7825fb15dbf2f4f7a9..c2f9979b2979e5e92e791e3f8304975db9e929c9 100644 (file)
--- a/arch/loongarch/include/asm/io.h
+++ b/arch/loongarch/include/asm/io.h
@@ -14,11 +14,6 @@
  #include <asm/pgtable-bits.h>
  #include <asm/string.h>
  
-/*
- * Change "struct page" to physical address.
- */
-#define page_to_phys(page)     ((phys_addr_t)page_to_pfn(page) << PAGE_SHIFT)
-
  extern void __init __iomem *early_ioremap(u64 phys_addr, unsigned long size);
  extern void __init early_iounmap(void __iomem *addr, unsigned long size);
  
@@ -73,6 +68,21 @@ extern void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t
  
  #define __io_aw() mmiowb()
  
+#ifdef CONFIG_KFENCE
+#define virt_to_phys(kaddr)                                                            \
+({                                                                                     \
+       (likely((unsigned long)kaddr < vm_map_base)) ? __pa((unsigned long)kaddr) :     \
+       page_to_phys(tlb_virt_to_page((unsigned long)kaddr)) + offset_in_page((unsigned long)kaddr);\
+})
+
+#define phys_to_virt(paddr)                                                            \
+({                                                                                     \
+       extern char *__kfence_pool;                                                     \
+       (unlikely(__kfence_pool == NULL)) ? __va((unsigned long)paddr) :                \
+       page_address(phys_to_page((unsigned long)paddr)) + offset_in_page((unsigned long)paddr);\
+})
+#endif
+
  #include <asm-generic/io.h>
  
  #define ARCH_HAS_VALID_PHYS_ADDR_RANGE
diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h

index 6c82aea1c99398c46484a77cc28da1316799affb..a6a5760da3a3323641e3fa422f3da87cdb4b66f8 100644 (file)
--- a/arch/loongarch/include/asm/kfence.h
+++ b/arch/loongarch/include/asm/kfence.h
@@ -16,6 +16,7 @@
  static inline bool arch_kfence_init_pool(void)
  {
         int err;
+       char *kaddr, *vaddr;
         char *kfence_pool = __kfence_pool;
         struct vm_struct *area;
  
@@ -35,6 +36,14 @@ static inline bool arch_kfence_init_pool(void)
                 return false;
         }
  
+       kaddr = kfence_pool;
+       vaddr = __kfence_pool;
+       while (kaddr < kfence_pool + KFENCE_POOL_SIZE) {
+               set_page_address(virt_to_page(kaddr), vaddr);
+               kaddr += PAGE_SIZE;
+               vaddr += PAGE_SIZE;
+       }
+
         return true;
  }
  
diff --git a/arch/loongarch/include/asm/page.h b/arch/loongarch/include/asm/page.h

index 44027060c54a28bd34a80f538135491e3ebc758a..e85df33f11c77212c2e8ec8e6b3f1dbb955bc622 100644 (file)
--- a/arch/loongarch/include/asm/page.h
+++ b/arch/loongarch/include/asm/page.h
@@ -78,7 +78,26 @@ typedef struct { unsigned long pgprot; } pgprot_t;
  struct page *dmw_virt_to_page(unsigned long kaddr);
  struct page *tlb_virt_to_page(unsigned long kaddr);
  
-#define virt_to_pfn(kaddr)     PFN_DOWN(PHYSADDR(kaddr))
+#define pfn_to_phys(pfn)       __pfn_to_phys(pfn)
+#define phys_to_pfn(paddr)     __phys_to_pfn(paddr)
+
+#define page_to_phys(page)     pfn_to_phys(page_to_pfn(page))
+#define phys_to_page(paddr)    pfn_to_page(phys_to_pfn(paddr))
+
+#ifndef CONFIG_KFENCE
+
+#define page_to_virt(page)     __va(page_to_phys(page))
+#define virt_to_page(kaddr)    phys_to_page(__pa(kaddr))
+
+#else
+
+#define WANT_PAGE_VIRTUAL
+
+#define page_to_virt(page)                                                             \
+({                                                                                     \
+       extern char *__kfence_pool;                                                     \
+       (__kfence_pool == NULL) ? __va(page_to_phys(page)) : page_address(page);        \
+})
  
  #define virt_to_page(kaddr)                                                            \
  ({                                                                                     \
@@ -86,6 +105,11 @@ struct page *tlb_virt_to_page(unsigned long kaddr);
         dmw_virt_to_page((unsigned long)kaddr) : tlb_virt_to_page((unsigned long)kaddr);\
  })
  
+#endif
+
+#define pfn_to_virt(pfn)       page_to_virt(pfn_to_page(pfn))
+#define virt_to_pfn(kaddr)     page_to_pfn(virt_to_page(kaddr))
+
  extern int __virt_addr_valid(volatile void *kaddr);
  #define virt_addr_valid(kaddr) __virt_addr_valid((volatile void *)(kaddr))
  
diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c

index a9630a81b38abbfc575ea4174af049ccd5a9a888..89af7c12e8c08d4faab2919cf22034b5ab0f5a6b 100644 (file)
--- a/arch/loongarch/mm/mmap.c
+++ b/arch/loongarch/mm/mmap.c
@@ -4,6 +4,7 @@
   */
  #include <linux/export.h>
  #include <linux/io.h>
+#include <linux/kfence.h>
  #include <linux/memblock.h>
  #include <linux/mm.h>
  #include <linux/mman.h>
@@ -111,6 +112,9 @@ int __virt_addr_valid(volatile void *kaddr)
  {
         unsigned long vaddr = (unsigned long)kaddr;
  
+       if (is_kfence_address((void *)kaddr))
+               return 1;
+
         if ((vaddr < PAGE_OFFSET) || (vaddr >= vm_map_base))
                 return 0;
  
diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c

index 2aae72e638713a658475e6fb82fc73eae0fc3469..bda018150000e66b906420ea7e3a5f79472ca352 100644 (file)
--- a/arch/loongarch/mm/pgtable.c
+++ b/arch/loongarch/mm/pgtable.c
@@ -11,13 +11,13 @@
  
  struct page *dmw_virt_to_page(unsigned long kaddr)
  {
-       return pfn_to_page(virt_to_pfn(kaddr));
+       return phys_to_page(__pa(kaddr));
  }
  EXPORT_SYMBOL(dmw_virt_to_page);
  
  struct page *tlb_virt_to_page(unsigned long kaddr)
  {
-       return pfn_to_page(pte_pfn(*virt_to_kpte(kaddr)));
+       return phys_to_page(pfn_to_phys(pte_pfn(*virt_to_kpte(kaddr))));
  }
  EXPORT_SYMBOL(tlb_virt_to_page);
  
diff --git a/arch/nios2/kernel/prom.c b/arch/nios2/kernel/prom.c

index 8d98af5c7201bb34570d97876d53d663e9068964..9a8393e6b4a85ecdb22691720c6a266eb5d7aa2d 100644 (file)
--- a/arch/nios2/kernel/prom.c
+++ b/arch/nios2/kernel/prom.c
@@ -21,7 +21,8 @@
  
  void __init early_init_devtree(void *params)
  {
-       __be32 *dtb = (u32 *)__dtb_start;
+       __be32 __maybe_unused *dtb = (u32 *)__dtb_start;
+
  #if defined(CONFIG_NIOS2_DTB_AT_PHYS_ADDR)
         if (be32_to_cpup((__be32 *)CONFIG_NIOS2_DTB_PHYS_ADDR) ==
                  OF_DT_HEADER) {
@@ -30,8 +31,11 @@ void __init early_init_devtree(void *params)
                 return;
         }
  #endif
+
+#ifdef CONFIG_NIOS2_DTB_SOURCE_BOOL
         if (be32_to_cpu((__be32) *dtb) == OF_DT_HEADER)
                 params = (void *)__dtb_start;
+#endif
  
         early_init_dt_scan(params);
  }
diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h b/arch/powerpc/include/asm/vdso/gettimeofday.h

index f0a4cf01e85c0312ee4b1350c0024f975a0b8120..78302f6c258006471bb4ba3dfb1f186c0137ef66 100644 (file)
--- a/arch/powerpc/include/asm/vdso/gettimeofday.h
+++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
@@ -4,7 +4,6 @@
  
  #ifndef __ASSEMBLY__
  
-#include <asm/page.h>
  #include <asm/vdso/timebase.h>
  #include <asm/barrier.h>
  #include <asm/unistd.h>
@@ -95,7 +94,7 @@ const struct vdso_data *__arch_get_vdso_data(void);
  static __always_inline
  const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd)
  {
-       return (void *)vd + PAGE_SIZE;
+       return (void *)vd + (1U << CONFIG_PAGE_SHIFT);
  }
  #endif
  
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile

index 252d63942f34ebe08a3087d12bee3a1c4833f15a..5b3115a198522684cbcba474953e07d4e76e9ff5 100644 (file)
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -151,7 +151,7 @@ endif
  endif
  
  vdso-install-y                 += arch/riscv/kernel/vdso/vdso.so.dbg
-vdso-install-$(CONFIG_COMPAT)  += arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg:../compat_vdso/compat_vdso.so
+vdso-install-$(CONFIG_COMPAT)  += arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg
  
  ifneq ($(CONFIG_XIP_KERNEL),y)
  ifeq ($(CONFIG_RISCV_M_MODE)$(CONFIG_ARCH_CANAAN),yy)
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h

index 97fcde30e2477d55f8046d844191a1e4b0ba5e1a..9f8ea0e33eb10424c5a05eb55849eacce627c3c3 100644 (file)
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -593,6 +593,12 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
         return ptep_test_and_clear_young(vma, address, ptep);
  }
  
+#define pgprot_nx pgprot_nx
+static inline pgprot_t pgprot_nx(pgprot_t _prot)
+{
+       return __pgprot(pgprot_val(_prot) & ~_PAGE_EXEC);
+}
+
  #define pgprot_noncached pgprot_noncached
  static inline pgprot_t pgprot_noncached(pgprot_t _prot)
  {
diff --git a/arch/riscv/include/asm/syscall_wrapper.h b/arch/riscv/include/asm/syscall_wrapper.h

index 980094c2e9761d19b562d4dd8f9f707fd13480ae..ac80216549ffa6fce76ffe1759ad4e0da4609f9f 100644 (file)
--- a/arch/riscv/include/asm/syscall_wrapper.h
+++ b/arch/riscv/include/asm/syscall_wrapper.h
@@ -36,7 +36,8 @@ asmlinkage long __riscv_sys_ni_syscall(const struct pt_regs *);
                                         ulong)                                          \
                         __attribute__((alias(__stringify(___se_##prefix##name))));      \
         __diag_pop();                                                                   \
-       static long noinline ___se_##prefix##name(__MAP(x,__SC_LONG,__VA_ARGS__));      \
+       static long noinline ___se_##prefix##name(__MAP(x,__SC_LONG,__VA_ARGS__))       \
+                       __used;                                                         \
         static long ___se_##prefix##name(__MAP(x,__SC_LONG,__VA_ARGS__))
  
  #define SC_RISCV_REGS_TO_ARGS(x, ...) \
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h

index ec0cab9fbddd0da98cb415af2732a4ede083886b..72ec1d9bd3f312ec05c6dc5f2342f06e24c58468 100644 (file)
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -319,7 +319,7 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n)
  
  #define __get_kernel_nofault(dst, src, type, err_label)                        \
  do {                                                                   \
-       long __kr_err;                                                  \
+       long __kr_err = 0;                                              \
                                                                         \
         __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err);  \
         if (unlikely(__kr_err))                                         \
@@ -328,7 +328,7 @@ do {                                                                        \
  
  #define __put_kernel_nofault(dst, src, type, err_label)                        \
  do {                                                                   \
-       long __kr_err;                                                  \
+       long __kr_err = 0;                                              \
                                                                         \
         __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err);  \
         if (unlikely(__kr_err))                                         \
diff --git a/arch/riscv/include/uapi/asm/auxvec.h b/arch/riscv/include/uapi/asm/auxvec.h

index 10aaa83db89ef74a6441f5782698dc82d7e0ee5c..95050ebe9ad00bce67e4a8e42611624a40734c41 100644 (file)
--- a/arch/riscv/include/uapi/asm/auxvec.h
+++ b/arch/riscv/include/uapi/asm/auxvec.h
@@ -34,7 +34,7 @@
  #define AT_L3_CACHEGEOMETRY    47
  
  /* entries in ARCH_DLINFO */
-#define AT_VECTOR_SIZE_ARCH    9
+#define AT_VECTOR_SIZE_ARCH    10
  #define AT_MINSIGSTKSZ         51
  
  #endif /* _UAPI_ASM_RISCV_AUXVEC_H */
diff --git a/arch/riscv/kernel/compat_vdso/Makefile b/arch/riscv/kernel/compat_vdso/Makefile

index 62fa393b2eb2ead77a85ce54a9c4c8b32d528f6b..3df4cb788c1fa459d629d629ef64f99c81891f6b 100644 (file)
--- a/arch/riscv/kernel/compat_vdso/Makefile
+++ b/arch/riscv/kernel/compat_vdso/Makefile
@@ -74,5 +74,5 @@ quiet_cmd_compat_vdsold = VDSOLD  $@
                     rm $@.tmp
  
  # actual build commands
-quiet_cmd_compat_vdsoas = VDSOAS $@
+quiet_cmd_compat_vdsoas = VDSOAS  $@
        cmd_compat_vdsoas = $(COMPAT_CC) $(a_flags) $(COMPAT_CC_FLAGS) -c -o $@ $<
diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c

index 37e87fdcf6a00057663ffd636c50e85e865be3c4..30e12b310cab7397f91d622f8d4c20117e5e8c5f 100644 (file)
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -80,6 +80,8 @@ static int __patch_insn_set(void *addr, u8 c, size_t len)
          */
         lockdep_assert_held(&text_mutex);
  
+       preempt_disable();
+
         if (across_pages)
                 patch_map(addr + PAGE_SIZE, FIX_TEXT_POKE1);
  
@@ -92,6 +94,8 @@ static int __patch_insn_set(void *addr, u8 c, size_t len)
         if (across_pages)
                 patch_unmap(FIX_TEXT_POKE1);
  
+       preempt_enable();
+
         return 0;
  }
  NOKPROBE_SYMBOL(__patch_insn_set);
@@ -122,6 +126,8 @@ static int __patch_insn_write(void *addr, const void *insn, size_t len)
         if (!riscv_patch_in_stop_machine)
                 lockdep_assert_held(&text_mutex);
  
+       preempt_disable();
+
         if (across_pages)
                 patch_map(addr + PAGE_SIZE, FIX_TEXT_POKE1);
  
@@ -134,6 +140,8 @@ static int __patch_insn_write(void *addr, const void *insn, size_t len)
         if (across_pages)
                 patch_unmap(FIX_TEXT_POKE1);
  
+       preempt_enable();
+
         return ret;
  }
  NOKPROBE_SYMBOL(__patch_insn_write);
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c

index 92922dbd5b5c1f9b5d57643ecbd7a1599c5ac4c3..e4bc61c4e58af9c6c3914692c240021d053d72d8 100644 (file)
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -27,8 +27,6 @@
  #include <asm/vector.h>
  #include <asm/cpufeature.h>
  
-register unsigned long gp_in_global __asm__("gp");
-
  #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
  #include <linux/stackprotector.h>
  unsigned long __stack_chk_guard __read_mostly;
@@ -37,7 +35,7 @@ EXPORT_SYMBOL(__stack_chk_guard);
  
  extern asmlinkage void ret_from_fork(void);
  
-void arch_cpu_idle(void)
+void noinstr arch_cpu_idle(void)
  {
         cpu_do_idle();
  }
@@ -207,7 +205,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
         if (unlikely(args->fn)) {
                 /* Kernel thread */
                 memset(childregs, 0, sizeof(struct pt_regs));
-               childregs->gp = gp_in_global;
                 /* Supervisor/Machine, irqs on: */
                 childregs->status = SR_PP | SR_PIE;
  
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c

index 501e66debf69721d53db2515cea4df970a6b2784..5a2edd7f027e5d12e682349c3f1b54b51cd3b735 100644 (file)
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -119,6 +119,13 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
         struct __sc_riscv_v_state __user *state = sc_vec;
         void __user *datap;
  
+       /*
+        * Mark the vstate as clean prior performing the actual copy,
+        * to avoid getting the vstate incorrectly clobbered by the
+        *  discarded vector state.
+        */
+       riscv_v_vstate_set_restore(current, regs);
+
         /* Copy everything of __sc_riscv_v_state except datap. */
         err = __copy_from_user(&current->thread.vstate, &state->v_state,
                                offsetof(struct __riscv_v_ext_state, datap));
@@ -133,13 +140,7 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
          * Copy the whole vector content from user space datap. Use
          * copy_from_user to prevent information leak.
          */
-       err = copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
-       if (unlikely(err))
-               return err;
-
-       riscv_v_vstate_set_restore(current, regs);
-
-       return err;
+       return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
  }
  #else
  #define save_v_state(task, regs) (0)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c

index 868d6280cf667e655de2d5003c2fd57d129b3127..05a16b1f0aee858f3abf7a28c647ad1146410da0 100644 (file)
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -122,7 +122,7 @@ void do_trap(struct pt_regs *regs, int signo, int code, unsigned long addr)
                 print_vma_addr(KERN_CONT " in ", instruction_pointer(regs));
                 pr_cont("\n");
                 __show_regs(regs);
-               dump_instr(KERN_EMERG, regs);
+               dump_instr(KERN_INFO, regs);
         }
  
         force_sig_fault(signo, code, (void __user *)addr);
diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile

index 9b517fe1b8a8ecfddfae487dc9e829cc622334f2..272c431ac5b9f82c8181b673afcf236d85641feb 100644 (file)
--- a/arch/riscv/kernel/vdso/Makefile
+++ b/arch/riscv/kernel/vdso/Makefile
@@ -37,6 +37,7 @@ endif
  
  # Disable -pg to prevent insert call site
  CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS)
+CFLAGS_REMOVE_hwprobe.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS)
  
  # Disable profiling and instrumentation for VDSO code
  GCOV_PROFILE := n
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c

index 893566e004b73fcf9a8dbc94f766e59cd00f1bb1..07d743f87b3f69f2e88e716fc7b7d4b064fe9c3e 100644 (file)
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -99,7 +99,7 @@ static void __ipi_flush_tlb_range_asid(void *info)
         local_flush_tlb_range_asid(d->start, d->size, d->stride, d->asid);
  }
  
-static void __flush_tlb_range(struct cpumask *cmask, unsigned long asid,
+static void __flush_tlb_range(const struct cpumask *cmask, unsigned long asid,
                               unsigned long start, unsigned long size,
                               unsigned long stride)
  {
@@ -200,7 +200,7 @@ void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
  
  void flush_tlb_kernel_range(unsigned long start, unsigned long end)
  {
-       __flush_tlb_range((struct cpumask *)cpu_online_mask, FLUSH_TLB_NO_ASID,
+       __flush_tlb_range(cpu_online_mask, FLUSH_TLB_NO_ASID,
                           start, end - start, PAGE_SIZE);
  }
  
diff --git a/arch/s390/include/asm/atomic.h b/arch/s390/include/asm/atomic.h

index 7138d189cc420a2b4ca87b780503e7f4d53c9d7a..0c4cad7d5a5b1199c900f339e7872dbd9196d1ab 100644 (file)
--- a/arch/s390/include/asm/atomic.h
+++ b/arch/s390/include/asm/atomic.h
@@ -15,31 +15,31 @@
  #include <asm/barrier.h>
  #include <asm/cmpxchg.h>
  
-static inline int arch_atomic_read(const atomic_t *v)
+static __always_inline int arch_atomic_read(const atomic_t *v)
  {
         return __atomic_read(v);
  }
  #define arch_atomic_read arch_atomic_read
  
-static inline void arch_atomic_set(atomic_t *v, int i)
+static __always_inline void arch_atomic_set(atomic_t *v, int i)
  {
         __atomic_set(v, i);
  }
  #define arch_atomic_set arch_atomic_set
  
-static inline int arch_atomic_add_return(int i, atomic_t *v)
+static __always_inline int arch_atomic_add_return(int i, atomic_t *v)
  {
         return __atomic_add_barrier(i, &v->counter) + i;
  }
  #define arch_atomic_add_return arch_atomic_add_return
  
-static inline int arch_atomic_fetch_add(int i, atomic_t *v)
+static __always_inline int arch_atomic_fetch_add(int i, atomic_t *v)
  {
         return __atomic_add_barrier(i, &v->counter);
  }
  #define arch_atomic_fetch_add arch_atomic_fetch_add
  
-static inline void arch_atomic_add(int i, atomic_t *v)
+static __always_inline void arch_atomic_add(int i, atomic_t *v)
  {
         __atomic_add(i, &v->counter);
  }
@@ -50,11 +50,11 @@ static inline void arch_atomic_add(int i, atomic_t *v)
  #define arch_atomic_fetch_sub(_i, _v)  arch_atomic_fetch_add(-(int)(_i), _v)
  
  #define ATOMIC_OPS(op)                                                 \
-static inline void arch_atomic_##op(int i, atomic_t *v)                        \
+static __always_inline void arch_atomic_##op(int i, atomic_t *v)       \
  {                                                                      \
         __atomic_##op(i, &v->counter);                                  \
  }                                                                      \
-static inline int arch_atomic_fetch_##op(int i, atomic_t *v)           \
+static __always_inline int arch_atomic_fetch_##op(int i, atomic_t *v)  \
  {                                                                      \
         return __atomic_##op##_barrier(i, &v->counter);                 \
  }
@@ -74,7 +74,7 @@ ATOMIC_OPS(xor)
  
  #define arch_atomic_xchg(v, new)       (arch_xchg(&((v)->counter), new))
  
-static inline int arch_atomic_cmpxchg(atomic_t *v, int old, int new)
+static __always_inline int arch_atomic_cmpxchg(atomic_t *v, int old, int new)
  {
         return __atomic_cmpxchg(&v->counter, old, new);
  }
@@ -82,31 +82,31 @@ static inline int arch_atomic_cmpxchg(atomic_t *v, int old, int new)
  
  #define ATOMIC64_INIT(i)  { (i) }
  
-static inline s64 arch_atomic64_read(const atomic64_t *v)
+static __always_inline s64 arch_atomic64_read(const atomic64_t *v)
  {
         return __atomic64_read(v);
  }
  #define arch_atomic64_read arch_atomic64_read
  
-static inline void arch_atomic64_set(atomic64_t *v, s64 i)
+static __always_inline void arch_atomic64_set(atomic64_t *v, s64 i)
  {
         __atomic64_set(v, i);
  }
  #define arch_atomic64_set arch_atomic64_set
  
-static inline s64 arch_atomic64_add_return(s64 i, atomic64_t *v)
+static __always_inline s64 arch_atomic64_add_return(s64 i, atomic64_t *v)
  {
         return __atomic64_add_barrier(i, (long *)&v->counter) + i;
  }
  #define arch_atomic64_add_return arch_atomic64_add_return
  
-static inline s64 arch_atomic64_fetch_add(s64 i, atomic64_t *v)
+static __always_inline s64 arch_atomic64_fetch_add(s64 i, atomic64_t *v)
  {
         return __atomic64_add_barrier(i, (long *)&v->counter);
  }
  #define arch_atomic64_fetch_add arch_atomic64_fetch_add
  
-static inline void arch_atomic64_add(s64 i, atomic64_t *v)
+static __always_inline void arch_atomic64_add(s64 i, atomic64_t *v)
  {
         __atomic64_add(i, (long *)&v->counter);
  }
@@ -114,20 +114,20 @@ static inline void arch_atomic64_add(s64 i, atomic64_t *v)
  
  #define arch_atomic64_xchg(v, new)     (arch_xchg(&((v)->counter), new))
  
-static inline s64 arch_atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
+static __always_inline s64 arch_atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
  {
         return __atomic64_cmpxchg((long *)&v->counter, old, new);
  }
  #define arch_atomic64_cmpxchg arch_atomic64_cmpxchg
  
-#define ATOMIC64_OPS(op)                                               \
-static inline void arch_atomic64_##op(s64 i, atomic64_t *v)            \
-{                                                                      \
-       __atomic64_##op(i, (long *)&v->counter);                        \
-}                                                                      \
-static inline long arch_atomic64_fetch_##op(s64 i, atomic64_t *v)      \
-{                                                                      \
-       return __atomic64_##op##_barrier(i, (long *)&v->counter);       \
+#define ATOMIC64_OPS(op)                                                       \
+static __always_inline void arch_atomic64_##op(s64 i, atomic64_t *v)           \
+{                                                                              \
+       __atomic64_##op(i, (long *)&v->counter);                                \
+}                                                                              \
+static __always_inline long arch_atomic64_fetch_##op(s64 i, atomic64_t *v)     \
+{                                                                              \
+       return __atomic64_##op##_barrier(i, (long *)&v->counter);               \
  }
  
  ATOMIC64_OPS(and)
diff --git a/arch/s390/include/asm/atomic_ops.h b/arch/s390/include/asm/atomic_ops.h

index 50510e08b893b557dc0f7c8211093a7bf5bb91d5..7fa5f96a553a4720c5f5d41119d10c7cc954ce9e 100644 (file)
--- a/arch/s390/include/asm/atomic_ops.h
+++ b/arch/s390/include/asm/atomic_ops.h
@@ -8,7 +8,7 @@
  #ifndef __ARCH_S390_ATOMIC_OPS__
  #define __ARCH_S390_ATOMIC_OPS__
  
-static inline int __atomic_read(const atomic_t *v)
+static __always_inline int __atomic_read(const atomic_t *v)
  {
         int c;
  
@@ -18,14 +18,14 @@ static inline int __atomic_read(const atomic_t *v)
         return c;
  }
  
-static inline void __atomic_set(atomic_t *v, int i)
+static __always_inline void __atomic_set(atomic_t *v, int i)
  {
         asm volatile(
                 "       st      %1,%0\n"
                 : "=R" (v->counter) : "d" (i));
  }
  
-static inline s64 __atomic64_read(const atomic64_t *v)
+static __always_inline s64 __atomic64_read(const atomic64_t *v)
  {
         s64 c;
  
@@ -35,7 +35,7 @@ static inline s64 __atomic64_read(const atomic64_t *v)
         return c;
  }
  
-static inline void __atomic64_set(atomic64_t *v, s64 i)
+static __always_inline void __atomic64_set(atomic64_t *v, s64 i)
  {
         asm volatile(
                 "       stg     %1,%0\n"
@@ -45,7 +45,7 @@ static inline void __atomic64_set(atomic64_t *v, s64 i)
  #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
  
  #define __ATOMIC_OP(op_name, op_type, op_string, op_barrier)           \
-static inline op_type op_name(op_type val, op_type *ptr)               \
+static __always_inline op_type op_name(op_type val, op_type *ptr)      \
  {                                                                      \
         op_type old;                                                    \
                                                                         \
@@ -96,7 +96,7 @@ __ATOMIC_CONST_OPS(__atomic64_add_const, long, "agsi")
  #else /* CONFIG_HAVE_MARCH_Z196_FEATURES */
  
  #define __ATOMIC_OP(op_name, op_string)                                        \
-static inline int op_name(int val, int *ptr)                           \
+static __always_inline int op_name(int val, int *ptr)                  \
  {                                                                      \
         int old, new;                                                   \
                                                                         \
@@ -122,7 +122,7 @@ __ATOMIC_OPS(__atomic_xor, "xr")
  #undef __ATOMIC_OPS
  
  #define __ATOMIC64_OP(op_name, op_string)                              \
-static inline long op_name(long val, long *ptr)                                \
+static __always_inline long op_name(long val, long *ptr)               \
  {                                                                      \
         long old, new;                                                  \
                                                                         \
@@ -154,7 +154,7 @@ __ATOMIC64_OPS(__atomic64_xor, "xgr")
  
  #endif /* CONFIG_HAVE_MARCH_Z196_FEATURES */
  
-static inline int __atomic_cmpxchg(int *ptr, int old, int new)
+static __always_inline int __atomic_cmpxchg(int *ptr, int old, int new)
  {
         asm volatile(
                 "       cs      %[old],%[new],%[ptr]"
@@ -164,7 +164,7 @@ static inline int __atomic_cmpxchg(int *ptr, int old, int new)
         return old;
  }
  
-static inline bool __atomic_cmpxchg_bool(int *ptr, int old, int new)
+static __always_inline bool __atomic_cmpxchg_bool(int *ptr, int old, int new)
  {
         int old_expected = old;
  
@@ -176,7 +176,7 @@ static inline bool __atomic_cmpxchg_bool(int *ptr, int old, int new)
         return old == old_expected;
  }
  
-static inline long __atomic64_cmpxchg(long *ptr, long old, long new)
+static __always_inline long __atomic64_cmpxchg(long *ptr, long old, long new)
  {
         asm volatile(
                 "       csg     %[old],%[new],%[ptr]"
@@ -186,7 +186,7 @@ static inline long __atomic64_cmpxchg(long *ptr, long old, long new)
         return old;
  }
  
-static inline bool __atomic64_cmpxchg_bool(long *ptr, long old, long new)
+static __always_inline bool __atomic64_cmpxchg_bool(long *ptr, long old, long new)
  {
         long old_expected = old;
  
diff --git a/arch/s390/include/asm/preempt.h b/arch/s390/include/asm/preempt.h

index bf15da0fedbca5ed6cc0d24a6af9b348b2776fcd..0e3da500e98c19109676f690385bb6da44bf971c 100644 (file)
--- a/arch/s390/include/asm/preempt.h
+++ b/arch/s390/include/asm/preempt.h
@@ -12,12 +12,12 @@
  #define PREEMPT_NEED_RESCHED   0x80000000
  #define PREEMPT_ENABLED        (0 + PREEMPT_NEED_RESCHED)
  
-static inline int preempt_count(void)
+static __always_inline int preempt_count(void)
  {
         return READ_ONCE(S390_lowcore.preempt_count) & ~PREEMPT_NEED_RESCHED;
  }
  
-static inline void preempt_count_set(int pc)
+static __always_inline void preempt_count_set(int pc)
  {
         int old, new;
  
@@ -29,22 +29,22 @@ static inline void preempt_count_set(int pc)
                                   old, new) != old);
  }
  
-static inline void set_preempt_need_resched(void)
+static __always_inline void set_preempt_need_resched(void)
  {
         __atomic_and(~PREEMPT_NEED_RESCHED, &S390_lowcore.preempt_count);
  }
  
-static inline void clear_preempt_need_resched(void)
+static __always_inline void clear_preempt_need_resched(void)
  {
         __atomic_or(PREEMPT_NEED_RESCHED, &S390_lowcore.preempt_count);
  }
  
-static inline bool test_preempt_need_resched(void)
+static __always_inline bool test_preempt_need_resched(void)
  {
         return !(READ_ONCE(S390_lowcore.preempt_count) & PREEMPT_NEED_RESCHED);
  }
  
-static inline void __preempt_count_add(int val)
+static __always_inline void __preempt_count_add(int val)
  {
         /*
          * With some obscure config options and CONFIG_PROFILE_ALL_BRANCHES
@@ -59,17 +59,17 @@ static inline void __preempt_count_add(int val)
         __atomic_add(val, &S390_lowcore.preempt_count);
  }
  
-static inline void __preempt_count_sub(int val)
+static __always_inline void __preempt_count_sub(int val)
  {
         __preempt_count_add(-val);
  }
  
-static inline bool __preempt_count_dec_and_test(void)
+static __always_inline bool __preempt_count_dec_and_test(void)
  {
         return __atomic_add(-1, &S390_lowcore.preempt_count) == 1;
  }
  
-static inline bool should_resched(int preempt_offset)
+static __always_inline bool should_resched(int preempt_offset)
  {
         return unlikely(READ_ONCE(S390_lowcore.preempt_count) ==
                         preempt_offset);
@@ -79,45 +79,45 @@ static inline bool should_resched(int preempt_offset)
  
  #define PREEMPT_ENABLED        (0)
  
-static inline int preempt_count(void)
+static __always_inline int preempt_count(void)
  {
         return READ_ONCE(S390_lowcore.preempt_count);
  }
  
-static inline void preempt_count_set(int pc)
+static __always_inline void preempt_count_set(int pc)
  {
         S390_lowcore.preempt_count = pc;
  }
  
-static inline void set_preempt_need_resched(void)
+static __always_inline void set_preempt_need_resched(void)
  {
  }
  
-static inline void clear_preempt_need_resched(void)
+static __always_inline void clear_preempt_need_resched(void)
  {
  }
  
-static inline bool test_preempt_need_resched(void)
+static __always_inline bool test_preempt_need_resched(void)
  {
         return false;
  }
  
-static inline void __preempt_count_add(int val)
+static __always_inline void __preempt_count_add(int val)
  {
         S390_lowcore.preempt_count += val;
  }
  
-static inline void __preempt_count_sub(int val)
+static __always_inline void __preempt_count_sub(int val)
  {
         S390_lowcore.preempt_count -= val;
  }
  
-static inline bool __preempt_count_dec_and_test(void)
+static __always_inline bool __preempt_count_dec_and_test(void)
  {
         return !--S390_lowcore.preempt_count && tif_need_resched();
  }
  
-static inline bool should_resched(int preempt_offset)
+static __always_inline bool should_resched(int preempt_offset)
  {
         return unlikely(preempt_count() == preempt_offset &&
                         tif_need_resched());
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S

index 787394978bc0f86400ce30214c2e1b8eb3e82675..3dc85638bc63b7d96eb3ebc25c558cff967395c6 100644 (file)
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -635,6 +635,7 @@ SYM_DATA_START_LOCAL(daton_psw)
  SYM_DATA_END(daton_psw)
  
         .section .rodata, "a"
+       .balign 8
  #define SYSCALL(esame,emu)     .quad __s390x_ ## esame
  SYM_DATA_START(sys_call_table)
  #include "asm/syscall_table.h"
diff --git a/arch/s390/kernel/perf_pai_crypto.c b/arch/s390/kernel/perf_pai_crypto.c

index 823d652e3917f8653fe71bb5c67a72c2653cf6c3..4ad472d130a3c075cda96949a605e080ef8d3e1a 100644 (file)
--- a/arch/s390/kernel/perf_pai_crypto.c
+++ b/arch/s390/kernel/perf_pai_crypto.c
@@ -90,7 +90,6 @@ static void paicrypt_event_destroy(struct perf_event *event)
                                                  event->cpu);
         struct paicrypt_map *cpump = mp->mapptr;
  
-       cpump->event = NULL;
         static_branch_dec(&pai_key);
         mutex_lock(&pai_reserve_mutex);
         debug_sprintf_event(cfm_dbg, 5, "%s event %#llx cpu %d users %d"
@@ -356,10 +355,15 @@ static int paicrypt_add(struct perf_event *event, int flags)
  
  static void paicrypt_stop(struct perf_event *event, int flags)
  {
-       if (!event->attr.sample_period) /* Counting */
+       struct paicrypt_mapptr *mp = this_cpu_ptr(paicrypt_root.mapptr);
+       struct paicrypt_map *cpump = mp->mapptr;
+
+       if (!event->attr.sample_period) {       /* Counting */
                 paicrypt_read(event);
-       else                            /* Sampling */
+       } else {                                /* Sampling */
                 perf_sched_cb_dec(event->pmu);
+               cpump->event = NULL;
+       }
         event->hw.state = PERF_HES_STOPPED;
  }
  
diff --git a/arch/s390/kernel/perf_pai_ext.c b/arch/s390/kernel/perf_pai_ext.c

index 616a25606cd63dcda97a0b781c88b55dc86f0032..a6da7e0cc7a66dac02e9524feb802b0bfee8e0e8 100644 (file)
--- a/arch/s390/kernel/perf_pai_ext.c
+++ b/arch/s390/kernel/perf_pai_ext.c
@@ -122,7 +122,6 @@ static void paiext_event_destroy(struct perf_event *event)
  
         free_page(PAI_SAVE_AREA(event));
         mutex_lock(&paiext_reserve_mutex);
-       cpump->event = NULL;
         if (refcount_dec_and_test(&cpump->refcnt))      /* Last reference gone */
                 paiext_free(mp);
         paiext_root_free();
@@ -362,10 +361,15 @@ static int paiext_add(struct perf_event *event, int flags)
  
  static void paiext_stop(struct perf_event *event, int flags)
  {
-       if (!event->attr.sample_period) /* Counting */
+       struct paiext_mapptr *mp = this_cpu_ptr(paiext_root.mapptr);
+       struct paiext_map *cpump = mp->mapptr;
+
+       if (!event->attr.sample_period) {       /* Counting */
                 paiext_read(event);
-       else                            /* Sampling */
+       } else {                                /* Sampling */
                 perf_sched_cb_dec(event->pmu);
+               cpump->event = NULL;
+       }
         event->hw.state = PERF_HES_STOPPED;
  }
  
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c

index c421dd44ffbe0346ab31028d433e5b3b7a2df626..0c66b32e0f9f1b54b4959a51d8cfc03984b4d52d 100644 (file)
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -75,7 +75,7 @@ static enum fault_type get_fault_type(struct pt_regs *regs)
                 if (!IS_ENABLED(CONFIG_PGSTE))
                         return KERNEL_FAULT;
                 gmap = (struct gmap *)S390_lowcore.gmap;
-               if (regs->cr1 == gmap->asce)
+               if (gmap && gmap->asce == regs->cr1)
                         return GMAP_FAULT;
                 return KERNEL_FAULT;
         }
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig

index 4fff6ed46e902cfbe723cf5ed5ce517e2d131891..10a6251f58f3e0789cf5507322b92d92a87bc2eb 100644 (file)
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2633,6 +2633,32 @@ config MITIGATION_RFDS
           stored in floating point, vector and integer registers.
           See also <file:Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst>
  
+choice
+       prompt "Clear branch history"
+       depends on CPU_SUP_INTEL
+       default SPECTRE_BHI_ON
+       help
+         Enable BHI mitigations. BHI attacks are a form of Spectre V2 attacks
+         where the branch history buffer is poisoned to speculatively steer
+         indirect branches.
+         See <file:Documentation/admin-guide/hw-vuln/spectre.rst>
+
+config SPECTRE_BHI_ON
+       bool "on"
+       help
+         Equivalent to setting spectre_bhi=on command line parameter.
+config SPECTRE_BHI_OFF
+       bool "off"
+       help
+         Equivalent to setting spectre_bhi=off command line parameter.
+config SPECTRE_BHI_AUTO
+       bool "auto"
+       depends on BROKEN
+       help
+         Equivalent to setting spectre_bhi=auto command line parameter.
+
+endchoice
+
  endif
  
  config ARCH_HAS_ADD_PAGES
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c

index d07be9d05cd03781072798e5802a5ef34d7a057a..b31ef2424d194b96d07b601d4eeac4b23d637d27 100644 (file)
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -3,19 +3,28 @@
   * Confidential Computing Platform Capability checks
   *
   * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ * Copyright (C) 2024 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
   *
   * Author: Tom Lendacky <thomas.lendacky@amd.com>
   */
  
  #include <linux/export.h>
  #include <linux/cc_platform.h>
+#include <linux/string.h>
+#include <linux/random.h>
  
+#include <asm/archrandom.h>
  #include <asm/coco.h>
  #include <asm/processor.h>
  
  enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
  u64 cc_mask __ro_after_init;
  
+static struct cc_attr_flags {
+       __u64 host_sev_snp      : 1,
+             __resv            : 63;
+} cc_flags;
+
  static bool noinstr intel_cc_platform_has(enum cc_attr attr)
  {
         switch (attr) {
@@ -89,6 +98,9 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
         case CC_ATTR_GUEST_SEV_SNP:
                 return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
  
+       case CC_ATTR_HOST_SEV_SNP:
+               return cc_flags.host_sev_snp;
+
         default:
                 return false;
         }
@@ -148,3 +160,84 @@ u64 cc_mkdec(u64 val)
         }
  }
  EXPORT_SYMBOL_GPL(cc_mkdec);
+
+static void amd_cc_platform_clear(enum cc_attr attr)
+{
+       switch (attr) {
+       case CC_ATTR_HOST_SEV_SNP:
+               cc_flags.host_sev_snp = 0;
+               break;
+       default:
+               break;
+       }
+}
+
+void cc_platform_clear(enum cc_attr attr)
+{
+       switch (cc_vendor) {
+       case CC_VENDOR_AMD:
+               amd_cc_platform_clear(attr);
+               break;
+       default:
+               break;
+       }
+}
+
+static void amd_cc_platform_set(enum cc_attr attr)
+{
+       switch (attr) {
+       case CC_ATTR_HOST_SEV_SNP:
+               cc_flags.host_sev_snp = 1;
+               break;
+       default:
+               break;
+       }
+}
+
+void cc_platform_set(enum cc_attr attr)
+{
+       switch (cc_vendor) {
+       case CC_VENDOR_AMD:
+               amd_cc_platform_set(attr);
+               break;
+       default:
+               break;
+       }
+}
+
+__init void cc_random_init(void)
+{
+       /*
+        * The seed is 32 bytes (in units of longs), which is 256 bits, which
+        * is the security level that the RNG is targeting.
+        */
+       unsigned long rng_seed[32 / sizeof(long)];
+       size_t i, longs;
+
+       if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+               return;
+
+       /*
+        * Since the CoCo threat model includes the host, the only reliable
+        * source of entropy that can be neither observed nor manipulated is
+        * RDRAND. Usually, RDRAND failure is considered tolerable, but since
+        * CoCo guests have no other unobservable source of entropy, it's
+        * important to at least ensure the RNG gets some initial random seeds.
+        */
+       for (i = 0; i < ARRAY_SIZE(rng_seed); i += longs) {
+               longs = arch_get_random_longs(&rng_seed[i], ARRAY_SIZE(rng_seed) - i);
+
+               /*
+                * A zero return value means that the guest doesn't have RDRAND
+                * or the CPU is physically broken, and in both cases that
+                * means most crypto inside of the CoCo instance will be
+                * broken, defeating the purpose of CoCo in the first place. So
+                * just panic here because it's absolutely unsafe to continue
+                * executing.
+                */
+               if (longs == 0)
+                       panic("RDRAND is defective.");
+       }
+       add_device_randomness(rng_seed, sizeof(rng_seed));
+       memzero_explicit(rng_seed, sizeof(rng_seed));
+}
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c

index 6356060caaf311af8370ccaeb69aab85847b62d1..6de50b80702e61087c432faf580a0ab063d22507 100644 (file)
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -49,7 +49,7 @@ static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
  
         if (likely(unr < NR_syscalls)) {
                 unr = array_index_nospec(unr, NR_syscalls);
-               regs->ax = sys_call_table[unr](regs);
+               regs->ax = x64_sys_call(regs, unr);
                 return true;
         }
         return false;
@@ -66,7 +66,7 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
  
         if (IS_ENABLED(CONFIG_X86_X32_ABI) && likely(xnr < X32_NR_syscalls)) {
                 xnr = array_index_nospec(xnr, X32_NR_syscalls);
-               regs->ax = x32_sys_call_table[xnr](regs);
+               regs->ax = x32_sys_call(regs, xnr);
                 return true;
         }
         return false;
@@ -162,7 +162,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr)
  
         if (likely(unr < IA32_NR_syscalls)) {
                 unr = array_index_nospec(unr, IA32_NR_syscalls);
-               regs->ax = ia32_sys_call_table[unr](regs);
+               regs->ax = ia32_sys_call(regs, unr);
         } else if (nr != -1) {
                 regs->ax = __ia32_sys_ni_syscall(regs);
         }
@@ -189,7 +189,7 @@ static __always_inline bool int80_is_external(void)
  }
  
  /**
- * int80_emulation - 32-bit legacy syscall entry
+ * do_int80_emulation - 32-bit legacy syscall C entry from asm
   *
   * This entry point can be used by 32-bit and 64-bit programs to perform
   * 32-bit system calls.  Instances of INT $0x80 can be found inline in
@@ -207,7 +207,7 @@ static __always_inline bool int80_is_external(void)
   *   eax:                              system call number
   *   ebx, ecx, edx, esi, edi, ebp:     arg1 - arg 6
   */
-DEFINE_IDTENTRY_RAW(int80_emulation)
+__visible noinstr void do_int80_emulation(struct pt_regs *regs)
  {
         int nr;
  
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S

index 8af2a26b24f6a9783f9bb348cd67c15e1c3799c8..1b5be07f86698a3b634a0d83b6578775781d1739 100644 (file)
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -116,6 +116,7 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
         /* clobbers %rax, make sure it is after saving the syscall nr */
         IBRS_ENTER
         UNTRAIN_RET
+       CLEAR_BRANCH_HISTORY
  
         call    do_syscall_64           /* returns with IRQs disabled */
  
@@ -1491,3 +1492,63 @@ SYM_CODE_START_NOALIGN(rewind_stack_and_make_dead)
         call    make_task_dead
  SYM_CODE_END(rewind_stack_and_make_dead)
  .popsection
+
+/*
+ * This sequence executes branches in order to remove user branch information
+ * from the branch history tracker in the Branch Predictor, therefore removing
+ * user influence on subsequent BTB lookups.
+ *
+ * It should be used on parts prior to Alder Lake. Newer parts should use the
+ * BHI_DIS_S hardware control instead. If a pre-Alder Lake part is being
+ * virtualized on newer hardware the VMM should protect against BHI attacks by
+ * setting BHI_DIS_S for the guests.
+ *
+ * CALLs/RETs are necessary to prevent Loop Stream Detector(LSD) from engaging
+ * and not clearing the branch history. The call tree looks like:
+ *
+ * call 1
+ *    call 2
+ *      call 2
+ *        call 2
+ *          call 2
+ *           call 2
+ *           ret
+ *         ret
+ *        ret
+ *      ret
+ *    ret
+ * ret
+ *
+ * This means that the stack is non-constant and ORC can't unwind it with %rsp
+ * alone.  Therefore we unconditionally set up the frame pointer, which allows
+ * ORC to unwind properly.
+ *
+ * The alignment is for performance and not for safety, and may be safely
+ * refactored in the future if needed.
+ */
+SYM_FUNC_START(clear_bhb_loop)
+       push    %rbp
+       mov     %rsp, %rbp
+       movl    $5, %ecx
+       ANNOTATE_INTRA_FUNCTION_CALL
+       call    1f
+       jmp     5f
+       .align 64, 0xcc
+       ANNOTATE_INTRA_FUNCTION_CALL
+1:     call    2f
+       RET
+       .align 64, 0xcc
+2:     movl    $5, %eax
+3:     jmp     4f
+       nop
+4:     sub     $1, %eax
+       jnz     3b
+       sub     $1, %ecx
+       jnz     1b
+       RET
+5:     lfence
+       pop     %rbp
+       RET
+SYM_FUNC_END(clear_bhb_loop)
+EXPORT_SYMBOL_GPL(clear_bhb_loop)
+STACK_FRAME_NON_STANDARD(clear_bhb_loop)
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S

index eabf48c4d4b4c30367792f5d9a0b158a9ecf8a04..c779046cc3fe792658a984648328000535812dea 100644 (file)
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -92,6 +92,7 @@ SYM_INNER_LABEL(entry_SYSENTER_compat_after_hwframe, SYM_L_GLOBAL)
  
         IBRS_ENTER
         UNTRAIN_RET
+       CLEAR_BRANCH_HISTORY
  
         /*
          * SYSENTER doesn't filter flags, so we need to clear NT and AC
@@ -206,6 +207,7 @@ SYM_INNER_LABEL(entry_SYSCALL_compat_after_hwframe, SYM_L_GLOBAL)
  
         IBRS_ENTER
         UNTRAIN_RET
+       CLEAR_BRANCH_HISTORY
  
         movq    %rsp, %rdi
         call    do_fast_syscall_32
@@ -276,3 +278,17 @@ SYM_INNER_LABEL(entry_SYSRETL_compat_end, SYM_L_GLOBAL)
         ANNOTATE_NOENDBR
         int3
  SYM_CODE_END(entry_SYSCALL_compat)
+
+/*
+ * int 0x80 is used by 32 bit mode as a system call entry. Normally idt entries
+ * point to C routines, however since this is a system call interface the branch
+ * history needs to be scrubbed to protect against BHI attacks, and that
+ * scrubbing needs to take place in assembly code prior to entering any C
+ * routines.
+ */
+SYM_CODE_START(int80_emulation)
+       ANNOTATE_NOENDBR
+       UNWIND_HINT_FUNC
+       CLEAR_BRANCH_HISTORY
+       jmp do_int80_emulation
+SYM_CODE_END(int80_emulation)
diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c

index 8cfc9bc73e7f8b21f748367256a78df3dc5e5b4a..c2235bae17ef665098342c323a24e4b388c169cb 100644 (file)
--- a/arch/x86/entry/syscall_32.c
+++ b/arch/x86/entry/syscall_32.c
@@ -18,8 +18,25 @@
  #include <asm/syscalls_32.h>
  #undef __SYSCALL
  
+/*
+ * The sys_call_table[] is no longer used for system calls, but
+ * kernel/trace/trace_syscalls.c still wants to know the system
+ * call address.
+ */
+#ifdef CONFIG_X86_32
  #define __SYSCALL(nr, sym) __ia32_##sym,
-
-__visible const sys_call_ptr_t ia32_sys_call_table[] = {
+const sys_call_ptr_t sys_call_table[] = {
  #include <asm/syscalls_32.h>
  };
+#undef __SYSCALL
+#endif
+
+#define __SYSCALL(nr, sym) case nr: return __ia32_##sym(regs);
+
+long ia32_sys_call(const struct pt_regs *regs, unsigned int nr)
+{
+       switch (nr) {
+       #include <asm/syscalls_32.h>
+       default: return __ia32_sys_ni_syscall(regs);
+       }
+};
diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c

index be120eec1fc9f95c69c23074bcd3fbc355b90d47..33b3f09e6f151e11faca1c9d13f0eb4917f3392b 100644 (file)
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -11,8 +11,23 @@
  #include <asm/syscalls_64.h>
  #undef __SYSCALL
  
+/*
+ * The sys_call_table[] is no longer used for system calls, but
+ * kernel/trace/trace_syscalls.c still wants to know the system
+ * call address.
+ */
  #define __SYSCALL(nr, sym) __x64_##sym,
-
-asmlinkage const sys_call_ptr_t sys_call_table[] = {
+const sys_call_ptr_t sys_call_table[] = {
  #include <asm/syscalls_64.h>
  };
+#undef __SYSCALL
+
+#define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
+
+long x64_sys_call(const struct pt_regs *regs, unsigned int nr)
+{
+       switch (nr) {
+       #include <asm/syscalls_64.h>
+       default: return __x64_sys_ni_syscall(regs);
+       }
+};
diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c

index bdd0e03a1265d23e474c5c45e1bd64e7b14b7b79..03de4a93213182c6fa5809b077a54ea51be411ea 100644 (file)
--- a/arch/x86/entry/syscall_x32.c
+++ b/arch/x86/entry/syscall_x32.c
@@ -11,8 +11,12 @@
  #include <asm/syscalls_x32.h>
  #undef __SYSCALL
  
-#define __SYSCALL(nr, sym) __x64_##sym,
+#define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
  
-asmlinkage const sys_call_ptr_t x32_sys_call_table[] = {
-#include <asm/syscalls_x32.h>
+long x32_sys_call(const struct pt_regs *regs, unsigned int nr)
+{
+       switch (nr) {
+       #include <asm/syscalls_x32.h>
+       default: return __x64_sys_ni_syscall(regs);
+       }
  };
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c

index 2641ba620f12a51d4c5d71ceba3bd28557926bfb..e010bfed84170570ddc74d307fd0620c7bc6a3c9 100644 (file)
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1237,11 +1237,11 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc,
         struct pmu *pmu = event->pmu;
  
         /*
-        * Make sure we get updated with the first PEBS
-        * event. It will trigger also during removal, but
-        * that does not hurt:
+        * Make sure we get updated with the first PEBS event.
+        * During removal, ->pebs_data_cfg is still valid for
+        * the last PEBS event. Don't clear it.
          */
-       if (cpuc->n_pebs == 1)
+       if ((cpuc->n_pebs == 1) && add)
                 cpuc->pebs_data_cfg = PEBS_UPDATE_DS_SW;
  
         if (needed_cb != pebs_needs_sched_cb(cpuc)) {
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h

index fb7388bbc212f9b1435b47206ae586cf84505846..c086699b0d0c59fc62834bb457963f1b81b541d3 100644 (file)
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -22,6 +22,7 @@ static inline void cc_set_mask(u64 mask)
  
  u64 cc_mkenc(u64 val);
  u64 cc_mkdec(u64 val);
+void cc_random_init(void);
  #else
  #define cc_vendor (CC_VENDOR_NONE)
  
@@ -34,6 +35,7 @@ static inline u64 cc_mkdec(u64 val)
  {
         return val;
  }
+static inline void cc_random_init(void) { }
  #endif
  
  #endif /* _ASM_X86_COCO_H */
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h

index 42157ddcc09d436fef9684a813b3e9b9e666e9da..686e92d2663eeeacd90a46568ae37b3db76b9e00 100644 (file)
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -33,6 +33,8 @@ enum cpuid_leafs
         CPUID_7_EDX,
         CPUID_8000_001F_EAX,
         CPUID_8000_0021_EAX,
+       CPUID_LNX_5,
+       NR_CPUID_WORDS,
  };
  
  #define X86_CAP_FMT_NUM "%d:%d"
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h

index a38f8f9ba65729125234814c08547498e4e3b8bc..3c7434329661c66e7c34283f0a3f2c59a87f8044 100644 (file)
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -461,11 +461,15 @@
  
  /*
   * Extended auxiliary flags: Linux defined - for features scattered in various
- * CPUID levels like 0x80000022, etc.
+ * CPUID levels like 0x80000022, etc and Linux defined features.
   *
   * Reuse free bits when adding new feature flags!
   */
  #define X86_FEATURE_AMD_LBR_PMC_FREEZE (21*32+ 0) /* AMD LBR and PMC Freeze */
+#define X86_FEATURE_CLEAR_BHB_LOOP     (21*32+ 1) /* "" Clear branch history at syscall entry using SW loop */
+#define X86_FEATURE_BHI_CTRL           (21*32+ 2) /* "" BHI_DIS_S HW control available */
+#define X86_FEATURE_CLEAR_BHB_HW       (21*32+ 3) /* "" BHI_DIS_S HW control enabled */
+#define X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT (21*32+ 4) /* "" Clear branch history at vmexit using SW loop */
  
  /*
   * BUG word(s)
@@ -515,4 +519,5 @@
  #define X86_BUG_SRSO                   X86_BUG(1*32 + 0) /* AMD SRSO bug */
  #define X86_BUG_DIV0                   X86_BUG(1*32 + 1) /* AMD DIV0 speculation bug */
  #define X86_BUG_RFDS                   X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */
+#define X86_BUG_BHI                    X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */
  #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h

index 05956bd8bacf50e35f463c13720a38735fe8b1b5..e72c2b87295799af9d44eb84f59d095f4f90acfd 100644 (file)
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -61,10 +61,13 @@
  #define SPEC_CTRL_SSBD                 BIT(SPEC_CTRL_SSBD_SHIFT)       /* Speculative Store Bypass Disable */
  #define SPEC_CTRL_RRSBA_DIS_S_SHIFT    6          /* Disable RRSBA behavior */
  #define SPEC_CTRL_RRSBA_DIS_S          BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
+#define SPEC_CTRL_BHI_DIS_S_SHIFT      10         /* Disable Branch History Injection behavior */
+#define SPEC_CTRL_BHI_DIS_S            BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
  
  /* A mask for bits which the kernel toggles when controlling mitigations */
  #define SPEC_CTRL_MITIGATIONS_MASK     (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
-                                                       | SPEC_CTRL_RRSBA_DIS_S)
+                                                       | SPEC_CTRL_RRSBA_DIS_S \
+                                                       | SPEC_CTRL_BHI_DIS_S)
  
  #define MSR_IA32_PRED_CMD              0x00000049 /* Prediction Command */
  #define PRED_CMD_IBPB                  BIT(0)     /* Indirect Branch Prediction Barrier */
@@ -163,6 +166,10 @@
                                                  * are restricted to targets in
                                                  * kernel.
                                                  */
+#define ARCH_CAP_BHI_NO                        BIT(20) /*
+                                                * CPU is not affected by Branch
+                                                * History Injection.
+                                                */
  #define ARCH_CAP_PBRSB_NO              BIT(24) /*
                                                  * Not susceptible to Post-Barrier
                                                  * Return Stack Buffer Predictions.
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h

index 170c89ed22fcd3a27106d166a9e7f5a5d1fadf80..ff5f1ecc7d1e6512fcc34f4a6e5df5976e9087f0 100644 (file)
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -326,6 +326,19 @@
         ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
  .endm
  
+#ifdef CONFIG_X86_64
+.macro CLEAR_BRANCH_HISTORY
+       ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_LOOP
+.endm
+
+.macro CLEAR_BRANCH_HISTORY_VMEXIT
+       ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT
+.endm
+#else
+#define CLEAR_BRANCH_HISTORY
+#define CLEAR_BRANCH_HISTORY_VMEXIT
+#endif
+
  #else /* __ASSEMBLY__ */
  
  #define ANNOTATE_RETPOLINE_SAFE                                        \
@@ -368,6 +381,10 @@ extern void srso_alias_return_thunk(void);
  extern void entry_untrain_ret(void);
  extern void entry_ibpb(void);
  
+#ifdef CONFIG_X86_64
+extern void clear_bhb_loop(void);
+#endif
+
  extern void (*x86_return_thunk)(void);
  
  extern void __warn_thunk(void);
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h

index 07e125f32528394df24c06de44e3232eaf15e4cd..7f57382afee41754beb8164244f199e4ac30a148 100644 (file)
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -228,7 +228,6 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct sn
  void snp_accept_memory(phys_addr_t start, phys_addr_t end);
  u64 snp_get_unsupported_features(u64 status);
  u64 sev_get_status(void);
-void kdump_sev_callback(void);
  void sev_show_status(void);
  #else
  static inline void sev_es_ist_enter(struct pt_regs *regs) { }
@@ -258,7 +257,6 @@ static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *in
  static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
  static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
  static inline u64 sev_get_status(void) { return 0; }
-static inline void kdump_sev_callback(void) { }
  static inline void sev_show_status(void) { }
  #endif
  
@@ -270,6 +268,7 @@ int psmash(u64 pfn);
  int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 asid, bool immutable);
  int rmp_make_shared(u64 pfn, enum pg_level level);
  void snp_leak_pages(u64 pfn, unsigned int npages);
+void kdump_sev_callback(void);
  #else
  static inline bool snp_probe_rmptable_info(void) { return false; }
  static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -282,6 +281,7 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 as
  }
  static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
  static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
+static inline void kdump_sev_callback(void) { }
  #endif
  
  #endif
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h

index f44e2f9ab65d779f35bac9c5e58dd8b694778efc..2fc7bc3863ff6f7a932ac2ee05682a2ba71f3308 100644 (file)
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -16,19 +16,17 @@
  #include <asm/thread_info.h>   /* for TS_COMPAT */
  #include <asm/unistd.h>
  
+/* This is used purely for kernel/trace/trace_syscalls.c */
  typedef long (*sys_call_ptr_t)(const struct pt_regs *);
  extern const sys_call_ptr_t sys_call_table[];
  
-#if defined(CONFIG_X86_32)
-#define ia32_sys_call_table sys_call_table
-#else
  /*
   * These may not exist, but still put the prototypes in so we
   * can use IS_ENABLED().
   */
-extern const sys_call_ptr_t ia32_sys_call_table[];
-extern const sys_call_ptr_t x32_sys_call_table[];
-#endif
+extern long ia32_sys_call(const struct pt_regs *, unsigned int nr);
+extern long x32_sys_call(const struct pt_regs *, unsigned int nr);
+extern long x64_sys_call(const struct pt_regs *, unsigned int nr);
  
  /*
   * Only the low 32 bits of orig_ax are meaningful, so we return int.
@@ -127,6 +125,7 @@ static inline int syscall_get_arch(struct task_struct *task)
  }
  
  bool do_syscall_64(struct pt_regs *regs, int nr);
+void do_int80_emulation(struct pt_regs *regs);
  
  #endif /* CONFIG_X86_32 */
  
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c

index 6d8677e80ddbb17c94ec7fcda9e5bae502c0dcb2..9bf17c9c29dad2e3f3c38c07253accc667cadea3 100644 (file)
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -345,6 +345,28 @@ static void srat_detect_node(struct cpuinfo_x86 *c)
  #endif
  }
  
+static void bsp_determine_snp(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
+       cc_vendor = CC_VENDOR_AMD;
+
+       if (cpu_has(c, X86_FEATURE_SEV_SNP)) {
+               /*
+                * RMP table entry format is not architectural and is defined by the
+                * per-processor PPR. Restrict SNP support on the known CPU models
+                * for which the RMP table entry format is currently defined for.
+                */
+               if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
+                   c->x86 >= 0x19 && snp_probe_rmptable_info()) {
+                       cc_platform_set(CC_ATTR_HOST_SEV_SNP);
+               } else {
+                       setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+                       cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
+               }
+       }
+#endif
+}
+
  static void bsp_init_amd(struct cpuinfo_x86 *c)
  {
         if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
@@ -452,21 +474,7 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
                 break;
         }
  
-       if (cpu_has(c, X86_FEATURE_SEV_SNP)) {
-               /*
-                * RMP table entry format is not architectural and it can vary by processor
-                * and is defined by the per-processor PPR. Restrict SNP support on the
-                * known CPU model and family for which the RMP table entry format is
-                * currently defined for.
-                */
-               if (!boot_cpu_has(X86_FEATURE_ZEN3) &&
-                   !boot_cpu_has(X86_FEATURE_ZEN4) &&
-                   !boot_cpu_has(X86_FEATURE_ZEN5))
-                       setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
-               else if (!snp_probe_rmptable_info())
-                       setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
-       }
-
+       bsp_determine_snp(c);
         return;
  
  warn:
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c

index e7ba936d798b8198f5837118d5bb33d40389ccc7..295463707e68181cb536f8f4bd763bf045936202 100644 (file)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1607,6 +1607,79 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
         dump_stack();
  }
  
+/*
+ * Set BHI_DIS_S to prevent indirect branches in kernel to be influenced by
+ * branch history in userspace. Not needed if BHI_NO is set.
+ */
+static bool __init spec_ctrl_bhi_dis(void)
+{
+       if (!boot_cpu_has(X86_FEATURE_BHI_CTRL))
+               return false;
+
+       x86_spec_ctrl_base |= SPEC_CTRL_BHI_DIS_S;
+       update_spec_ctrl(x86_spec_ctrl_base);
+       setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_HW);
+
+       return true;
+}
+
+enum bhi_mitigations {
+       BHI_MITIGATION_OFF,
+       BHI_MITIGATION_ON,
+       BHI_MITIGATION_AUTO,
+};
+
+static enum bhi_mitigations bhi_mitigation __ro_after_init =
+       IS_ENABLED(CONFIG_SPECTRE_BHI_ON)  ? BHI_MITIGATION_ON  :
+       IS_ENABLED(CONFIG_SPECTRE_BHI_OFF) ? BHI_MITIGATION_OFF :
+                                            BHI_MITIGATION_AUTO;
+
+static int __init spectre_bhi_parse_cmdline(char *str)
+{
+       if (!str)
+               return -EINVAL;
+
+       if (!strcmp(str, "off"))
+               bhi_mitigation = BHI_MITIGATION_OFF;
+       else if (!strcmp(str, "on"))
+               bhi_mitigation = BHI_MITIGATION_ON;
+       else if (!strcmp(str, "auto"))
+               bhi_mitigation = BHI_MITIGATION_AUTO;
+       else
+               pr_err("Ignoring unknown spectre_bhi option (%s)", str);
+
+       return 0;
+}
+early_param("spectre_bhi", spectre_bhi_parse_cmdline);
+
+static void __init bhi_select_mitigation(void)
+{
+       if (bhi_mitigation == BHI_MITIGATION_OFF)
+               return;
+
+       /* Retpoline mitigates against BHI unless the CPU has RRSBA behavior */
+       if (cpu_feature_enabled(X86_FEATURE_RETPOLINE) &&
+           !(x86_read_arch_cap_msr() & ARCH_CAP_RRSBA))
+               return;
+
+       if (spec_ctrl_bhi_dis())
+               return;
+
+       if (!IS_ENABLED(CONFIG_X86_64))
+               return;
+
+       /* Mitigate KVM by default */
+       setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
+       pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
+
+       if (bhi_mitigation == BHI_MITIGATION_AUTO)
+               return;
+
+       /* Mitigate syscalls when the mitigation is forced =on */
+       setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP);
+       pr_info("Spectre BHI mitigation: SW BHB clearing on syscall\n");
+}
+
  static void __init spectre_v2_select_mitigation(void)
  {
         enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -1718,6 +1791,9 @@ static void __init spectre_v2_select_mitigation(void)
             mode == SPECTRE_V2_RETPOLINE)
                 spec_ctrl_disable_kernel_rrsba();
  
+       if (boot_cpu_has(X86_BUG_BHI))
+               bhi_select_mitigation();
+
         spectre_v2_enabled = mode;
         pr_info("%s\n", spectre_v2_strings[mode]);
  
@@ -2695,15 +2771,15 @@ static char *stibp_state(void)
  
         switch (spectre_v2_user_stibp) {
         case SPECTRE_V2_USER_NONE:
-               return ", STIBP: disabled";
+               return "; STIBP: disabled";
         case SPECTRE_V2_USER_STRICT:
-               return ", STIBP: forced";
+               return "; STIBP: forced";
         case SPECTRE_V2_USER_STRICT_PREFERRED:
-               return ", STIBP: always-on";
+               return "; STIBP: always-on";
         case SPECTRE_V2_USER_PRCTL:
         case SPECTRE_V2_USER_SECCOMP:
                 if (static_key_enabled(&switch_to_cond_stibp))
-                       return ", STIBP: conditional";
+                       return "; STIBP: conditional";
         }
         return "";
  }
@@ -2712,10 +2788,10 @@ static char *ibpb_state(void)
  {
         if (boot_cpu_has(X86_FEATURE_IBPB)) {
                 if (static_key_enabled(&switch_mm_always_ibpb))
-                       return ", IBPB: always-on";
+                       return "; IBPB: always-on";
                 if (static_key_enabled(&switch_mm_cond_ibpb))
-                       return ", IBPB: conditional";
-               return ", IBPB: disabled";
+                       return "; IBPB: conditional";
+               return "; IBPB: disabled";
         }
         return "";
  }
@@ -2725,14 +2801,31 @@ static char *pbrsb_eibrs_state(void)
         if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
                 if (boot_cpu_has(X86_FEATURE_RSB_VMEXIT_LITE) ||
                     boot_cpu_has(X86_FEATURE_RSB_VMEXIT))
-                       return ", PBRSB-eIBRS: SW sequence";
+                       return "; PBRSB-eIBRS: SW sequence";
                 else
-                       return ", PBRSB-eIBRS: Vulnerable";
+                       return "; PBRSB-eIBRS: Vulnerable";
         } else {
-               return ", PBRSB-eIBRS: Not affected";
+               return "; PBRSB-eIBRS: Not affected";
         }
  }
  
+static const char * const spectre_bhi_state(void)
+{
+       if (!boot_cpu_has_bug(X86_BUG_BHI))
+               return "; BHI: Not affected";
+       else if  (boot_cpu_has(X86_FEATURE_CLEAR_BHB_HW))
+               return "; BHI: BHI_DIS_S";
+       else if  (boot_cpu_has(X86_FEATURE_CLEAR_BHB_LOOP))
+               return "; BHI: SW loop, KVM: SW loop";
+       else if (boot_cpu_has(X86_FEATURE_RETPOLINE) &&
+                !(x86_read_arch_cap_msr() & ARCH_CAP_RRSBA))
+               return "; BHI: Retpoline";
+       else if  (boot_cpu_has(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT))
+               return "; BHI: Syscall hardening, KVM: SW loop";
+
+       return "; BHI: Vulnerable (Syscall hardening enabled)";
+}
+
  static ssize_t spectre_v2_show_state(char *buf)
  {
         if (spectre_v2_enabled == SPECTRE_V2_LFENCE)
@@ -2745,13 +2838,15 @@ static ssize_t spectre_v2_show_state(char *buf)
             spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
                 return sysfs_emit(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
  
-       return sysfs_emit(buf, "%s%s%s%s%s%s%s\n",
+       return sysfs_emit(buf, "%s%s%s%s%s%s%s%s\n",
                           spectre_v2_strings[spectre_v2_enabled],
                           ibpb_state(),
-                         boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
+                         boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? "; IBRS_FW" : "",
                           stibp_state(),
-                         boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
+                         boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? "; RSB filling" : "",
                           pbrsb_eibrs_state(),
+                         spectre_bhi_state(),
+                         /* this should always be at the end */
                           spectre_v2_module_string());
  }
  
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c

index 5c1e6d6be267af3e7b489e9f71937e7be6b25448..754d91857d634a2c6055ed50afa659ac45749086 100644 (file)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1120,6 +1120,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
  #define NO_SPECTRE_V2          BIT(8)
  #define NO_MMIO                        BIT(9)
  #define NO_EIBRS_PBRSB         BIT(10)
+#define NO_BHI                 BIT(11)
  
  #define VULNWL(vendor, family, model, whitelist)       \
         X86_MATCH_VENDOR_FAM_MODEL(vendor, family, model, whitelist)
@@ -1182,18 +1183,18 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
         VULNWL_INTEL(ATOM_TREMONT_D,            NO_ITLB_MULTIHIT | NO_EIBRS_PBRSB),
  
         /* AMD Family 0xf - 0x12 */
-       VULNWL_AMD(0x0f,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
-       VULNWL_AMD(0x10,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
-       VULNWL_AMD(0x11,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
-       VULNWL_AMD(0x12,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
+       VULNWL_AMD(0x0f,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
+       VULNWL_AMD(0x10,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
+       VULNWL_AMD(0x11,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
+       VULNWL_AMD(0x12,        NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
  
         /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
-       VULNWL_AMD(X86_FAMILY_ANY,      NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB),
-       VULNWL_HYGON(X86_FAMILY_ANY,    NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB),
+       VULNWL_AMD(X86_FAMILY_ANY,      NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI),
+       VULNWL_HYGON(X86_FAMILY_ANY,    NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI),
  
         /* Zhaoxin Family 7 */
-       VULNWL(CENTAUR, 7, X86_MODEL_ANY,       NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO),
-       VULNWL(ZHAOXIN, 7, X86_MODEL_ANY,       NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO),
+       VULNWL(CENTAUR, 7, X86_MODEL_ANY,       NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO | NO_BHI),
+       VULNWL(ZHAOXIN, 7, X86_MODEL_ANY,       NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO | NO_BHI),
         {}
  };
  
@@ -1435,6 +1436,13 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
         if (vulnerable_to_rfds(ia32_cap))
                 setup_force_cpu_bug(X86_BUG_RFDS);
  
+       /* When virtualized, eIBRS could be hidden, assume vulnerable */
+       if (!(ia32_cap & ARCH_CAP_BHI_NO) &&
+           !cpu_matches(cpu_vuln_whitelist, NO_BHI) &&
+           (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) ||
+            boot_cpu_has(X86_FEATURE_HYPERVISOR)))
+               setup_force_cpu_bug(X86_BUG_BHI);
+
         if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
                 return;
  
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c

index b5cc557cfc3736708d96be6372f34c9ad0be85e4..84d41be6d06ba4e79f49ae069f5f1d5ae20b00de 100644 (file)
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -2500,12 +2500,14 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr,
                 return -EINVAL;
  
         b = &per_cpu(mce_banks_array, s->id)[bank];
-
         if (!b->init)
                 return -ENODEV;
  
         b->ctl = new;
+
+       mutex_lock(&mce_sysfs_mutex);
         mce_restart();
+       mutex_unlock(&mce_sysfs_mutex);
  
         return size;
  }
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c

index 422a4ddc2ab7c9408f1d2d21433fea7f320c6f85..7b29ebda024f4e69bc9a9326f9cecd8a86ee2abb 100644 (file)
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -108,7 +108,7 @@ static inline void k8_check_syscfg_dram_mod_en(void)
               (boot_cpu_data.x86 >= 0x0f)))
                 return;
  
-       if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return;
  
         rdmsr(MSR_AMD64_SYSCFG, lo, hi);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h

index c99f26ebe7a6537a7cd43274701ac4f489648081..1a8687f8073a89f3335038c8b3ba7e2ee45aa6d4 100644 (file)
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -78,7 +78,8 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
         else
                 cpu = cpumask_any_but(mask, exclude_cpu);
  
-       if (!IS_ENABLED(CONFIG_NO_HZ_FULL))
+       /* Only continue if tick_nohz_full_mask has been initialized. */
+       if (!tick_nohz_full_enabled())
                 return cpu;
  
         /* If the CPU picked isn't marked nohz_full nothing more needs doing. */
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c

index a515328d9d7d88b802f588bf678d098e0ba53b86..af5aa2c754c22226080870967d6c410067c86447 100644 (file)
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -28,6 +28,7 @@ static const struct cpuid_bit cpuid_bits[] = {
         { X86_FEATURE_EPB,              CPUID_ECX,  3, 0x00000006, 0 },
         { X86_FEATURE_INTEL_PPIN,       CPUID_EBX,  0, 0x00000007, 1 },
         { X86_FEATURE_RRSBA_CTRL,       CPUID_EDX,  2, 0x00000007, 2 },
+       { X86_FEATURE_BHI_CTRL,         CPUID_EDX,  4, 0x00000007, 2 },
         { X86_FEATURE_CQM_LLC,          CPUID_EDX,  1, 0x0000000f, 0 },
         { X86_FEATURE_CQM_OCCUP_LLC,    CPUID_EDX,  0, 0x0000000f, 1 },
         { X86_FEATURE_CQM_MBM_TOTAL,    CPUID_EDX,  1, 0x0000000f, 1 },
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c

index 0109e6c510e02fcced0e2bbbfedc6af8e48ba90e..e125e059e2c45d3e6657716777426b926d950aee 100644 (file)
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -35,6 +35,7 @@
  #include <asm/bios_ebda.h>
  #include <asm/bugs.h>
  #include <asm/cacheinfo.h>
+#include <asm/coco.h>
  #include <asm/cpu.h>
  #include <asm/efi.h>
  #include <asm/gart.h>
@@ -991,6 +992,7 @@ void __init setup_arch(char **cmdline_p)
          * memory size.
          */
         mem_encrypt_setup_arch();
+       cc_random_init();
  
         efi_fake_memmap();
         efi_find_mirror();
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c

index 7e1e63cc48e67dcc3f3abc84362ba805fef5a93d..38ad066179d81fe7efc2b2469fbe583c12586e06 100644 (file)
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2284,16 +2284,6 @@ static int __init snp_init_platform_device(void)
  }
  device_initcall(snp_init_platform_device);
  
-void kdump_sev_callback(void)
-{
-       /*
-        * Do wbinvd() on remote CPUs when SNP is enabled in order to
-        * safely do SNP_SHUTDOWN on the local CPU.
-        */
-       if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
-               wbinvd();
-}
-
  void sev_show_status(void)
  {
         int i;
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig

index 3aaf7e86a859a2f680625b4d13b1c27d43fa5629..0ebdd088f28b852261786cb5f90d285b2af42ebc 100644 (file)
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -122,6 +122,7 @@ config KVM_AMD_SEV
         default y
         depends on KVM_AMD && X86_64
         depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
+       select ARCH_HAS_CC_PLATFORM
         help
           Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
           with Encrypted State (SEV-ES) on AMD processors.
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h

index aadefcaa9561d0a31e589784da7e871e4a0de2e0..2f4e155080badc5efdbcc93fbc909c5bbcf70094 100644 (file)
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -52,7 +52,7 @@ enum kvm_only_cpuid_leafs {
  #define X86_FEATURE_IPRED_CTRL         KVM_X86_FEATURE(CPUID_7_2_EDX, 1)
  #define KVM_X86_FEATURE_RRSBA_CTRL     KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
  #define X86_FEATURE_DDPD_U             KVM_X86_FEATURE(CPUID_7_2_EDX, 3)
-#define X86_FEATURE_BHI_CTRL           KVM_X86_FEATURE(CPUID_7_2_EDX, 4)
+#define KVM_X86_FEATURE_BHI_CTRL       KVM_X86_FEATURE(CPUID_7_2_EDX, 4)
  #define X86_FEATURE_MCDT_NO            KVM_X86_FEATURE(CPUID_7_2_EDX, 5)
  
  /* CPUID level 0x80000007 (EDX). */
@@ -102,10 +102,12 @@ static const struct cpuid_reg reverse_cpuid[] = {
   */
  static __always_inline void reverse_cpuid_check(unsigned int x86_leaf)
  {
+       BUILD_BUG_ON(NR_CPUID_WORDS != NCAPINTS);
         BUILD_BUG_ON(x86_leaf == CPUID_LNX_1);
         BUILD_BUG_ON(x86_leaf == CPUID_LNX_2);
         BUILD_BUG_ON(x86_leaf == CPUID_LNX_3);
         BUILD_BUG_ON(x86_leaf == CPUID_LNX_4);
+       BUILD_BUG_ON(x86_leaf == CPUID_LNX_5);
         BUILD_BUG_ON(x86_leaf >= ARRAY_SIZE(reverse_cpuid));
         BUILD_BUG_ON(reverse_cpuid[x86_leaf].function == 0);
  }
@@ -126,6 +128,7 @@ static __always_inline u32 __feature_translate(int x86_feature)
         KVM_X86_TRANSLATE_FEATURE(CONSTANT_TSC);
         KVM_X86_TRANSLATE_FEATURE(PERFMON_V2);
         KVM_X86_TRANSLATE_FEATURE(RRSBA_CTRL);
+       KVM_X86_TRANSLATE_FEATURE(BHI_CTRL);
         default:
                 return x86_feature;
         }
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c

index e5a4d9b0e79fd23e2dfc8224aa7a89d3b243c103..61a7531d41b019a7f263b9c4e02ccfdf960dd09f 100644 (file)
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3184,7 +3184,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
         unsigned long pfn;
         struct page *p;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
  
         /*
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S

index 2bfbf758d06110f49c71a22c1f54da9d9499669a..f6986dee6f8c7c52622857f131adf766d1528121 100644 (file)
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -275,6 +275,8 @@ SYM_INNER_LABEL_ALIGN(vmx_vmexit, SYM_L_GLOBAL)
  
         call vmx_spec_ctrl_restore_host
  
+       CLEAR_BRANCH_HISTORY_VMEXIT
+
         /* Put return value in AX */
         mov %_ASM_BX, %_ASM_AX
  
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c

index 47d9f03b7778373393b9853fe32b153dadd9de29..984ea2089efc3132154527508976879a4e11cb10 100644 (file)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1621,7 +1621,7 @@ static bool kvm_is_immutable_feature_msr(u32 msr)
          ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
          ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
          ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \
-        ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR)
+        ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO)
  
  static u64 kvm_get_arch_capabilities(void)
  {
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S

index 0795b3464058b0e515cb71d7bf84fa9908e6fc07..e674ccf720b9f6befe6ffb0fccec192fb1aa9a89 100644 (file)
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -229,6 +229,7 @@ SYM_CODE_END(srso_return_thunk)
  /* Dummy for the alternative in CALL_UNTRAIN_RET. */
  SYM_CODE_START(srso_alias_untrain_ret)
         ANNOTATE_UNRET_SAFE
+       ANNOTATE_NOENDBR
         ret
         int3
  SYM_FUNC_END(srso_alias_untrain_ret)
diff --git a/arch/x86/mm/numa_32.c b/arch/x86/mm/numa_32.c

index 104544359d69cd20ef1e37449d32a687ee1f3433..025fd7ea5d69f5bfba07a712a14ef85671d68af5 100644 (file)
--- a/arch/x86/mm/numa_32.c
+++ b/arch/x86/mm/numa_32.c
@@ -24,6 +24,7 @@
  
  #include <linux/memblock.h>
  #include <linux/init.h>
+#include <asm/pgtable_areas.h>
  
  #include "numa_internal.h"
  
diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c

index 0d72183b5dd028ad83b98b38183af643ad3b21b5..36b603d0cddefc7b208873e8d435cdf1ddbe20d9 100644 (file)
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -947,6 +947,38 @@ static void free_pfn_range(u64 paddr, unsigned long size)
                 memtype_free(paddr, paddr + size);
  }
  
+static int get_pat_info(struct vm_area_struct *vma, resource_size_t *paddr,
+               pgprot_t *pgprot)
+{
+       unsigned long prot;
+
+       VM_WARN_ON_ONCE(!(vma->vm_flags & VM_PAT));
+
+       /*
+        * We need the starting PFN and cachemode used for track_pfn_remap()
+        * that covered the whole VMA. For most mappings, we can obtain that
+        * information from the page tables. For COW mappings, we might now
+        * suddenly have anon folios mapped and follow_phys() will fail.
+        *
+        * Fallback to using vma->vm_pgoff, see remap_pfn_range_notrack(), to
+        * detect the PFN. If we need the cachemode as well, we're out of luck
+        * for now and have to fail fork().
+        */
+       if (!follow_phys(vma, vma->vm_start, 0, &prot, paddr)) {
+               if (pgprot)
+                       *pgprot = __pgprot(prot);
+               return 0;
+       }
+       if (is_cow_mapping(vma->vm_flags)) {
+               if (pgprot)
+                       return -EINVAL;
+               *paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
+               return 0;
+       }
+       WARN_ON_ONCE(1);
+       return -EINVAL;
+}
+
  /*
   * track_pfn_copy is called when vma that is covering the pfnmap gets
   * copied through copy_page_range().
@@ -957,20 +989,13 @@ static void free_pfn_range(u64 paddr, unsigned long size)
  int track_pfn_copy(struct vm_area_struct *vma)
  {
         resource_size_t paddr;
-       unsigned long prot;
         unsigned long vma_size = vma->vm_end - vma->vm_start;
         pgprot_t pgprot;
  
         if (vma->vm_flags & VM_PAT) {
-               /*
-                * reserve the whole chunk covered by vma. We need the
-                * starting address and protection from pte.
-                */
-               if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
-                       WARN_ON_ONCE(1);
+               if (get_pat_info(vma, &paddr, &pgprot))
                         return -EINVAL;
-               }
-               pgprot = __pgprot(prot);
+               /* reserve the whole chunk covered by vma. */
                 return reserve_pfn_range(paddr, vma_size, &pgprot, 1);
         }
  
@@ -1045,7 +1070,6 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
                  unsigned long size, bool mm_wr_locked)
  {
         resource_size_t paddr;
-       unsigned long prot;
  
         if (vma && !(vma->vm_flags & VM_PAT))
                 return;
@@ -1053,11 +1077,8 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
         /* free the chunk starting from pfn or the whole chunk */
         paddr = (resource_size_t)pfn << PAGE_SHIFT;
         if (!paddr && !size) {
-               if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
-                       WARN_ON_ONCE(1);
+               if (get_pat_info(vma, &paddr, NULL))
                         return;
-               }
-
                 size = vma->vm_end - vma->vm_start;
         }
         free_pfn_range(paddr, size);
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c

index cffe1157a90acfcf741b31ac216d6fd3a9ed4fd2..ab0e8448bb6eb2bfbc4fab29321cd0ddbe876f7e 100644 (file)
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -77,7 +77,7 @@ static int __mfd_enable(unsigned int cpu)
  {
         u64 val;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return 0;
  
         rdmsrl(MSR_AMD64_SYSCFG, val);
@@ -98,7 +98,7 @@ static int __snp_enable(unsigned int cpu)
  {
         u64 val;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return 0;
  
         rdmsrl(MSR_AMD64_SYSCFG, val);
@@ -174,11 +174,11 @@ static int __init snp_rmptable_init(void)
         u64 rmptable_size;
         u64 val;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return 0;
  
         if (!amd_iommu_snp_en)
-               return 0;
+               goto nosnp;
  
         if (!probed_rmp_size)
                 goto nosnp;
@@ -225,7 +225,7 @@ skip_enable:
         return 0;
  
  nosnp:
-       setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+       cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
         return -ENOSYS;
  }
  
@@ -246,7 +246,7 @@ static struct rmpentry *__snp_lookup_rmpentry(u64 pfn, int *level)
  {
         struct rmpentry *large_entry, *entry;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return ERR_PTR(-ENODEV);
  
         entry = get_rmpentry(pfn);
@@ -363,7 +363,7 @@ int psmash(u64 pfn)
         unsigned long paddr = pfn << PAGE_SHIFT;
         int ret;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return -ENODEV;
  
         if (!pfn_valid(pfn))
@@ -472,7 +472,7 @@ static int rmpupdate(u64 pfn, struct rmp_state *state)
         unsigned long paddr = pfn << PAGE_SHIFT;
         int ret, level;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return -ENODEV;
  
         level = RMP_TO_PG_LEVEL(state->pagesize);
@@ -558,3 +558,13 @@ void snp_leak_pages(u64 pfn, unsigned int npages)
         spin_unlock(&snp_leaked_pages_list_lock);
  }
  EXPORT_SYMBOL_GPL(snp_leak_pages);
+
+void kdump_sev_callback(void)
+{
+       /*
+        * Do wbinvd() on remote CPUs when SNP is enabled in order to
+        * safely do SNP_SHUTDOWN on the local CPU.
+        */
+       if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))
+               wbinvd();
+}
diff --git a/block/bdev.c b/block/bdev.c

index 7a5f611c3d2e3e83eb00be12b9131f49d8348f5e..b8e32d933a6369aebbffe36e2ce3165cc2288bef 100644 (file)
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -583,9 +583,6 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder,
         mutex_unlock(&bdev->bd_holder_lock);
         bd_clear_claiming(whole, holder);
         mutex_unlock(&bdev_lock);
-
-       if (hops && hops->get_holder)
-               hops->get_holder(holder);
  }
  
  /**
@@ -608,7 +605,6 @@ EXPORT_SYMBOL(bd_abort_claiming);
  static void bd_end_claim(struct block_device *bdev, void *holder)
  {
         struct block_device *whole = bdev_whole(bdev);
-       const struct blk_holder_ops *hops = bdev->bd_holder_ops;
         bool unblock = false;
  
         /*
@@ -631,9 +627,6 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
                 whole->bd_holder = NULL;
         mutex_unlock(&bdev_lock);
  
-       if (hops && hops->put_holder)
-               hops->put_holder(holder);
-
         /*
          * If this was the last claim, remove holder link and unblock evpoll if
          * it was a write holder.
@@ -776,17 +769,17 @@ void blkdev_put_no_open(struct block_device *bdev)
  
  static bool bdev_writes_blocked(struct block_device *bdev)
  {
-       return bdev->bd_writers == -1;
+       return bdev->bd_writers < 0;
  }
  
  static void bdev_block_writes(struct block_device *bdev)
  {
-       bdev->bd_writers = -1;
+       bdev->bd_writers--;
  }
  
  static void bdev_unblock_writes(struct block_device *bdev)
  {
-       bdev->bd_writers = 0;
+       bdev->bd_writers++;
  }
  
  static bool bdev_may_open(struct block_device *bdev, blk_mode_t mode)
@@ -813,6 +806,11 @@ static void bdev_claim_write_access(struct block_device *bdev, blk_mode_t mode)
                 bdev->bd_writers++;
  }
  
+static inline bool bdev_unclaimed(const struct file *bdev_file)
+{
+       return bdev_file->private_data == BDEV_I(bdev_file->f_mapping->host);
+}
+
  static void bdev_yield_write_access(struct file *bdev_file)
  {
         struct block_device *bdev;
@@ -820,14 +818,15 @@ static void bdev_yield_write_access(struct file *bdev_file)
         if (bdev_allow_write_mounted)
                 return;
  
+       if (bdev_unclaimed(bdev_file))
+               return;
+
         bdev = file_bdev(bdev_file);
-       /* Yield exclusive or shared write access. */
-       if (bdev_file->f_mode & FMODE_WRITE) {
-               if (bdev_writes_blocked(bdev))
-                       bdev_unblock_writes(bdev);
-               else
-                       bdev->bd_writers--;
-       }
+
+       if (bdev_file->f_mode & FMODE_WRITE_RESTRICTED)
+               bdev_unblock_writes(bdev);
+       else if (bdev_file->f_mode & FMODE_WRITE)
+               bdev->bd_writers--;
  }
  
  /**
@@ -907,6 +906,8 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
         bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
         if (bdev_nowait(bdev))
                 bdev_file->f_mode |= FMODE_NOWAIT;
+       if (mode & BLK_OPEN_RESTRICT_WRITES)
+               bdev_file->f_mode |= FMODE_WRITE_RESTRICTED;
         bdev_file->f_mapping = bdev->bd_inode->i_mapping;
         bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
         bdev_file->private_data = holder;
@@ -1012,6 +1013,20 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
  }
  EXPORT_SYMBOL(bdev_file_open_by_path);
  
+static inline void bd_yield_claim(struct file *bdev_file)
+{
+       struct block_device *bdev = file_bdev(bdev_file);
+       void *holder = bdev_file->private_data;
+
+       lockdep_assert_held(&bdev->bd_disk->open_mutex);
+
+       if (WARN_ON_ONCE(IS_ERR_OR_NULL(holder)))
+               return;
+
+       if (!bdev_unclaimed(bdev_file))
+               bd_end_claim(bdev, holder);
+}
+
  void bdev_release(struct file *bdev_file)
  {
         struct block_device *bdev = file_bdev(bdev_file);
@@ -1036,7 +1051,7 @@ void bdev_release(struct file *bdev_file)
         bdev_yield_write_access(bdev_file);
  
         if (holder)
-               bd_end_claim(bdev, holder);
+               bd_yield_claim(bdev_file);
  
         /*
          * Trigger event checking and tell drivers to flush MEDIA_CHANGE
@@ -1056,6 +1071,39 @@ put_no_open:
         blkdev_put_no_open(bdev);
  }
  
+/**
+ * bdev_fput - yield claim to the block device and put the file
+ * @bdev_file: open block device
+ *
+ * Yield claim on the block device and put the file. Ensure that the
+ * block device can be reclaimed before the file is closed which is a
+ * deferred operation.
+ */
+void bdev_fput(struct file *bdev_file)
+{
+       if (WARN_ON_ONCE(bdev_file->f_op != &def_blk_fops))
+               return;
+
+       if (bdev_file->private_data) {
+               struct block_device *bdev = file_bdev(bdev_file);
+               struct gendisk *disk = bdev->bd_disk;
+
+               mutex_lock(&disk->open_mutex);
+               bdev_yield_write_access(bdev_file);
+               bd_yield_claim(bdev_file);
+               /*
+                * Tell release we already gave up our hold on the
+                * device and if write restrictions are available that
+                * we already gave up write access to the device.
+                */
+               bdev_file->private_data = BDEV_I(bdev_file->f_mapping->host);
+               mutex_unlock(&disk->open_mutex);
+       }
+
+       fput(bdev_file);
+}
+EXPORT_SYMBOL(bdev_fput);
+
  /**
   * lookup_bdev() - Look up a struct block_device by name.
   * @pathname: Name of the block device in the filesystem.
diff --git a/block/ioctl.c b/block/ioctl.c

index 0c76137adcaaa5b9d212d789291d681c23c064f6..a9028a2c2db57b0881b7faf7cee2243148d1626c 100644 (file)
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -96,7 +96,7 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
                 unsigned long arg)
  {
         uint64_t range[2];
-       uint64_t start, len;
+       uint64_t start, len, end;
         struct inode *inode = bdev->bd_inode;
         int err;
  
@@ -117,7 +117,8 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
         if (len & 511)
                 return -EINVAL;
  
-       if (start + len > bdev_nr_bytes(bdev))
+       if (check_add_overflow(start, len, &end) ||
+           end > bdev_nr_bytes(bdev))
                 return -EINVAL;
  
         filemap_invalidate_lock(inode->i_mapping);
diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c

index 302dce0b2b5044e20489f4b34bb8f4fde189e597..d67881b50bca28a1e08bb494b00c2bf0ee44957b 100644 (file)
--- a/drivers/acpi/thermal.c
+++ b/drivers/acpi/thermal.c
@@ -662,14 +662,15 @@ static int acpi_thermal_register_thermal_zone(struct acpi_thermal *tz,
  {
         int result;
  
-       tz->thermal_zone = thermal_zone_device_register_with_trips("acpitz",
-                                                                  trip_table,
-                                                                  trip_count,
-                                                                  tz,
-                                                                  &acpi_thermal_zone_ops,
-                                                                  NULL,
-                                                                  passive_delay,
-                                                                  tz->polling_frequency * 100);
+       if (trip_count)
+               tz->thermal_zone = thermal_zone_device_register_with_trips(
+                                       "acpitz", trip_table, trip_count, tz,
+                                       &acpi_thermal_zone_ops, NULL, passive_delay,
+                                       tz->polling_frequency * 100);
+       else
+               tz->thermal_zone = thermal_tripless_zone_device_register(
+                                       "acpitz", tz, &acpi_thermal_zone_ops, NULL);
+
         if (IS_ERR(tz->thermal_zone))
                 return PTR_ERR(tz->thermal_zone);
  
@@ -901,11 +902,8 @@ static int acpi_thermal_add(struct acpi_device *device)
                 trip++;
         }
  
-       if (trip == trip_table) {
+       if (trip == trip_table)
                 pr_warn(FW_BUG "No valid trip points!\n");
-               result = -ENODEV;
-               goto free_memory;
-       }
  
         result = acpi_thermal_register_thermal_zone(tz, trip_table,
                                                     trip - trip_table,
diff --git a/drivers/ata/ahci_st.c b/drivers/ata/ahci_st.c

index d4a626f87963ba123a4f07a366c28681db4714fa..79a8b0aa37bf37fa8eb44e2dcba7181dfb2222b0 100644 (file)
--- a/drivers/ata/ahci_st.c
+++ b/drivers/ata/ahci_st.c
@@ -30,7 +30,6 @@
  #define ST_AHCI_OOBR_CIMAX_SHIFT       0
  
  struct st_ahci_drv_data {
-       struct platform_device *ahci;
         struct reset_control *pwr;
         struct reset_control *sw_rst;
         struct reset_control *pwr_rst;
diff --git a/drivers/ata/pata_macio.c b/drivers/ata/pata_macio.c

index 4ac854f6b05777c669d7de39ab006d963b74bd48..88b2e9817f49dfd200a0f58835a9344cab1e2818 100644 (file)
--- a/drivers/ata/pata_macio.c
+++ b/drivers/ata/pata_macio.c
@@ -1371,9 +1371,6 @@ static struct pci_driver pata_macio_pci_driver = {
         .suspend        = pata_macio_pci_suspend,
         .resume         = pata_macio_pci_resume,
  #endif
-       .driver = {
-               .owner          = THIS_MODULE,
-       },
  };
  MODULE_DEVICE_TABLE(pci, pata_macio_pci_match);
  
diff --git a/drivers/ata/sata_gemini.c b/drivers/ata/sata_gemini.c

index 400b22ee99c33affba7b25ae46de0a2014bfd71f..4c270999ba3ccd9dd70175b02886998cc47e99a9 100644 (file)
--- a/drivers/ata/sata_gemini.c
+++ b/drivers/ata/sata_gemini.c
@@ -200,7 +200,10 @@ int gemini_sata_start_bridge(struct sata_gemini *sg, unsigned int bridge)
                 pclk = sg->sata0_pclk;
         else
                 pclk = sg->sata1_pclk;
-       clk_enable(pclk);
+       ret = clk_enable(pclk);
+       if (ret)
+               return ret;
+
         msleep(10);
  
         /* Do not keep clocking a bridge that is not online */
diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c

index e82786c63fbd73decc4af68d1a3aff1113411a27..9bec0aee92e04c412fec0abe4ac30173950890fb 100644 (file)
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -787,37 +787,6 @@ static const struct ata_port_info mv_port_info[] = {
         },
  };
  
-static const struct pci_device_id mv_pci_tbl[] = {
-       { PCI_VDEVICE(MARVELL, 0x5040), chip_504x },
-       { PCI_VDEVICE(MARVELL, 0x5041), chip_504x },
-       { PCI_VDEVICE(MARVELL, 0x5080), chip_5080 },
-       { PCI_VDEVICE(MARVELL, 0x5081), chip_508x },
-       /* RocketRAID 1720/174x have different identifiers */
-       { PCI_VDEVICE(TTI, 0x1720), chip_6042 },
-       { PCI_VDEVICE(TTI, 0x1740), chip_6042 },
-       { PCI_VDEVICE(TTI, 0x1742), chip_6042 },
-
-       { PCI_VDEVICE(MARVELL, 0x6040), chip_604x },
-       { PCI_VDEVICE(MARVELL, 0x6041), chip_604x },
-       { PCI_VDEVICE(MARVELL, 0x6042), chip_6042 },
-       { PCI_VDEVICE(MARVELL, 0x6080), chip_608x },
-       { PCI_VDEVICE(MARVELL, 0x6081), chip_608x },
-
-       { PCI_VDEVICE(ADAPTEC2, 0x0241), chip_604x },
-
-       /* Adaptec 1430SA */
-       { PCI_VDEVICE(ADAPTEC2, 0x0243), chip_7042 },
-
-       /* Marvell 7042 support */
-       { PCI_VDEVICE(MARVELL, 0x7042), chip_7042 },
-
-       /* Highpoint RocketRAID PCIe series */
-       { PCI_VDEVICE(TTI, 0x2300), chip_7042 },
-       { PCI_VDEVICE(TTI, 0x2310), chip_7042 },
-
-       { }                     /* terminate list */
-};
-
  static const struct mv_hw_ops mv5xxx_ops = {
         .phy_errata             = mv5_phy_errata,
         .enable_leds            = mv5_enable_leds,
@@ -4303,6 +4272,36 @@ static int mv_pci_init_one(struct pci_dev *pdev,
  static int mv_pci_device_resume(struct pci_dev *pdev);
  #endif
  
+static const struct pci_device_id mv_pci_tbl[] = {
+       { PCI_VDEVICE(MARVELL, 0x5040), chip_504x },
+       { PCI_VDEVICE(MARVELL, 0x5041), chip_504x },
+       { PCI_VDEVICE(MARVELL, 0x5080), chip_5080 },
+       { PCI_VDEVICE(MARVELL, 0x5081), chip_508x },
+       /* RocketRAID 1720/174x have different identifiers */
+       { PCI_VDEVICE(TTI, 0x1720), chip_6042 },
+       { PCI_VDEVICE(TTI, 0x1740), chip_6042 },
+       { PCI_VDEVICE(TTI, 0x1742), chip_6042 },
+
+       { PCI_VDEVICE(MARVELL, 0x6040), chip_604x },
+       { PCI_VDEVICE(MARVELL, 0x6041), chip_604x },
+       { PCI_VDEVICE(MARVELL, 0x6042), chip_6042 },
+       { PCI_VDEVICE(MARVELL, 0x6080), chip_608x },
+       { PCI_VDEVICE(MARVELL, 0x6081), chip_608x },
+
+       { PCI_VDEVICE(ADAPTEC2, 0x0241), chip_604x },
+
+       /* Adaptec 1430SA */
+       { PCI_VDEVICE(ADAPTEC2, 0x0243), chip_7042 },
+
+       /* Marvell 7042 support */
+       { PCI_VDEVICE(MARVELL, 0x7042), chip_7042 },
+
+       /* Highpoint RocketRAID PCIe series */
+       { PCI_VDEVICE(TTI, 0x2300), chip_7042 },
+       { PCI_VDEVICE(TTI, 0x2310), chip_7042 },
+
+       { }                     /* terminate list */
+};
  
  static struct pci_driver mv_pci_driver = {
         .name                   = DRV_NAME,
@@ -4315,6 +4314,7 @@ static struct pci_driver mv_pci_driver = {
  #endif
  
  };
+MODULE_DEVICE_TABLE(pci, mv_pci_tbl);
  
  /**
   *      mv_print_info - Dump key info to kernel log for perusal.
@@ -4487,7 +4487,6 @@ static void __exit mv_exit(void)
  MODULE_AUTHOR("Brett Russ");
  MODULE_DESCRIPTION("SCSI low-level driver for Marvell SATA controllers");
  MODULE_LICENSE("GPL v2");
-MODULE_DEVICE_TABLE(pci, mv_pci_tbl);
  MODULE_VERSION(DRV_VERSION);
  MODULE_ALIAS("platform:" DRV_NAME);
  
diff --git a/drivers/ata/sata_sx4.c b/drivers/ata/sata_sx4.c

index b51d7a9d0d90ce0a6c72fe841aad222708c046e1..a482741eb181ffca923519ba7d8ab5e73da1e176 100644 (file)
--- a/drivers/ata/sata_sx4.c
+++ b/drivers/ata/sata_sx4.c
@@ -957,8 +957,7 @@ static void pdc20621_get_from_dimm(struct ata_host *host, void *psource,
  
         offset -= (idx * window_size);
         idx++;
-       dist = ((long) (window_size - (offset + size))) >= 0 ? size :
-               (long) (window_size - offset);
+       dist = min(size, window_size - offset);
         memcpy_fromio(psource, dimm_mmio + offset / 4, dist);
  
         psource += dist;
@@ -1005,8 +1004,7 @@ static void pdc20621_put_to_dimm(struct ata_host *host, void *psource,
         readl(mmio + PDC_DIMM_WINDOW_CTLR);
         offset -= (idx * window_size);
         idx++;
-       dist = ((long)(s32)(window_size - (offset + size))) >= 0 ? size :
-               (long) (window_size - offset);
+       dist = min(size, window_size - offset);
         memcpy_toio(dimm_mmio + offset / 4, psource, dist);
         writel(0x01, mmio + PDC_GENERAL_CTLR);
         readl(mmio + PDC_GENERAL_CTLR);
diff --git a/drivers/base/core.c b/drivers/base/core.c

index b93f3c5716aeeaa001651962f0af2c73bdfd5685..5f4e03336e68ef459f0df6c20348f8e1996956ba 100644 (file)
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
  static void __fw_devlink_link_to_consumers(struct device *dev);
  static bool fw_devlink_drv_reg_done;
  static bool fw_devlink_best_effort;
+static struct workqueue_struct *device_link_wq;
  
  /**
   * __fwnode_link_add - Create a link between two fwnode_handles.
@@ -533,12 +534,26 @@ static void devlink_dev_release(struct device *dev)
         /*
          * It may take a while to complete this work because of the SRCU
          * synchronization in device_link_release_fn() and if the consumer or
-        * supplier devices get deleted when it runs, so put it into the "long"
-        * workqueue.
+        * supplier devices get deleted when it runs, so put it into the
+        * dedicated workqueue.
          */
-       queue_work(system_long_wq, &link->rm_work);
+       queue_work(device_link_wq, &link->rm_work);
  }
  
+/**
+ * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate
+ */
+void device_link_wait_removal(void)
+{
+       /*
+        * devlink removal jobs are queued in the dedicated work queue.
+        * To be sure that all removal jobs are terminated, ensure that any
+        * scheduled work has run to completion.
+        */
+       flush_workqueue(device_link_wq);
+}
+EXPORT_SYMBOL_GPL(device_link_wait_removal);
+
  static struct class devlink_class = {
         .name = "devlink",
         .dev_groups = devlink_groups,
@@ -4164,9 +4179,14 @@ int __init devices_init(void)
         sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
         if (!sysfs_dev_char_kobj)
                 goto char_kobj_err;
+       device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
+       if (!device_link_wq)
+               goto wq_err;
  
         return 0;
  
+ wq_err:
+       kobject_put(sysfs_dev_char_kobj);
   char_kobj_err:
         kobject_put(sysfs_dev_block_kobj);
   block_kobj_err:
diff --git a/drivers/base/regmap/regcache-maple.c b/drivers/base/regmap/regcache-maple.c

index 41edd6a430eb457a76db36e8b7e9c58758bf2fe4..55999a50ccc0b85bb688be0b36ae9af5384b2965 100644 (file)
--- a/drivers/base/regmap/regcache-maple.c
+++ b/drivers/base/regmap/regcache-maple.c
@@ -112,7 +112,7 @@ static int regcache_maple_drop(struct regmap *map, unsigned int min,
         unsigned long *entry, *lower, *upper;
         unsigned long lower_index, lower_last;
         unsigned long upper_index, upper_last;
-       int ret;
+       int ret = 0;
  
         lower = NULL;
         upper = NULL;
@@ -145,7 +145,7 @@ static int regcache_maple_drop(struct regmap *map, unsigned int min,
                         upper_index = max + 1;
                         upper_last = mas.last;
  
-                       upper = kmemdup(&entry[max + 1],
+                       upper = kmemdup(&entry[max - mas.index + 1],
                                         ((mas.last - max) *
                                          sizeof(unsigned long)),
                                         map->alloc_flags);
@@ -244,7 +244,7 @@ static int regcache_maple_sync(struct regmap *map, unsigned int min,
         unsigned long lmin = min;
         unsigned long lmax = max;
         unsigned int r, v, sync_start;
-       int ret;
+       int ret = 0;
         bool sync_needed = false;
  
         map->cache_bypass = true;
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c

index 71c39bcd872c7ecaabc67e91f35aa2fb267d6826..ed33cf7192d21672fb389a93c20fbbb887796337 100644 (file)
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -1965,10 +1965,10 @@ static int null_add_dev(struct nullb_device *dev)
  
  out_ida_free:
         ida_free(&nullb_indexes, nullb->index);
-out_cleanup_zone:
-       null_free_zoned_dev(dev);
  out_cleanup_disk:
         put_disk(nullb->disk);
+out_cleanup_zone:
+       null_free_zoned_dev(dev);
  out_cleanup_tags:
         if (nullb->tag_set == &nullb->__tag_set)
                 blk_mq_free_tag_set(nullb->tag_set);
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c

index f44efbb89c346a8e0b72c3d262ec9213c95aab18..2102377f727b1eecac8d28423e10a3e313b38683 100644 (file)
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -1090,7 +1090,7 @@ static int __sev_snp_init_locked(int *error)
         void *arg = &data;
         int cmd, rc = 0;
  
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return -ENODEV;
  
         sev = psp->sev_data;
diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c

index 7bc71f4be64a07510507e1c9b7d0f1a61de30e3b..38d19410a2be68cab9f382d48ab7f15493c42af0 100644 (file)
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -2060,6 +2060,8 @@ static void bus_reset_work(struct work_struct *work)
  
         ohci->generation = generation;
         reg_write(ohci, OHCI1394_IntEventClear, OHCI1394_busReset);
+       if (param_debug & OHCI_PARAM_DEBUG_BUSRESETS)
+               reg_write(ohci, OHCI1394_IntMaskSet, OHCI1394_busReset);
  
         if (ohci->quirks & QUIRK_RESET_PACKET)
                 ohci->request_generation = generation;
@@ -2125,12 +2127,14 @@ static irqreturn_t irq_handler(int irq, void *data)
                 return IRQ_NONE;
  
         /*
-        * busReset and postedWriteErr must not be cleared yet
+        * busReset and postedWriteErr events must not be cleared yet
          * (OHCI 1.1 clauses 7.2.3.2 and 13.2.8.1)
          */
         reg_write(ohci, OHCI1394_IntEventClear,
                   event & ~(OHCI1394_busReset | OHCI1394_postedWriteErr));
         log_irqs(ohci, event);
+       if (event & OHCI1394_busReset)
+               reg_write(ohci, OHCI1394_IntMaskClear, OHCI1394_busReset);
  
         if (event & OHCI1394_selfIDComplete)
                 queue_work(selfid_workqueue, &ohci->bus_reset_work);
diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c

index fa96356102510967cb30538375c9348d122c3910..d09c7d72836551ab510031179a9b95340cd3fb36 100644 (file)
--- a/drivers/gpio/gpiolib-cdev.c
+++ b/drivers/gpio/gpiolib-cdev.c
@@ -728,6 +728,25 @@ static u32 line_event_id(int level)
                        GPIO_V2_LINE_EVENT_FALLING_EDGE;
  }
  
+static inline char *make_irq_label(const char *orig)
+{
+       char *new;
+
+       if (!orig)
+               return NULL;
+
+       new = kstrdup_and_replace(orig, '/', ':', GFP_KERNEL);
+       if (!new)
+               return ERR_PTR(-ENOMEM);
+
+       return new;
+}
+
+static inline void free_irq_label(const char *label)
+{
+       kfree(label);
+}
+
  #ifdef CONFIG_HTE
  
  static enum hte_return process_hw_ts_thread(void *p)
@@ -1015,6 +1034,7 @@ static int debounce_setup(struct line *line, unsigned int debounce_period_us)
  {
         unsigned long irqflags;
         int ret, level, irq;
+       char *label;
  
         /* try hardware */
         ret = gpiod_set_debounce(line->desc, debounce_period_us);
@@ -1037,11 +1057,17 @@ static int debounce_setup(struct line *line, unsigned int debounce_period_us)
                         if (irq < 0)
                                 return -ENXIO;
  
+                       label = make_irq_label(line->req->label);
+                       if (IS_ERR(label))
+                               return -ENOMEM;
+
                         irqflags = IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING;
                         ret = request_irq(irq, debounce_irq_handler, irqflags,
-                                         line->req->label, line);
-                       if (ret)
+                                         label, line);
+                       if (ret) {
+                               free_irq_label(label);
                                 return ret;
+                       }
                         line->irq = irq;
                 } else {
                         ret = hte_edge_setup(line, GPIO_V2_LINE_FLAG_EDGE_BOTH);
@@ -1083,16 +1109,6 @@ static u32 gpio_v2_line_config_debounce_period(struct gpio_v2_line_config *lc,
         return 0;
  }
  
-static inline char *make_irq_label(const char *orig)
-{
-       return kstrdup_and_replace(orig, '/', ':', GFP_KERNEL);
-}
-
-static inline void free_irq_label(const char *label)
-{
-       kfree(label);
-}
-
  static void edge_detector_stop(struct line *line)
  {
         if (line->irq) {
@@ -1158,8 +1174,8 @@ static int edge_detector_setup(struct line *line,
         irqflags |= IRQF_ONESHOT;
  
         label = make_irq_label(line->req->label);
-       if (!label)
-               return -ENOMEM;
+       if (IS_ERR(label))
+               return PTR_ERR(label);
  
         /* Request a thread to read the events */
         ret = request_threaded_irq(irq, edge_irq_handler, edge_irq_thread,
@@ -2217,8 +2233,8 @@ static int lineevent_create(struct gpio_device *gdev, void __user *ip)
                 goto out_free_le;
  
         label = make_irq_label(le->label);
-       if (!label) {
-               ret = -ENOMEM;
+       if (IS_ERR(label)) {
+               ret = PTR_ERR(label);
                 goto out_free_le;
         }
  
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c

index 59ccf9a3e1539c93e705139fb2c4a1f7db166001..94903fc1c1459f9fd26eba62628037492e202620 100644 (file)
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1175,6 +1175,9 @@ struct gpio_device *gpio_device_find(const void *data,
  
         list_for_each_entry_srcu(gdev, &gpio_devices, list,
                                  srcu_read_lock_held(&gpio_devices_srcu)) {
+               if (!device_is_registered(&gdev->dev))
+                       continue;
+
                 guard(srcu)(&gdev->srcu);
  
                 gc = srcu_dereference(gdev->chip, &gdev->srcu);
diff --git a/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c

index bd61e20770a5be20b8978be47a3ba2eaae0c3289..14a2a8473682b00a84e5a0e3907969e719fa5019 100644 (file)
--- a/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c
+++ b/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c
@@ -52,7 +52,7 @@
   * @adapter: I2C adapter for the DDC bus
   * @offset: register offset
   * @buffer: buffer for return data
- * @size: sizo of the buffer
+ * @size: size of the buffer
   *
   * Reads @size bytes from the DP dual mode adaptor registers
   * starting at @offset.
@@ -116,7 +116,7 @@ EXPORT_SYMBOL(drm_dp_dual_mode_read);
   * @adapter: I2C adapter for the DDC bus
   * @offset: register offset
   * @buffer: buffer for write data
- * @size: sizo of the buffer
+ * @size: size of the buffer
   *
   * Writes @size bytes to the DP dual mode adaptor registers
   * starting at @offset.
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c

index 7352bde299d54767fecb34232cb5941a01d6ea88..03bd3c7bd0dc2cf833decec93ce2186cf955a9bd 100644 (file)
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -582,7 +582,12 @@ int drm_gem_map_attach(struct dma_buf *dma_buf,
  {
         struct drm_gem_object *obj = dma_buf->priv;
  
-       if (!obj->funcs->get_sg_table)
+       /*
+        * drm_gem_map_dma_buf() requires obj->get_sg_table(), but drivers
+        * that implement their own ->map_dma_buf() do not.
+        */
+       if (dma_buf->ops->map_dma_buf == drm_gem_map_dma_buf &&
+           !obj->funcs->get_sg_table)
                 return -ENOSYS;
  
         return drm_gem_pin(obj);
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile

index 4c2f85632391a669c35012439094250e5e9c6dc5..fba73c38e23569fa521e387484b96eadfb988d80 100644 (file)
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -118,6 +118,7 @@ gt-y += \
         gt/intel_ggtt_fencing.o \
         gt/intel_gt.o \
         gt/intel_gt_buffer_pool.o \
+       gt/intel_gt_ccs_mode.o \
         gt/intel_gt_clock_utils.o \
         gt/intel_gt_debugfs.o \
         gt/intel_gt_engines_debugfs.o \
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c

index ab2f52d21bad8bad22c184cce6aefac8b0ce5a29..8af9e6128277af050fb95b2902615c4ce9678592 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -2709,15 +2709,6 @@ static void intel_set_pipe_src_size(const struct intel_crtc_state *crtc_state)
          */
         intel_de_write(dev_priv, PIPESRC(pipe),
                        PIPESRC_WIDTH(width - 1) | PIPESRC_HEIGHT(height - 1));
-
-       if (!crtc_state->enable_psr2_su_region_et)
-               return;
-
-       width = drm_rect_width(&crtc_state->psr2_su_area);
-       height = drm_rect_height(&crtc_state->psr2_su_area);
-
-       intel_de_write(dev_priv, PIPE_SRCSZ_ERLY_TPT(pipe),
-                      PIPESRC_WIDTH(width - 1) | PIPESRC_HEIGHT(height - 1));
  }
  
  static bool intel_pipe_is_interlaced(const struct intel_crtc_state *crtc_state)
diff --git a/drivers/gpu/drm/i915/display/intel_display_device.h b/drivers/gpu/drm/i915/display/intel_display_device.h

index fe42688137863ca8dc95057aa3fffe5ada026211..9b1bce2624b9ea1a8e45bcc5b76e4621703d073e 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_display_device.h
+++ b/drivers/gpu/drm/i915/display/intel_display_device.h
@@ -47,6 +47,7 @@ struct drm_printer;
  #define HAS_DPT(i915)                  (DISPLAY_VER(i915) >= 13)
  #define HAS_DSB(i915)                  (DISPLAY_INFO(i915)->has_dsb)
  #define HAS_DSC(__i915)                        (DISPLAY_RUNTIME_INFO(__i915)->has_dsc)
+#define HAS_DSC_MST(__i915)            (DISPLAY_VER(__i915) >= 12 && HAS_DSC(__i915))
  #define HAS_FBC(i915)                  (DISPLAY_RUNTIME_INFO(i915)->fbc_mask != 0)
  #define HAS_FPGA_DBG_UNCLAIMED(i915)   (DISPLAY_INFO(i915)->has_fpga_dbg)
  #define HAS_FW_BLC(i915)               (DISPLAY_VER(i915) >= 3)
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h

index 9104f18753b484fde2b439f85494fc77a3d27c87..bf3f942e19c3d38a314d2e5c5065dbb73b36682f 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -1423,6 +1423,8 @@ struct intel_crtc_state {
  
         u32 psr2_man_track_ctl;
  
+       u32 pipe_srcsz_early_tpt;
+
         struct drm_rect psr2_su_area;
  
         /* Variable Refresh Rate state */
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c

index f98ef4b42a448f57d5dfaf0459cba23a00946870..abd62bebc46d0e58d5bc78d8f4500ddcbc6098f1 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -499,7 +499,7 @@ intel_dp_set_source_rates(struct intel_dp *intel_dp)
         /* The values must be in increasing order */
         static const int mtl_rates[] = {
                 162000, 216000, 243000, 270000, 324000, 432000, 540000, 675000,
-               810000, 1000000, 1350000, 2000000,
+               810000, 1000000, 2000000,
         };
         static const int icl_rates[] = {
                 162000, 216000, 270000, 324000, 432000, 540000, 648000, 810000,
@@ -1422,7 +1422,8 @@ static bool intel_dp_source_supports_fec(struct intel_dp *intel_dp,
         if (DISPLAY_VER(dev_priv) >= 12)
                 return true;
  
-       if (DISPLAY_VER(dev_priv) == 11 && encoder->port != PORT_A)
+       if (DISPLAY_VER(dev_priv) == 11 && encoder->port != PORT_A &&
+           !intel_crtc_has_type(pipe_config, INTEL_OUTPUT_DP_MST))
                 return true;
  
         return false;
@@ -1917,8 +1918,9 @@ icl_dsc_compute_link_config(struct intel_dp *intel_dp,
         dsc_max_bpp = min(dsc_max_bpp, pipe_bpp - 1);
  
         for (i = 0; i < ARRAY_SIZE(valid_dsc_bpp); i++) {
-               if (valid_dsc_bpp[i] < dsc_min_bpp ||
-                   valid_dsc_bpp[i] > dsc_max_bpp)
+               if (valid_dsc_bpp[i] < dsc_min_bpp)
+                       continue;
+               if (valid_dsc_bpp[i] > dsc_max_bpp)
                         break;
  
                 ret = dsc_compute_link_config(intel_dp,
@@ -6557,6 +6559,7 @@ intel_dp_init_connector(struct intel_digital_port *dig_port,
                 intel_connector->get_hw_state = intel_ddi_connector_get_hw_state;
         else
                 intel_connector->get_hw_state = intel_connector_get_hw_state;
+       intel_connector->sync_state = intel_dp_connector_sync_state;
  
         if (!intel_edp_init_connector(intel_dp, intel_connector)) {
                 intel_dp_aux_fini(intel_dp);
diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c b/drivers/gpu/drm/i915/display/intel_dp_mst.c

index 53aec023ce92fae91e653adf9278b6a81eae3040..b651c990af85f70b17510effdfdba35235dbf51f 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c
@@ -1355,7 +1355,7 @@ intel_dp_mst_mode_valid_ctx(struct drm_connector *connector,
                 return 0;
         }
  
-       if (DISPLAY_VER(dev_priv) >= 10 &&
+       if (HAS_DSC_MST(dev_priv) &&
             drm_dp_sink_supports_dsc(intel_connector->dp.dsc_dpcd)) {
                 /*
                  * TBD pass the connector BPC,
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c

index 6927785fd6ff2fed2406a6ca1889cdf455f548e7..b6e539f1342c29ad97f5f46de8b51d9a358375bb 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -1994,6 +1994,7 @@ static void psr_force_hw_tracking_exit(struct intel_dp *intel_dp)
  
  void intel_psr2_program_trans_man_trk_ctl(const struct intel_crtc_state *crtc_state)
  {
+       struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
         struct drm_i915_private *dev_priv = to_i915(crtc_state->uapi.crtc->dev);
         enum transcoder cpu_transcoder = crtc_state->cpu_transcoder;
         struct intel_encoder *encoder;
@@ -2013,6 +2014,12 @@ void intel_psr2_program_trans_man_trk_ctl(const struct intel_crtc_state *crtc_st
  
         intel_de_write(dev_priv, PSR2_MAN_TRK_CTL(cpu_transcoder),
                        crtc_state->psr2_man_track_ctl);
+
+       if (!crtc_state->enable_psr2_su_region_et)
+               return;
+
+       intel_de_write(dev_priv, PIPE_SRCSZ_ERLY_TPT(crtc->pipe),
+                      crtc_state->pipe_srcsz_early_tpt);
  }
  
  static void psr2_man_trk_ctl_calc(struct intel_crtc_state *crtc_state,
@@ -2051,6 +2058,20 @@ exit:
         crtc_state->psr2_man_track_ctl = val;
  }
  
+static u32 psr2_pipe_srcsz_early_tpt_calc(struct intel_crtc_state *crtc_state,
+                                         bool full_update)
+{
+       int width, height;
+
+       if (!crtc_state->enable_psr2_su_region_et || full_update)
+               return 0;
+
+       width = drm_rect_width(&crtc_state->psr2_su_area);
+       height = drm_rect_height(&crtc_state->psr2_su_area);
+
+       return PIPESRC_WIDTH(width - 1) | PIPESRC_HEIGHT(height - 1);
+}
+
  static void clip_area_update(struct drm_rect *overlap_damage_area,
                              struct drm_rect *damage_area,
                              struct drm_rect *pipe_src)
@@ -2095,21 +2116,36 @@ static void intel_psr2_sel_fetch_pipe_alignment(struct intel_crtc_state *crtc_st
   * cursor fully when cursor is in SU area.
   */
  static void
-intel_psr2_sel_fetch_et_alignment(struct intel_crtc_state *crtc_state,
-                                 struct intel_plane_state *cursor_state)
+intel_psr2_sel_fetch_et_alignment(struct intel_atomic_state *state,
+                                 struct intel_crtc *crtc)
  {
-       struct drm_rect inter;
+       struct intel_crtc_state *crtc_state = intel_atomic_get_new_crtc_state(state, crtc);
+       struct intel_plane_state *new_plane_state;
+       struct intel_plane *plane;
+       int i;
  
-       if (!crtc_state->enable_psr2_su_region_et ||
-           !cursor_state->uapi.visible)
+       if (!crtc_state->enable_psr2_su_region_et)
                 return;
  
-       inter = crtc_state->psr2_su_area;
-       if (!drm_rect_intersect(&inter, &cursor_state->uapi.dst))
-               return;
+       for_each_new_intel_plane_in_state(state, plane, new_plane_state, i) {
+               struct drm_rect inter;
  
-       clip_area_update(&crtc_state->psr2_su_area, &cursor_state->uapi.dst,
-                        &crtc_state->pipe_src);
+               if (new_plane_state->uapi.crtc != crtc_state->uapi.crtc)
+                       continue;
+
+               if (plane->id != PLANE_CURSOR)
+                       continue;
+
+               if (!new_plane_state->uapi.visible)
+                       continue;
+
+               inter = crtc_state->psr2_su_area;
+               if (!drm_rect_intersect(&inter, &new_plane_state->uapi.dst))
+                       continue;
+
+               clip_area_update(&crtc_state->psr2_su_area, &new_plane_state->uapi.dst,
+                                &crtc_state->pipe_src);
+       }
  }
  
  /*
@@ -2152,8 +2188,7 @@ int intel_psr2_sel_fetch_update(struct intel_atomic_state *state,
  {
         struct drm_i915_private *dev_priv = to_i915(state->base.dev);
         struct intel_crtc_state *crtc_state = intel_atomic_get_new_crtc_state(state, crtc);
-       struct intel_plane_state *new_plane_state, *old_plane_state,
-               *cursor_plane_state = NULL;
+       struct intel_plane_state *new_plane_state, *old_plane_state;
         struct intel_plane *plane;
         bool full_update = false;
         int i, ret;
@@ -2238,13 +2273,6 @@ int intel_psr2_sel_fetch_update(struct intel_atomic_state *state,
                 damaged_area.x2 += new_plane_state->uapi.dst.x1 - src.x1;
  
                 clip_area_update(&crtc_state->psr2_su_area, &damaged_area, &crtc_state->pipe_src);
-
-               /*
-                * Cursor plane new state is stored to adjust su area to cover
-                * cursor are fully.
-                */
-               if (plane->id == PLANE_CURSOR)
-                       cursor_plane_state = new_plane_state;
         }
  
         /*
@@ -2273,9 +2301,13 @@ int intel_psr2_sel_fetch_update(struct intel_atomic_state *state,
         if (ret)
                 return ret;
  
-       /* Adjust su area to cover cursor fully as necessary */
-       if (cursor_plane_state)
-               intel_psr2_sel_fetch_et_alignment(crtc_state, cursor_plane_state);
+       /*
+        * Adjust su area to cover cursor fully as necessary (early
+        * transport). This needs to be done after
+        * drm_atomic_add_affected_planes to ensure visible cursor is added into
+        * affected planes even when cursor is not updated by itself.
+        */
+       intel_psr2_sel_fetch_et_alignment(state, crtc);
  
         intel_psr2_sel_fetch_pipe_alignment(crtc_state);
  
@@ -2338,6 +2370,8 @@ int intel_psr2_sel_fetch_update(struct intel_atomic_state *state,
  
  skip_sel_fetch_set_loop:
         psr2_man_trk_ctl_calc(crtc_state, full_update);
+       crtc_state->pipe_srcsz_early_tpt =
+               psr2_pipe_srcsz_early_tpt_calc(crtc_state, full_update);
         return 0;
  }
  
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c

index fa46d2308b0ed3b0d6bd5054f7ffbf4f5701128a..81bf2216371be6a5e16fe15a1bc23ef6c0b5b46c 100644 (file)
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -961,6 +961,9 @@ static int gen8_init_rsvd(struct i915_address_space *vm)
         struct i915_vma *vma;
         int ret;
  
+       if (!intel_gt_needs_wa_16018031267(vm->gt))
+               return 0;
+
         /* The memory will be used only by GPU. */
         obj = i915_gem_object_create_lmem(i915, PAGE_SIZE,
                                           I915_BO_ALLOC_VOLATILE |
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c

index 1ade568ffbfa43409129228881abe60d965e8d10..7a6dc371c384eb3d1f2639d5a767072a3bc554f4 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -908,6 +908,23 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
                 info->engine_mask &= ~BIT(GSC0);
         }
  
+       /*
+        * Do not create the command streamer for CCS slices beyond the first.
+        * All the workload submitted to the first engine will be shared among
+        * all the slices.
+        *
+        * Once the user will be allowed to customize the CCS mode, then this
+        * check needs to be removed.
+        */
+       if (IS_DG2(gt->i915)) {
+               u8 first_ccs = __ffs(CCS_MASK(gt));
+
+               /* Mask off all the CCS engine */
+               info->engine_mask &= ~GENMASK(CCS3, CCS0);
+               /* Put back in the first CCS engine */
+               info->engine_mask |= BIT(_CCS(first_ccs));
+       }
+
         return info->engine_mask;
  }
  
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c

index a425db5ed3a22c38af996ce2183d6fa030ed60b2..6a2c2718bcc38e645903031ce00cd667c1ee5411 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1024,6 +1024,12 @@ enum i915_map_type intel_gt_coherent_map_type(struct intel_gt *gt,
                 return I915_MAP_WC;
  }
  
+bool intel_gt_needs_wa_16018031267(struct intel_gt *gt)
+{
+       /* Wa_16018031267, Wa_16018063123 */
+       return IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 55), IP_VER(12, 71));
+}
+
  bool intel_gt_needs_wa_22016122933(struct intel_gt *gt)
  {
         return MEDIA_VER_FULL(gt->i915) == IP_VER(13, 0) && gt->type == GT_MEDIA;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h

index 608f5c87292857c6b2777bbd809c5bd87a48238c..003eb93b826fd06fa122650b99b7cdbb09fe3161 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_gt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt.h
@@ -82,17 +82,18 @@ struct drm_printer;
                   ##__VA_ARGS__);                                       \
  } while (0)
  
-#define NEEDS_FASTCOLOR_BLT_WABB(engine) ( \
-       IS_GFX_GT_IP_RANGE(engine->gt, IP_VER(12, 55), IP_VER(12, 71)) && \
-       engine->class == COPY_ENGINE_CLASS && engine->instance == 0)
-
  static inline bool gt_is_root(struct intel_gt *gt)
  {
         return !gt->info.id;
  }
  
+bool intel_gt_needs_wa_16018031267(struct intel_gt *gt);
  bool intel_gt_needs_wa_22016122933(struct intel_gt *gt);
  
+#define NEEDS_FASTCOLOR_BLT_WABB(engine) ( \
+       intel_gt_needs_wa_16018031267(engine->gt) && \
+       engine->class == COPY_ENGINE_CLASS && engine->instance == 0)
+
  static inline struct intel_gt *uc_to_gt(struct intel_uc *uc)
  {
         return container_of(uc, struct intel_gt, uc);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c

new file mode 100644 (file)

index 0000000..044219c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_gt.h"
+#include "intel_gt_ccs_mode.h"
+#include "intel_gt_regs.h"
+
+void intel_gt_apply_ccs_mode(struct intel_gt *gt)
+{
+       int cslice;
+       u32 mode = 0;
+       int first_ccs = __ffs(CCS_MASK(gt));
+
+       if (!IS_DG2(gt->i915))
+               return;
+
+       /* Build the value for the fixed CCS load balancing */
+       for (cslice = 0; cslice < I915_MAX_CCS; cslice++) {
+               if (CCS_MASK(gt) & BIT(cslice))
+                       /*
+                        * If available, assign the cslice
+                        * to the first available engine...
+                        */
+                       mode |= XEHP_CCS_MODE_CSLICE(cslice, first_ccs);
+
+               else
+                       /*
+                        * ... otherwise, mark the cslice as
+                        * unavailable if no CCS dispatches here
+                        */
+                       mode |= XEHP_CCS_MODE_CSLICE(cslice,
+                                                    XEHP_CCS_MODE_CSLICE_MASK);
+       }
+
+       intel_uncore_write(gt->uncore, XEHP_CCS_MODE, mode);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h

new file mode 100644 (file)

index 0000000..9e5549c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#ifndef __INTEL_GT_CCS_MODE_H__
+#define __INTEL_GT_CCS_MODE_H__
+
+struct intel_gt;
+
+void intel_gt_apply_ccs_mode(struct intel_gt *gt);
+
+#endif /* __INTEL_GT_CCS_MODE_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h

index 50962cfd1353ae4673b27a9bb2437d47633b5651..743fe35667227451436205f9e44514df1c4e809b 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1477,8 +1477,14 @@
  #define   ECOBITS_PPGTT_CACHE4B                        (0 << 8)
  
  #define GEN12_RCU_MODE                         _MMIO(0x14800)
+#define   XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE   REG_BIT(1)
  #define   GEN12_RCU_MODE_CCS_ENABLE            REG_BIT(0)
  
+#define XEHP_CCS_MODE                          _MMIO(0x14804)
+#define   XEHP_CCS_MODE_CSLICE_MASK            REG_GENMASK(2, 0) /* CCS0-3 + rsvd */
+#define   XEHP_CCS_MODE_CSLICE_WIDTH           ilog2(XEHP_CCS_MODE_CSLICE_MASK + 1)
+#define   XEHP_CCS_MODE_CSLICE(cslice, ccs)    (ccs << (cslice * XEHP_CCS_MODE_CSLICE_WIDTH))
+
  #define CHV_FUSE_GT                            _MMIO(VLV_GUNIT_BASE + 0x2168)
  #define   CHV_FGT_DISABLE_SS0                  (1 << 10)
  #define   CHV_FGT_DISABLE_SS1                  (1 << 11)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c

index 25413809b9dc99734409210259a9f51f1fffad88..6ec3582c97357780f823865cf0a9a9581b50d288 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -10,6 +10,7 @@
  #include "intel_engine_regs.h"
  #include "intel_gpu_commands.h"
  #include "intel_gt.h"
+#include "intel_gt_ccs_mode.h"
  #include "intel_gt_mcr.h"
  #include "intel_gt_print.h"
  #include "intel_gt_regs.h"
@@ -51,7 +52,8 @@
   *   registers belonging to BCS, VCS or VECS should be implemented in
   *   xcs_engine_wa_init(). Workarounds for registers not belonging to a specific
   *   engine's MMIO range but that are part of of the common RCS/CCS reset domain
- *   should be implemented in general_render_compute_wa_init().
+ *   should be implemented in general_render_compute_wa_init(). The settings
+ *   about the CCS load balancing should be added in ccs_engine_wa_mode().
   *
   * - GT workarounds: the list of these WAs is applied whenever these registers
   *   revert to their default values: on GPU reset, suspend/resume [1]_, etc.
@@ -2854,6 +2856,28 @@ add_render_compute_tuning_settings(struct intel_gt *gt,
                 wa_write_clr(wal, GEN8_GARBCNTL, GEN12_BUS_HASH_CTL_BIT_EXC);
  }
  
+static void ccs_engine_wa_mode(struct intel_engine_cs *engine, struct i915_wa_list *wal)
+{
+       struct intel_gt *gt = engine->gt;
+
+       if (!IS_DG2(gt->i915))
+               return;
+
+       /*
+        * Wa_14019159160: This workaround, along with others, leads to
+        * significant challenges in utilizing load balancing among the
+        * CCS slices. Consequently, an architectural decision has been
+        * made to completely disable automatic CCS load balancing.
+        */
+       wa_masked_en(wal, GEN12_RCU_MODE, XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE);
+
+       /*
+        * After having disabled automatic load balancing we need to
+        * assign all slices to a single CCS. We will call it CCS mode 1
+        */
+       intel_gt_apply_ccs_mode(gt);
+}
+
  /*
   * The workarounds in this function apply to shared registers in
   * the general render reset domain that aren't tied to a
@@ -3004,8 +3028,10 @@ engine_init_workarounds(struct intel_engine_cs *engine, struct i915_wa_list *wal
          * to a single RCS/CCS engine's workaround list since
          * they're reset as part of the general render domain reset.
          */
-       if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE)
+       if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE) {
                 general_render_compute_wa_init(engine, wal);
+               ccs_engine_wa_mode(engine, wal);
+       }
  
         if (engine->class == COMPUTE_CLASS)
                 ccs_engine_wa_init(engine, wal);
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c

index 0a0a11dc9ec03eeba855f47ca57c1ad1c5669f54..ee02cd833c5e4345abdc3fb83968769999ac4340 100644 (file)
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -812,15 +812,15 @@ op_remap(struct drm_gpuva_op_remap *r,
         struct drm_gpuva_op_unmap *u = r->unmap;
         struct nouveau_uvma *uvma = uvma_from_va(u->va);
         u64 addr = uvma->va.va.addr;
-       u64 range = uvma->va.va.range;
+       u64 end = uvma->va.va.addr + uvma->va.va.range;
  
         if (r->prev)
                 addr = r->prev->va.addr + r->prev->va.range;
  
         if (r->next)
-               range = r->next->va.addr - addr;
+               end = r->next->va.addr;
  
-       op_unmap_range(u, addr, range);
+       op_unmap_range(u, addr, end - addr);
  }
  
  static int
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c

index 986e8d547c94246a5f7bd058e6ddf555ffc651a4..060c74a80eb14b916db3c441e44b137dd15b7336 100644 (file)
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
@@ -420,7 +420,7 @@ gf100_gr_chan_new(struct nvkm_gr *base, struct nvkm_chan *fifoch,
                         return ret;
         } else {
                 ret = nvkm_memory_map(gr->attrib_cb, 0, chan->vmm, chan->attrib_cb,
-                                     &args, sizeof(args));;
+                                     &args, sizeof(args));
                 if (ret)
                         return ret;
         }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c

index 7bcbc4895ec22196acecfd46d0b29490d2c93ee2..271bfa038f5bc90974acd1ed2709d5cbae51ed94 100644 (file)
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c
@@ -25,6 +25,7 @@
  
  #include <subdev/bios.h>
  #include <subdev/bios/init.h>
+#include <subdev/gsp.h>
  
  void
  gm107_devinit_disable(struct nvkm_devinit *init)
@@ -33,10 +34,13 @@ gm107_devinit_disable(struct nvkm_devinit *init)
         u32 r021c00 = nvkm_rd32(device, 0x021c00);
         u32 r021c04 = nvkm_rd32(device, 0x021c04);
  
-       if (r021c00 & 0x00000001)
-               nvkm_subdev_disable(device, NVKM_ENGINE_CE, 0);
-       if (r021c00 & 0x00000004)
-               nvkm_subdev_disable(device, NVKM_ENGINE_CE, 2);
+       /* gsp only wants to enable/disable display */
+       if (!nvkm_gsp_rm(device->gsp)) {
+               if (r021c00 & 0x00000001)
+                       nvkm_subdev_disable(device, NVKM_ENGINE_CE, 0);
+               if (r021c00 & 0x00000004)
+                       nvkm_subdev_disable(device, NVKM_ENGINE_CE, 2);
+       }
         if (r021c04 & 0x00000001)
                 nvkm_subdev_disable(device, NVKM_ENGINE_DISP, 0);
  }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c

index 11b4c9c274a1a597cb3592019d873345c241d1cd..666eb93b1742ca5435cf0567e28e1664122bad8b 100644 (file)
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
@@ -41,6 +41,7 @@ r535_devinit_new(const struct nvkm_devinit_func *hw,
  
         rm->dtor = r535_devinit_dtor;
         rm->post = hw->post;
+       rm->disable = hw->disable;
  
         ret = nv50_devinit_new_(rm, device, type, inst, pdevinit);
         if (ret)
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c

index 9063ce2546422fd93eb0c0b847cab68aac0ee753..fd8e44992184fa2e63a11e1810ea8e79f9e929a4 100644 (file)
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -441,19 +441,19 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
  
         gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
         ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
-                                        val, !val, 1, 1000);
+                                        val, !val, 1, 2000);
         if (ret)
                 dev_err(pfdev->dev, "shader power transition timeout");
  
         gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
         ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
-                                        val, !val, 1, 1000);
+                                        val, !val, 1, 2000);
         if (ret)
                 dev_err(pfdev->dev, "tiler power transition timeout");
  
         gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
         ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
-                                val, !val, 0, 1000);
+                                val, !val, 0, 2000);
         if (ret)
                 dev_err(pfdev->dev, "l2 power transition timeout");
  }
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c

index ca85e81fdb44383ffdafdb48a98a843cb1884b71..d32ff3857e65838d460d507440d576601fa02f03 100644 (file)
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -193,6 +193,9 @@ static void xe_device_destroy(struct drm_device *dev, void *dummy)
  {
         struct xe_device *xe = to_xe_device(dev);
  
+       if (xe->preempt_fence_wq)
+               destroy_workqueue(xe->preempt_fence_wq);
+
         if (xe->ordered_wq)
                 destroy_workqueue(xe->ordered_wq);
  
@@ -258,9 +261,15 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
         INIT_LIST_HEAD(&xe->pinned.external_vram);
         INIT_LIST_HEAD(&xe->pinned.evicted);
  
+       xe->preempt_fence_wq = alloc_ordered_workqueue("xe-preempt-fence-wq", 0);
         xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
         xe->unordered_wq = alloc_workqueue("xe-unordered-wq", 0, 0);
-       if (!xe->ordered_wq || !xe->unordered_wq) {
+       if (!xe->ordered_wq || !xe->unordered_wq ||
+           !xe->preempt_fence_wq) {
+               /*
+                * Cleanup done in xe_device_destroy via
+                * drmm_add_action_or_reset register above
+                */
                 drm_err(&xe->drm, "Failed to allocate xe workqueues\n");
                 err = -ENOMEM;
                 goto err;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h

index 9785eef2e5a4e6566c452e1fa8c45c447fe00b76..8e3a222b41cf0a4dda7286b10566e6def0d97ad4 100644 (file)
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -363,6 +363,9 @@ struct xe_device {
         /** @ufence_wq: user fence wait queue */
         wait_queue_head_t ufence_wq;
  
+       /** @preempt_fence_wq: used to serialize preempt fences */
+       struct workqueue_struct *preempt_fence_wq;
+
         /** @ordered_wq: used to serialize compute mode resume */
         struct workqueue_struct *ordered_wq;
  
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c

index 826c8b389672502dfebd6e89c6c1997bf8f0c9a2..cc5e0f75de3c7350770323aeea9570ddd89d48bb 100644 (file)
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -94,48 +94,16 @@
   *     Unlock all
   */
  
+/*
+ * Add validation and rebinding to the drm_exec locking loop, since both can
+ * trigger eviction which may require sleeping dma_resv locks.
+ */
  static int xe_exec_fn(struct drm_gpuvm_exec *vm_exec)
  {
         struct xe_vm *vm = container_of(vm_exec->vm, struct xe_vm, gpuvm);
-       struct drm_gem_object *obj;
-       unsigned long index;
-       int num_fences;
-       int ret;
-
-       ret = drm_gpuvm_validate(vm_exec->vm, &vm_exec->exec);
-       if (ret)
-               return ret;
-
-       /*
-        * 1 fence slot for the final submit, and 1 more for every per-tile for
-        * GPU bind and 1 extra for CPU bind. Note that there are potentially
-        * many vma per object/dma-resv, however the fence slot will just be
-        * re-used, since they are largely the same timeline and the seqno
-        * should be in order. In the case of CPU bind there is dummy fence used
-        * for all CPU binds, so no need to have a per-tile slot for that.
-        */
-       num_fences = 1 + 1 + vm->xe->info.tile_count;
  
-       /*
-        * We don't know upfront exactly how many fence slots we will need at
-        * the start of the exec, since the TTM bo_validate above can consume
-        * numerous fence slots. Also due to how the dma_resv_reserve_fences()
-        * works it only ensures that at least that many fence slots are
-        * available i.e if there are already 10 slots available and we reserve
-        * two more, it can just noop without reserving anything.  With this it
-        * is quite possible that TTM steals some of the fence slots and then
-        * when it comes time to do the vma binding and final exec stage we are
-        * lacking enough fence slots, leading to some nasty BUG_ON() when
-        * adding the fences. Hence just add our own fences here, after the
-        * validate stage.
-        */
-       drm_exec_for_each_locked_object(&vm_exec->exec, index, obj) {
-               ret = dma_resv_reserve_fences(obj->resv, num_fences);
-               if (ret)
-                       return ret;
-       }
-
-       return 0;
+       /* The fence slot added here is intended for the exec sched job. */
+       return xe_vm_validate_rebind(vm, &vm_exec->exec, 1);
  }
  
  int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
@@ -152,7 +120,6 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
         struct drm_exec *exec = &vm_exec.exec;
         u32 i, num_syncs = 0, num_ufence = 0;
         struct xe_sched_job *job;
-       struct dma_fence *rebind_fence;
         struct xe_vm *vm;
         bool write_locked, skip_retry = false;
         ktime_t end = 0;
@@ -290,39 +257,7 @@ retry:
                 goto err_exec;
         }
  
-       /*
-        * Rebind any invalidated userptr or evicted BOs in the VM, non-compute
-        * VM mode only.
-        */
-       rebind_fence = xe_vm_rebind(vm, false);
-       if (IS_ERR(rebind_fence)) {
-               err = PTR_ERR(rebind_fence);
-               goto err_put_job;
-       }
-
-       /*
-        * We store the rebind_fence in the VM so subsequent execs don't get
-        * scheduled before the rebinds of userptrs / evicted BOs is complete.
-        */
-       if (rebind_fence) {
-               dma_fence_put(vm->rebind_fence);
-               vm->rebind_fence = rebind_fence;
-       }
-       if (vm->rebind_fence) {
-               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-                            &vm->rebind_fence->flags)) {
-                       dma_fence_put(vm->rebind_fence);
-                       vm->rebind_fence = NULL;
-               } else {
-                       dma_fence_get(vm->rebind_fence);
-                       err = drm_sched_job_add_dependency(&job->drm,
-                                                          vm->rebind_fence);
-                       if (err)
-                               goto err_put_job;
-               }
-       }
-
-       /* Wait behind munmap style rebinds */
+       /* Wait behind rebinds */
         if (!xe_vm_in_lr_mode(vm)) {
                 err = drm_sched_job_add_resv_dependencies(&job->drm,
                                                           xe_vm_resv(vm),
diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h

index 62b3d9d1d7cdd4f2d65c55db414a00b7bd7fbd06..462b331950320c0e49901fb09c32a8cdcffc1745 100644 (file)
--- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
@@ -148,6 +148,11 @@ struct xe_exec_queue {
         const struct xe_ring_ops *ring_ops;
         /** @entity: DRM sched entity for this exec queue (1 to 1 relationship) */
         struct drm_sched_entity *entity;
+       /**
+        * @tlb_flush_seqno: The seqno of the last rebind tlb flush performed
+        * Protected by @vm's resv. Unused if @vm == NULL.
+        */
+       u64 tlb_flush_seqno;
         /** @lrc: logical ring context for this exec queue */
         struct xe_lrc lrc[];
  };
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c

index 241c294270d9167f25d1898f8f590c7aabb06ca0..fa9e9853c53ba605e0e35870bed69e7d09d25934 100644 (file)
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -100,10 +100,9 @@ static int xe_pf_begin(struct drm_exec *exec, struct xe_vma *vma,
  {
         struct xe_bo *bo = xe_vma_bo(vma);
         struct xe_vm *vm = xe_vma_vm(vma);
-       unsigned int num_shared = 2; /* slots for bind + move */
         int err;
  
-       err = xe_vm_prepare_vma(exec, vma, num_shared);
+       err = xe_vm_lock_vma(exec, vma);
         if (err)
                 return err;
  
diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c

index f03e077f81a04fcb9344f8c634856acab516c6f1..e598a4363d0190504d9ca8d826d7d996f0d2dfaf 100644 (file)
--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
@@ -61,7 +61,6 @@ int xe_gt_tlb_invalidation_init(struct xe_gt *gt)
         INIT_LIST_HEAD(&gt->tlb_invalidation.pending_fences);
         spin_lock_init(&gt->tlb_invalidation.pending_lock);
         spin_lock_init(&gt->tlb_invalidation.lock);
-       gt->tlb_invalidation.fence_context = dma_fence_context_alloc(1);
         INIT_DELAYED_WORK(&gt->tlb_invalidation.fence_tdr,
                           xe_gt_tlb_fence_timeout);
  
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h

index 70c615dd14986599324a2fb68f766889761c7eb1..07b2f724ec45685feaa4b5ab86b6f2011f65198e 100644 (file)
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -177,13 +177,6 @@ struct xe_gt {
                  * xe_gt_tlb_fence_timeout after the timeut interval is over.
                  */
                 struct delayed_work fence_tdr;
-               /** @tlb_invalidation.fence_context: context for TLB invalidation fences */
-               u64 fence_context;
-               /**
-                * @tlb_invalidation.fence_seqno: seqno to TLB invalidation fences, protected by
-                * tlb_invalidation.lock
-                */
-               u32 fence_seqno;
                 /** @tlb_invalidation.lock: protects TLB invalidation fences */
                 spinlock_t lock;
         } tlb_invalidation;
diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c

index 7bce2a332603c086bf4bed63c212cdff311f6bbf..7d50c6e89d8e7dc0ba718b9439ef86858e1f3992 100644 (file)
--- a/drivers/gpu/drm/xe/xe_preempt_fence.c
+++ b/drivers/gpu/drm/xe/xe_preempt_fence.c
@@ -49,7 +49,7 @@ static bool preempt_fence_enable_signaling(struct dma_fence *fence)
         struct xe_exec_queue *q = pfence->q;
  
         pfence->error = q->ops->suspend(q);
-       queue_work(system_unbound_wq, &pfence->preempt_work);
+       queue_work(q->vm->xe->preempt_fence_wq, &pfence->preempt_work);
         return true;
  }
  
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c

index 7f54bc3e389d58f8023f3a1092aa47d3e852a16b..4efc8c1a3d7a99e00107aeb88c803db26cb62881 100644 (file)
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -1135,8 +1135,7 @@ static int invalidation_fence_init(struct xe_gt *gt,
         spin_lock_irq(&gt->tlb_invalidation.lock);
         dma_fence_init(&ifence->base.base, &invalidation_fence_ops,
                        &gt->tlb_invalidation.lock,
-                      gt->tlb_invalidation.fence_context,
-                      ++gt->tlb_invalidation.fence_seqno);
+                      dma_fence_context_alloc(1), 1);
         spin_unlock_irq(&gt->tlb_invalidation.lock);
  
         INIT_LIST_HEAD(&ifence->base.link);
@@ -1236,6 +1235,13 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
         err = xe_pt_prepare_bind(tile, vma, entries, &num_entries);
         if (err)
                 goto err;
+
+       err = dma_resv_reserve_fences(xe_vm_resv(vm), 1);
+       if (!err && !xe_vma_has_no_bo(vma) && !xe_vma_bo(vma)->vm)
+               err = dma_resv_reserve_fences(xe_vma_bo(vma)->ttm.base.resv, 1);
+       if (err)
+               goto err;
+
         xe_tile_assert(tile, num_entries <= ARRAY_SIZE(entries));
  
         xe_vm_dbg_print_entries(tile_to_xe(tile), entries, num_entries);
@@ -1254,11 +1260,13 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
          * non-faulting LR, in particular on user-space batch buffer chaining,
          * it needs to be done here.
          */
-       if ((rebind && !xe_vm_in_lr_mode(vm) && !vm->batch_invalidate_tlb) ||
-           (!rebind && xe_vm_has_scratch(vm) && xe_vm_in_preempt_fence_mode(vm))) {
+       if ((!rebind && xe_vm_has_scratch(vm) && xe_vm_in_preempt_fence_mode(vm))) {
                 ifence = kzalloc(sizeof(*ifence), GFP_KERNEL);
                 if (!ifence)
                         return ERR_PTR(-ENOMEM);
+       } else if (rebind && !xe_vm_in_lr_mode(vm)) {
+               /* We bump also if batch_invalidate_tlb is true */
+               vm->tlb_flush_seqno++;
         }
  
         rfence = kzalloc(sizeof(*rfence), GFP_KERNEL);
@@ -1297,7 +1305,7 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
                 }
  
                 /* add shared fence now for pagetable delayed destroy */
-               dma_resv_add_fence(xe_vm_resv(vm), fence, !rebind &&
+               dma_resv_add_fence(xe_vm_resv(vm), fence, rebind ||
                                    last_munmap_rebind ?
                                    DMA_RESV_USAGE_KERNEL :
                                    DMA_RESV_USAGE_BOOKKEEP);
@@ -1576,6 +1584,7 @@ __xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queu
         struct dma_fence *fence = NULL;
         struct invalidation_fence *ifence;
         struct xe_range_fence *rfence;
+       int err;
  
         LLIST_HEAD(deferred);
  
@@ -1593,6 +1602,12 @@ __xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queu
         xe_pt_calc_rfence_interval(vma, &unbind_pt_update, entries,
                                    num_entries);
  
+       err = dma_resv_reserve_fences(xe_vm_resv(vm), 1);
+       if (!err && !xe_vma_has_no_bo(vma) && !xe_vma_bo(vma)->vm)
+               err = dma_resv_reserve_fences(xe_vma_bo(vma)->ttm.base.resv, 1);
+       if (err)
+               return ERR_PTR(err);
+
         ifence = kzalloc(sizeof(*ifence), GFP_KERNEL);
         if (!ifence)
                 return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c

index c4edffcd4a320666d576d950ab15dc614545a053..5b2b37b598130ac464a2c344bad52b731e778e28 100644 (file)
--- a/drivers/gpu/drm/xe/xe_ring_ops.c
+++ b/drivers/gpu/drm/xe/xe_ring_ops.c
@@ -219,10 +219,9 @@ static void __emit_job_gen12_simple(struct xe_sched_job *job, struct xe_lrc *lrc
  {
         u32 dw[MAX_JOB_SIZE_DW], i = 0;
         u32 ppgtt_flag = get_ppgtt_flag(job);
-       struct xe_vm *vm = job->q->vm;
         struct xe_gt *gt = job->q->gt;
  
-       if (vm && vm->batch_invalidate_tlb) {
+       if (job->ring_ops_flush_tlb) {
                 dw[i++] = preparser_disable(true);
                 i = emit_flush_imm_ggtt(xe_lrc_start_seqno_ggtt_addr(lrc),
                                         seqno, true, dw, i);
@@ -270,7 +269,6 @@ static void __emit_job_gen12_video(struct xe_sched_job *job, struct xe_lrc *lrc,
         struct xe_gt *gt = job->q->gt;
         struct xe_device *xe = gt_to_xe(gt);
         bool decode = job->q->class == XE_ENGINE_CLASS_VIDEO_DECODE;
-       struct xe_vm *vm = job->q->vm;
  
         dw[i++] = preparser_disable(true);
  
@@ -282,13 +280,13 @@ static void __emit_job_gen12_video(struct xe_sched_job *job, struct xe_lrc *lrc,
                         i = emit_aux_table_inv(gt, VE0_AUX_INV, dw, i);
         }
  
-       if (vm && vm->batch_invalidate_tlb)
+       if (job->ring_ops_flush_tlb)
                 i = emit_flush_imm_ggtt(xe_lrc_start_seqno_ggtt_addr(lrc),
                                         seqno, true, dw, i);
  
         dw[i++] = preparser_disable(false);
  
-       if (!vm || !vm->batch_invalidate_tlb)
+       if (!job->ring_ops_flush_tlb)
                 i = emit_store_imm_ggtt(xe_lrc_start_seqno_ggtt_addr(lrc),
                                         seqno, dw, i);
  
@@ -317,7 +315,6 @@ static void __emit_job_gen12_render_compute(struct xe_sched_job *job,
         struct xe_gt *gt = job->q->gt;
         struct xe_device *xe = gt_to_xe(gt);
         bool lacks_render = !(gt->info.engine_mask & XE_HW_ENGINE_RCS_MASK);
-       struct xe_vm *vm = job->q->vm;
         u32 mask_flags = 0;
  
         dw[i++] = preparser_disable(true);
@@ -327,7 +324,7 @@ static void __emit_job_gen12_render_compute(struct xe_sched_job *job,
                 mask_flags = PIPE_CONTROL_3D_ENGINE_FLAGS;
  
         /* See __xe_pt_bind_vma() for a discussion on TLB invalidations. */
-       i = emit_pipe_invalidate(mask_flags, vm && vm->batch_invalidate_tlb, dw, i);
+       i = emit_pipe_invalidate(mask_flags, job->ring_ops_flush_tlb, dw, i);
  
         /* hsdes: 1809175790 */
         if (has_aux_ccs(xe))
diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c

index 8151ddafb940756d87dbca45e6d3407354535ce4..b0c7fa4693cfe4a999b93b3878cb72c6150ebcbd 100644 (file)
--- a/drivers/gpu/drm/xe/xe_sched_job.c
+++ b/drivers/gpu/drm/xe/xe_sched_job.c
@@ -250,6 +250,16 @@ bool xe_sched_job_completed(struct xe_sched_job *job)
  
  void xe_sched_job_arm(struct xe_sched_job *job)
  {
+       struct xe_exec_queue *q = job->q;
+       struct xe_vm *vm = q->vm;
+
+       if (vm && !xe_sched_job_is_migration(q) && !xe_vm_in_lr_mode(vm) &&
+           (vm->batch_invalidate_tlb || vm->tlb_flush_seqno != q->tlb_flush_seqno)) {
+               xe_vm_assert_held(vm);
+               q->tlb_flush_seqno = vm->tlb_flush_seqno;
+               job->ring_ops_flush_tlb = true;
+       }
+
         drm_sched_job_arm(&job->drm);
  }
  
diff --git a/drivers/gpu/drm/xe/xe_sched_job_types.h b/drivers/gpu/drm/xe/xe_sched_job_types.h

index b1d83da50a53da59b6d72af1bbd21c8d98ca3517..5e12724219fdd485f2b770bd4b31e78aa2ab42af 100644 (file)
--- a/drivers/gpu/drm/xe/xe_sched_job_types.h
+++ b/drivers/gpu/drm/xe/xe_sched_job_types.h
@@ -39,6 +39,8 @@ struct xe_sched_job {
         } user_fence;
         /** @migrate_flush_flags: Additional flush flags for migration jobs */
         u32 migrate_flush_flags;
+       /** @ring_ops_flush_tlb: The ring ops need to flush TLB before payload. */
+       bool ring_ops_flush_tlb;
         /** @batch_addr: batch buffer address of job */
         u64 batch_addr[];
  };
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c

index f88faef4142bde018f336d33d3e2eed726a4bc29..62d1ef8867a84351ae7444d63113d8867dfbb0c5 100644 (file)
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -482,17 +482,53 @@ static int xe_gpuvm_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec)
         return 0;
  }
  
+/**
+ * xe_vm_validate_rebind() - Validate buffer objects and rebind vmas
+ * @vm: The vm for which we are rebinding.
+ * @exec: The struct drm_exec with the locked GEM objects.
+ * @num_fences: The number of fences to reserve for the operation, not
+ * including rebinds and validations.
+ *
+ * Validates all evicted gem objects and rebinds their vmas. Note that
+ * rebindings may cause evictions and hence the validation-rebind
+ * sequence is rerun until there are no more objects to validate.
+ *
+ * Return: 0 on success, negative error code on error. In particular,
+ * may return -EINTR or -ERESTARTSYS if interrupted, and -EDEADLK if
+ * the drm_exec transaction needs to be restarted.
+ */
+int xe_vm_validate_rebind(struct xe_vm *vm, struct drm_exec *exec,
+                         unsigned int num_fences)
+{
+       struct drm_gem_object *obj;
+       unsigned long index;
+       int ret;
+
+       do {
+               ret = drm_gpuvm_validate(&vm->gpuvm, exec);
+               if (ret)
+                       return ret;
+
+               ret = xe_vm_rebind(vm, false);
+               if (ret)
+                       return ret;
+       } while (!list_empty(&vm->gpuvm.evict.list));
+
+       drm_exec_for_each_locked_object(exec, index, obj) {
+               ret = dma_resv_reserve_fences(obj->resv, num_fences);
+               if (ret)
+                       return ret;
+       }
+
+       return 0;
+}
+
  static int xe_preempt_work_begin(struct drm_exec *exec, struct xe_vm *vm,
                                  bool *done)
  {
         int err;
  
-       /*
-        * 1 fence for each preempt fence plus a fence for each tile from a
-        * possible rebind
-        */
-       err = drm_gpuvm_prepare_vm(&vm->gpuvm, exec, vm->preempt.num_exec_queues +
-                                  vm->xe->info.tile_count);
+       err = drm_gpuvm_prepare_vm(&vm->gpuvm, exec, 0);
         if (err)
                 return err;
  
@@ -507,7 +543,7 @@ static int xe_preempt_work_begin(struct drm_exec *exec, struct xe_vm *vm,
                 return 0;
         }
  
-       err = drm_gpuvm_prepare_objects(&vm->gpuvm, exec, vm->preempt.num_exec_queues);
+       err = drm_gpuvm_prepare_objects(&vm->gpuvm, exec, 0);
         if (err)
                 return err;
  
@@ -515,14 +551,19 @@ static int xe_preempt_work_begin(struct drm_exec *exec, struct xe_vm *vm,
         if (err)
                 return err;
  
-       return drm_gpuvm_validate(&vm->gpuvm, exec);
+       /*
+        * Add validation and rebinding to the locking loop since both can
+        * cause evictions which may require blocing dma_resv locks.
+        * The fence reservation here is intended for the new preempt fences
+        * we attach at the end of the rebind work.
+        */
+       return xe_vm_validate_rebind(vm, exec, vm->preempt.num_exec_queues);
  }
  
  static void preempt_rebind_work_func(struct work_struct *w)
  {
         struct xe_vm *vm = container_of(w, struct xe_vm, preempt.rebind_work);
         struct drm_exec exec;
-       struct dma_fence *rebind_fence;
         unsigned int fence_count = 0;
         LIST_HEAD(preempt_fences);
         ktime_t end = 0;
@@ -568,18 +609,11 @@ retry:
         if (err)
                 goto out_unlock;
  
-       rebind_fence = xe_vm_rebind(vm, true);
-       if (IS_ERR(rebind_fence)) {
-               err = PTR_ERR(rebind_fence);
+       err = xe_vm_rebind(vm, true);
+       if (err)
                 goto out_unlock;
-       }
  
-       if (rebind_fence) {
-               dma_fence_wait(rebind_fence, false);
-               dma_fence_put(rebind_fence);
-       }
-
-       /* Wait on munmap style VM unbinds */
+       /* Wait on rebinds and munmap style VM unbinds */
         wait = dma_resv_wait_timeout(xe_vm_resv(vm),
                                      DMA_RESV_USAGE_KERNEL,
                                      false, MAX_SCHEDULE_TIMEOUT);
@@ -773,14 +807,14 @@ xe_vm_bind_vma(struct xe_vma *vma, struct xe_exec_queue *q,
                struct xe_sync_entry *syncs, u32 num_syncs,
                bool first_op, bool last_op);
  
-struct dma_fence *xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
+int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
  {
-       struct dma_fence *fence = NULL;
+       struct dma_fence *fence;
         struct xe_vma *vma, *next;
  
         lockdep_assert_held(&vm->lock);
         if (xe_vm_in_lr_mode(vm) && !rebind_worker)
-               return NULL;
+               return 0;
  
         xe_vm_assert_held(vm);
         list_for_each_entry_safe(vma, next, &vm->rebind_list,
@@ -788,17 +822,17 @@ struct dma_fence *xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
                 xe_assert(vm->xe, vma->tile_present);
  
                 list_del_init(&vma->combined_links.rebind);
-               dma_fence_put(fence);
                 if (rebind_worker)
                         trace_xe_vma_rebind_worker(vma);
                 else
                         trace_xe_vma_rebind_exec(vma);
                 fence = xe_vm_bind_vma(vma, NULL, NULL, 0, false, false);
                 if (IS_ERR(fence))
-                       return fence;
+                       return PTR_ERR(fence);
+               dma_fence_put(fence);
         }
  
-       return fence;
+       return 0;
  }
  
  static void xe_vma_free(struct xe_vma *vma)
@@ -1004,35 +1038,26 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
  }
  
  /**
- * xe_vm_prepare_vma() - drm_exec utility to lock a vma
+ * xe_vm_lock_vma() - drm_exec utility to lock a vma
   * @exec: The drm_exec object we're currently locking for.
   * @vma: The vma for witch we want to lock the vm resv and any attached
   * object's resv.
- * @num_shared: The number of dma-fence slots to pre-allocate in the
- * objects' reservation objects.
   *
   * Return: 0 on success, negative error code on error. In particular
   * may return -EDEADLK on WW transaction contention and -EINTR if
   * an interruptible wait is terminated by a signal.
   */
-int xe_vm_prepare_vma(struct drm_exec *exec, struct xe_vma *vma,
-                     unsigned int num_shared)
+int xe_vm_lock_vma(struct drm_exec *exec, struct xe_vma *vma)
  {
         struct xe_vm *vm = xe_vma_vm(vma);
         struct xe_bo *bo = xe_vma_bo(vma);
         int err;
  
         XE_WARN_ON(!vm);
-       if (num_shared)
-               err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), num_shared);
-       else
-               err = drm_exec_lock_obj(exec, xe_vm_obj(vm));
-       if (!err && bo && !bo->vm) {
-               if (num_shared)
-                       err = drm_exec_prepare_obj(exec, &bo->ttm.base, num_shared);
-               else
-                       err = drm_exec_lock_obj(exec, &bo->ttm.base);
-       }
+
+       err = drm_exec_lock_obj(exec, xe_vm_obj(vm));
+       if (!err && bo && !bo->vm)
+               err = drm_exec_lock_obj(exec, &bo->ttm.base);
  
         return err;
  }
@@ -1044,7 +1069,7 @@ static void xe_vma_destroy_unlocked(struct xe_vma *vma)
  
         drm_exec_init(&exec, 0, 0);
         drm_exec_until_all_locked(&exec) {
-               err = xe_vm_prepare_vma(&exec, vma, 0);
+               err = xe_vm_lock_vma(&exec, vma);
                 drm_exec_retry_on_contention(&exec);
                 if (XE_WARN_ON(err))
                         break;
@@ -1589,7 +1614,6 @@ static void vm_destroy_work_func(struct work_struct *w)
                 XE_WARN_ON(vm->pt_root[id]);
  
         trace_xe_vm_free(vm);
-       dma_fence_put(vm->rebind_fence);
         kfree(vm);
  }
  
@@ -2512,7 +2536,7 @@ static int op_execute(struct drm_exec *exec, struct xe_vm *vm,
  
         lockdep_assert_held_write(&vm->lock);
  
-       err = xe_vm_prepare_vma(exec, vma, 1);
+       err = xe_vm_lock_vma(exec, vma);
         if (err)
                 return err;
  
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h

index 6df1f1c7f85d98a2b948ba41ec9f1ed5a287faf0..306cd0934a190ba0d5580787522c59e762b3b163 100644 (file)
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -207,7 +207,7 @@ int __xe_vm_userptr_needs_repin(struct xe_vm *vm);
  
  int xe_vm_userptr_check_repin(struct xe_vm *vm);
  
-struct dma_fence *xe_vm_rebind(struct xe_vm *vm, bool rebind_worker);
+int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker);
  
  int xe_vm_invalidate_vma(struct xe_vma *vma);
  
@@ -242,8 +242,10 @@ bool xe_vm_validate_should_retry(struct drm_exec *exec, int err, ktime_t *end);
  
  int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id);
  
-int xe_vm_prepare_vma(struct drm_exec *exec, struct xe_vma *vma,
-                     unsigned int num_shared);
+int xe_vm_lock_vma(struct drm_exec *exec, struct xe_vma *vma);
+
+int xe_vm_validate_rebind(struct xe_vm *vm, struct drm_exec *exec,
+                         unsigned int num_fences);
  
  /**
   * xe_vm_resv() - Return's the vm's reservation object
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h

index ae5fb565f6bf48d52e29c811a8333793e4e128fd..badf3945083d56723cc477b3074929a4db316753 100644 (file)
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -177,9 +177,6 @@ struct xe_vm {
          */
         struct list_head rebind_list;
  
-       /** @rebind_fence: rebind fence from execbuf */
-       struct dma_fence *rebind_fence;
-
         /**
          * @destroy_work: worker to destroy VM, needed as a dma_fence signaling
          * from an irq context can be last put and the destroy needs to be able
@@ -264,6 +261,11 @@ struct xe_vm {
                 bool capture_once;
         } error_capture;
  
+       /**
+        * @tlb_flush_seqno: Required TLB flush seqno for the next exec.
+        * protected by the vm resv.
+        */
+       u64 tlb_flush_seqno;
         /** @batch_invalidate_tlb: Always invalidate TLB before batch start */
         bool batch_invalidate_tlb;
         /** @xef: XE file handle for tracking this VM's drm client */
diff --git a/drivers/i2c/busses/i2c-pxa.c b/drivers/i2c/busses/i2c-pxa.c

index 76f79b68cef84548b86def688b6ba95f4aa46335..888ca636f3f3b009ca542747ddca2119a79daa61 100644 (file)
--- a/drivers/i2c/busses/i2c-pxa.c
+++ b/drivers/i2c/busses/i2c-pxa.c
@@ -324,6 +324,7 @@ static void decode_ISR(unsigned int val)
         decode_bits(KERN_DEBUG "ISR", isr_bits, ARRAY_SIZE(isr_bits), val);
  }
  
+#ifdef CONFIG_I2C_PXA_SLAVE
  static const struct bits icr_bits[] = {
         PXA_BIT(ICR_START,  "START",    NULL),
         PXA_BIT(ICR_STOP,   "STOP",     NULL),
@@ -342,7 +343,6 @@ static const struct bits icr_bits[] = {
         PXA_BIT(ICR_UR,     "UR",               "ur"),
  };
  
-#ifdef CONFIG_I2C_PXA_SLAVE
  static void decode_ICR(unsigned int val)
  {
         decode_bits(KERN_DEBUG "ICR", icr_bits, ARRAY_SIZE(icr_bits), val);
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c

index e7a44929f0daf71f017ec8fe0d1b56243ab47ba5..33228c1c8980f32a5e8af323587601a4783b5b7f 100644 (file)
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3228,7 +3228,7 @@ out:
  static void iommu_snp_enable(void)
  {
  #ifdef CONFIG_KVM_AMD_SEV
-       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
                 return;
         /*
          * The SNP support requires that IOMMU must be enabled, and is
@@ -3236,12 +3236,14 @@ static void iommu_snp_enable(void)
          */
         if (no_iommu || iommu_default_passthrough()) {
                 pr_err("SNP: IOMMU disabled or configured in passthrough mode, SNP cannot be supported.\n");
+               cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
                 return;
         }
  
         amd_iommu_snp_en = check_feature(FEATURE_SNP);
         if (!amd_iommu_snp_en) {
                 pr_err("SNP: IOMMU SNP feature not enabled, SNP cannot be supported.\n");
+               cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
                 return;
         }
  
diff --git a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c

index 4c34344dc7dcb876e29d66358bcfcc79e1e77705..d7027d600208fc2f7233c5ca01ab7d590ef33042 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c
+++ b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c
@@ -50,12 +50,12 @@ static void mtk_vcodec_vpu_reset_dec_handler(void *priv)
  
         dev_err(&dev->plat_dev->dev, "Watchdog timeout!!");
  
-       mutex_lock(&dev->dev_mutex);
+       mutex_lock(&dev->dev_ctx_lock);
         list_for_each_entry(ctx, &dev->ctx_list, list) {
                 ctx->state = MTK_STATE_ABORT;
                 mtk_v4l2_vdec_dbg(0, ctx, "[%d] Change to state MTK_STATE_ABORT", ctx->id);
         }
-       mutex_unlock(&dev->dev_mutex);
+       mutex_unlock(&dev->dev_ctx_lock);
  }
  
  static void mtk_vcodec_vpu_reset_enc_handler(void *priv)
@@ -65,12 +65,12 @@ static void mtk_vcodec_vpu_reset_enc_handler(void *priv)
  
         dev_err(&dev->plat_dev->dev, "Watchdog timeout!!");
  
-       mutex_lock(&dev->dev_mutex);
+       mutex_lock(&dev->dev_ctx_lock);
         list_for_each_entry(ctx, &dev->ctx_list, list) {
                 ctx->state = MTK_STATE_ABORT;
                 mtk_v4l2_vdec_dbg(0, ctx, "[%d] Change to state MTK_STATE_ABORT", ctx->id);
         }
-       mutex_unlock(&dev->dev_mutex);
+       mutex_unlock(&dev->dev_ctx_lock);
  }
  
  static const struct mtk_vcodec_fw_ops mtk_vcodec_vpu_msg = {
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c

index f47c98faf068b6250de0c46a45efbca641a0e0ad..2073781ccadb156116b1cbe86c49b3e06b7a93f3 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c
@@ -268,7 +268,9 @@ static int fops_vcodec_open(struct file *file)
  
         ctx->dev->vdec_pdata->init_vdec_params(ctx);
  
+       mutex_lock(&dev->dev_ctx_lock);
         list_add(&ctx->list, &dev->ctx_list);
+       mutex_unlock(&dev->dev_ctx_lock);
         mtk_vcodec_dbgfs_create(ctx);
  
         mutex_unlock(&dev->dev_mutex);
@@ -311,7 +313,9 @@ static int fops_vcodec_release(struct file *file)
         v4l2_ctrl_handler_free(&ctx->ctrl_hdl);
  
         mtk_vcodec_dbgfs_remove(dev, ctx->id);
+       mutex_lock(&dev->dev_ctx_lock);
         list_del_init(&ctx->list);
+       mutex_unlock(&dev->dev_ctx_lock);
         kfree(ctx);
         mutex_unlock(&dev->dev_mutex);
         return 0;
@@ -404,6 +408,7 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
         for (i = 0; i < MTK_VDEC_HW_MAX; i++)
                 mutex_init(&dev->dec_mutex[i]);
         mutex_init(&dev->dev_mutex);
+       mutex_init(&dev->dev_ctx_lock);
         spin_lock_init(&dev->irqlock);
  
         snprintf(dev->v4l2_dev.name, sizeof(dev->v4l2_dev.name), "%s",
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h

index 849b89dd205c21d686d7fcfc3624df79f99e4449..85b2c0d3d8bcdd3a59027ddccd1efeb4371292c9 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
@@ -241,6 +241,7 @@ struct mtk_vcodec_dec_ctx {
   *
   * @dec_mutex: decoder hardware lock
   * @dev_mutex: video_device lock
+ * @dev_ctx_lock: the lock of context list
   * @decode_workqueue: decode work queue
   *
   * @irqlock: protect data access by irq handler and work thread
@@ -282,6 +283,7 @@ struct mtk_vcodec_dec_dev {
         /* decoder hardware mutex lock */
         struct mutex dec_mutex[MTK_VDEC_HW_MAX];
         struct mutex dev_mutex;
+       struct mutex dev_ctx_lock;
         struct workqueue_struct *decode_workqueue;
  
         spinlock_t irqlock;
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c

index 06ed47df693bfd049fe5537abb6b994c1b740b85..21836dd6ef85a36f4bfc7e781f0a5b57f6c1962d 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c
@@ -869,7 +869,6 @@ static int vdec_hevc_slice_init(struct mtk_vcodec_dec_ctx *ctx)
         inst->vpu.codec_type = ctx->current_codec;
         inst->vpu.capture_type = ctx->capture_fourcc;
  
-       ctx->drv_handle = inst;
         err = vpu_dec_init(&inst->vpu);
         if (err) {
                 mtk_vdec_err(ctx, "vdec_hevc init err=%d", err);
@@ -898,6 +897,7 @@ static int vdec_hevc_slice_init(struct mtk_vcodec_dec_ctx *ctx)
         mtk_vdec_debug(ctx, "lat hevc instance >> %p, codec_type = 0x%x",
                        inst, inst->vpu.codec_type);
  
+       ctx->drv_handle = inst;
         return 0;
  error_free_inst:
         kfree(inst);
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp8_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp8_if.c

index 19407f9bc773c34445613ed8311fb86b1b565d38..987b3d71b662ac98495604e535f6ece7b733b8dd 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp8_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp8_if.c
@@ -449,7 +449,7 @@ static int vdec_vp8_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
                        inst->frm_cnt, y_fb_dma, c_fb_dma, fb);
  
         inst->cur_fb = fb;
-       dec->bs_dma = (unsigned long)bs->dma_addr;
+       dec->bs_dma = (uint64_t)bs->dma_addr;
         dec->bs_sz = bs->size;
         dec->cur_y_fb_dma = y_fb_dma;
         dec->cur_c_fb_dma = c_fb_dma;
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_if.c

index 55355fa7009083cacba971e0e3f0981e09f80300..039082f600c813f8e703fd283843ee1bddbe31c8 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_if.c
@@ -16,6 +16,7 @@
  #include "../vdec_drv_base.h"
  #include "../vdec_vpu_if.h"
  
+#define VP9_MAX_SUPER_FRAMES_NUM 8
  #define VP9_SUPER_FRAME_BS_SZ 64
  #define MAX_VP9_DPB_SIZE       9
  
@@ -133,11 +134,11 @@ struct vp9_sf_ref_fb {
   */
  struct vdec_vp9_vsi {
         unsigned char sf_bs_buf[VP9_SUPER_FRAME_BS_SZ];
-       struct vp9_sf_ref_fb sf_ref_fb[VP9_MAX_FRM_BUF_NUM-1];
+       struct vp9_sf_ref_fb sf_ref_fb[VP9_MAX_SUPER_FRAMES_NUM];
         int sf_next_ref_fb_idx;
         unsigned int sf_frm_cnt;
-       unsigned int sf_frm_offset[VP9_MAX_FRM_BUF_NUM-1];
-       unsigned int sf_frm_sz[VP9_MAX_FRM_BUF_NUM-1];
+       unsigned int sf_frm_offset[VP9_MAX_SUPER_FRAMES_NUM];
+       unsigned int sf_frm_sz[VP9_MAX_SUPER_FRAMES_NUM];
         unsigned int sf_frm_idx;
         unsigned int sf_init;
         struct vdec_fb fb;
@@ -526,7 +527,7 @@ static void vp9_swap_frm_bufs(struct vdec_vp9_inst *inst)
         /* if this super frame and it is not last sub-frame, get next fb for
          * sub-frame decode
          */
-       if (vsi->sf_frm_cnt > 0 && vsi->sf_frm_idx != vsi->sf_frm_cnt - 1)
+       if (vsi->sf_frm_cnt > 0 && vsi->sf_frm_idx != vsi->sf_frm_cnt)
                 vsi->sf_next_ref_fb_idx = vp9_get_sf_ref_fb(inst);
  }
  
@@ -735,7 +736,7 @@ static void get_free_fb(struct vdec_vp9_inst *inst, struct vdec_fb **out_fb)
  
  static int validate_vsi_array_indexes(struct vdec_vp9_inst *inst,
                 struct vdec_vp9_vsi *vsi) {
-       if (vsi->sf_frm_idx >= VP9_MAX_FRM_BUF_NUM - 1) {
+       if (vsi->sf_frm_idx > VP9_MAX_SUPER_FRAMES_NUM) {
                 mtk_vdec_err(inst->ctx, "Invalid vsi->sf_frm_idx=%u.", vsi->sf_frm_idx);
                 return -EIO;
         }
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c

index cf48d09b78d7a156440e1343448af946342d26e9..eea709d93820919d33d13184af7281fe9f0035fc 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
@@ -1074,7 +1074,7 @@ static int vdec_vp9_slice_setup_tile_buffer(struct vdec_vp9_slice_instance *inst
         unsigned int mi_row;
         unsigned int mi_col;
         unsigned int offset;
-       unsigned int pa;
+       dma_addr_t pa;
         unsigned int size;
         struct vdec_vp9_slice_tiles *tiles;
         unsigned char *pos;
@@ -1109,7 +1109,7 @@ static int vdec_vp9_slice_setup_tile_buffer(struct vdec_vp9_slice_instance *inst
         pos = va + offset;
         end = va + bs->size;
         /* truncated */
-       pa = (unsigned int)bs->dma_addr + offset;
+       pa = bs->dma_addr + offset;
         tb = instance->tile.va;
         for (i = 0; i < rows; i++) {
                 for (j = 0; j < cols; j++) {
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c

index 82e57ae983d55777463b4d7b08ac6fc18f3ec675..da6be556727bb18a458e1e59235615dc9b42c05f 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
@@ -77,12 +77,14 @@ static bool vpu_dec_check_ap_inst(struct mtk_vcodec_dec_dev *dec_dev, struct vde
         struct mtk_vcodec_dec_ctx *ctx;
         int ret = false;
  
+       mutex_lock(&dec_dev->dev_ctx_lock);
         list_for_each_entry(ctx, &dec_dev->ctx_list, list) {
                 if (!IS_ERR_OR_NULL(ctx) && ctx->vpu_inst == vpu) {
                         ret = true;
                         break;
                 }
         }
+       mutex_unlock(&dec_dev->dev_ctx_lock);
  
         return ret;
  }
diff --git a/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.c b/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.c

index 6319f24bc714b5eb3a7018f1e612afcf2dadf25e..3cb8a16222220e2d5480b48b48879112a68fc11f 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.c
+++ b/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.c
@@ -177,7 +177,9 @@ static int fops_vcodec_open(struct file *file)
         mtk_v4l2_venc_dbg(2, ctx, "Create instance [%d]@%p m2m_ctx=%p ",
                           ctx->id, ctx, ctx->m2m_ctx);
  
+       mutex_lock(&dev->dev_ctx_lock);
         list_add(&ctx->list, &dev->ctx_list);
+       mutex_unlock(&dev->dev_ctx_lock);
  
         mutex_unlock(&dev->dev_mutex);
         mtk_v4l2_venc_dbg(0, ctx, "%s encoder [%d]", dev_name(&dev->plat_dev->dev),
@@ -212,7 +214,9 @@ static int fops_vcodec_release(struct file *file)
         v4l2_fh_exit(&ctx->fh);
         v4l2_ctrl_handler_free(&ctx->ctrl_hdl);
  
+       mutex_lock(&dev->dev_ctx_lock);
         list_del_init(&ctx->list);
+       mutex_unlock(&dev->dev_ctx_lock);
         kfree(ctx);
         mutex_unlock(&dev->dev_mutex);
         return 0;
@@ -294,6 +298,7 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
  
         mutex_init(&dev->enc_mutex);
         mutex_init(&dev->dev_mutex);
+       mutex_init(&dev->dev_ctx_lock);
         spin_lock_init(&dev->irqlock);
  
         snprintf(dev->v4l2_dev.name, sizeof(dev->v4l2_dev.name), "%s",
diff --git a/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.h b/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.h

index a042f607ed8d1645a9dc3cf199b89e4280bc8337..0bd85d0fb379acbba3ac07c01e780cf57bef0305 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.h
+++ b/drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.h
@@ -178,6 +178,7 @@ struct mtk_vcodec_enc_ctx {
   *
   * @enc_mutex: encoder hardware lock.
   * @dev_mutex: video_device lock
+ * @dev_ctx_lock: the lock of context list
   * @encode_workqueue: encode work queue
   *
   * @enc_irq: h264 encoder irq resource
@@ -205,6 +206,7 @@ struct mtk_vcodec_enc_dev {
         /* encoder hardware mutex lock */
         struct mutex enc_mutex;
         struct mutex dev_mutex;
+       struct mutex dev_ctx_lock;
         struct workqueue_struct *encode_workqueue;
  
         int enc_irq;
diff --git a/drivers/media/platform/mediatek/vcodec/encoder/venc_vpu_if.c b/drivers/media/platform/mediatek/vcodec/encoder/venc_vpu_if.c

index 84ad1cc6ad171ef2ea2767653d60e6d779e5604e..51bb7ee141b9e58ac98f940f5e419d9ef4df37ca 100644 (file)
--- a/drivers/media/platform/mediatek/vcodec/encoder/venc_vpu_if.c
+++ b/drivers/media/platform/mediatek/vcodec/encoder/venc_vpu_if.c
@@ -47,12 +47,14 @@ static bool vpu_enc_check_ap_inst(struct mtk_vcodec_enc_dev *enc_dev, struct ven
         struct mtk_vcodec_enc_ctx *ctx;
         int ret = false;
  
+       mutex_lock(&enc_dev->dev_ctx_lock);
         list_for_each_entry(ctx, &enc_dev->ctx_list, list) {
                 if (!IS_ERR_OR_NULL(ctx) && ctx->vpu_inst == vpu) {
                         ret = true;
                         break;
                 }
         }
+       mutex_unlock(&enc_dev->dev_ctx_lock);
  
         return ret;
  }
diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c

index 97a00ec9a4d48944a8233b49c5fa0106493abb47..caacdc0a3819458fbb47faae23432f5950fe5869 100644 (file)
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -209,7 +209,7 @@ static void block2mtd_free_device(struct block2mtd_dev *dev)
  
         if (dev->bdev_file) {
                 invalidate_mapping_pages(dev->bdev_file->f_mapping, 0, -1);
-               fput(dev->bdev_file);
+               bdev_fput(dev->bdev_file);
         }
  
         kfree(dev);
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c

index 1035820c2377af7d73d401824734e949b7679c8d..c0d0bce0b5942d67a139091f08441983f30dc074 100644 (file)
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -950,20 +950,173 @@ static void mt7530_setup_port5(struct dsa_switch *ds, phy_interface_t interface)
         mutex_unlock(&priv->reg_mutex);
  }
  
-/* On page 205, section "8.6.3 Frame filtering" of the active standard, IEEE Std
- * 802.1Q™-2022, it is stated that frames with 01:80:C2:00:00:00-0F as MAC DA
- * must only be propagated to C-VLAN and MAC Bridge components. That means
- * VLAN-aware and VLAN-unaware bridges. On the switch designs with CPU ports,
- * these frames are supposed to be processed by the CPU (software). So we make
- * the switch only forward them to the CPU port. And if received from a CPU
- * port, forward to a single port. The software is responsible of making the
- * switch conform to the latter by setting a single port as destination port on
- * the special tag.
+/* In Clause 5 of IEEE Std 802-2014, two sublayers of the data link layer (DLL)
+ * of the Open Systems Interconnection basic reference model (OSI/RM) are
+ * described; the medium access control (MAC) and logical link control (LLC)
+ * sublayers. The MAC sublayer is the one facing the physical layer.
   *
- * This switch intellectual property cannot conform to this part of the standard
- * fully. Whilst the REV_UN frame tag covers the remaining :04-0D and :0F MAC
- * DAs, it also includes :22-FF which the scope of propagation is not supposed
- * to be restricted for these MAC DAs.
+ * In 8.2 of IEEE Std 802.1Q-2022, the Bridge architecture is described. A
+ * Bridge component comprises a MAC Relay Entity for interconnecting the Ports
+ * of the Bridge, at least two Ports, and higher layer entities with at least a
+ * Spanning Tree Protocol Entity included.
+ *
+ * Each Bridge Port also functions as an end station and shall provide the MAC
+ * Service to an LLC Entity. Each instance of the MAC Service is provided to a
+ * distinct LLC Entity that supports protocol identification, multiplexing, and
+ * demultiplexing, for protocol data unit (PDU) transmission and reception by
+ * one or more higher layer entities.
+ *
+ * It is described in 8.13.9 of IEEE Std 802.1Q-2022 that in a Bridge, the LLC
+ * Entity associated with each Bridge Port is modeled as being directly
+ * connected to the attached Local Area Network (LAN).
+ *
+ * On the switch with CPU port architecture, CPU port functions as Management
+ * Port, and the Management Port functionality is provided by software which
+ * functions as an end station. Software is connected to an IEEE 802 LAN that is
+ * wholly contained within the system that incorporates the Bridge. Software
+ * provides access to the LLC Entity associated with each Bridge Port by the
+ * value of the source port field on the special tag on the frame received by
+ * software.
+ *
+ * We call frames that carry control information to determine the active
+ * topology and current extent of each Virtual Local Area Network (VLAN), i.e.,
+ * spanning tree or Shortest Path Bridging (SPB) and Multiple VLAN Registration
+ * Protocol Data Units (MVRPDUs), and frames from other link constrained
+ * protocols, such as Extensible Authentication Protocol over LAN (EAPOL) and
+ * Link Layer Discovery Protocol (LLDP), link-local frames. They are not
+ * forwarded by a Bridge. Permanently configured entries in the filtering
+ * database (FDB) ensure that such frames are discarded by the Forwarding
+ * Process. In 8.6.3 of IEEE Std 802.1Q-2022, this is described in detail:
+ *
+ * Each of the reserved MAC addresses specified in Table 8-1
+ * (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]) shall be
+ * permanently configured in the FDB in C-VLAN components and ERs.
+ *
+ * Each of the reserved MAC addresses specified in Table 8-2
+ * (01-80-C2-00-00-[01,02,03,04,05,06,07,08,09,0A,0E]) shall be permanently
+ * configured in the FDB in S-VLAN components.
+ *
+ * Each of the reserved MAC addresses specified in Table 8-3
+ * (01-80-C2-00-00-[01,02,04,0E]) shall be permanently configured in the FDB in
+ * TPMR components.
+ *
+ * The FDB entries for reserved MAC addresses shall specify filtering for all
+ * Bridge Ports and all VIDs. Management shall not provide the capability to
+ * modify or remove entries for reserved MAC addresses.
+ *
+ * The addresses in Table 8-1, Table 8-2, and Table 8-3 determine the scope of
+ * propagation of PDUs within a Bridged Network, as follows:
+ *
+ *   The Nearest Bridge group address (01-80-C2-00-00-0E) is an address that no
+ *   conformant Two-Port MAC Relay (TPMR) component, Service VLAN (S-VLAN)
+ *   component, Customer VLAN (C-VLAN) component, or MAC Bridge can forward.
+ *   PDUs transmitted using this destination address, or any other addresses
+ *   that appear in Table 8-1, Table 8-2, and Table 8-3
+ *   (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]), can
+ *   therefore travel no further than those stations that can be reached via a
+ *   single individual LAN from the originating station.
+ *
+ *   The Nearest non-TPMR Bridge group address (01-80-C2-00-00-03), is an
+ *   address that no conformant S-VLAN component, C-VLAN component, or MAC
+ *   Bridge can forward; however, this address is relayed by a TPMR component.
+ *   PDUs using this destination address, or any of the other addresses that
+ *   appear in both Table 8-1 and Table 8-2 but not in Table 8-3
+ *   (01-80-C2-00-00-[00,03,05,06,07,08,09,0A,0B,0C,0D,0F]), will be relayed by
+ *   any TPMRs but will propagate no further than the nearest S-VLAN component,
+ *   C-VLAN component, or MAC Bridge.
+ *
+ *   The Nearest Customer Bridge group address (01-80-C2-00-00-00) is an address
+ *   that no conformant C-VLAN component, MAC Bridge can forward; however, it is
+ *   relayed by TPMR components and S-VLAN components. PDUs using this
+ *   destination address, or any of the other addresses that appear in Table 8-1
+ *   but not in either Table 8-2 or Table 8-3 (01-80-C2-00-00-[00,0B,0C,0D,0F]),
+ *   will be relayed by TPMR components and S-VLAN components but will propagate
+ *   no further than the nearest C-VLAN component or MAC Bridge.
+ *
+ * Because the LLC Entity associated with each Bridge Port is provided via CPU
+ * port, we must not filter these frames but forward them to CPU port.
+ *
+ * In a Bridge, the transmission Port is majorly decided by ingress and egress
+ * rules, FDB, and spanning tree Port State functions of the Forwarding Process.
+ * For link-local frames, only CPU port should be designated as destination port
+ * in the FDB, and the other functions of the Forwarding Process must not
+ * interfere with the decision of the transmission Port. We call this process
+ * trapping frames to CPU port.
+ *
+ * Therefore, on the switch with CPU port architecture, link-local frames must
+ * be trapped to CPU port, and certain link-local frames received by a Port of a
+ * Bridge comprising a TPMR component or an S-VLAN component must be excluded
+ * from it.
+ *
+ * A Bridge of the switch with CPU port architecture cannot comprise a Two-Port
+ * MAC Relay (TPMR) component as a TPMR component supports only a subset of the
+ * functionality of a MAC Bridge. A Bridge comprising two Ports (Management Port
+ * doesn't count) of this architecture will either function as a standard MAC
+ * Bridge or a standard VLAN Bridge.
+ *
+ * Therefore, a Bridge of this architecture can only comprise S-VLAN components,
+ * C-VLAN components, or MAC Bridge components. Since there's no TPMR component,
+ * we don't need to relay PDUs using the destination addresses specified on the
+ * Nearest non-TPMR section, and the proportion of the Nearest Customer Bridge
+ * section where they must be relayed by TPMR components.
+ *
+ * One option to trap link-local frames to CPU port is to add static FDB entries
+ * with CPU port designated as destination port. However, because that
+ * Independent VLAN Learning (IVL) is being used on every VID, each entry only
+ * applies to a single VLAN Identifier (VID). For a Bridge comprising a MAC
+ * Bridge component or a C-VLAN component, there would have to be 16 times 4096
+ * entries. This switch intellectual property can only hold a maximum of 2048
+ * entries. Using this option, there also isn't a mechanism to prevent
+ * link-local frames from being discarded when the spanning tree Port State of
+ * the reception Port is discarding.
+ *
+ * The remaining option is to utilise the BPC, RGAC1, RGAC2, RGAC3, and RGAC4
+ * registers. Whilst this applies to every VID, it doesn't contain all of the
+ * reserved MAC addresses without affecting the remaining Standard Group MAC
+ * Addresses. The REV_UN frame tag utilised using the RGAC4 register covers the
+ * remaining 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] destination
+ * addresses. It also includes the 01-80-C2-00-00-22 to 01-80-C2-00-00-FF
+ * destination addresses which may be relayed by MAC Bridges or VLAN Bridges.
+ * The latter option provides better but not complete conformance.
+ *
+ * This switch intellectual property also does not provide a mechanism to trap
+ * link-local frames with specific destination addresses to CPU port by Bridge,
+ * to conform to the filtering rules for the distinct Bridge components.
+ *
+ * Therefore, regardless of the type of the Bridge component, link-local frames
+ * with these destination addresses will be trapped to CPU port:
+ *
+ * 01-80-C2-00-00-[00,01,02,03,0E]
+ *
+ * In a Bridge comprising a MAC Bridge component or a C-VLAN component:
+ *
+ *   Link-local frames with these destination addresses won't be trapped to CPU
+ *   port which won't conform to IEEE Std 802.1Q-2022:
+ *
+ *   01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F]
+ *
+ * In a Bridge comprising an S-VLAN component:
+ *
+ *   Link-local frames with these destination addresses will be trapped to CPU
+ *   port which won't conform to IEEE Std 802.1Q-2022:
+ *
+ *   01-80-C2-00-00-00
+ *
+ *   Link-local frames with these destination addresses won't be trapped to CPU
+ *   port which won't conform to IEEE Std 802.1Q-2022:
+ *
+ *   01-80-C2-00-00-[04,05,06,07,08,09,0A]
+ *
+ * To trap link-local frames to CPU port as conformant as this switch
+ * intellectual property can allow, link-local frames are made to be regarded as
+ * Bridge Protocol Data Units (BPDUs). This is because this switch intellectual
+ * property only lets the frames regarded as BPDUs bypass the spanning tree Port
+ * State function of the Forwarding Process.
+ *
+ * The only remaining interference is the ingress rules. When the reception Port
+ * has no PVID assigned on software, VLAN-untagged frames won't be allowed in.
+ * There doesn't seem to be a mechanism on the switch intellectual property to
+ * have link-local frames bypass this function of the Forwarding Process.
   */
  static void
  mt753x_trap_frames(struct mt7530_priv *priv)
@@ -971,35 +1124,43 @@ mt753x_trap_frames(struct mt7530_priv *priv)
         /* Trap 802.1X PAE frames and BPDUs to the CPU port(s) and egress them
          * VLAN-untagged.
          */
-       mt7530_rmw(priv, MT753X_BPC, MT753X_PAE_EG_TAG_MASK |
-                  MT753X_PAE_PORT_FW_MASK | MT753X_BPDU_EG_TAG_MASK |
-                  MT753X_BPDU_PORT_FW_MASK,
-                  MT753X_PAE_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
-                  MT753X_PAE_PORT_FW(MT753X_BPDU_CPU_ONLY) |
-                  MT753X_BPDU_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
-                  MT753X_BPDU_CPU_ONLY);
+       mt7530_rmw(priv, MT753X_BPC,
+                  MT753X_PAE_BPDU_FR | MT753X_PAE_EG_TAG_MASK |
+                          MT753X_PAE_PORT_FW_MASK | MT753X_BPDU_EG_TAG_MASK |
+                          MT753X_BPDU_PORT_FW_MASK,
+                  MT753X_PAE_BPDU_FR |
+                          MT753X_PAE_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+                          MT753X_PAE_PORT_FW(MT753X_BPDU_CPU_ONLY) |
+                          MT753X_BPDU_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+                          MT753X_BPDU_CPU_ONLY);
  
         /* Trap frames with :01 and :02 MAC DAs to the CPU port(s) and egress
          * them VLAN-untagged.
          */
-       mt7530_rmw(priv, MT753X_RGAC1, MT753X_R02_EG_TAG_MASK |
-                  MT753X_R02_PORT_FW_MASK | MT753X_R01_EG_TAG_MASK |
-                  MT753X_R01_PORT_FW_MASK,
-                  MT753X_R02_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
-                  MT753X_R02_PORT_FW(MT753X_BPDU_CPU_ONLY) |
-                  MT753X_R01_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
-                  MT753X_BPDU_CPU_ONLY);
+       mt7530_rmw(priv, MT753X_RGAC1,
+                  MT753X_R02_BPDU_FR | MT753X_R02_EG_TAG_MASK |
+                          MT753X_R02_PORT_FW_MASK | MT753X_R01_BPDU_FR |
+                          MT753X_R01_EG_TAG_MASK | MT753X_R01_PORT_FW_MASK,
+                  MT753X_R02_BPDU_FR |
+                          MT753X_R02_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+                          MT753X_R02_PORT_FW(MT753X_BPDU_CPU_ONLY) |
+                          MT753X_R01_BPDU_FR |
+                          MT753X_R01_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+                          MT753X_BPDU_CPU_ONLY);
  
         /* Trap frames with :03 and :0E MAC DAs to the CPU port(s) and egress
          * them VLAN-untagged.
          */
-       mt7530_rmw(priv, MT753X_RGAC2, MT753X_R0E_EG_TAG_MASK |
-                  MT753X_R0E_PORT_FW_MASK | MT753X_R03_EG_TAG_MASK |
-                  MT753X_R03_PORT_FW_MASK,
-                  MT753X_R0E_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
-                  MT753X_R0E_PORT_FW(MT753X_BPDU_CPU_ONLY) |
-                  MT753X_R03_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
-                  MT753X_BPDU_CPU_ONLY);
+       mt7530_rmw(priv, MT753X_RGAC2,
+                  MT753X_R0E_BPDU_FR | MT753X_R0E_EG_TAG_MASK |
+                          MT753X_R0E_PORT_FW_MASK | MT753X_R03_BPDU_FR |
+                          MT753X_R03_EG_TAG_MASK | MT753X_R03_PORT_FW_MASK,
+                  MT753X_R0E_BPDU_FR |
+                          MT753X_R0E_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+                          MT753X_R0E_PORT_FW(MT753X_BPDU_CPU_ONLY) |
+                          MT753X_R03_BPDU_FR |
+                          MT753X_R03_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+                          MT753X_BPDU_CPU_ONLY);
  }
  
  static void
@@ -2505,18 +2666,25 @@ mt7531_setup(struct dsa_switch *ds)
         mt7530_rmw(priv, MT7531_GPIO_MODE0, MT7531_GPIO0_MASK,
                    MT7531_GPIO0_INTERRUPT);
  
-       /* Enable PHY core PLL, since phy_device has not yet been created
-        * provided for phy_[read,write]_mmd_indirect is called, we provide
-        * our own mt7531_ind_mmd_phy_[read,write] to complete this
-        * function.
+       /* Enable Energy-Efficient Ethernet (EEE) and PHY core PLL, since
+        * phy_device has not yet been created provided for
+        * phy_[read,write]_mmd_indirect is called, we provide our own
+        * mt7531_ind_mmd_phy_[read,write] to complete this function.
          */
         val = mt7531_ind_c45_phy_read(priv, MT753X_CTRL_PHY_ADDR,
                                       MDIO_MMD_VEND2, CORE_PLL_GROUP4);
-       val |= MT7531_PHY_PLL_BYPASS_MODE;
+       val |= MT7531_RG_SYSPLL_DMY2 | MT7531_PHY_PLL_BYPASS_MODE;
         val &= ~MT7531_PHY_PLL_OFF;
         mt7531_ind_c45_phy_write(priv, MT753X_CTRL_PHY_ADDR, MDIO_MMD_VEND2,
                                  CORE_PLL_GROUP4, val);
  
+       /* Disable EEE advertisement on the switch PHYs. */
+       for (i = MT753X_CTRL_PHY_ADDR;
+            i < MT753X_CTRL_PHY_ADDR + MT7530_NUM_PHYS; i++) {
+               mt7531_ind_c45_phy_write(priv, i, MDIO_MMD_AN, MDIO_AN_EEE_ADV,
+                                        0);
+       }
+
         mt7531_setup_common(ds);
  
         /* Setup VLAN ID 0 for VLAN-unaware bridges */
diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h

index d17b318e6ee4882ed8b6f6668eaa57a99a38d184..585db03c054878f85ba69e6546cfc885491d8f4e 100644 (file)
--- a/drivers/net/dsa/mt7530.h
+++ b/drivers/net/dsa/mt7530.h
@@ -65,6 +65,7 @@ enum mt753x_id {
  
  /* Registers for BPDU and PAE frame control*/
  #define MT753X_BPC                     0x24
+#define  MT753X_PAE_BPDU_FR            BIT(25)
  #define  MT753X_PAE_EG_TAG_MASK                GENMASK(24, 22)
  #define  MT753X_PAE_EG_TAG(x)          FIELD_PREP(MT753X_PAE_EG_TAG_MASK, x)
  #define  MT753X_PAE_PORT_FW_MASK       GENMASK(18, 16)
@@ -75,20 +76,24 @@ enum mt753x_id {
  
  /* Register for :01 and :02 MAC DA frame control */
  #define MT753X_RGAC1                   0x28
+#define  MT753X_R02_BPDU_FR            BIT(25)
  #define  MT753X_R02_EG_TAG_MASK                GENMASK(24, 22)
  #define  MT753X_R02_EG_TAG(x)          FIELD_PREP(MT753X_R02_EG_TAG_MASK, x)
  #define  MT753X_R02_PORT_FW_MASK       GENMASK(18, 16)
  #define  MT753X_R02_PORT_FW(x)         FIELD_PREP(MT753X_R02_PORT_FW_MASK, x)
+#define  MT753X_R01_BPDU_FR            BIT(9)
  #define  MT753X_R01_EG_TAG_MASK                GENMASK(8, 6)
  #define  MT753X_R01_EG_TAG(x)          FIELD_PREP(MT753X_R01_EG_TAG_MASK, x)
  #define  MT753X_R01_PORT_FW_MASK       GENMASK(2, 0)
  
  /* Register for :03 and :0E MAC DA frame control */
  #define MT753X_RGAC2                   0x2c
+#define  MT753X_R0E_BPDU_FR            BIT(25)
  #define  MT753X_R0E_EG_TAG_MASK                GENMASK(24, 22)
  #define  MT753X_R0E_EG_TAG(x)          FIELD_PREP(MT753X_R0E_EG_TAG_MASK, x)
  #define  MT753X_R0E_PORT_FW_MASK       GENMASK(18, 16)
  #define  MT753X_R0E_PORT_FW(x)         FIELD_PREP(MT753X_R0E_PORT_FW_MASK, x)
+#define  MT753X_R03_BPDU_FR            BIT(9)
  #define  MT753X_R03_EG_TAG_MASK                GENMASK(8, 6)
  #define  MT753X_R03_EG_TAG(x)          FIELD_PREP(MT753X_R03_EG_TAG_MASK, x)
  #define  MT753X_R03_PORT_FW_MASK       GENMASK(2, 0)
@@ -616,6 +621,7 @@ enum mt7531_clk_skew {
  #define  RG_SYSPLL_DDSFBK_EN           BIT(12)
  #define  RG_SYSPLL_BIAS_EN             BIT(11)
  #define  RG_SYSPLL_BIAS_LPF_EN         BIT(10)
+#define  MT7531_RG_SYSPLL_DMY2         BIT(6)
  #define  MT7531_PHY_PLL_OFF            BIT(5)
  #define  MT7531_PHY_PLL_BYPASS_MODE    BIT(4)
  
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c

index 9e9e4a03f1a8c9bd4c8d68cb20340157c4547e80..2d8a66ea82fab7f0a023ab469ccc33321d0b4ba3 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -351,7 +351,7 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,
                         ENA_COM_BOUNCE_BUFFER_CNTRL_CNT;
                 io_sq->bounce_buf_ctrl.next_to_use = 0;
  
-               size = io_sq->bounce_buf_ctrl.buffer_size *
+               size = (size_t)io_sq->bounce_buf_ctrl.buffer_size *
                         io_sq->bounce_buf_ctrl.buffers_num;
  
                 dev_node = dev_to_node(ena_dev->dmadev);
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c

index 09e7da1a69c9f0c8141e03be445c2589e9d3a999..be5acfa41ee0ce4d80605e0bcdc6dc743c421f42 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -718,8 +718,11 @@ void ena_unmap_tx_buff(struct ena_ring *tx_ring,
  static void ena_free_tx_bufs(struct ena_ring *tx_ring)
  {
         bool print_once = true;
+       bool is_xdp_ring;
         u32 i;
  
+       is_xdp_ring = ENA_IS_XDP_INDEX(tx_ring->adapter, tx_ring->qid);
+
         for (i = 0; i < tx_ring->ring_size; i++) {
                 struct ena_tx_buffer *tx_info = &tx_ring->tx_buffer_info[i];
  
@@ -739,10 +742,15 @@ static void ena_free_tx_bufs(struct ena_ring *tx_ring)
  
                 ena_unmap_tx_buff(tx_ring, tx_info);
  
-               dev_kfree_skb_any(tx_info->skb);
+               if (is_xdp_ring)
+                       xdp_return_frame(tx_info->xdpf);
+               else
+                       dev_kfree_skb_any(tx_info->skb);
         }
-       netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev,
-                                                 tx_ring->qid));
+
+       if (!is_xdp_ring)
+               netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev,
+                                                         tx_ring->qid));
  }
  
  static void ena_free_all_tx_bufs(struct ena_adapter *adapter)
@@ -3481,10 +3489,11 @@ static void check_for_missing_completions(struct ena_adapter *adapter)
  {
         struct ena_ring *tx_ring;
         struct ena_ring *rx_ring;
-       int i, budget, rc;
+       int qid, budget, rc;
         int io_queue_count;
  
         io_queue_count = adapter->xdp_num_queues + adapter->num_io_queues;
+
         /* Make sure the driver doesn't turn the device in other process */
         smp_rmb();
  
@@ -3497,27 +3506,29 @@ static void check_for_missing_completions(struct ena_adapter *adapter)
         if (adapter->missing_tx_completion_to == ENA_HW_HINTS_NO_TIMEOUT)
                 return;
  
-       budget = ENA_MONITORED_TX_QUEUES;
+       budget = min_t(u32, io_queue_count, ENA_MONITORED_TX_QUEUES);
  
-       for (i = adapter->last_monitored_tx_qid; i < io_queue_count; i++) {
-               tx_ring = &adapter->tx_ring[i];
-               rx_ring = &adapter->rx_ring[i];
+       qid = adapter->last_monitored_tx_qid;
+
+       while (budget) {
+               qid = (qid + 1) % io_queue_count;
+
+               tx_ring = &adapter->tx_ring[qid];
+               rx_ring = &adapter->rx_ring[qid];
  
                 rc = check_missing_comp_in_tx_queue(adapter, tx_ring);
                 if (unlikely(rc))
                         return;
  
-               rc =  !ENA_IS_XDP_INDEX(adapter, i) ?
+               rc =  !ENA_IS_XDP_INDEX(adapter, qid) ?
                         check_for_rx_interrupt_queue(adapter, rx_ring) : 0;
                 if (unlikely(rc))
                         return;
  
                 budget--;
-               if (!budget)
-                       break;
         }
  
-       adapter->last_monitored_tx_qid = i % io_queue_count;
+       adapter->last_monitored_tx_qid = qid;
  }
  
  /* trigger napi schedule after 2 consecutive detections */
diff --git a/drivers/net/ethernet/amazon/ena/ena_xdp.c b/drivers/net/ethernet/amazon/ena/ena_xdp.c

index 337c435d3ce998b1b8f69a86f8be7997e1ff99c8..5b175e7e92a10ba19917b9c5e63d89bc1f2a8dd5 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_xdp.c
+++ b/drivers/net/ethernet/amazon/ena/ena_xdp.c
@@ -89,7 +89,7 @@ int ena_xdp_xmit_frame(struct ena_ring *tx_ring,
  
         rc = ena_xdp_tx_map_frame(tx_ring, tx_info, xdpf, &ena_tx_ctx);
         if (unlikely(rc))
-               return rc;
+               goto err;
  
         ena_tx_ctx.req_id = req_id;
  
@@ -112,7 +112,9 @@ int ena_xdp_xmit_frame(struct ena_ring *tx_ring,
  
  error_unmap_dma:
         ena_unmap_tx_buff(tx_ring, tx_info);
+err:
         tx_info->xdpf = NULL;
+
         return rc;
  }
  
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h

index 86f1854698b4e80816b1b55e1d4f4d31fdf0737f..883c044852f1df39852b50b50a80ab31c7bfb091 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h
@@ -95,9 +95,15 @@ static inline void mlx5e_ptp_metadata_fifo_push(struct mlx5e_ptp_metadata_fifo *
  }
  
  static inline u8
+mlx5e_ptp_metadata_fifo_peek(struct mlx5e_ptp_metadata_fifo *fifo)
+{
+       return fifo->data[fifo->mask & fifo->cc];
+}
+
+static inline void
  mlx5e_ptp_metadata_fifo_pop(struct mlx5e_ptp_metadata_fifo *fifo)
  {
-       return fifo->data[fifo->mask & fifo->cc++];
+       fifo->cc++;
  }
  
  static inline void
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c

index e87e26f2c669c2e39f59a9f656e643fce2b48aae..6743806b8480602a8d0a3d02cddcf2d2c1c82199 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
@@ -83,24 +83,25 @@ int mlx5e_open_qos_sq(struct mlx5e_priv *priv, struct mlx5e_channels *chs,
  
         txq_ix = mlx5e_qid_from_qos(chs, node_qid);
  
-       WARN_ON(node_qid > priv->htb_max_qos_sqs);
-       if (node_qid == priv->htb_max_qos_sqs) {
-               struct mlx5e_sq_stats *stats, **stats_list = NULL;
-
-               if (priv->htb_max_qos_sqs == 0) {
-                       stats_list = kvcalloc(mlx5e_qos_max_leaf_nodes(priv->mdev),
-                                             sizeof(*stats_list),
-                                             GFP_KERNEL);
-                       if (!stats_list)
-                               return -ENOMEM;
-               }
+       WARN_ON(node_qid >= mlx5e_htb_cur_leaf_nodes(priv->htb));
+       if (!priv->htb_qos_sq_stats) {
+               struct mlx5e_sq_stats **stats_list;
+
+               stats_list = kvcalloc(mlx5e_qos_max_leaf_nodes(priv->mdev),
+                                     sizeof(*stats_list), GFP_KERNEL);
+               if (!stats_list)
+                       return -ENOMEM;
+
+               WRITE_ONCE(priv->htb_qos_sq_stats, stats_list);
+       }
+
+       if (!priv->htb_qos_sq_stats[node_qid]) {
+               struct mlx5e_sq_stats *stats;
+
                 stats = kzalloc(sizeof(*stats), GFP_KERNEL);
-               if (!stats) {
-                       kvfree(stats_list);
+               if (!stats)
                         return -ENOMEM;
-               }
-               if (stats_list)
-                       WRITE_ONCE(priv->htb_qos_sq_stats, stats_list);
+
                 WRITE_ONCE(priv->htb_qos_sq_stats[node_qid], stats);
                 /* Order htb_max_qos_sqs increment after writing the array pointer.
                  * Pairs with smp_load_acquire in en_stats.c.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c

index bcafb4bf94154ff01969fc851852d4234f5b0c04..8d9a3b5ec973b39aaa1addc8f6c5e3a568a7ab1a 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c
@@ -179,6 +179,13 @@ u32 mlx5e_rqt_size(struct mlx5_core_dev *mdev, unsigned int num_channels)
         return min_t(u32, rqt_size, max_cap_rqt_size);
  }
  
+#define MLX5E_MAX_RQT_SIZE_ALLOWED_WITH_XOR8_HASH 256
+
+unsigned int mlx5e_rqt_max_num_channels_allowed_for_xor8(void)
+{
+       return MLX5E_MAX_RQT_SIZE_ALLOWED_WITH_XOR8_HASH / MLX5E_UNIFORM_SPREAD_RQT_FACTOR;
+}
+
  void mlx5e_rqt_destroy(struct mlx5e_rqt *rqt)
  {
         mlx5_core_destroy_rqt(rqt->mdev, rqt->rqtn);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h

index e0bc30308c77000038d151a7caa206c516a6fe9a..2f9e04a8418f143fbf3d01423721d85b2d5b5a2a 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h
@@ -38,6 +38,7 @@ static inline u32 mlx5e_rqt_get_rqtn(struct mlx5e_rqt *rqt)
  }
  
  u32 mlx5e_rqt_size(struct mlx5_core_dev *mdev, unsigned int num_channels);
+unsigned int mlx5e_rqt_max_num_channels_allowed_for_xor8(void);
  int mlx5e_rqt_redirect_direct(struct mlx5e_rqt *rqt, u32 rqn, u32 *vhca_id);
  int mlx5e_rqt_redirect_indir(struct mlx5e_rqt *rqt, u32 *rqns, u32 *vhca_ids,
                              unsigned int num_rqns,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c b/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c

index f675b1926340f9ca4218aa47febac7c5139ab0e9..f66bbc8464645efabc08ebf923fada5e1f79c5fe 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c
@@ -57,6 +57,7 @@ int mlx5e_selq_init(struct mlx5e_selq *selq, struct mutex *state_lock)
  
  void mlx5e_selq_cleanup(struct mlx5e_selq *selq)
  {
+       mutex_lock(selq->state_lock);
         WARN_ON_ONCE(selq->is_prepared);
  
         kvfree(selq->standby);
@@ -67,6 +68,7 @@ void mlx5e_selq_cleanup(struct mlx5e_selq *selq)
  
         kvfree(selq->standby);
         selq->standby = NULL;
+       mutex_unlock(selq->state_lock);
  }
  
  void mlx5e_selq_prepare_params(struct mlx5e_selq *selq, struct mlx5e_params *params)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c

index cc51ce16df14abe530910e063b9072c9e23ff49c..8f101181648c6a294332b6229946b1afae4ee554 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -451,6 +451,34 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
  
         mutex_lock(&priv->state_lock);
  
+       if (mlx5e_rx_res_get_current_hash(priv->rx_res).hfunc == ETH_RSS_HASH_XOR) {
+               unsigned int xor8_max_channels = mlx5e_rqt_max_num_channels_allowed_for_xor8();
+
+               if (count > xor8_max_channels) {
+                       err = -EINVAL;
+                       netdev_err(priv->netdev, "%s: Requested number of channels (%d) exceeds the maximum allowed by the XOR8 RSS hfunc (%d)\n",
+                                  __func__, count, xor8_max_channels);
+                       goto out;
+               }
+       }
+
+       /* If RXFH is configured, changing the channels number is allowed only if
+        * it does not require resizing the RSS table. This is because the previous
+        * configuration may no longer be compatible with the new RSS table.
+        */
+       if (netif_is_rxfh_configured(priv->netdev)) {
+               int cur_rqt_size = mlx5e_rqt_size(priv->mdev, cur_params->num_channels);
+               int new_rqt_size = mlx5e_rqt_size(priv->mdev, count);
+
+               if (new_rqt_size != cur_rqt_size) {
+                       err = -EINVAL;
+                       netdev_err(priv->netdev,
+                                  "%s: RXFH is configured, block changing channels number that affects RSS table size (new: %d, current: %d)\n",
+                                  __func__, new_rqt_size, cur_rqt_size);
+                       goto out;
+               }
+       }
+
         /* Don't allow changing the number of channels if HTB offload is active,
          * because the numeration of the QoS SQs will change, while per-queue
          * qdiscs are attached.
@@ -1281,17 +1309,30 @@ int mlx5e_set_rxfh(struct net_device *dev, struct ethtool_rxfh_param *rxfh,
         struct mlx5e_priv *priv = netdev_priv(dev);
         u32 *rss_context = &rxfh->rss_context;
         u8 hfunc = rxfh->hfunc;
+       unsigned int count;
         int err;
  
         mutex_lock(&priv->state_lock);
+
+       count = priv->channels.params.num_channels;
+
+       if (hfunc == ETH_RSS_HASH_XOR) {
+               unsigned int xor8_max_channels = mlx5e_rqt_max_num_channels_allowed_for_xor8();
+
+               if (count > xor8_max_channels) {
+                       err = -EINVAL;
+                       netdev_err(priv->netdev, "%s: Cannot set RSS hash function to XOR, current number of channels (%d) exceeds the maximum allowed for XOR8 RSS hfunc (%d)\n",
+                                  __func__, count, xor8_max_channels);
+                       goto unlock;
+               }
+       }
+
         if (*rss_context && rxfh->rss_delete) {
                 err = mlx5e_rx_res_rss_destroy(priv->rx_res, *rss_context);
                 goto unlock;
         }
  
         if (*rss_context == ETH_RXFH_CONTEXT_ALLOC) {
-               unsigned int count = priv->channels.params.num_channels;
-
                 err = mlx5e_rx_res_rss_init(priv->rx_res, rss_context, count);
                 if (err)
                         goto unlock;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c

index 91848eae45655fd57d7dcc3365f8ad5094f61f3d..b375ef268671ab01215e370149d1a3b9cb86b756 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5726,9 +5726,7 @@ void mlx5e_priv_cleanup(struct mlx5e_priv *priv)
         kfree(priv->tx_rates);
         kfree(priv->txq2sq);
         destroy_workqueue(priv->wq);
-       mutex_lock(&priv->state_lock);
         mlx5e_selq_cleanup(&priv->selq);
-       mutex_unlock(&priv->state_lock);
         free_cpumask_var(priv->scratchpad.cpumask);
  
         for (i = 0; i < priv->htb_max_qos_sqs; i++)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c

index 2fa076b23fbead06bceb6697e0ebb0238bb5be7e..e21a3b4128ce880478795b023e1ff314e9336dd0 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -398,6 +398,8 @@ mlx5e_txwqe_complete(struct mlx5e_txqsq *sq, struct sk_buff *skb,
                      (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))) {
                 u8 metadata_index = be32_to_cpu(eseg->flow_table_metadata);
  
+               mlx5e_ptp_metadata_fifo_pop(&sq->ptpsq->metadata_freelist);
+
                 mlx5e_skb_cb_hwtstamp_init(skb);
                 mlx5e_ptp_metadata_map_put(&sq->ptpsq->metadata_map, skb,
                                            metadata_index);
@@ -496,9 +498,6 @@ mlx5e_sq_xmit_wqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
  
  err_drop:
         stats->dropped++;
-       if (unlikely(sq->ptpsq && (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)))
-               mlx5e_ptp_metadata_fifo_push(&sq->ptpsq->metadata_freelist,
-                                            be32_to_cpu(eseg->flow_table_metadata));
         dev_kfree_skb_any(skb);
         mlx5e_tx_flush(sq);
  }
@@ -657,7 +656,7 @@ static void mlx5e_cqe_ts_id_eseg(struct mlx5e_ptpsq *ptpsq, struct sk_buff *skb,
  {
         if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
                 eseg->flow_table_metadata =
-                       cpu_to_be32(mlx5e_ptp_metadata_fifo_pop(&ptpsq->metadata_freelist));
+                       cpu_to_be32(mlx5e_ptp_metadata_fifo_peek(&ptpsq->metadata_freelist));
  }
  
  static void mlx5e_txwqe_build_eseg(struct mlx5e_priv *priv, struct mlx5e_txqsq *sq,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c

index 3047d7015c5256726338904432ce56845c59c39c..1789800faaeb62841387ed69b0a82aab3283bf46 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1868,6 +1868,7 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev)
         if (err)
                 goto abort;
  
+       dev->priv.eswitch = esw;
         err = esw_offloads_init(esw);
         if (err)
                 goto reps_err;
@@ -1892,11 +1893,6 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev)
                 esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_BASIC;
         else
                 esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_NONE;
-       if (MLX5_ESWITCH_MANAGER(dev) &&
-           mlx5_esw_vport_match_metadata_supported(esw))
-               esw->flags |= MLX5_ESWITCH_VPORT_MATCH_METADATA;
-
-       dev->priv.eswitch = esw;
         BLOCKING_INIT_NOTIFIER_HEAD(&esw->n_head);
  
         esw_info(dev,
@@ -1908,6 +1904,7 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev)
  
  reps_err:
         mlx5_esw_vports_cleanup(esw);
+       dev->priv.eswitch = NULL;
  abort:
         if (esw->work_queue)
                 destroy_workqueue(esw->work_queue);
@@ -1926,7 +1923,6 @@ void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw)
  
         esw_info(esw->dev, "cleanup\n");
  
-       esw->dev->priv.eswitch = NULL;
         destroy_workqueue(esw->work_queue);
         WARN_ON(refcount_read(&esw->qos.refcnt));
         mutex_destroy(&esw->state_lock);
@@ -1937,6 +1933,7 @@ void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw)
         mutex_destroy(&esw->offloads.encap_tbl_lock);
         mutex_destroy(&esw->offloads.decap_tbl_lock);
         esw_offloads_cleanup(esw);
+       esw->dev->priv.eswitch = NULL;
         mlx5_esw_vports_cleanup(esw);
         debugfs_remove_recursive(esw->debugfs_root);
         devl_params_unregister(priv_to_devlink(esw->dev), mlx5_eswitch_params,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c

index baaae628b0a0f6510e2c350cbab0b6309b32da52..1f60954c12f7257cacf0c7e46539c949ac6eaf7b 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -43,6 +43,7 @@
  #include "rdma.h"
  #include "en.h"
  #include "fs_core.h"
+#include "lib/mlx5.h"
  #include "lib/devcom.h"
  #include "lib/eq.h"
  #include "lib/fs_chains.h"
@@ -2476,6 +2477,10 @@ int esw_offloads_init(struct mlx5_eswitch *esw)
         if (err)
                 return err;
  
+       if (MLX5_ESWITCH_MANAGER(esw->dev) &&
+           mlx5_esw_vport_match_metadata_supported(esw))
+               esw->flags |= MLX5_ESWITCH_VPORT_MATCH_METADATA;
+
         err = devl_params_register(priv_to_devlink(esw->dev),
                                    esw_devlink_params,
                                    ARRAY_SIZE(esw_devlink_params));
@@ -3707,6 +3712,12 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
         if (esw_mode_from_devlink(mode, &mlx5_mode))
                 return -EINVAL;
  
+       if (mode == DEVLINK_ESWITCH_MODE_SWITCHDEV && mlx5_get_sd(esw->dev)) {
+               NL_SET_ERR_MSG_MOD(extack,
+                                  "Can't change E-Switch mode to switchdev when multi-PF netdev (Socket Direct) is configured.");
+               return -EPERM;
+       }
+
         mlx5_lag_disable_change(esw->dev);
         err = mlx5_esw_try_lock(esw);
         if (err < 0) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c

index e6bfa7e4f146caf5b05506beaa6c9aabc6c4f74d..cf085a478e3e4c69ffdd4ee9bb24f0036e27c66d 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -1664,6 +1664,16 @@ static int create_auto_flow_group(struct mlx5_flow_table *ft,
         return err;
  }
  
+static bool mlx5_pkt_reformat_cmp(struct mlx5_pkt_reformat *p1,
+                                 struct mlx5_pkt_reformat *p2)
+{
+       return p1->owner == p2->owner &&
+               (p1->owner == MLX5_FLOW_RESOURCE_OWNER_FW ?
+                p1->id == p2->id :
+                mlx5_fs_dr_action_get_pkt_reformat_id(p1) ==
+                mlx5_fs_dr_action_get_pkt_reformat_id(p2));
+}
+
  static bool mlx5_flow_dests_cmp(struct mlx5_flow_destination *d1,
                                 struct mlx5_flow_destination *d2)
  {
@@ -1675,8 +1685,8 @@ static bool mlx5_flow_dests_cmp(struct mlx5_flow_destination *d1,
                      ((d1->vport.flags & MLX5_FLOW_DEST_VPORT_VHCA_ID) ?
                       (d1->vport.vhca_id == d2->vport.vhca_id) : true) &&
                      ((d1->vport.flags & MLX5_FLOW_DEST_VPORT_REFORMAT_ID) ?
-                     (d1->vport.pkt_reformat->id ==
-                      d2->vport.pkt_reformat->id) : true)) ||
+                     mlx5_pkt_reformat_cmp(d1->vport.pkt_reformat,
+                                           d2->vport.pkt_reformat) : true)) ||
                     (d1->type == MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE &&
                      d1->ft == d2->ft) ||
                     (d1->type == MLX5_FLOW_DESTINATION_TYPE_TIR &&
@@ -1808,8 +1818,9 @@ static struct mlx5_flow_handle *add_rule_fg(struct mlx5_flow_group *fg,
         }
         trace_mlx5_fs_set_fte(fte, false);
  
+       /* Link newly added rules into the tree. */
         for (i = 0; i < handle->num_rules; i++) {
-               if (refcount_read(&handle->rule[i]->node.refcount) == 1) {
+               if (!handle->rule[i]->node.parent) {
                         tree_add_node(&handle->rule[i]->node, &fte->node);
                         trace_mlx5_fs_add_rule(handle->rule[i]);
                 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c

index c2593625c09ad6a9150e03baeda0ae41a1a010be..59806553889e907f7ff2938af2063b1e3b73ca1e 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1480,6 +1480,14 @@ int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
         if (err)
                 goto err_register;
  
+       err = mlx5_crdump_enable(dev);
+       if (err)
+               mlx5_core_err(dev, "mlx5_crdump_enable failed with error code %d\n", err);
+
+       err = mlx5_hwmon_dev_register(dev);
+       if (err)
+               mlx5_core_err(dev, "mlx5_hwmon_dev_register failed with error code %d\n", err);
+
         mutex_unlock(&dev->intf_state_mutex);
         return 0;
  
@@ -1505,7 +1513,10 @@ int mlx5_init_one(struct mlx5_core_dev *dev)
         int err;
  
         devl_lock(devlink);
+       devl_register(devlink);
         err = mlx5_init_one_devl_locked(dev);
+       if (err)
+               devl_unregister(devlink);
         devl_unlock(devlink);
         return err;
  }
@@ -1517,6 +1528,8 @@ void mlx5_uninit_one(struct mlx5_core_dev *dev)
         devl_lock(devlink);
         mutex_lock(&dev->intf_state_mutex);
  
+       mlx5_hwmon_dev_unregister(dev);
+       mlx5_crdump_disable(dev);
         mlx5_unregister_device(dev);
  
         if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) {
@@ -1534,6 +1547,7 @@ void mlx5_uninit_one(struct mlx5_core_dev *dev)
         mlx5_function_teardown(dev, true);
  out:
         mutex_unlock(&dev->intf_state_mutex);
+       devl_unregister(devlink);
         devl_unlock(devlink);
  }
  
@@ -1680,16 +1694,20 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
         }
  
         devl_lock(devlink);
+       devl_register(devlink);
+
         err = mlx5_devlink_params_register(priv_to_devlink(dev));
-       devl_unlock(devlink);
         if (err) {
                 mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err);
                 goto query_hca_caps_err;
         }
  
+       devl_unlock(devlink);
         return 0;
  
  query_hca_caps_err:
+       devl_unregister(devlink);
+       devl_unlock(devlink);
         mlx5_function_disable(dev, true);
  out:
         dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
@@ -1702,6 +1720,7 @@ void mlx5_uninit_one_light(struct mlx5_core_dev *dev)
  
         devl_lock(devlink);
         mlx5_devlink_params_unregister(priv_to_devlink(dev));
+       devl_unregister(devlink);
         devl_unlock(devlink);
         if (dev->state != MLX5_DEVICE_STATE_UP)
                 return;
@@ -1943,16 +1962,7 @@ static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
                 goto err_init_one;
         }
  
-       err = mlx5_crdump_enable(dev);
-       if (err)
-               dev_err(&pdev->dev, "mlx5_crdump_enable failed with error code %d\n", err);
-
-       err = mlx5_hwmon_dev_register(dev);
-       if (err)
-               mlx5_core_err(dev, "mlx5_hwmon_dev_register failed with error code %d\n", err);
-
         pci_save_state(pdev);
-       devlink_register(devlink);
         return 0;
  
  err_init_one:
@@ -1973,16 +1983,9 @@ static void remove_one(struct pci_dev *pdev)
         struct devlink *devlink = priv_to_devlink(dev);
  
         set_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state);
-       /* mlx5_drain_fw_reset() and mlx5_drain_health_wq() are using
-        * devlink notify APIs.
-        * Hence, we must drain them before unregistering the devlink.
-        */
         mlx5_drain_fw_reset(dev);
         mlx5_drain_health_wq(dev);
-       devlink_unregister(devlink);
         mlx5_sriov_disable(pdev, false);
-       mlx5_hwmon_dev_unregister(dev);
-       mlx5_crdump_disable(dev);
         mlx5_uninit_one(dev);
         mlx5_pci_close(dev);
         mlx5_mdev_uninit(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c

index 4dcf995cb1a2042c39938ee2f166a6c3d3e6ef24..6bac8ad70ba60bf9982a110f7e115183858e0497 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -19,6 +19,7 @@
  #define MLX5_IRQ_CTRL_SF_MAX 8
  /* min num of vectors for SFs to be enabled */
  #define MLX5_IRQ_VEC_COMP_BASE_SF 2
+#define MLX5_IRQ_VEC_COMP_BASE 1
  
  #define MLX5_EQ_SHARE_IRQ_MAX_COMP (8)
  #define MLX5_EQ_SHARE_IRQ_MAX_CTRL (UINT_MAX)
@@ -246,6 +247,7 @@ static void irq_set_name(struct mlx5_irq_pool *pool, char *name, int vecidx)
                 return;
         }
  
+       vecidx -= MLX5_IRQ_VEC_COMP_BASE;
         snprintf(name, MLX5_MAX_IRQ_NAME, "mlx5_comp%d", vecidx);
  }
  
@@ -585,7 +587,7 @@ struct mlx5_irq *mlx5_irq_request_vector(struct mlx5_core_dev *dev, u16 cpu,
         struct mlx5_irq_table *table = mlx5_irq_table_get(dev);
         struct mlx5_irq_pool *pool = table->pcif_pool;
         struct irq_affinity_desc af_desc;
-       int offset = 1;
+       int offset = MLX5_IRQ_VEC_COMP_BASE;
  
         if (!pool->xa_num_irqs.max)
                 offset = 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c

index bc863e1f062e6bd316f6b54f87850e11123bbfea..e3bf8c7e4baa62e336415e495a8a385e1edb0654 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
@@ -101,7 +101,6 @@ static void mlx5_sf_dev_remove(struct auxiliary_device *adev)
         devlink = priv_to_devlink(mdev);
         set_bit(MLX5_BREAK_FW_WAIT, &mdev->intf_state);
         mlx5_drain_health_wq(mdev);
-       devlink_unregister(devlink);
         if (mlx5_dev_is_lightweight(mdev))
                 mlx5_uninit_one_light(mdev);
         else
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c

index 64f4cc284aea41715abecb1167439efe401951f8..030a5776c937406540645462b5950cd209c37974 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
@@ -205,12 +205,11 @@ dr_dump_hex_print(char hex[DR_HEX_SIZE], char *src, u32 size)
  }
  
  static int
-dr_dump_rule_action_mem(struct seq_file *file, const u64 rule_id,
+dr_dump_rule_action_mem(struct seq_file *file, char *buff, const u64 rule_id,
                         struct mlx5dr_rule_action_member *action_mem)
  {
         struct mlx5dr_action *action = action_mem->action;
         const u64 action_id = DR_DBG_PTR_TO_ID(action);
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         u64 hit_tbl_ptr, miss_tbl_ptr;
         u32 hit_tbl_id, miss_tbl_id;
         int ret;
@@ -488,10 +487,9 @@ dr_dump_rule_action_mem(struct seq_file *file, const u64 rule_id,
  }
  
  static int
-dr_dump_rule_mem(struct seq_file *file, struct mlx5dr_ste *ste,
+dr_dump_rule_mem(struct seq_file *file, char *buff, struct mlx5dr_ste *ste,
                  bool is_rx, const u64 rule_id, u8 format_ver)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         char hw_ste_dump[DR_HEX_SIZE];
         u32 mem_rec_type;
         int ret;
@@ -522,7 +520,8 @@ dr_dump_rule_mem(struct seq_file *file, struct mlx5dr_ste *ste,
  }
  
  static int
-dr_dump_rule_rx_tx(struct seq_file *file, struct mlx5dr_rule_rx_tx *rule_rx_tx,
+dr_dump_rule_rx_tx(struct seq_file *file, char *buff,
+                  struct mlx5dr_rule_rx_tx *rule_rx_tx,
                    bool is_rx, const u64 rule_id, u8 format_ver)
  {
         struct mlx5dr_ste *ste_arr[DR_RULE_MAX_STES + DR_ACTION_MAX_STES];
@@ -533,7 +532,7 @@ dr_dump_rule_rx_tx(struct seq_file *file, struct mlx5dr_rule_rx_tx *rule_rx_tx,
                 return 0;
  
         while (i--) {
-               ret = dr_dump_rule_mem(file, ste_arr[i], is_rx, rule_id,
+               ret = dr_dump_rule_mem(file, buff, ste_arr[i], is_rx, rule_id,
                                        format_ver);
                 if (ret < 0)
                         return ret;
@@ -542,7 +541,8 @@ dr_dump_rule_rx_tx(struct seq_file *file, struct mlx5dr_rule_rx_tx *rule_rx_tx,
         return 0;
  }
  
-static int dr_dump_rule(struct seq_file *file, struct mlx5dr_rule *rule)
+static noinline_for_stack int
+dr_dump_rule(struct seq_file *file, struct mlx5dr_rule *rule)
  {
         struct mlx5dr_rule_action_member *action_mem;
         const u64 rule_id = DR_DBG_PTR_TO_ID(rule);
@@ -565,19 +565,19 @@ static int dr_dump_rule(struct seq_file *file, struct mlx5dr_rule *rule)
                 return ret;
  
         if (rx->nic_matcher) {
-               ret = dr_dump_rule_rx_tx(file, rx, true, rule_id, format_ver);
+               ret = dr_dump_rule_rx_tx(file, buff, rx, true, rule_id, format_ver);
                 if (ret < 0)
                         return ret;
         }
  
         if (tx->nic_matcher) {
-               ret = dr_dump_rule_rx_tx(file, tx, false, rule_id, format_ver);
+               ret = dr_dump_rule_rx_tx(file, buff, tx, false, rule_id, format_ver);
                 if (ret < 0)
                         return ret;
         }
  
         list_for_each_entry(action_mem, &rule->rule_actions_list, list) {
-               ret = dr_dump_rule_action_mem(file, rule_id, action_mem);
+               ret = dr_dump_rule_action_mem(file, buff, rule_id, action_mem);
                 if (ret < 0)
                         return ret;
         }
@@ -586,10 +586,10 @@ static int dr_dump_rule(struct seq_file *file, struct mlx5dr_rule *rule)
  }
  
  static int
-dr_dump_matcher_mask(struct seq_file *file, struct mlx5dr_match_param *mask,
+dr_dump_matcher_mask(struct seq_file *file, char *buff,
+                    struct mlx5dr_match_param *mask,
                      u8 criteria, const u64 matcher_id)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         char dump[DR_HEX_SIZE];
         int ret;
  
@@ -681,10 +681,10 @@ dr_dump_matcher_mask(struct seq_file *file, struct mlx5dr_match_param *mask,
  }
  
  static int
-dr_dump_matcher_builder(struct seq_file *file, struct mlx5dr_ste_build *builder,
+dr_dump_matcher_builder(struct seq_file *file, char *buff,
+                       struct mlx5dr_ste_build *builder,
                         u32 index, bool is_rx, const u64 matcher_id)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         int ret;
  
         ret = snprintf(buff, MLX5DR_DEBUG_DUMP_BUFF_LENGTH,
@@ -702,11 +702,10 @@ dr_dump_matcher_builder(struct seq_file *file, struct mlx5dr_ste_build *builder,
  }
  
  static int
-dr_dump_matcher_rx_tx(struct seq_file *file, bool is_rx,
+dr_dump_matcher_rx_tx(struct seq_file *file, char *buff, bool is_rx,
                       struct mlx5dr_matcher_rx_tx *matcher_rx_tx,
                       const u64 matcher_id)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         enum dr_dump_rec_type rec_type;
         u64 s_icm_addr, e_icm_addr;
         int i, ret;
@@ -731,7 +730,7 @@ dr_dump_matcher_rx_tx(struct seq_file *file, bool is_rx,
                 return ret;
  
         for (i = 0; i < matcher_rx_tx->num_of_builders; i++) {
-               ret = dr_dump_matcher_builder(file,
+               ret = dr_dump_matcher_builder(file, buff,
                                               &matcher_rx_tx->ste_builder[i],
                                               i, is_rx, matcher_id);
                 if (ret < 0)
@@ -741,7 +740,7 @@ dr_dump_matcher_rx_tx(struct seq_file *file, bool is_rx,
         return 0;
  }
  
-static int
+static noinline_for_stack int
  dr_dump_matcher(struct seq_file *file, struct mlx5dr_matcher *matcher)
  {
         struct mlx5dr_matcher_rx_tx *rx = &matcher->rx;
@@ -763,19 +762,19 @@ dr_dump_matcher(struct seq_file *file, struct mlx5dr_matcher *matcher)
         if (ret)
                 return ret;
  
-       ret = dr_dump_matcher_mask(file, &matcher->mask,
+       ret = dr_dump_matcher_mask(file, buff, &matcher->mask,
                                    matcher->match_criteria, matcher_id);
         if (ret < 0)
                 return ret;
  
         if (rx->nic_tbl) {
-               ret = dr_dump_matcher_rx_tx(file, true, rx, matcher_id);
+               ret = dr_dump_matcher_rx_tx(file, buff, true, rx, matcher_id);
                 if (ret < 0)
                         return ret;
         }
  
         if (tx->nic_tbl) {
-               ret = dr_dump_matcher_rx_tx(file, false, tx, matcher_id);
+               ret = dr_dump_matcher_rx_tx(file, buff, false, tx, matcher_id);
                 if (ret < 0)
                         return ret;
         }
@@ -803,11 +802,10 @@ dr_dump_matcher_all(struct seq_file *file, struct mlx5dr_matcher *matcher)
  }
  
  static int
-dr_dump_table_rx_tx(struct seq_file *file, bool is_rx,
+dr_dump_table_rx_tx(struct seq_file *file, char *buff, bool is_rx,
                     struct mlx5dr_table_rx_tx *table_rx_tx,
                     const u64 table_id)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         enum dr_dump_rec_type rec_type;
         u64 s_icm_addr;
         int ret;
@@ -829,7 +827,8 @@ dr_dump_table_rx_tx(struct seq_file *file, bool is_rx,
         return 0;
  }
  
-static int dr_dump_table(struct seq_file *file, struct mlx5dr_table *table)
+static noinline_for_stack int
+dr_dump_table(struct seq_file *file, struct mlx5dr_table *table)
  {
         struct mlx5dr_table_rx_tx *rx = &table->rx;
         struct mlx5dr_table_rx_tx *tx = &table->tx;
@@ -848,14 +847,14 @@ static int dr_dump_table(struct seq_file *file, struct mlx5dr_table *table)
                 return ret;
  
         if (rx->nic_dmn) {
-               ret = dr_dump_table_rx_tx(file, true, rx,
+               ret = dr_dump_table_rx_tx(file, buff, true, rx,
                                           DR_DBG_PTR_TO_ID(table));
                 if (ret < 0)
                         return ret;
         }
  
         if (tx->nic_dmn) {
-               ret = dr_dump_table_rx_tx(file, false, tx,
+               ret = dr_dump_table_rx_tx(file, buff, false, tx,
                                           DR_DBG_PTR_TO_ID(table));
                 if (ret < 0)
                         return ret;
@@ -881,10 +880,10 @@ static int dr_dump_table_all(struct seq_file *file, struct mlx5dr_table *tbl)
  }
  
  static int
-dr_dump_send_ring(struct seq_file *file, struct mlx5dr_send_ring *ring,
+dr_dump_send_ring(struct seq_file *file, char *buff,
+                 struct mlx5dr_send_ring *ring,
                   const u64 domain_id)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         int ret;
  
         ret = snprintf(buff, MLX5DR_DEBUG_DUMP_BUFF_LENGTH,
@@ -902,13 +901,13 @@ dr_dump_send_ring(struct seq_file *file, struct mlx5dr_send_ring *ring,
         return 0;
  }
  
-static noinline_for_stack int
+static int
  dr_dump_domain_info_flex_parser(struct seq_file *file,
+                               char *buff,
                                 const char *flex_parser_name,
                                 const u8 flex_parser_value,
                                 const u64 domain_id)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         int ret;
  
         ret = snprintf(buff, MLX5DR_DEBUG_DUMP_BUFF_LENGTH,
@@ -925,11 +924,11 @@ dr_dump_domain_info_flex_parser(struct seq_file *file,
         return 0;
  }
  
-static noinline_for_stack int
-dr_dump_domain_info_caps(struct seq_file *file, struct mlx5dr_cmd_caps *caps,
+static int
+dr_dump_domain_info_caps(struct seq_file *file, char *buff,
+                        struct mlx5dr_cmd_caps *caps,
                          const u64 domain_id)
  {
-       char buff[MLX5DR_DEBUG_DUMP_BUFF_LENGTH];
         struct mlx5dr_cmd_vport_cap *vport_caps;
         unsigned long i, vports_num;
         int ret;
@@ -969,34 +968,35 @@ dr_dump_domain_info_caps(struct seq_file *file, struct mlx5dr_cmd_caps *caps,
  }
  
  static int
-dr_dump_domain_info(struct seq_file *file, struct mlx5dr_domain_info *info,
+dr_dump_domain_info(struct seq_file *file, char *buff,
+                   struct mlx5dr_domain_info *info,
                     const u64 domain_id)
  {
         int ret;
  
-       ret = dr_dump_domain_info_caps(file, &info->caps, domain_id);
+       ret = dr_dump_domain_info_caps(file, buff, &info->caps, domain_id);
         if (ret < 0)
                 return ret;
  
-       ret = dr_dump_domain_info_flex_parser(file, "icmp_dw0",
+       ret = dr_dump_domain_info_flex_parser(file, buff, "icmp_dw0",
                                               info->caps.flex_parser_id_icmp_dw0,
                                               domain_id);
         if (ret < 0)
                 return ret;
  
-       ret = dr_dump_domain_info_flex_parser(file, "icmp_dw1",
+       ret = dr_dump_domain_info_flex_parser(file, buff, "icmp_dw1",
                                               info->caps.flex_parser_id_icmp_dw1,
                                               domain_id);
         if (ret < 0)
                 return ret;
  
-       ret = dr_dump_domain_info_flex_parser(file, "icmpv6_dw0",
+       ret = dr_dump_domain_info_flex_parser(file, buff, "icmpv6_dw0",
                                               info->caps.flex_parser_id_icmpv6_dw0,
                                               domain_id);
         if (ret < 0)
                 return ret;
  
-       ret = dr_dump_domain_info_flex_parser(file, "icmpv6_dw1",
+       ret = dr_dump_domain_info_flex_parser(file, buff, "icmpv6_dw1",
                                               info->caps.flex_parser_id_icmpv6_dw1,
                                               domain_id);
         if (ret < 0)
@@ -1032,12 +1032,12 @@ dr_dump_domain(struct seq_file *file, struct mlx5dr_domain *dmn)
         if (ret)
                 return ret;
  
-       ret = dr_dump_domain_info(file, &dmn->info, domain_id);
+       ret = dr_dump_domain_info(file, buff, &dmn->info, domain_id);
         if (ret < 0)
                 return ret;
  
         if (dmn->info.supp_sw_steering) {
-               ret = dr_dump_send_ring(file, dmn->send_ring, domain_id);
+               ret = dr_dump_send_ring(file, buff, dmn->send_ring, domain_id);
                 if (ret < 0)
                         return ret;
         }
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c

index 3a1b1a1f5a1951069f9c3e5ee5e3a10c1be55eb6..60dd2fd603a8554f02f5d649e8d290dc074b5d72 100644 (file)
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c
@@ -731,7 +731,7 @@ static int sparx5_port_pcs_low_set(struct sparx5 *sparx5,
         bool sgmii = false, inband_aneg = false;
         int err;
  
-       if (port->conf.inband) {
+       if (conf->inband) {
                 if (conf->portmode == PHY_INTERFACE_MODE_SGMII ||
                     conf->portmode == PHY_INTERFACE_MODE_QSGMII)
                         inband_aneg = true; /* Cisco-SGMII in-band-aneg */
@@ -948,7 +948,7 @@ int sparx5_port_pcs_set(struct sparx5 *sparx5,
         if (err)
                 return -EINVAL;
  
-       if (port->conf.inband) {
+       if (conf->inband) {
                 /* Enable/disable 1G counters in ASM */
                 spx5_rmw(ASM_PORT_CFG_CSC_STAT_DIS_SET(high_speed_dev),
                          ASM_PORT_CFG_CSC_STAT_DIS,
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c

index 8a27328eae34b7ef10b134d354e7ac0ccc032b28..0fc5fe564ae50be28bc6568f90d339d840a4b8d1 100644 (file)
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -5046,7 +5046,8 @@ static void rtl_remove_one(struct pci_dev *pdev)
  
         cancel_work_sync(&tp->wk.work);
  
-       r8169_remove_leds(tp->leds);
+       if (IS_ENABLED(CONFIG_R8169_LEDS))
+               r8169_remove_leds(tp->leds);
  
         unregister_netdev(tp->dev);
  
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c

index 943d72bdd794ca5e6258cb02841447ca38898251..27281a9a8951dbd53f30a27a14e0ac0be9a35c5d 100644 (file)
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2076,6 +2076,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
         bool vwc = ns->ctrl->vwc & NVME_CTRL_VWC_PRESENT;
         struct queue_limits lim;
         struct nvme_id_ns_nvm *nvm = NULL;
+       struct nvme_zone_info zi = {};
         struct nvme_id_ns *id;
         sector_t capacity;
         unsigned lbaf;
@@ -2088,9 +2089,10 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
         if (id->ncap == 0) {
                 /* namespace not allocated or attached */
                 info->is_removed = true;
-               ret = -ENODEV;
+               ret = -ENXIO;
                 goto out;
         }
+       lbaf = nvme_lbaf_index(id->flbas);
  
         if (ns->ctrl->ctratt & NVME_CTRL_ATTR_ELBAS) {
                 ret = nvme_identify_ns_nvm(ns->ctrl, info->nsid, &nvm);
@@ -2098,8 +2100,14 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
                         goto out;
         }
  
+       if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
+           ns->head->ids.csi == NVME_CSI_ZNS) {
+               ret = nvme_query_zone_info(ns, lbaf, &zi);
+               if (ret < 0)
+                       goto out;
+       }
+
         blk_mq_freeze_queue(ns->disk->queue);
-       lbaf = nvme_lbaf_index(id->flbas);
         ns->head->lba_shift = id->lbaf[lbaf].ds;
         ns->head->nuse = le64_to_cpu(id->nuse);
         capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
@@ -2112,13 +2120,8 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
                 capacity = 0;
         nvme_config_discard(ns, &lim);
         if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
-           ns->head->ids.csi == NVME_CSI_ZNS) {
-               ret = nvme_update_zone_info(ns, lbaf, &lim);
-               if (ret) {
-                       blk_mq_unfreeze_queue(ns->disk->queue);
-                       goto out;
-               }
-       }
+           ns->head->ids.csi == NVME_CSI_ZNS)
+               nvme_update_zone_info(ns, &lim, &zi);
         ret = queue_limits_commit_update(ns->disk->queue, &lim);
         if (ret) {
                 blk_mq_unfreeze_queue(ns->disk->queue);
@@ -2201,6 +2204,7 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
         }
  
         if (!ret && nvme_ns_head_multipath(ns->head)) {
+               struct queue_limits *ns_lim = &ns->disk->queue->limits;
                 struct queue_limits lim;
  
                 blk_mq_freeze_queue(ns->head->disk->queue);
@@ -2212,7 +2216,26 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
                 set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
                 nvme_mpath_revalidate_paths(ns);
  
+               /*
+                * queue_limits mixes values that are the hardware limitations
+                * for bio splitting with what is the device configuration.
+                *
+                * For NVMe the device configuration can change after e.g. a
+                * Format command, and we really want to pick up the new format
+                * value here.  But we must still stack the queue limits to the
+                * least common denominator for multipathing to split the bios
+                * properly.
+                *
+                * To work around this, we explicitly set the device
+                * configuration to those that we just queried, but only stack
+                * the splitting limits in to make sure we still obey possibly
+                * lower limitations of other controllers.
+                */
                 lim = queue_limits_start_update(ns->head->disk->queue);
+               lim.logical_block_size = ns_lim->logical_block_size;
+               lim.physical_block_size = ns_lim->physical_block_size;
+               lim.io_min = ns_lim->io_min;
+               lim.io_opt = ns_lim->io_opt;
                 queue_limits_stack_bdev(&lim, ns->disk->part0, 0,
                                         ns->head->disk->disk_name);
                 ret = queue_limits_commit_update(ns->head->disk->queue, &lim);
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c

index 68a5d971657bb5080f717f5ae1ec5645830aadd5..a5b29e9ad342df82730ba9a06ea28e755d1953ee 100644 (file)
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2428,7 +2428,7 @@ nvme_fc_ctrl_get(struct nvme_fc_ctrl *ctrl)
   * controller. Called after last nvme_put_ctrl() call
   */
  static void
-nvme_fc_nvme_ctrl_freed(struct nvme_ctrl *nctrl)
+nvme_fc_free_ctrl(struct nvme_ctrl *nctrl)
  {
         struct nvme_fc_ctrl *ctrl = to_fc_ctrl(nctrl);
  
@@ -3384,7 +3384,7 @@ static const struct nvme_ctrl_ops nvme_fc_ctrl_ops = {
         .reg_read32             = nvmf_reg_read32,
         .reg_read64             = nvmf_reg_read64,
         .reg_write32            = nvmf_reg_write32,
-       .free_ctrl              = nvme_fc_nvme_ctrl_freed,
+       .free_ctrl              = nvme_fc_free_ctrl,
         .submit_async_event     = nvme_fc_submit_async_event,
         .delete_ctrl            = nvme_fc_delete_ctrl,
         .get_address            = nvmf_get_address,
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h

index 24193fcb8bd584de277606d57738c1aea5a9cb49..d0ed64dc7380e51577bc6ece92db1d0a273905a7 100644 (file)
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1036,10 +1036,18 @@ static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
  }
  #endif /* CONFIG_NVME_MULTIPATH */
  
+struct nvme_zone_info {
+       u64 zone_size;
+       unsigned int max_open_zones;
+       unsigned int max_active_zones;
+};
+
  int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector,
                 unsigned int nr_zones, report_zones_cb cb, void *data);
-int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf,
-               struct queue_limits *lim);
+int nvme_query_zone_info(struct nvme_ns *ns, unsigned lbaf,
+               struct nvme_zone_info *zi);
+void nvme_update_zone_info(struct nvme_ns *ns, struct queue_limits *lim,
+               struct nvme_zone_info *zi);
  #ifdef CONFIG_BLK_DEV_ZONED
  blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns, struct request *req,
                                        struct nvme_command *cmnd,
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c

index 722384bcc765cda778972c8a86345eaaf18a7353..77aa0f440a6d2a5a538ad06e9cae9ae97c409aef 100644 (file)
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -35,8 +35,8 @@ static int nvme_set_max_append(struct nvme_ctrl *ctrl)
         return 0;
  }
  
-int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf,
-               struct queue_limits *lim)
+int nvme_query_zone_info(struct nvme_ns *ns, unsigned lbaf,
+               struct nvme_zone_info *zi)
  {
         struct nvme_effects_log *log = ns->head->effects;
         struct nvme_command c = { };
@@ -89,27 +89,34 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf,
                 goto free_data;
         }
  
-       ns->head->zsze =
-               nvme_lba_to_sect(ns->head, le64_to_cpu(id->lbafe[lbaf].zsze));
-       if (!is_power_of_2(ns->head->zsze)) {
+       zi->zone_size = le64_to_cpu(id->lbafe[lbaf].zsze);
+       if (!is_power_of_2(zi->zone_size)) {
                 dev_warn(ns->ctrl->device,
-                       "invalid zone size:%llu for namespace:%u\n",
-                       ns->head->zsze, ns->head->ns_id);
+                       "invalid zone size: %llu for namespace: %u\n",
+                       zi->zone_size, ns->head->ns_id);
                 status = -ENODEV;
                 goto free_data;
         }
+       zi->max_open_zones = le32_to_cpu(id->mor) + 1;
+       zi->max_active_zones = le32_to_cpu(id->mar) + 1;
  
-       blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, ns->queue);
-       lim->zoned = 1;
-       lim->max_open_zones = le32_to_cpu(id->mor) + 1;
-       lim->max_active_zones = le32_to_cpu(id->mar) + 1;
-       lim->chunk_sectors = ns->head->zsze;
-       lim->max_zone_append_sectors = ns->ctrl->max_zone_append;
  free_data:
         kfree(id);
         return status;
  }
  
+void nvme_update_zone_info(struct nvme_ns *ns, struct queue_limits *lim,
+               struct nvme_zone_info *zi)
+{
+       lim->zoned = 1;
+       lim->max_open_zones = zi->max_open_zones;
+       lim->max_active_zones = zi->max_active_zones;
+       lim->max_zone_append_sectors = ns->ctrl->max_zone_append;
+       lim->chunk_sectors = ns->head->zsze =
+               nvme_lba_to_sect(ns->head, zi->zone_size);
+       blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, ns->queue);
+}
+
  static void *nvme_zns_alloc_report_buffer(struct nvme_ns *ns,
                                           unsigned int nr_zones, size_t *buflen)
  {
diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c

index 77a6e817b31596998e4424aa8205f8cfd9219f1d..a2325330bf22145202837aa5cf89d9ec6543ab59 100644 (file)
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -1613,6 +1613,11 @@ static struct config_group *nvmet_subsys_make(struct config_group *group,
                 return ERR_PTR(-EINVAL);
         }
  
+       if (sysfs_streq(name, nvmet_disc_subsys->subsysnqn)) {
+               pr_err("can't create subsystem using unique discovery NQN\n");
+               return ERR_PTR(-EINVAL);
+       }
+
         subsys = nvmet_subsys_alloc(name, NVME_NQN_NVME);
         if (IS_ERR(subsys))
                 return ERR_CAST(subsys);
@@ -2159,7 +2164,49 @@ static const struct config_item_type nvmet_hosts_type = {
  
  static struct config_group nvmet_hosts_group;
  
+static ssize_t nvmet_root_discovery_nqn_show(struct config_item *item,
+                                            char *page)
+{
+       return snprintf(page, PAGE_SIZE, "%s\n", nvmet_disc_subsys->subsysnqn);
+}
+
+static ssize_t nvmet_root_discovery_nqn_store(struct config_item *item,
+               const char *page, size_t count)
+{
+       struct list_head *entry;
+       size_t len;
+
+       len = strcspn(page, "\n");
+       if (!len || len > NVMF_NQN_FIELD_LEN - 1)
+               return -EINVAL;
+
+       down_write(&nvmet_config_sem);
+       list_for_each(entry, &nvmet_subsystems_group.cg_children) {
+               struct config_item *item =
+                       container_of(entry, struct config_item, ci_entry);
+
+               if (!strncmp(config_item_name(item), page, len)) {
+                       pr_err("duplicate NQN %s\n", config_item_name(item));
+                       up_write(&nvmet_config_sem);
+                       return -EINVAL;
+               }
+       }
+       memset(nvmet_disc_subsys->subsysnqn, 0, NVMF_NQN_FIELD_LEN);
+       memcpy(nvmet_disc_subsys->subsysnqn, page, len);
+       up_write(&nvmet_config_sem);
+
+       return len;
+}
+
+CONFIGFS_ATTR(nvmet_root_, discovery_nqn);
+
+static struct configfs_attribute *nvmet_root_attrs[] = {
+       &nvmet_root_attr_discovery_nqn,
+       NULL,
+};
+
  static const struct config_item_type nvmet_root_type = {
+       .ct_attrs               = nvmet_root_attrs,
         .ct_owner               = THIS_MODULE,
  };
  
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c

index 6bbe4df0166ca56949a5f5b14ad90f68305d6f36..8860a3eb71ec891e948a34060f34b4b148553418 100644 (file)
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -1541,6 +1541,13 @@ static struct nvmet_subsys *nvmet_find_get_subsys(struct nvmet_port *port,
         }
  
         down_read(&nvmet_config_sem);
+       if (!strncmp(nvmet_disc_subsys->subsysnqn, subsysnqn,
+                               NVMF_NQN_SIZE)) {
+               if (kref_get_unless_zero(&nvmet_disc_subsys->ref)) {
+                       up_read(&nvmet_config_sem);
+                       return nvmet_disc_subsys;
+               }
+       }
         list_for_each_entry(p, &port->subsystems, entry) {
                 if (!strncmp(p->subsys->subsysnqn, subsysnqn,
                                 NVMF_NQN_SIZE)) {
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c

index fd229f310c931fbfd6c3132185f2b73c135cd633..337ee1cb09ae644bb98bdf5e8da0525ff40230c3 100644 (file)
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1115,16 +1115,21 @@ nvmet_fc_schedule_delete_assoc(struct nvmet_fc_tgt_assoc *assoc)
  }
  
  static bool
-nvmet_fc_assoc_exits(struct nvmet_fc_tgtport *tgtport, u64 association_id)
+nvmet_fc_assoc_exists(struct nvmet_fc_tgtport *tgtport, u64 association_id)
  {
         struct nvmet_fc_tgt_assoc *a;
+       bool found = false;
  
+       rcu_read_lock();
         list_for_each_entry_rcu(a, &tgtport->assoc_list, a_list) {
-               if (association_id == a->association_id)
-                       return true;
+               if (association_id == a->association_id) {
+                       found = true;
+                       break;
+               }
         }
+       rcu_read_unlock();
  
-       return false;
+       return found;
  }
  
  static struct nvmet_fc_tgt_assoc *
@@ -1164,13 +1169,11 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
                 ran = ran << BYTES_FOR_QID_SHIFT;
  
                 spin_lock_irqsave(&tgtport->lock, flags);
-               rcu_read_lock();
-               if (!nvmet_fc_assoc_exits(tgtport, ran)) {
+               if (!nvmet_fc_assoc_exists(tgtport, ran)) {
                         assoc->association_id = ran;
                         list_add_tail_rcu(&assoc->a_list, &tgtport->assoc_list);
                         done = true;
                 }
-               rcu_read_unlock();
                 spin_unlock_irqrestore(&tgtport->lock, flags);
         } while (!done);
  
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c

index 3bf27052832f302ac72d366b986629e03db4e900..4d57a4e34105466f8997b210271b231d216cb9b5 100644 (file)
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -9,6 +9,7 @@
  
  #define pr_fmt(fmt)    "OF: " fmt
  
+#include <linux/device.h>
  #include <linux/of.h>
  #include <linux/spinlock.h>
  #include <linux/slab.h>
@@ -667,6 +668,17 @@ void of_changeset_destroy(struct of_changeset *ocs)
  {
         struct of_changeset_entry *ce, *cen;
  
+       /*
+        * When a device is deleted, the device links to/from it are also queued
+        * for deletion. Until these device links are freed, the devices
+        * themselves aren't freed. If the device being deleted is due to an
+        * overlay change, this device might be holding a reference to a device
+        * node that will be freed. So, wait until all already pending device
+        * links are deleted before freeing a device node. This ensures we don't
+        * free any device node that has a non-zero reference count.
+        */
+       device_link_wait_removal();
+
         list_for_each_entry_safe_reverse(ce, cen, &ocs->entries, node)
                 __of_changeset_entry_destroy(ce);
  }
diff --git a/drivers/of/module.c b/drivers/of/module.c

index 0e8aa974f0f2bb5262dfbc8b03978b6381bfd61e..f58e624953a20f25f058841b70eb9468d6f5d11e 100644 (file)
--- a/drivers/of/module.c
+++ b/drivers/of/module.c
@@ -16,6 +16,14 @@ ssize_t of_modalias(const struct device_node *np, char *str, ssize_t len)
         ssize_t csize;
         ssize_t tsize;
  
+       /*
+        * Prevent a kernel oops in vsnprintf() -- it only allows passing a
+        * NULL ptr when the length is also 0. Also filter out the negative
+        * lengths...
+        */
+       if ((len > 0 && !str) || len < 0)
+               return -EINVAL;
+
         /* Name & Type */
         /* %p eats all alphanum characters, so %c must be used here */
         csize = snprintf(str, len, "of:N%pOFn%c%s", np, 'T',
diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu.c

index c78a6fd6c57f612221749d44673d47845911231f..b4efdddb2ad91f2c8b6a52f769a1fcff84206811 100644 (file)
--- a/drivers/perf/riscv_pmu.c
+++ b/drivers/perf/riscv_pmu.c
@@ -313,6 +313,10 @@ static int riscv_pmu_event_init(struct perf_event *event)
         u64 event_config = 0;
         uint64_t cmask;
  
+       /* driver does not support branch stack sampling */
+       if (has_branch_stack(event))
+               return -EOPNOTSUPP;
+
         hwc->flags = 0;
         mapped_event = rvpmu->event_map(event, &event_config);
         if (mapped_event < 0) {
diff --git a/drivers/platform/chrome/cros_ec_uart.c b/drivers/platform/chrome/cros_ec_uart.c

index 8ea867c2a01a371a64e1eb10327931861d306dc8..62bc24f6dcc7a82cb11361b133d65bab77d90152 100644 (file)
--- a/drivers/platform/chrome/cros_ec_uart.c
+++ b/drivers/platform/chrome/cros_ec_uart.c
@@ -263,12 +263,6 @@ static int cros_ec_uart_probe(struct serdev_device *serdev)
         if (!ec_dev)
                 return -ENOMEM;
  
-       ret = devm_serdev_device_open(dev, serdev);
-       if (ret) {
-               dev_err(dev, "Unable to open UART device");
-               return ret;
-       }
-
         serdev_device_set_drvdata(serdev, ec_dev);
         init_waitqueue_head(&ec_uart->response.wait_queue);
  
@@ -280,14 +274,6 @@ static int cros_ec_uart_probe(struct serdev_device *serdev)
                 return ret;
         }
  
-       ret = serdev_device_set_baudrate(serdev, ec_uart->baudrate);
-       if (ret < 0) {
-               dev_err(dev, "Failed to set up host baud rate (%d)", ret);
-               return ret;
-       }
-
-       serdev_device_set_flow_control(serdev, ec_uart->flowcontrol);
-
         /* Initialize ec_dev for cros_ec  */
         ec_dev->phys_name = dev_name(dev);
         ec_dev->dev = dev;
@@ -301,6 +287,20 @@ static int cros_ec_uart_probe(struct serdev_device *serdev)
  
         serdev_device_set_client_ops(serdev, &cros_ec_uart_client_ops);
  
+       ret = devm_serdev_device_open(dev, serdev);
+       if (ret) {
+               dev_err(dev, "Unable to open UART device");
+               return ret;
+       }
+
+       ret = serdev_device_set_baudrate(serdev, ec_uart->baudrate);
+       if (ret < 0) {
+               dev_err(dev, "Failed to set up host baud rate (%d)", ret);
+               return ret;
+       }
+
+       serdev_device_set_flow_control(serdev, ec_uart->flowcontrol);
+
         return cros_ec_register(ec_dev);
  }
  
diff --git a/drivers/platform/x86/acer-wmi.c b/drivers/platform/x86/acer-wmi.c

index ee2e164f86b9c2973e317bfbfe3297571a8cab48..38c932df6446ac5714d225ac0545ff16345a6e27 100644 (file)
--- a/drivers/platform/x86/acer-wmi.c
+++ b/drivers/platform/x86/acer-wmi.c
@@ -597,6 +597,15 @@ static const struct dmi_system_id acer_quirks[] __initconst = {
                 },
                 .driver_data = &quirk_acer_predator_v4,
         },
+       {
+               .callback = dmi_matched,
+               .ident = "Acer Predator PH18-71",
+               .matches = {
+                       DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
+                       DMI_MATCH(DMI_PRODUCT_NAME, "Predator PH18-71"),
+               },
+               .driver_data = &quirk_acer_predator_v4,
+       },
         {
                 .callback = set_force_caps,
                 .ident = "Acer Aspire Switch 10E SW3-016",
diff --git a/drivers/platform/x86/intel/hid.c b/drivers/platform/x86/intel/hid.c

index 7457ca2b27a60b7adadcebb251dba45a0e675e97..c7a8276458640adc888f99fee23fcc10b5ddf2e0 100644 (file)
--- a/drivers/platform/x86/intel/hid.c
+++ b/drivers/platform/x86/intel/hid.c
@@ -49,6 +49,8 @@ static const struct acpi_device_id intel_hid_ids[] = {
         {"INTC1076", 0},
         {"INTC1077", 0},
         {"INTC1078", 0},
+       {"INTC107B", 0},
+       {"INTC10CB", 0},
         {"", 0},
  };
  MODULE_DEVICE_TABLE(acpi, intel_hid_ids);
@@ -504,6 +506,7 @@ static void notify_handler(acpi_handle handle, u32 event, void *context)
         struct platform_device *device = context;
         struct intel_hid_priv *priv = dev_get_drvdata(&device->dev);
         unsigned long long ev_index;
+       struct key_entry *ke;
         int err;
  
         /*
@@ -545,11 +548,15 @@ static void notify_handler(acpi_handle handle, u32 event, void *context)
                 if (event == 0xc0 || !priv->array)
                         return;
  
-               if (!sparse_keymap_entry_from_scancode(priv->array, event)) {
+               ke = sparse_keymap_entry_from_scancode(priv->array, event);
+               if (!ke) {
                         dev_info(&device->dev, "unknown event 0x%x\n", event);
                         return;
                 }
  
+               if (ke->type == KE_IGNORE)
+                       return;
+
  wakeup:
                 pm_wakeup_hard_event(&device->dev);
  
diff --git a/drivers/platform/x86/intel/vbtn.c b/drivers/platform/x86/intel/vbtn.c

index 084c355c86f5fa9050ccb881a7efa6682b538773..79bb2c801daa972a74b96596e7129583c7abb39c 100644 (file)
--- a/drivers/platform/x86/intel/vbtn.c
+++ b/drivers/platform/x86/intel/vbtn.c
@@ -136,8 +136,6 @@ static int intel_vbtn_input_setup(struct platform_device *device)
         priv->switches_dev->id.bustype = BUS_HOST;
  
         if (priv->has_switches) {
-               detect_tablet_mode(&device->dev);
-
                 ret = input_register_device(priv->switches_dev);
                 if (ret)
                         return ret;
@@ -258,9 +256,6 @@ static const struct dmi_system_id dmi_switches_allow_list[] = {
  
  static bool intel_vbtn_has_switches(acpi_handle handle, bool dual_accel)
  {
-       unsigned long long vgbs;
-       acpi_status status;
-
         /* See dual_accel_detect.h for more info */
         if (dual_accel)
                 return false;
@@ -268,8 +263,7 @@ static bool intel_vbtn_has_switches(acpi_handle handle, bool dual_accel)
         if (!dmi_check_system(dmi_switches_allow_list))
                 return false;
  
-       status = acpi_evaluate_integer(handle, "VGBS", NULL, &vgbs);
-       return ACPI_SUCCESS(status);
+       return acpi_has_method(handle, "VGBS");
  }
  
  static int intel_vbtn_probe(struct platform_device *device)
@@ -316,6 +310,9 @@ static int intel_vbtn_probe(struct platform_device *device)
                 if (ACPI_FAILURE(status))
                         dev_err(&device->dev, "Error VBDL failed with ACPI status %d\n", status);
         }
+       // Check switches after buttons since VBDL may have side effects.
+       if (has_switches)
+               detect_tablet_mode(&device->dev);
  
         device_init_wakeup(&device->dev, true);
         /*
diff --git a/drivers/platform/x86/lg-laptop.c b/drivers/platform/x86/lg-laptop.c

index ad3c39e9e9f586d301abd572c83e76d554a5c382..e714ee6298dda8a66637aa918e33c861508ce15e 100644 (file)
--- a/drivers/platform/x86/lg-laptop.c
+++ b/drivers/platform/x86/lg-laptop.c
@@ -736,7 +736,7 @@ static int acpi_add(struct acpi_device *device)
                 default:
                         year = 2019;
                 }
-       pr_info("product: %s  year: %d\n", product, year);
+       pr_info("product: %s  year: %d\n", product ?: "unknown", year);
  
         if (year >= 2019)
                 battery_limit_use_wmbb = 1;
diff --git a/drivers/platform/x86/toshiba_acpi.c b/drivers/platform/x86/toshiba_acpi.c

index 291f14ef67024a35befa2ab2418e69b8c94c8302..77244c9aa60d233dd35316d764158ab6dcc378ae 100644 (file)
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -264,6 +264,7 @@ static const struct key_entry toshiba_acpi_keymap[] = {
         { KE_KEY, 0xb32, { KEY_NEXTSONG } },
         { KE_KEY, 0xb33, { KEY_PLAYPAUSE } },
         { KE_KEY, 0xb5a, { KEY_MEDIA } },
+       { KE_IGNORE, 0x0e00, { KEY_RESERVED } }, /* Wake from sleep */
         { KE_IGNORE, 0x1430, { KEY_RESERVED } }, /* Wake from sleep */
         { KE_IGNORE, 0x1501, { KEY_RESERVED } }, /* Output changed */
         { KE_IGNORE, 0x1502, { KEY_RESERVED } }, /* HDMI plugged/unplugged */
@@ -3523,9 +3524,10 @@ static void toshiba_acpi_notify(struct acpi_device *acpi_dev, u32 event)
                                         (dev->kbd_mode == SCI_KBD_MODE_ON) ?
                                         LED_FULL : LED_OFF);
                 break;
+       case 0x8e: /* Power button pressed */
+               break;
         case 0x85: /* Unknown */
         case 0x8d: /* Unknown */
-       case 0x8e: /* Unknown */
         case 0x94: /* Unknown */
         case 0x95: /* Unknown */
         default:
diff --git a/drivers/regulator/tps65132-regulator.c b/drivers/regulator/tps65132-regulator.c

index a06f5f2d79329d615807fcc51064705accfdcc63..9c2f0dd42613d43a456974c0fd0018607e2867fe 100644 (file)
--- a/drivers/regulator/tps65132-regulator.c
+++ b/drivers/regulator/tps65132-regulator.c
@@ -267,10 +267,17 @@ static const struct i2c_device_id tps65132_id[] = {
  };
  MODULE_DEVICE_TABLE(i2c, tps65132_id);
  
+static const struct of_device_id __maybe_unused tps65132_of_match[] = {
+       { .compatible = "ti,tps65132" },
+       {},
+};
+MODULE_DEVICE_TABLE(of, tps65132_of_match);
+
  static struct i2c_driver tps65132_i2c_driver = {
         .driver = {
                 .name = "tps65132",
                 .probe_type = PROBE_PREFER_ASYNCHRONOUS,
+               .of_match_table = of_match_ptr(tps65132_of_match),
         },
         .probe = tps65132_probe,
         .id_table = tps65132_id,
diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c

index affb05521e146f3ec6de7ffaf005517b6a0caba7..2c8e964425dc38ca80fa5009b17b4e9dc29bbf10 100644 (file)
--- a/drivers/s390/net/ism_drv.c
+++ b/drivers/s390/net/ism_drv.c
@@ -14,8 +14,6 @@
  #include <linux/err.h>
  #include <linux/ctype.h>
  #include <linux/processor.h>
-#include <linux/dma-mapping.h>
-#include <linux/mm.h>
  
  #include "ism.h"
  
@@ -294,15 +292,13 @@ out:
  static void ism_free_dmb(struct ism_dev *ism, struct ism_dmb *dmb)
  {
         clear_bit(dmb->sba_idx, ism->sba_bitmap);
-       dma_unmap_page(&ism->pdev->dev, dmb->dma_addr, dmb->dmb_len,
-                      DMA_FROM_DEVICE);
-       folio_put(virt_to_folio(dmb->cpu_addr));
+       dma_free_coherent(&ism->pdev->dev, dmb->dmb_len,
+                         dmb->cpu_addr, dmb->dma_addr);
  }
  
  static int ism_alloc_dmb(struct ism_dev *ism, struct ism_dmb *dmb)
  {
         unsigned long bit;
-       int rc;
  
         if (PAGE_ALIGN(dmb->dmb_len) > dma_get_max_seg_size(&ism->pdev->dev))
                 return -EINVAL;
@@ -319,30 +315,14 @@ static int ism_alloc_dmb(struct ism_dev *ism, struct ism_dmb *dmb)
             test_and_set_bit(dmb->sba_idx, ism->sba_bitmap))
                 return -EINVAL;
  
-       dmb->cpu_addr =
-               folio_address(folio_alloc(GFP_KERNEL | __GFP_NOWARN |
-                                         __GFP_NOMEMALLOC | __GFP_NORETRY,
-                                         get_order(dmb->dmb_len)));
+       dmb->cpu_addr = dma_alloc_coherent(&ism->pdev->dev, dmb->dmb_len,
+                                          &dmb->dma_addr,
+                                          GFP_KERNEL | __GFP_NOWARN |
+                                          __GFP_NOMEMALLOC | __GFP_NORETRY);
+       if (!dmb->cpu_addr)
+               clear_bit(dmb->sba_idx, ism->sba_bitmap);
  
-       if (!dmb->cpu_addr) {
-               rc = -ENOMEM;
-               goto out_bit;
-       }
-       dmb->dma_addr = dma_map_page(&ism->pdev->dev,
-                                    virt_to_page(dmb->cpu_addr), 0,
-                                    dmb->dmb_len, DMA_FROM_DEVICE);
-       if (dma_mapping_error(&ism->pdev->dev, dmb->dma_addr)) {
-               rc = -ENOMEM;
-               goto out_free;
-       }
-
-       return 0;
-
-out_free:
-       kfree(dmb->cpu_addr);
-out_bit:
-       clear_bit(dmb->sba_idx, ism->sba_bitmap);
-       return rc;
+       return dmb->cpu_addr ? 0 : -ENOMEM;
  }
  
  int ism_register_dmb(struct ism_dev *ism, struct ism_dmb *dmb,
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c

index 097dfe4b620dce85736b8a0d5cf7f4b3c4842e9b..35f8e00850d6cb3e45063c2229c4f7532a9eae40 100644 (file)
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1797,7 +1797,7 @@ static int hisi_sas_debug_I_T_nexus_reset(struct domain_device *device)
         if (dev_is_sata(device)) {
                 struct ata_link *link = &device->sata_dev.ap->link;
  
-               rc = ata_wait_after_reset(link, HISI_SAS_WAIT_PHYUP_TIMEOUT,
+               rc = ata_wait_after_reset(link, jiffies + HISI_SAS_WAIT_PHYUP_TIMEOUT,
                                           smp_ata_check_ready_type);
         } else {
                 msleep(2000);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c

index 7d2a33514538c2cd8083733d8303f4dc5934de7d..34f96cc35342bcb4ad2e4208d69b19073e6a9bb2 100644 (file)
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -2244,7 +2244,15 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
         case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
                 if ((dw0 & CMPLT_HDR_RSPNS_XFRD_MSK) &&
                     (sipc_rx_err_type & RX_FIS_STATUS_ERR_MSK)) {
-                       ts->stat = SAS_PROTO_RESPONSE;
+                       if (task->ata_task.use_ncq) {
+                               struct domain_device *device = task->dev;
+                               struct hisi_sas_device *sas_dev = device->lldd_dev;
+
+                               sas_dev->dev_status = HISI_SAS_DEV_NCQ_ERR;
+                               slot->abort = 1;
+                       } else {
+                               ts->stat = SAS_PROTO_RESPONSE;
+                       }
                 } else if (dma_rx_err_type & RX_DATA_LEN_UNDERFLOW_MSK) {
                         ts->residual = trans_tx_fail_type;
                         ts->stat = SAS_DATA_UNDERRUN;
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c

index 5c261005b74e47063172d3a0d10fb6e34cf74764..f6e6db8b8aba9133410834ee819f11dbaf1efc4b 100644 (file)
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -135,7 +135,7 @@ static int smp_execute_task(struct domain_device *dev, void *req, int req_size,
  
  static inline void *alloc_smp_req(int size)
  {
-       u8 *p = kzalloc(size, GFP_KERNEL);
+       u8 *p = kzalloc(ALIGN(size, ARCH_DMA_MINALIGN), GFP_KERNEL);
         if (p)
                 p[0] = SMP_REQUEST;
         return p;
diff --git a/drivers/scsi/myrb.c b/drivers/scsi/myrb.c

index ca2e932dd9b7016a92649113df439627e4a1e32b..f684eb5e04898aff3a2d164bad4649dab716861f 100644 (file)
--- a/drivers/scsi/myrb.c
+++ b/drivers/scsi/myrb.c
@@ -1775,9 +1775,9 @@ static ssize_t raid_state_show(struct device *dev,
  
                 name = myrb_devstate_name(ldev_info->state);
                 if (name)
-                       ret = snprintf(buf, 32, "%s\n", name);
+                       ret = snprintf(buf, 64, "%s\n", name);
                 else
-                       ret = snprintf(buf, 32, "Invalid (%02X)\n",
+                       ret = snprintf(buf, 64, "Invalid (%02X)\n",
                                        ldev_info->state);
         } else {
                 struct myrb_pdev_state *pdev_info = sdev->hostdata;
@@ -1796,9 +1796,9 @@ static ssize_t raid_state_show(struct device *dev,
                 else
                         name = myrb_devstate_name(pdev_info->state);
                 if (name)
-                       ret = snprintf(buf, 32, "%s\n", name);
+                       ret = snprintf(buf, 64, "%s\n", name);
                 else
-                       ret = snprintf(buf, 32, "Invalid (%02X)\n",
+                       ret = snprintf(buf, 64, "Invalid (%02X)\n",
                                        pdev_info->state);
         }
         return ret;
@@ -1886,11 +1886,11 @@ static ssize_t raid_level_show(struct device *dev,
  
                 name = myrb_raidlevel_name(ldev_info->raid_level);
                 if (!name)
-                       return snprintf(buf, 32, "Invalid (%02X)\n",
+                       return snprintf(buf, 64, "Invalid (%02X)\n",
                                         ldev_info->state);
-               return snprintf(buf, 32, "%s\n", name);
+               return snprintf(buf, 64, "%s\n", name);
         }
-       return snprintf(buf, 32, "Physical Drive\n");
+       return snprintf(buf, 64, "Physical Drive\n");
  }
  static DEVICE_ATTR_RO(raid_level);
  
@@ -1903,15 +1903,15 @@ static ssize_t rebuild_show(struct device *dev,
         unsigned char status;
  
         if (sdev->channel < myrb_logical_channel(sdev->host))
-               return snprintf(buf, 32, "physical device - not rebuilding\n");
+               return snprintf(buf, 64, "physical device - not rebuilding\n");
  
         status = myrb_get_rbld_progress(cb, &rbld_buf);
  
         if (rbld_buf.ldev_num != sdev->id ||
             status != MYRB_STATUS_SUCCESS)
-               return snprintf(buf, 32, "not rebuilding\n");
+               return snprintf(buf, 64, "not rebuilding\n");
  
-       return snprintf(buf, 32, "rebuilding block %u of %u\n",
+       return snprintf(buf, 64, "rebuilding block %u of %u\n",
                         rbld_buf.ldev_size - rbld_buf.blocks_left,
                         rbld_buf.ldev_size);
  }
diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c

index a1eec65a9713f5bb79a25360f9554bc379f052b3..e824be9d9bbb94f1c1f88bf3da591d7e585dcf8d 100644 (file)
--- a/drivers/scsi/myrs.c
+++ b/drivers/scsi/myrs.c
@@ -947,9 +947,9 @@ static ssize_t raid_state_show(struct device *dev,
  
                 name = myrs_devstate_name(ldev_info->dev_state);
                 if (name)
-                       ret = snprintf(buf, 32, "%s\n", name);
+                       ret = snprintf(buf, 64, "%s\n", name);
                 else
-                       ret = snprintf(buf, 32, "Invalid (%02X)\n",
+                       ret = snprintf(buf, 64, "Invalid (%02X)\n",
                                        ldev_info->dev_state);
         } else {
                 struct myrs_pdev_info *pdev_info;
@@ -958,9 +958,9 @@ static ssize_t raid_state_show(struct device *dev,
                 pdev_info = sdev->hostdata;
                 name = myrs_devstate_name(pdev_info->dev_state);
                 if (name)
-                       ret = snprintf(buf, 32, "%s\n", name);
+                       ret = snprintf(buf, 64, "%s\n", name);
                 else
-                       ret = snprintf(buf, 32, "Invalid (%02X)\n",
+                       ret = snprintf(buf, 64, "Invalid (%02X)\n",
                                        pdev_info->dev_state);
         }
         return ret;
@@ -1066,13 +1066,13 @@ static ssize_t raid_level_show(struct device *dev,
                 ldev_info = sdev->hostdata;
                 name = myrs_raid_level_name(ldev_info->raid_level);
                 if (!name)
-                       return snprintf(buf, 32, "Invalid (%02X)\n",
+                       return snprintf(buf, 64, "Invalid (%02X)\n",
                                         ldev_info->dev_state);
  
         } else
                 name = myrs_raid_level_name(MYRS_RAID_PHYSICAL);
  
-       return snprintf(buf, 32, "%s\n", name);
+       return snprintf(buf, 64, "%s\n", name);
  }
  static DEVICE_ATTR_RO(raid_level);
  
@@ -1086,7 +1086,7 @@ static ssize_t rebuild_show(struct device *dev,
         unsigned char status;
  
         if (sdev->channel < cs->ctlr_info->physchan_present)
-               return snprintf(buf, 32, "physical device - not rebuilding\n");
+               return snprintf(buf, 64, "physical device - not rebuilding\n");
  
         ldev_info = sdev->hostdata;
         ldev_num = ldev_info->ldev_num;
@@ -1098,11 +1098,11 @@ static ssize_t rebuild_show(struct device *dev,
                 return -EIO;
         }
         if (ldev_info->rbld_active) {
-               return snprintf(buf, 32, "rebuilding block %zu of %zu\n",
+               return snprintf(buf, 64, "rebuilding block %zu of %zu\n",
                                 (size_t)ldev_info->rbld_lba,
                                 (size_t)ldev_info->cfg_devsize);
         } else
-               return snprintf(buf, 32, "not rebuilding\n");
+               return snprintf(buf, 64, "not rebuilding\n");
  }
  
  static ssize_t rebuild_store(struct device *dev,
@@ -1190,7 +1190,7 @@ static ssize_t consistency_check_show(struct device *dev,
         unsigned short ldev_num;
  
         if (sdev->channel < cs->ctlr_info->physchan_present)
-               return snprintf(buf, 32, "physical device - not checking\n");
+               return snprintf(buf, 64, "physical device - not checking\n");
  
         ldev_info = sdev->hostdata;
         if (!ldev_info)
@@ -1198,11 +1198,11 @@ static ssize_t consistency_check_show(struct device *dev,
         ldev_num = ldev_info->ldev_num;
         myrs_get_ldev_info(cs, ldev_num, ldev_info);
         if (ldev_info->cc_active)
-               return snprintf(buf, 32, "checking block %zu of %zu\n",
+               return snprintf(buf, 64, "checking block %zu of %zu\n",
                                 (size_t)ldev_info->cc_lba,
                                 (size_t)ldev_info->cfg_devsize);
         else
-               return snprintf(buf, 32, "not checking\n");
+               return snprintf(buf, 64, "not checking\n");
  }
  
  static ssize_t consistency_check_store(struct device *dev,
diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c

index 26e6b3e3af4317ca088941bc5bab37c15aebfd32..dcde55c8ee5deadd421b108087605c5c822c3b4c 100644 (file)
--- a/drivers/scsi/qla2xxx/qla_edif.c
+++ b/drivers/scsi/qla2xxx/qla_edif.c
@@ -1100,7 +1100,7 @@ qla_edif_app_getstats(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
  
                 list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
                         if (fcport->edif.enable) {
-                               if (pcnt > app_req.num_ports)
+                               if (pcnt >= app_req.num_ports)
                                         break;
  
                                 app_reply->elem[pcnt].rekey_count =
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c

index 3cf89867029044ede7b2f6a844dca974b8d50e32..58fdf679341dc64ee1768f1d09dc7ce4d549b4bd 100644 (file)
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3920,7 +3920,7 @@ static int sd_probe(struct device *dev)
  
         error = device_add_disk(dev, gd, NULL);
         if (error) {
-               put_device(&sdkp->disk_dev);
+               device_unregister(&sdkp->disk_dev);
                 put_disk(gd);
                 goto out;
         }
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c

index 386981c6976a53d668632457a47fcf1db609f5fd..baf870a03ecf6c6516f90e599188c659dc986bae 100644 (file)
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -285,6 +285,7 @@ sg_open(struct inode *inode, struct file *filp)
         int dev = iminor(inode);
         int flags = filp->f_flags;
         struct request_queue *q;
+       struct scsi_device *device;
         Sg_device *sdp;
         Sg_fd *sfp;
         int retval;
@@ -301,11 +302,12 @@ sg_open(struct inode *inode, struct file *filp)
  
         /* This driver's module count bumped by fops_get in <linux/fs.h> */
         /* Prevent the device driver from vanishing while we sleep */
-       retval = scsi_device_get(sdp->device);
+       device = sdp->device;
+       retval = scsi_device_get(device);
         if (retval)
                 goto sg_put;
  
-       retval = scsi_autopm_get_device(sdp->device);
+       retval = scsi_autopm_get_device(device);
         if (retval)
                 goto sdp_put;
  
@@ -313,7 +315,7 @@ sg_open(struct inode *inode, struct file *filp)
          * check if O_NONBLOCK. Permits SCSI commands to be issued
          * during error recovery. Tread carefully. */
         if (!((flags & O_NONBLOCK) ||
-             scsi_block_when_processing_errors(sdp->device))) {
+             scsi_block_when_processing_errors(device))) {
                 retval = -ENXIO;
                 /* we are in error recovery for this device */
                 goto error_out;
@@ -344,7 +346,7 @@ sg_open(struct inode *inode, struct file *filp)
  
         if (sdp->open_cnt < 1) {  /* no existing opens */
                 sdp->sgdebug = 0;
-               q = sdp->device->request_queue;
+               q = device->request_queue;
                 sdp->sg_tablesize = queue_max_segments(q);
         }
         sfp = sg_add_sfp(sdp);
@@ -370,10 +372,11 @@ out_undo:
  error_mutex_locked:
         mutex_unlock(&sdp->open_rel_lock);
  error_out:
-       scsi_autopm_put_device(sdp->device);
+       scsi_autopm_put_device(device);
  sdp_put:
-       scsi_device_put(sdp->device);
-       goto sg_put;
+       kref_put(&sdp->d_ref, sg_device_destroy);
+       scsi_device_put(device);
+       return retval;
  }
  
  /* Release resources associated with a successful sg_open()
@@ -2233,7 +2236,6 @@ sg_remove_sfp_usercontext(struct work_struct *work)
                         "sg_remove_sfp: sfp=0x%p\n", sfp));
         kfree(sfp);
  
-       WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
         kref_put(&sdp->d_ref, sg_device_destroy);
         scsi_device_put(device);
         module_put(THIS_MODULE);
diff --git a/drivers/spi/spi-fsl-lpspi.c b/drivers/spi/spi-fsl-lpspi.c

index 079035db7dd8592aa62a21d5e88480028c5941bb..92a662d1b55cf2ed044fcbbfe96fd03ef0035736 100644 (file)
--- a/drivers/spi/spi-fsl-lpspi.c
+++ b/drivers/spi/spi-fsl-lpspi.c
@@ -852,39 +852,39 @@ static int fsl_lpspi_probe(struct platform_device *pdev)
         fsl_lpspi->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
         if (IS_ERR(fsl_lpspi->base)) {
                 ret = PTR_ERR(fsl_lpspi->base);
-               goto out_controller_put;
+               return ret;
         }
         fsl_lpspi->base_phys = res->start;
  
         irq = platform_get_irq(pdev, 0);
         if (irq < 0) {
                 ret = irq;
-               goto out_controller_put;
+               return ret;
         }
  
         ret = devm_request_irq(&pdev->dev, irq, fsl_lpspi_isr, 0,
                                dev_name(&pdev->dev), fsl_lpspi);
         if (ret) {
                 dev_err(&pdev->dev, "can't get irq%d: %d\n", irq, ret);
-               goto out_controller_put;
+               return ret;
         }
  
         fsl_lpspi->clk_per = devm_clk_get(&pdev->dev, "per");
         if (IS_ERR(fsl_lpspi->clk_per)) {
                 ret = PTR_ERR(fsl_lpspi->clk_per);
-               goto out_controller_put;
+               return ret;
         }
  
         fsl_lpspi->clk_ipg = devm_clk_get(&pdev->dev, "ipg");
         if (IS_ERR(fsl_lpspi->clk_ipg)) {
                 ret = PTR_ERR(fsl_lpspi->clk_ipg);
-               goto out_controller_put;
+               return ret;
         }
  
         /* enable the clock */
         ret = fsl_lpspi_init_rpm(fsl_lpspi);
         if (ret)
-               goto out_controller_put;
+               return ret;
  
         ret = pm_runtime_get_sync(fsl_lpspi->dev);
         if (ret < 0) {
@@ -945,8 +945,6 @@ out_pm_get:
         pm_runtime_dont_use_autosuspend(fsl_lpspi->dev);
         pm_runtime_put_sync(fsl_lpspi->dev);
         pm_runtime_disable(fsl_lpspi->dev);
-out_controller_put:
-       spi_controller_put(controller);
  
         return ret;
  }
diff --git a/drivers/spi/spi-pci1xxxx.c b/drivers/spi/spi-pci1xxxx.c

index 969965d7bc98b538c6a9a0aa710e2083c7a4925a..cc18d320370f97523fae77bb5b34fc199b3e62e5 100644 (file)
--- a/drivers/spi/spi-pci1xxxx.c
+++ b/drivers/spi/spi-pci1xxxx.c
@@ -725,6 +725,8 @@ static int pci1xxxx_spi_probe(struct pci_dev *pdev, const struct pci_device_id *
                 spi_bus->spi_int[iter] = devm_kzalloc(&pdev->dev,
                                                       sizeof(struct pci1xxxx_spi_internal),
                                                       GFP_KERNEL);
+               if (!spi_bus->spi_int[iter])
+                       return -ENOMEM;
                 spi_sub_ptr = spi_bus->spi_int[iter];
                 spi_sub_ptr->spi_host = devm_spi_alloc_host(dev, sizeof(struct spi_controller));
                 if (!spi_sub_ptr->spi_host)
diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c

index 9fcbe040cb2f2e0b8f8fb916d1c867c1c1949332..f726d86704287e56b5cffba6ed40d8f7e1ca4956 100644 (file)
--- a/drivers/spi/spi-s3c64xx.c
+++ b/drivers/spi/spi-s3c64xx.c
@@ -430,7 +430,7 @@ static bool s3c64xx_spi_can_dma(struct spi_controller *host,
         struct s3c64xx_spi_driver_data *sdd = spi_controller_get_devdata(host);
  
         if (sdd->rx_dma.ch && sdd->tx_dma.ch)
-               return xfer->len > sdd->fifo_depth;
+               return xfer->len >= sdd->fifo_depth;
  
         return false;
  }
@@ -826,10 +826,9 @@ static int s3c64xx_spi_transfer_one(struct spi_controller *host,
                         return status;
         }
  
-       if (!is_polling(sdd) && (xfer->len > fifo_len) &&
+       if (!is_polling(sdd) && xfer->len >= fifo_len &&
             sdd->rx_dma.ch && sdd->tx_dma.ch) {
                 use_dma = 1;
-
         } else if (xfer->len >= fifo_len) {
                 tx_buf = xfer->tx_buf;
                 rx_buf = xfer->rx_buf;
diff --git a/drivers/target/target_core_configfs.c b/drivers/target/target_core_configfs.c

index c1fbcdd1618264f0cd09f5e4078ac600ad6dc22a..c40217f44b1bc53d149e8d5ea12c0e5297373800 100644 (file)
--- a/drivers/target/target_core_configfs.c
+++ b/drivers/target/target_core_configfs.c
@@ -3672,6 +3672,8 @@ static int __init target_core_init_configfs(void)
  {
         struct configfs_subsystem *subsys = &target_core_fabrics;
         struct t10_alua_lu_gp *lu_gp;
+       struct cred *kern_cred;
+       const struct cred *old_cred;
         int ret;
  
         pr_debug("TARGET_CORE[0]: Loading Generic Kernel Storage"
@@ -3748,11 +3750,21 @@ static int __init target_core_init_configfs(void)
         if (ret < 0)
                 goto out;
  
+       /* We use the kernel credentials to access the target directory */
+       kern_cred = prepare_kernel_cred(&init_task);
+       if (!kern_cred) {
+               ret = -ENOMEM;
+               goto out;
+       }
+       old_cred = override_creds(kern_cred);
         target_init_dbroot();
+       revert_creds(old_cred);
+       put_cred(kern_cred);
  
         return 0;
  
  out:
+       target_xcopy_release_pt();
         configfs_unregister_subsystem(subsys);
         core_dev_release_virtual_lun0();
         rd_module_exit();
diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c

index 1b17dc4c219cc94aae8bf030526298aca3e29eaa..e25e48d76aa79c843e6873fa2ee8bc1a830bc7f5 100644 (file)
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -606,7 +606,7 @@ static int allocate_actors_buffer(struct power_allocator_params *params,
  
         /* There might be no cooling devices yet. */
         if (!num_actors) {
-               ret = -EINVAL;
+               ret = 0;
                 goto clean_state;
         }
  
@@ -679,11 +679,6 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
                 return -ENOMEM;
  
         get_governor_trips(tz, params);
-       if (!params->trip_max) {
-               dev_warn(&tz->device, "power_allocator: missing trip_max\n");
-               kfree(params);
-               return -EINVAL;
-       }
  
         ret = check_power_actors(tz, params);
         if (ret < 0) {
@@ -714,9 +709,10 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
         else
                 params->sustainable_power = tz->tzp->sustainable_power;
  
-       estimate_pid_constants(tz, tz->tzp->sustainable_power,
-                              params->trip_switch_on,
-                              params->trip_max->temperature);
+       if (params->trip_max)
+               estimate_pid_constants(tz, tz->tzp->sustainable_power,
+                                      params->trip_switch_on,
+                                      params->trip_max->temperature);
  
         reset_pid_controller(params);
  
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c

index e30fd125988d7a8ca521d6fb30e97c671f269732..a0f8e930167d70aab48b315076fe76f9924e34b4 100644 (file)
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -3217,7 +3217,9 @@ retry:
  
                 /* MCQ mode */
                 if (is_mcq_enabled(hba)) {
-                       err = ufshcd_clear_cmd(hba, lrbp->task_tag);
+                       /* successfully cleared the command, retry if needed */
+                       if (ufshcd_clear_cmd(hba, lrbp->task_tag) == 0)
+                               err = -EAGAIN;
                         hba->dev_cmd.complete = NULL;
                         return err;
                 }
@@ -9791,7 +9793,10 @@ static int __ufshcd_wl_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
  
         /* UFS device & link must be active before we enter in this function */
         if (!ufshcd_is_ufs_dev_active(hba) || !ufshcd_is_link_active(hba)) {
-               ret = -EINVAL;
+               /*  Wait err handler finish or trigger err recovery */
+               if (!ufshcd_eh_in_progress(hba))
+                       ufshcd_force_error_recovery(hba);
+               ret = -EBUSY;
                 goto enable_scaling;
         }
  
diff --git a/fs/aio.c b/fs/aio.c

index 9cdaa2faa5363333627e0cba54a4efe75b45b144..0f4f531c97800c648437fb2eb7409ccc2b198536 100644 (file)
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1202,8 +1202,8 @@ static void aio_complete(struct aio_kiocb *iocb)
                 spin_lock_irqsave(&ctx->wait.lock, flags);
                 list_for_each_entry_safe(curr, next, &ctx->wait.head, w.entry)
                         if (avail >= curr->min_nr) {
-                               list_del_init_careful(&curr->w.entry);
                                 wake_up_process(curr->w.private);
+                               list_del_init_careful(&curr->w.entry);
                         }
                 spin_unlock_irqrestore(&ctx->wait.lock, flags);
         }
diff --git a/fs/bcachefs/acl.c b/fs/bcachefs/acl.c

index 3640f417cce118b06e43ae4c8b38bb275b0097fc..5c180fdc3efbdf09791c7941f3f1522cd6d6f9dc 100644 (file)
--- a/fs/bcachefs/acl.c
+++ b/fs/bcachefs/acl.c
@@ -281,7 +281,6 @@ struct posix_acl *bch2_get_acl(struct mnt_idmap *idmap,
         struct xattr_search_key search = X_SEARCH(acl_to_xattr_type(type), "", 0);
         struct btree_trans *trans = bch2_trans_get(c);
         struct btree_iter iter = { NULL };
-       struct bkey_s_c_xattr xattr;
         struct posix_acl *acl = NULL;
         struct bkey_s_c k;
         int ret;
@@ -290,28 +289,27 @@ retry:
  
         ret = bch2_hash_lookup(trans, &iter, bch2_xattr_hash_desc,
                         &hash, inode_inum(inode), &search, 0);
-       if (ret) {
-               if (!bch2_err_matches(ret, ENOENT))
-                       acl = ERR_PTR(ret);
-               goto out;
-       }
+       if (ret)
+               goto err;
  
         k = bch2_btree_iter_peek_slot(&iter);
         ret = bkey_err(k);
-       if (ret) {
-               acl = ERR_PTR(ret);
-               goto out;
-       }
+       if (ret)
+               goto err;
  
-       xattr = bkey_s_c_to_xattr(k);
+       struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
         acl = bch2_acl_from_disk(trans, xattr_val(xattr.v),
-                       le16_to_cpu(xattr.v->x_val_len));
+                                le16_to_cpu(xattr.v->x_val_len));
+       ret = PTR_ERR_OR_ZERO(acl);
+err:
+       if (bch2_err_matches(ret, BCH_ERR_transaction_restart))
+               goto retry;
  
-       if (!IS_ERR(acl))
+       if (ret)
+               acl = !bch2_err_matches(ret, ENOENT) ? ERR_PTR(ret) : NULL;
+
+       if (!IS_ERR_OR_NULL(acl))
                 set_cached_acl(&inode->v, type, acl);
-out:
-       if (bch2_err_matches(PTR_ERR_OR_ZERO(acl), BCH_ERR_transaction_restart))
-               goto retry;
  
         bch2_trans_iter_exit(trans, &iter);
         bch2_trans_put(trans);
diff --git a/fs/bcachefs/bcachefs_format.h b/fs/bcachefs/bcachefs_format.h

index 63102992d9556d1b33b445a3116df61964a6ca01..364ae42022af1750f9887d3f16c566a676b30776 100644 (file)
--- a/fs/bcachefs/bcachefs_format.h
+++ b/fs/bcachefs/bcachefs_format.h
@@ -1535,6 +1535,20 @@ enum btree_id {
         BTREE_ID_NR
  };
  
+static inline bool btree_id_is_alloc(enum btree_id id)
+{
+       switch (id) {
+       case BTREE_ID_alloc:
+       case BTREE_ID_backpointers:
+       case BTREE_ID_need_discard:
+       case BTREE_ID_freespace:
+       case BTREE_ID_bucket_gens:
+               return true;
+       default:
+               return false;
+       }
+}
+
  #define BTREE_MAX_DEPTH                4U
  
  /* Btree nodes */
diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c

index 6280da1244b55032beaf60c4e2b29df0ff2c3152..d2555da55c6da3750af9fab2538e3653118a48d9 100644 (file)
--- a/fs/bcachefs/btree_gc.c
+++ b/fs/bcachefs/btree_gc.c
@@ -368,11 +368,16 @@ again:
                                 buf.buf)) {
                         bch2_btree_node_evict(trans, cur_k.k);
                         cur = NULL;
-                       ret =   bch2_run_explicit_recovery_pass(c, BCH_RECOVERY_PASS_scan_for_btree_nodes) ?:
-                               bch2_journal_key_delete(c, b->c.btree_id,
-                                                       b->c.level, cur_k.k->k.p);
+                       ret = bch2_journal_key_delete(c, b->c.btree_id,
+                                                     b->c.level, cur_k.k->k.p);
                         if (ret)
                                 break;
+
+                       if (!btree_id_is_alloc(b->c.btree_id)) {
+                               ret = bch2_run_explicit_recovery_pass(c, BCH_RECOVERY_PASS_scan_for_btree_nodes);
+                               if (ret)
+                                       break;
+                       }
                         continue;
                 }
  
@@ -544,12 +549,12 @@ reconstruct_root:
                                 bch2_btree_root_alloc_fake(c, i, 0);
                         } else {
                                 bch2_btree_root_alloc_fake(c, i, 1);
+                               bch2_shoot_down_journal_keys(c, i, 1, BTREE_MAX_DEPTH, POS_MIN, SPOS_MAX);
                                 ret = bch2_get_scanned_nodes(c, i, 0, POS_MIN, SPOS_MAX);
                                 if (ret)
                                         break;
                         }
  
-                       bch2_shoot_down_journal_keys(c, i, 1, BTREE_MAX_DEPTH, POS_MIN, SPOS_MAX);
                         reconstructed_root = true;
                 }
  
diff --git a/fs/bcachefs/btree_iter.h b/fs/bcachefs/btree_iter.h

index 24772538e4cc74ada59851bd7847dd5ece5ea122..1d58d447b386cdf74ecce2609f8aa23851ae4057 100644 (file)
--- a/fs/bcachefs/btree_iter.h
+++ b/fs/bcachefs/btree_iter.h
@@ -642,7 +642,7 @@ int __bch2_btree_trans_too_many_iters(struct btree_trans *);
  
  static inline int btree_trans_too_many_iters(struct btree_trans *trans)
  {
-       if (bitmap_weight(trans->paths_allocated, trans->nr_paths) > BTREE_ITER_INITIAL - 8)
+       if (bitmap_weight(trans->paths_allocated, trans->nr_paths) > BTREE_ITER_NORMAL_LIMIT - 8)
                 return __bch2_btree_trans_too_many_iters(trans);
  
         return 0;
diff --git a/fs/bcachefs/btree_journal_iter.c b/fs/bcachefs/btree_journal_iter.c

index 5cbcbfe85235b8de3777ae82b120d4627f99c8d7..1e8cf49a69353198774a0e5b798c2f1f135041fa 100644 (file)
--- a/fs/bcachefs/btree_journal_iter.c
+++ b/fs/bcachefs/btree_journal_iter.c
@@ -130,12 +130,30 @@ struct bkey_i *bch2_journal_keys_peek_slot(struct bch_fs *c, enum btree_id btree
         return bch2_journal_keys_peek_upto(c, btree_id, level, pos, pos, &idx);
  }
  
+static void journal_iter_verify(struct journal_iter *iter)
+{
+       struct journal_keys *keys = iter->keys;
+       size_t gap_size = keys->size - keys->nr;
+
+       BUG_ON(iter->idx >= keys->gap &&
+              iter->idx <  keys->gap + gap_size);
+
+       if (iter->idx < keys->size) {
+               struct journal_key *k = keys->data + iter->idx;
+
+               int cmp = cmp_int(k->btree_id,  iter->btree_id) ?:
+                         cmp_int(k->level,     iter->level);
+               BUG_ON(cmp < 0);
+       }
+}
+
  static void journal_iters_fix(struct bch_fs *c)
  {
         struct journal_keys *keys = &c->journal_keys;
         /* The key we just inserted is immediately before the gap: */
         size_t gap_end = keys->gap + (keys->size - keys->nr);
-       struct btree_and_journal_iter *iter;
+       struct journal_key *new_key = &keys->data[keys->gap - 1];
+       struct journal_iter *iter;
  
         /*
          * If an iterator points one after the key we just inserted, decrement
@@ -143,9 +161,14 @@ static void journal_iters_fix(struct bch_fs *c)
          * decrement was unnecessary, bch2_btree_and_journal_iter_peek() will
          * handle that:
          */
-       list_for_each_entry(iter, &c->journal_iters, journal.list)
-               if (iter->journal.idx == gap_end)
-                       iter->journal.idx = keys->gap - 1;
+       list_for_each_entry(iter, &c->journal_iters, list) {
+               journal_iter_verify(iter);
+               if (iter->idx           == gap_end &&
+                   new_key->btree_id   == iter->btree_id &&
+                   new_key->level      == iter->level)
+                       iter->idx = keys->gap - 1;
+               journal_iter_verify(iter);
+       }
  }
  
  static void journal_iters_move_gap(struct bch_fs *c, size_t old_gap, size_t new_gap)
@@ -192,7 +215,12 @@ int bch2_journal_key_insert_take(struct bch_fs *c, enum btree_id id,
         if (idx > keys->gap)
                 idx -= keys->size - keys->nr;
  
+       size_t old_gap = keys->gap;
+
         if (keys->nr == keys->size) {
+               journal_iters_move_gap(c, old_gap, keys->size);
+               old_gap = keys->size;
+
                 struct journal_keys new_keys = {
                         .nr                     = keys->nr,
                         .size                   = max_t(size_t, keys->size, 8) * 2,
@@ -216,7 +244,7 @@ int bch2_journal_key_insert_take(struct bch_fs *c, enum btree_id id,
                 keys->gap       = keys->nr;
         }
  
-       journal_iters_move_gap(c, keys->gap, idx);
+       journal_iters_move_gap(c, old_gap, idx);
  
         move_gap(keys, idx);
  
@@ -301,16 +329,21 @@ static void bch2_journal_iter_advance(struct journal_iter *iter)
  
  static struct bkey_s_c bch2_journal_iter_peek(struct journal_iter *iter)
  {
-       struct journal_key *k = iter->keys->data + iter->idx;
+       journal_iter_verify(iter);
+
+       while (iter->idx < iter->keys->size) {
+               struct journal_key *k = iter->keys->data + iter->idx;
+
+               int cmp = cmp_int(k->btree_id,  iter->btree_id) ?:
+                         cmp_int(k->level,     iter->level);
+               if (cmp > 0)
+                       break;
+               BUG_ON(cmp);
  
-       while (k < iter->keys->data + iter->keys->size &&
-              k->btree_id      == iter->btree_id &&
-              k->level         == iter->level) {
                 if (!k->overwritten)
                         return bkey_i_to_s_c(k->k);
  
                 bch2_journal_iter_advance(iter);
-               k = iter->keys->data + iter->idx;
         }
  
         return bkey_s_c_null;
@@ -330,6 +363,8 @@ static void bch2_journal_iter_init(struct bch_fs *c,
         iter->level     = level;
         iter->keys      = &c->journal_keys;
         iter->idx       = bch2_journal_key_search(&c->journal_keys, id, level, pos);
+
+       journal_iter_verify(iter);
  }
  
  static struct bkey_s_c bch2_journal_iter_peek_btree(struct btree_and_journal_iter *iter)
@@ -434,10 +469,15 @@ void __bch2_btree_and_journal_iter_init_node_iter(struct btree_trans *trans,
         iter->trans = trans;
         iter->b = b;
         iter->node_iter = node_iter;
-       bch2_journal_iter_init(trans->c, &iter->journal, b->c.btree_id, b->c.level, pos);
-       INIT_LIST_HEAD(&iter->journal.list);
         iter->pos = b->data->min_key;
         iter->at_end = false;
+       INIT_LIST_HEAD(&iter->journal.list);
+
+       if (trans->journal_replay_not_finished) {
+               bch2_journal_iter_init(trans->c, &iter->journal, b->c.btree_id, b->c.level, pos);
+               if (!test_bit(BCH_FS_may_go_rw, &trans->c->flags))
+                       list_add(&iter->journal.list, &trans->c->journal_iters);
+       }
  }
  
  /*
@@ -452,9 +492,6 @@ void bch2_btree_and_journal_iter_init_node_iter(struct btree_trans *trans,
  
         bch2_btree_node_iter_init_from_start(&node_iter, b);
         __bch2_btree_and_journal_iter_init_node_iter(trans, iter, b, node_iter, b->data->min_key);
-       if (trans->journal_replay_not_finished &&
-           !test_bit(BCH_FS_may_go_rw, &trans->c->flags))
-               list_add(&iter->journal.list, &trans->c->journal_iters);
  }
  
  /* sort and dedup all keys in the journal: */
diff --git a/fs/bcachefs/btree_key_cache.c b/fs/bcachefs/btree_key_cache.c

index 581edcb0911bfa39e9ec6242686bd213c47f352c..88a3582a32757e34a28eb37143f9ff78a88a4085 100644 (file)
--- a/fs/bcachefs/btree_key_cache.c
+++ b/fs/bcachefs/btree_key_cache.c
@@ -169,6 +169,7 @@ static void bkey_cached_move_to_freelist(struct btree_key_cache *bc,
         } else {
                 mutex_lock(&bc->lock);
                 list_move_tail(&ck->list, &bc->freed_pcpu);
+               bc->nr_freed_pcpu++;
                 mutex_unlock(&bc->lock);
         }
  }
@@ -245,6 +246,7 @@ bkey_cached_alloc(struct btree_trans *trans, struct btree_path *path,
                 if (!list_empty(&bc->freed_pcpu)) {
                         ck = list_last_entry(&bc->freed_pcpu, struct bkey_cached, list);
                         list_del_init(&ck->list);
+                       bc->nr_freed_pcpu--;
                 }
                 mutex_unlock(&bc->lock);
         }
@@ -659,7 +661,7 @@ static int btree_key_cache_flush_pos(struct btree_trans *trans,
                 commit_flags |= BCH_WATERMARK_reclaim;
  
         if (ck->journal.seq != journal_last_seq(j) ||
-           j->watermark == BCH_WATERMARK_stripe)
+           !test_bit(JOURNAL_SPACE_LOW, &c->journal.flags))
                 commit_flags |= BCH_TRANS_COMMIT_no_journal_res;
  
         ret   = bch2_btree_iter_traverse(&b_iter) ?:
diff --git a/fs/bcachefs/btree_locking.c b/fs/bcachefs/btree_locking.c

index b9b151e693ed60ecc3dc9147cc34902643cfc7aa..f2caf491957efc2345c082323516e58fe2a35302 100644 (file)
--- a/fs/bcachefs/btree_locking.c
+++ b/fs/bcachefs/btree_locking.c
@@ -440,33 +440,7 @@ void bch2_btree_node_lock_write_nofail(struct btree_trans *trans,
                                        struct btree_path *path,
                                        struct btree_bkey_cached_common *b)
  {
-       struct btree_path *linked;
-       unsigned i, iter;
-       int ret;
-
-       /*
-        * XXX BIG FAT NOTICE
-        *
-        * Drop all read locks before taking a write lock:
-        *
-        * This is a hack, because bch2_btree_node_lock_write_nofail() is a
-        * hack - but by dropping read locks first, this should never fail, and
-        * we only use this in code paths where whatever read locks we've
-        * already taken are no longer needed:
-        */
-
-       trans_for_each_path(trans, linked, iter) {
-               if (!linked->nodes_locked)
-                       continue;
-
-               for (i = 0; i < BTREE_MAX_DEPTH; i++)
-                       if (btree_node_read_locked(linked, i)) {
-                               btree_node_unlock(trans, linked, i);
-                               btree_path_set_dirty(linked, BTREE_ITER_NEED_RELOCK);
-                       }
-       }
-
-       ret = __btree_node_lock_write(trans, path, b, true);
+       int ret = __btree_node_lock_write(trans, path, b, true);
         BUG_ON(ret);
  }
  
diff --git a/fs/bcachefs/btree_node_scan.c b/fs/bcachefs/btree_node_scan.c

index 3f33be7e5e5c26d9be6d0f1431ee04c3e545b825..556f76f5c84e1613c332e7443e6bb8b1602dd359 100644 (file)
--- a/fs/bcachefs/btree_node_scan.c
+++ b/fs/bcachefs/btree_node_scan.c
@@ -133,6 +133,9 @@ static void try_read_btree_node(struct find_btree_nodes *f, struct bch_dev *ca,
         if (le64_to_cpu(bn->magic) != bset_magic(c))
                 return;
  
+       if (btree_id_is_alloc(BTREE_NODE_ID(bn)))
+               return;
+
         rcu_read_lock();
         struct found_btree_node n = {
                 .btree_id       = BTREE_NODE_ID(bn),
@@ -213,6 +216,9 @@ static int read_btree_nodes(struct find_btree_nodes *f)
         closure_init_stack(&cl);
  
         for_each_online_member(c, ca) {
+               if (!(ca->mi.data_allowed & BIT(BCH_DATA_btree)))
+                       continue;
+
                 struct find_btree_nodes_worker *w = kmalloc(sizeof(*w), GFP_KERNEL);
                 struct task_struct *t;
  
@@ -290,7 +296,7 @@ again:
                         found_btree_node_to_text(&buf, c, n);
                         bch_err(c, "%s", buf.buf);
                         printbuf_exit(&buf);
-                       return -1;
+                       return -BCH_ERR_fsck_repair_unimplemented;
                 }
         }
  
@@ -436,6 +442,9 @@ bool bch2_btree_has_scanned_nodes(struct bch_fs *c, enum btree_id btree)
  int bch2_get_scanned_nodes(struct bch_fs *c, enum btree_id btree,
                            unsigned level, struct bpos node_min, struct bpos node_max)
  {
+       if (btree_id_is_alloc(btree))
+               return 0;
+
         struct find_btree_nodes *f = &c->found_btree_nodes;
  
         int ret = bch2_run_explicit_recovery_pass(c, BCH_RECOVERY_PASS_scan_for_btree_nodes);
diff --git a/fs/bcachefs/btree_types.h b/fs/bcachefs/btree_types.h

index 9404d96c38f3b368726a6603b601b241b5106100..e0c982a4195c764ab8a415b5f7f80cbff88c1935 100644 (file)
--- a/fs/bcachefs/btree_types.h
+++ b/fs/bcachefs/btree_types.h
@@ -364,7 +364,21 @@ struct btree_insert_entry {
         unsigned long           ip_allocated;
  };
  
+/* Number of btree paths we preallocate, usually enough */
  #define BTREE_ITER_INITIAL             64
+/*
+ * Lmiit for btree_trans_too_many_iters(); this is enough that almost all code
+ * paths should run inside this limit, and if they don't it usually indicates a
+ * bug (leaking/duplicated btree paths).
+ *
+ * exception: some fsck paths
+ *
+ * bugs with excessive path usage seem to have possibly been eliminated now, so
+ * we might consider eliminating this (and btree_trans_too_many_iter()) at some
+ * point.
+ */
+#define BTREE_ITER_NORMAL_LIMIT                256
+/* never exceed limit */
  #define BTREE_ITER_MAX                 (1U << 10)
  
  struct btree_trans_commit_hook;
diff --git a/fs/bcachefs/btree_update_interior.c b/fs/bcachefs/btree_update_interior.c

index 32397b99752fd2ec3cfd553724c97c7f217ca56e..c4a5e83a56a436548263445e3b7a329644757cc7 100644 (file)
--- a/fs/bcachefs/btree_update_interior.c
+++ b/fs/bcachefs/btree_update_interior.c
@@ -26,9 +26,9 @@
  
  #include <linux/random.h>
  
-const char * const bch2_btree_update_modes[] = {
+static const char * const bch2_btree_update_modes[] = {
  #define x(t) #t,
-       BCH_WATERMARKS()
+       BTREE_UPDATE_MODES()
  #undef x
         NULL
  };
@@ -704,9 +704,13 @@ static void btree_update_nodes_written(struct btree_update *as)
         bch2_fs_fatal_err_on(ret && !bch2_journal_error(&c->journal), c,
                              "%s", bch2_err_str(ret));
  err:
-       if (as->b) {
-
-               b = as->b;
+       /*
+        * We have to be careful because another thread might be getting ready
+        * to free as->b and calling btree_update_reparent() on us - we'll
+        * recheck under btree_update_lock below:
+        */
+       b = READ_ONCE(as->b);
+       if (b) {
                 btree_path_idx_t path_idx = get_unlocked_mut_path(trans,
                                                 as->btree_id, b->c.level, b->key.k.p);
                 struct btree_path *path = trans->paths + path_idx;
@@ -850,15 +854,17 @@ static void btree_update_updated_node(struct btree_update *as, struct btree *b)
  {
         struct bch_fs *c = as->c;
  
-       mutex_lock(&c->btree_interior_update_lock);
-       list_add_tail(&as->unwritten_list, &c->btree_interior_updates_unwritten);
-
         BUG_ON(as->mode != BTREE_UPDATE_none);
+       BUG_ON(as->update_level_end < b->c.level);
         BUG_ON(!btree_node_dirty(b));
         BUG_ON(!b->c.level);
  
+       mutex_lock(&c->btree_interior_update_lock);
+       list_add_tail(&as->unwritten_list, &c->btree_interior_updates_unwritten);
+
         as->mode        = BTREE_UPDATE_node;
         as->b           = b;
+       as->update_level_end = b->c.level;
  
         set_btree_node_write_blocked(b);
         list_add(&as->write_blocked_list, &b->write_blocked);
@@ -1100,7 +1106,7 @@ static void bch2_btree_update_done(struct btree_update *as, struct btree_trans *
  
  static struct btree_update *
  bch2_btree_update_start(struct btree_trans *trans, struct btree_path *path,
-                       unsigned level, bool split, unsigned flags)
+                       unsigned level_start, bool split, unsigned flags)
  {
         struct bch_fs *c = trans->c;
         struct btree_update *as;
@@ -1108,7 +1114,7 @@ bch2_btree_update_start(struct btree_trans *trans, struct btree_path *path,
         int disk_res_flags = (flags & BCH_TRANS_COMMIT_no_enospc)
                 ? BCH_DISK_RESERVATION_NOFAIL : 0;
         unsigned nr_nodes[2] = { 0, 0 };
-       unsigned update_level = level;
+       unsigned level_end = level_start;
         enum bch_watermark watermark = flags & BCH_WATERMARK_MASK;
         int ret = 0;
         u32 restart_count = trans->restart_count;
@@ -1123,34 +1129,30 @@ bch2_btree_update_start(struct btree_trans *trans, struct btree_path *path,
         flags &= ~BCH_WATERMARK_MASK;
         flags |= watermark;
  
-       if (watermark < c->journal.watermark) {
-               struct journal_res res = { 0 };
-               unsigned journal_flags = watermark|JOURNAL_RES_GET_CHECK;
+       if (watermark < BCH_WATERMARK_reclaim &&
+           test_bit(JOURNAL_SPACE_LOW, &c->journal.flags)) {
+               if (flags & BCH_TRANS_COMMIT_journal_reclaim)
+                       return ERR_PTR(-BCH_ERR_journal_reclaim_would_deadlock);
  
-               if ((flags & BCH_TRANS_COMMIT_journal_reclaim) &&
-                   watermark < BCH_WATERMARK_reclaim)
-                       journal_flags |= JOURNAL_RES_GET_NONBLOCK;
-
-               ret = drop_locks_do(trans,
-                       bch2_journal_res_get(&c->journal, &res, 1, journal_flags));
-               if (bch2_err_matches(ret, BCH_ERR_operation_blocked))
-                       ret = -BCH_ERR_journal_reclaim_would_deadlock;
+               bch2_trans_unlock(trans);
+               wait_event(c->journal.wait, !test_bit(JOURNAL_SPACE_LOW, &c->journal.flags));
+               ret = bch2_trans_relock(trans);
                 if (ret)
                         return ERR_PTR(ret);
         }
  
         while (1) {
-               nr_nodes[!!update_level] += 1 + split;
-               update_level++;
+               nr_nodes[!!level_end] += 1 + split;
+               level_end++;
  
-               ret = bch2_btree_path_upgrade(trans, path, update_level + 1);
+               ret = bch2_btree_path_upgrade(trans, path, level_end + 1);
                 if (ret)
                         return ERR_PTR(ret);
  
-               if (!btree_path_node(path, update_level)) {
+               if (!btree_path_node(path, level_end)) {
                         /* Allocating new root? */
                         nr_nodes[1] += split;
-                       update_level = BTREE_MAX_DEPTH;
+                       level_end = BTREE_MAX_DEPTH;
                         break;
                 }
  
@@ -1158,11 +1160,11 @@ bch2_btree_update_start(struct btree_trans *trans, struct btree_path *path,
                  * Always check for space for two keys, even if we won't have to
                  * split at prior level - it might have been a merge instead:
                  */
-               if (bch2_btree_node_insert_fits(path->l[update_level].b,
+               if (bch2_btree_node_insert_fits(path->l[level_end].b,
                                                 BKEY_BTREE_PTR_U64s_MAX * 2))
                         break;
  
-               split = path->l[update_level].b->nr.live_u64s > BTREE_SPLIT_THRESHOLD(c);
+               split = path->l[level_end].b->nr.live_u64s > BTREE_SPLIT_THRESHOLD(c);
         }
  
         if (!down_read_trylock(&c->gc_lock)) {
@@ -1176,14 +1178,15 @@ bch2_btree_update_start(struct btree_trans *trans, struct btree_path *path,
         as = mempool_alloc(&c->btree_interior_update_pool, GFP_NOFS);
         memset(as, 0, sizeof(*as));
         closure_init(&as->cl, NULL);
-       as->c           = c;
-       as->start_time  = start_time;
-       as->ip_started  = _RET_IP_;
-       as->mode        = BTREE_UPDATE_none;
-       as->watermark   = watermark;
-       as->took_gc_lock = true;
-       as->btree_id    = path->btree_id;
-       as->update_level = update_level;
+       as->c                   = c;
+       as->start_time          = start_time;
+       as->ip_started          = _RET_IP_;
+       as->mode                = BTREE_UPDATE_none;
+       as->watermark           = watermark;
+       as->took_gc_lock        = true;
+       as->btree_id            = path->btree_id;
+       as->update_level_start  = level_start;
+       as->update_level_end    = level_end;
         INIT_LIST_HEAD(&as->list);
         INIT_LIST_HEAD(&as->unwritten_list);
         INIT_LIST_HEAD(&as->write_blocked_list);
@@ -1373,12 +1376,12 @@ static void bch2_insert_fixup_btree_ptr(struct btree_update *as,
  }
  
  static void
-__bch2_btree_insert_keys_interior(struct btree_update *as,
-                                 struct btree_trans *trans,
-                                 struct btree_path *path,
-                                 struct btree *b,
-                                 struct btree_node_iter node_iter,
-                                 struct keylist *keys)
+bch2_btree_insert_keys_interior(struct btree_update *as,
+                               struct btree_trans *trans,
+                               struct btree_path *path,
+                               struct btree *b,
+                               struct btree_node_iter node_iter,
+                               struct keylist *keys)
  {
         struct bkey_i *insert = bch2_keylist_front(keys);
         struct bkey_packed *k;
@@ -1534,7 +1537,7 @@ static void btree_split_insert_keys(struct btree_update *as,
  
                 bch2_btree_node_iter_init(&node_iter, b, &bch2_keylist_front(keys)->k.p);
  
-               __bch2_btree_insert_keys_interior(as, trans, path, b, node_iter, keys);
+               bch2_btree_insert_keys_interior(as, trans, path, b, node_iter, keys);
  
                 BUG_ON(bch2_btree_node_check_topology(trans, b));
         }
@@ -1714,27 +1717,6 @@ err:
         goto out;
  }
  
-static void
-bch2_btree_insert_keys_interior(struct btree_update *as,
-                               struct btree_trans *trans,
-                               struct btree_path *path,
-                               struct btree *b,
-                               struct keylist *keys)
-{
-       struct btree_path *linked;
-       unsigned i;
-
-       __bch2_btree_insert_keys_interior(as, trans, path, b,
-                                         path->l[b->c.level].iter, keys);
-
-       btree_update_updated_node(as, b);
-
-       trans_for_each_path_with_node(trans, b, linked, i)
-               bch2_btree_node_iter_peek(&linked->l[b->c.level].iter, b);
-
-       bch2_trans_verify_paths(trans);
-}
-
  /**
   * bch2_btree_insert_node - insert bkeys into a given btree node
   *
@@ -1755,7 +1737,8 @@ static int bch2_btree_insert_node(struct btree_update *as, struct btree_trans *t
                                   struct keylist *keys)
  {
         struct bch_fs *c = as->c;
-       struct btree_path *path = trans->paths + path_idx;
+       struct btree_path *path = trans->paths + path_idx, *linked;
+       unsigned i;
         int old_u64s = le16_to_cpu(btree_bset_last(b)->u64s);
         int old_live_u64s = b->nr.live_u64s;
         int live_u64s_added, u64s_added;
@@ -1784,7 +1767,13 @@ static int bch2_btree_insert_node(struct btree_update *as, struct btree_trans *t
                 return ret;
         }
  
-       bch2_btree_insert_keys_interior(as, trans, path, b, keys);
+       bch2_btree_insert_keys_interior(as, trans, path, b,
+                                       path->l[b->c.level].iter, keys);
+
+       trans_for_each_path_with_node(trans, b, linked, i)
+               bch2_btree_node_iter_peek(&linked->l[b->c.level].iter, b);
+
+       bch2_trans_verify_paths(trans);
  
         live_u64s_added = (int) b->nr.live_u64s - old_live_u64s;
         u64s_added = (int) le16_to_cpu(btree_bset_last(b)->u64s) - old_u64s;
@@ -1798,6 +1787,7 @@ static int bch2_btree_insert_node(struct btree_update *as, struct btree_trans *t
             bch2_maybe_compact_whiteouts(c, b))
                 bch2_trans_node_reinit_iter(trans, b);
  
+       btree_update_updated_node(as, b);
         bch2_btree_node_unlock_write(trans, path, b);
  
         BUG_ON(bch2_btree_node_check_topology(trans, b));
@@ -1807,7 +1797,7 @@ split:
          * We could attempt to avoid the transaction restart, by calling
          * bch2_btree_path_upgrade() and allocating more nodes:
          */
-       if (b->c.level >= as->update_level) {
+       if (b->c.level >= as->update_level_end) {
                 trace_and_count(c, trans_restart_split_race, trans, _THIS_IP_, b);
                 return btree_trans_restart(trans, BCH_ERR_transaction_restart_split_race);
         }
@@ -2519,9 +2509,11 @@ void bch2_btree_root_alloc_fake(struct bch_fs *c, enum btree_id id, unsigned lev
  
  static void bch2_btree_update_to_text(struct printbuf *out, struct btree_update *as)
  {
-       prt_printf(out, "%ps: btree=%s watermark=%s mode=%s nodes_written=%u cl.remaining=%u journal_seq=%llu\n",
+       prt_printf(out, "%ps: btree=%s l=%u-%u watermark=%s mode=%s nodes_written=%u cl.remaining=%u journal_seq=%llu\n",
                    (void *) as->ip_started,
                    bch2_btree_id_str(as->btree_id),
+                  as->update_level_start,
+                  as->update_level_end,
                    bch2_watermarks[as->watermark],
                    bch2_btree_update_modes[as->mode],
                    as->nodes_written,
diff --git a/fs/bcachefs/btree_update_interior.h b/fs/bcachefs/btree_update_interior.h

index 88dcf5a22a3bd628aaa22065f3cdd70ca3770d90..c1a479ebaad12120813f95a4af50b32cd542023d 100644 (file)
--- a/fs/bcachefs/btree_update_interior.h
+++ b/fs/bcachefs/btree_update_interior.h
@@ -57,7 +57,8 @@ struct btree_update {
         unsigned                        took_gc_lock:1;
  
         enum btree_id                   btree_id;
-       unsigned                        update_level;
+       unsigned                        update_level_start;
+       unsigned                        update_level_end;
  
         struct disk_reservation         disk_res;
  
diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c

index cbfa6459bdbceec6a953f91a385fc5e4fe76691d..72781aad6ba70ccc774b688c6a9d50b2dc21f133 100644 (file)
--- a/fs/bcachefs/chardev.c
+++ b/fs/bcachefs/chardev.c
@@ -134,42 +134,38 @@ static long bch2_ioctl_incremental(struct bch_ioctl_incremental __user *user_arg
  struct fsck_thread {
         struct thread_with_stdio thr;
         struct bch_fs           *c;
-       char                    **devs;
-       size_t                  nr_devs;
         struct bch_opts         opts;
  };
  
  static void bch2_fsck_thread_exit(struct thread_with_stdio *_thr)
  {
         struct fsck_thread *thr = container_of(_thr, struct fsck_thread, thr);
-       if (thr->devs)
-               for (size_t i = 0; i < thr->nr_devs; i++)
-                       kfree(thr->devs[i]);
-       kfree(thr->devs);
         kfree(thr);
  }
  
  static int bch2_fsck_offline_thread_fn(struct thread_with_stdio *stdio)
  {
         struct fsck_thread *thr = container_of(stdio, struct fsck_thread, thr);
-       struct bch_fs *c = bch2_fs_open(thr->devs, thr->nr_devs, thr->opts);
-
-       if (IS_ERR(c))
-               return PTR_ERR(c);
+       struct bch_fs *c = thr->c;
  
-       int ret = 0;
-       if (test_bit(BCH_FS_errors_fixed, &c->flags))
-               ret |= 1;
-       if (test_bit(BCH_FS_error, &c->flags))
-               ret |= 4;
+       int ret = PTR_ERR_OR_ZERO(c);
+       if (ret)
+               return ret;
  
-       bch2_fs_stop(c);
+       ret = bch2_fs_start(thr->c);
+       if (ret)
+               goto err;
  
-       if (ret & 1)
+       if (test_bit(BCH_FS_errors_fixed, &c->flags)) {
                 bch2_stdio_redirect_printf(&stdio->stdio, false, "%s: errors fixed\n", c->name);
-       if (ret & 4)
+               ret |= 1;
+       }
+       if (test_bit(BCH_FS_error, &c->flags)) {
                 bch2_stdio_redirect_printf(&stdio->stdio, false, "%s: still has errors\n", c->name);
-
+               ret |= 4;
+       }
+err:
+       bch2_fs_stop(c);
         return ret;
  }
  
@@ -182,7 +178,7 @@ static long bch2_ioctl_fsck_offline(struct bch_ioctl_fsck_offline __user *user_a
  {
         struct bch_ioctl_fsck_offline arg;
         struct fsck_thread *thr = NULL;
-       u64 *devs = NULL;
+       darray_str(devs) = {};
         long ret = 0;
  
         if (copy_from_user(&arg, user_arg, sizeof(arg)))
@@ -194,29 +190,32 @@ static long bch2_ioctl_fsck_offline(struct bch_ioctl_fsck_offline __user *user_a
         if (!capable(CAP_SYS_ADMIN))
                 return -EPERM;
  
-       if (!(devs = kcalloc(arg.nr_devs, sizeof(*devs), GFP_KERNEL)) ||
-           !(thr = kzalloc(sizeof(*thr), GFP_KERNEL)) ||
-           !(thr->devs = kcalloc(arg.nr_devs, sizeof(*thr->devs), GFP_KERNEL))) {
-               ret = -ENOMEM;
-               goto err;
-       }
+       for (size_t i = 0; i < arg.nr_devs; i++) {
+               u64 dev_u64;
+               ret = copy_from_user_errcode(&dev_u64, &user_arg->devs[i], sizeof(u64));
+               if (ret)
+                       goto err;
  
-       thr->opts = bch2_opts_empty();
-       thr->nr_devs = arg.nr_devs;
+               char *dev_str = strndup_user((char __user *)(unsigned long) dev_u64, PATH_MAX);
+               ret = PTR_ERR_OR_ZERO(dev_str);
+               if (ret)
+                       goto err;
  
-       if (copy_from_user(devs, &user_arg->devs[0],
-                          array_size(sizeof(user_arg->devs[0]), arg.nr_devs))) {
-               ret = -EINVAL;
-               goto err;
+               ret = darray_push(&devs, dev_str);
+               if (ret) {
+                       kfree(dev_str);
+                       goto err;
+               }
         }
  
-       for (size_t i = 0; i < arg.nr_devs; i++) {
-               thr->devs[i] = strndup_user((char __user *)(unsigned long) devs[i], PATH_MAX);
-               ret = PTR_ERR_OR_ZERO(thr->devs[i]);
-               if (ret)
-                       goto err;
+       thr = kzalloc(sizeof(*thr), GFP_KERNEL);
+       if (!thr) {
+               ret = -ENOMEM;
+               goto err;
         }
  
+       thr->opts = bch2_opts_empty();
+
         if (arg.opts) {
                 char *optstr = strndup_user((char __user *)(unsigned long) arg.opts, 1 << 16);
  
@@ -230,15 +229,26 @@ static long bch2_ioctl_fsck_offline(struct bch_ioctl_fsck_offline __user *user_a
  
         opt_set(thr->opts, stdio, (u64)(unsigned long)&thr->thr.stdio);
  
+       /* We need request_key() to be called before we punt to kthread: */
+       opt_set(thr->opts, nostart, true);
+
+       thr->c = bch2_fs_open(devs.data, arg.nr_devs, thr->opts);
+
+       if (!IS_ERR(thr->c) &&
+           thr->c->opts.errors == BCH_ON_ERROR_panic)
+               thr->c->opts.errors = BCH_ON_ERROR_ro;
+
         ret = bch2_run_thread_with_stdio(&thr->thr, &bch2_offline_fsck_ops);
-err:
-       if (ret < 0) {
-               if (thr)
-                       bch2_fsck_thread_exit(&thr->thr);
-               pr_err("ret %s", bch2_err_str(ret));
-       }
-       kfree(devs);
+out:
+       darray_for_each(devs, i)
+               kfree(*i);
+       darray_exit(&devs);
         return ret;
+err:
+       if (thr)
+               bch2_fsck_thread_exit(&thr->thr);
+       pr_err("ret %s", bch2_err_str(ret));
+       goto out;
  }
  
  static long bch2_global_ioctl(unsigned cmd, void __user *arg)
diff --git a/fs/bcachefs/data_update.c b/fs/bcachefs/data_update.c

index 34731ee0217f62f6e43fb691e76083c46026b127..0022b51ce3c09cc9eafaab2f0639c944078d8c54 100644 (file)
--- a/fs/bcachefs/data_update.c
+++ b/fs/bcachefs/data_update.c
@@ -598,6 +598,8 @@ int bch2_data_update_init(struct btree_trans *trans,
                 i++;
         }
  
+       unsigned durability_required = max(0, (int) (io_opts.data_replicas - durability_have));
+
         /*
          * If current extent durability is less than io_opts.data_replicas,
          * we're not trying to rereplicate the extent up to data_replicas here -
@@ -607,7 +609,7 @@ int bch2_data_update_init(struct btree_trans *trans,
          * rereplicate, currently, so that users don't get an unexpected -ENOSPC
          */
         if (!(m->data_opts.write_flags & BCH_WRITE_CACHED) &&
-           durability_have >= io_opts.data_replicas) {
+           !durability_required) {
                 m->data_opts.kill_ptrs |= m->data_opts.rewrite_ptrs;
                 m->data_opts.rewrite_ptrs = 0;
                 /* if iter == NULL, it's just a promote */
@@ -616,11 +618,18 @@ int bch2_data_update_init(struct btree_trans *trans,
                 goto done;
         }
  
-       m->op.nr_replicas = min(durability_removing, io_opts.data_replicas - durability_have) +
+       m->op.nr_replicas = min(durability_removing, durability_required) +
                 m->data_opts.extra_replicas;
-       m->op.nr_replicas_required = m->op.nr_replicas;
  
-       BUG_ON(!m->op.nr_replicas);
+       /*
+        * If device(s) were set to durability=0 after data was written to them
+        * we can end up with a duribilty=0 extent, and the normal algorithm
+        * that tries not to increase durability doesn't work:
+        */
+       if (!(durability_have + durability_removing))
+               m->op.nr_replicas = max((unsigned) m->op.nr_replicas, 1);
+
+       m->op.nr_replicas_required = m->op.nr_replicas;
  
         if (reserve_sectors) {
                 ret = bch2_disk_reservation_add(c, &m->op.res, reserve_sectors,
diff --git a/fs/bcachefs/debug.c b/fs/bcachefs/debug.c

index 208ce6f0fc4317d561582bae51785da2c016a1cd..cd99b739941447f4c54037c8dc87bffd5f5e0d25 100644 (file)
--- a/fs/bcachefs/debug.c
+++ b/fs/bcachefs/debug.c
@@ -13,6 +13,7 @@
  #include "btree_iter.h"
  #include "btree_locking.h"
  #include "btree_update.h"
+#include "btree_update_interior.h"
  #include "buckets.h"
  #include "debug.h"
  #include "error.h"
@@ -668,7 +669,7 @@ static ssize_t bch2_journal_pins_read(struct file *file, char __user *buf,
         i->size = size;
         i->ret  = 0;
  
-       do {
+       while (1) {
                 err = flush_buf(i);
                 if (err)
                         return err;
@@ -676,9 +677,12 @@ static ssize_t bch2_journal_pins_read(struct file *file, char __user *buf,
                 if (!i->size)
                         break;
  
+               if (done)
+                       break;
+
                 done = bch2_journal_seq_pins_to_text(&i->buf, &c->journal, &i->iter);
                 i->iter++;
-       } while (!done);
+       }
  
         if (i->buf.allocation_failure)
                 return -ENOMEM;
@@ -693,13 +697,45 @@ static const struct file_operations journal_pins_ops = {
         .read           = bch2_journal_pins_read,
  };
  
+static ssize_t bch2_btree_updates_read(struct file *file, char __user *buf,
+                                      size_t size, loff_t *ppos)
+{
+       struct dump_iter *i = file->private_data;
+       struct bch_fs *c = i->c;
+       int err;
+
+       i->ubuf = buf;
+       i->size = size;
+       i->ret  = 0;
+
+       if (!i->iter) {
+               bch2_btree_updates_to_text(&i->buf, c);
+               i->iter++;
+       }
+
+       err = flush_buf(i);
+       if (err)
+               return err;
+
+       if (i->buf.allocation_failure)
+               return -ENOMEM;
+
+       return i->ret;
+}
+
+static const struct file_operations btree_updates_ops = {
+       .owner          = THIS_MODULE,
+       .open           = bch2_dump_open,
+       .release        = bch2_dump_release,
+       .read           = bch2_btree_updates_read,
+};
+
  static int btree_transaction_stats_open(struct inode *inode, struct file *file)
  {
         struct bch_fs *c = inode->i_private;
         struct dump_iter *i;
  
         i = kzalloc(sizeof(struct dump_iter), GFP_KERNEL);
-
         if (!i)
                 return -ENOMEM;
  
@@ -866,6 +902,20 @@ void bch2_fs_debug_exit(struct bch_fs *c)
                 debugfs_remove_recursive(c->fs_debug_dir);
  }
  
+static void bch2_fs_debug_btree_init(struct bch_fs *c, struct btree_debug *bd)
+{
+       struct dentry *d;
+
+       d = debugfs_create_dir(bch2_btree_id_str(bd->id), c->btree_debug_dir);
+
+       debugfs_create_file("keys", 0400, d, bd, &btree_debug_ops);
+
+       debugfs_create_file("formats", 0400, d, bd, &btree_format_debug_ops);
+
+       debugfs_create_file("bfloat-failed", 0400, d, bd,
+                           &bfloat_failed_debug_ops);
+}
+
  void bch2_fs_debug_init(struct bch_fs *c)
  {
         struct btree_debug *bd;
@@ -888,6 +938,9 @@ void bch2_fs_debug_init(struct bch_fs *c)
         debugfs_create_file("journal_pins", 0400, c->fs_debug_dir,
                             c->btree_debug, &journal_pins_ops);
  
+       debugfs_create_file("btree_updates", 0400, c->fs_debug_dir,
+                           c->btree_debug, &btree_updates_ops);
+
         debugfs_create_file("btree_transaction_stats", 0400, c->fs_debug_dir,
                             c, &btree_transaction_stats_op);
  
@@ -902,21 +955,7 @@ void bch2_fs_debug_init(struct bch_fs *c)
              bd < c->btree_debug + ARRAY_SIZE(c->btree_debug);
              bd++) {
                 bd->id = bd - c->btree_debug;
-               debugfs_create_file(bch2_btree_id_str(bd->id),
-                                   0400, c->btree_debug_dir, bd,
-                                   &btree_debug_ops);
-
-               snprintf(name, sizeof(name), "%s-formats",
-                        bch2_btree_id_str(bd->id));
-
-               debugfs_create_file(name, 0400, c->btree_debug_dir, bd,
-                                   &btree_format_debug_ops);
-
-               snprintf(name, sizeof(name), "%s-bfloat-failed",
-                        bch2_btree_id_str(bd->id));
-
-               debugfs_create_file(name, 0400, c->btree_debug_dir, bd,
-                                   &bfloat_failed_debug_ops);
+               bch2_fs_debug_btree_init(c, bd);
         }
  }
  
diff --git a/fs/bcachefs/eytzinger.c b/fs/bcachefs/eytzinger.c

index 4ce5e957a6e9162307d98b5b74b02338087f7e1e..0f955c3c76a7bcdce86556e4d09ba0e5cf4e7f9a 100644 (file)
--- a/fs/bcachefs/eytzinger.c
+++ b/fs/bcachefs/eytzinger.c
@@ -115,7 +115,7 @@ static void swap_bytes(void *a, void *b, size_t n)
  
  struct wrapper {
         cmp_func_t cmp;
-       swap_func_t swap;
+       swap_func_t swap_func;
  };
  
  /*
@@ -125,7 +125,7 @@ struct wrapper {
  static void do_swap(void *a, void *b, size_t size, swap_r_func_t swap_func, const void *priv)
  {
         if (swap_func == SWAP_WRAPPER) {
-               ((const struct wrapper *)priv)->swap(a, b, (int)size);
+               ((const struct wrapper *)priv)->swap_func(a, b, (int)size);
                 return;
         }
  
@@ -174,7 +174,7 @@ void eytzinger0_sort_r(void *base, size_t n, size_t size,
         int i, c, r;
  
         /* called from 'sort' without swap function, let's pick the default */
-       if (swap_func == SWAP_WRAPPER && !((struct wrapper *)priv)->swap)
+       if (swap_func == SWAP_WRAPPER && !((struct wrapper *)priv)->swap_func)
                 swap_func = NULL;
  
         if (!swap_func) {
@@ -227,7 +227,7 @@ void eytzinger0_sort(void *base, size_t n, size_t size,
  {
         struct wrapper w = {
                 .cmp  = cmp_func,
-               .swap = swap_func,
+               .swap_func = swap_func,
         };
  
         return eytzinger0_sort_r(base, n, size, _CMP_WRAPPER, SWAP_WRAPPER, &w);
diff --git a/fs/bcachefs/eytzinger.h b/fs/bcachefs/eytzinger.h

index ee0e2df33322d2dccb60e1ed90257863769ead0d..24840aee335c0ffeabd3ad69c79665cc005e28d8 100644 (file)
--- a/fs/bcachefs/eytzinger.h
+++ b/fs/bcachefs/eytzinger.h
@@ -242,8 +242,8 @@ static inline unsigned inorder_to_eytzinger0(unsigned i, unsigned size)
              (_i) = eytzinger0_next((_i), (_size)))
  
  /* return greatest node <= @search, or -1 if not found */
-static inline ssize_t eytzinger0_find_le(void *base, size_t nr, size_t size,
-                                        cmp_func_t cmp, const void *search)
+static inline int eytzinger0_find_le(void *base, size_t nr, size_t size,
+                                    cmp_func_t cmp, const void *search)
  {
         unsigned i, n = 0;
  
@@ -256,18 +256,32 @@ static inline ssize_t eytzinger0_find_le(void *base, size_t nr, size_t size,
         } while (n < nr);
  
         if (n & 1) {
-               /* @i was greater than @search, return previous node: */
+               /*
+                * @i was greater than @search, return previous node:
+                *
+                * if @i was leftmost/smallest element,
+                * eytzinger0_prev(eytzinger0_first())) returns -1, as expected
+                */
                 return eytzinger0_prev(i, nr);
         } else {
                 return i;
         }
  }
  
-static inline ssize_t eytzinger0_find_gt(void *base, size_t nr, size_t size,
-                                        cmp_func_t cmp, const void *search)
+static inline int eytzinger0_find_gt(void *base, size_t nr, size_t size,
+                                    cmp_func_t cmp, const void *search)
  {
         ssize_t idx = eytzinger0_find_le(base, nr, size, cmp, search);
-       return eytzinger0_next(idx, size);
+
+       /*
+        * if eytitzinger0_find_le() returned -1 - no element was <= search - we
+        * want to return the first element; next/prev identities mean this work
+        * as expected
+        *
+        * similarly if find_le() returns last element, we should return -1;
+        * identities mean this all works out:
+        */
+       return eytzinger0_next(idx, nr);
  }
  
  #define eytzinger0_find(base, nr, size, _cmp, search)                  \
diff --git a/fs/bcachefs/journal_reclaim.c b/fs/bcachefs/journal_reclaim.c

index ab811c0dad26accfb4924eaef4cccb3ab957087c..04a577848b015cd900a1a040ec0565ffb2f69811 100644 (file)
--- a/fs/bcachefs/journal_reclaim.c
+++ b/fs/bcachefs/journal_reclaim.c
@@ -67,6 +67,8 @@ void bch2_journal_set_watermark(struct journal *j)
             track_event_change(&c->times[BCH_TIME_blocked_write_buffer_full], low_on_wb))
                 trace_and_count(c, journal_full, c);
  
+       mod_bit(JOURNAL_SPACE_LOW, &j->flags, low_on_space || low_on_pin);
+
         swap(watermark, j->watermark);
         if (watermark > j->watermark)
                 journal_wake(j);
diff --git a/fs/bcachefs/journal_types.h b/fs/bcachefs/journal_types.h

index 8c053cb64ca5ee25b9a5b2613f2fcd9e03d517d3..b5161b5d76a00874ed9ed88a0969927f2cfc9dbe 100644 (file)
--- a/fs/bcachefs/journal_types.h
+++ b/fs/bcachefs/journal_types.h
@@ -134,6 +134,7 @@ enum journal_flags {
         JOURNAL_STARTED,
         JOURNAL_MAY_SKIP_FLUSH,
         JOURNAL_NEED_FLUSH_WRITE,
+       JOURNAL_SPACE_LOW,
  };
  
  /* Reasons we may fail to get a journal reservation: */
diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c

index b76c16152579c6d3e5a51dbf54c839392c0ce0b2..0f328aba9760ba0e89fd015ee757239b6d8bd8c4 100644 (file)
--- a/fs/bcachefs/recovery.c
+++ b/fs/bcachefs/recovery.c
@@ -47,20 +47,6 @@ void bch2_btree_lost_data(struct bch_fs *c, enum btree_id btree)
         }
  }
  
-static bool btree_id_is_alloc(enum btree_id id)
-{
-       switch (id) {
-       case BTREE_ID_alloc:
-       case BTREE_ID_backpointers:
-       case BTREE_ID_need_discard:
-       case BTREE_ID_freespace:
-       case BTREE_ID_bucket_gens:
-               return true;
-       default:
-               return false;
-       }
-}
-
  /* for -o reconstruct_alloc: */
  static void bch2_reconstruct_alloc(struct bch_fs *c)
  {
diff --git a/fs/bcachefs/snapshot.c b/fs/bcachefs/snapshot.c

index 0e806f04f3d7c5117ade3d612b1c851da243aead..544322d5c2517070143d367fa15d4ff353642556 100644 (file)
--- a/fs/bcachefs/snapshot.c
+++ b/fs/bcachefs/snapshot.c
@@ -125,6 +125,15 @@ static inline u32 get_ancestor_below(struct snapshot_table *t, u32 id, u32 ances
         return s->parent;
  }
  
+static bool test_ancestor_bitmap(struct snapshot_table *t, u32 id, u32 ancestor)
+{
+       const struct snapshot_t *s = __snapshot_t(t, id);
+       if (!s)
+               return false;
+
+       return test_bit(ancestor - id - 1, s->is_ancestor);
+}
+
  bool __bch2_snapshot_is_ancestor(struct bch_fs *c, u32 id, u32 ancestor)
  {
         bool ret;
@@ -140,13 +149,11 @@ bool __bch2_snapshot_is_ancestor(struct bch_fs *c, u32 id, u32 ancestor)
         while (id && id < ancestor - IS_ANCESTOR_BITMAP)
                 id = get_ancestor_below(t, id, ancestor);
  
-       if (id && id < ancestor) {
-               ret = test_bit(ancestor - id - 1, __snapshot_t(t, id)->is_ancestor);
+       ret = id && id < ancestor
+               ? test_ancestor_bitmap(t, id, ancestor)
+               : id == ancestor;
  
-               EBUG_ON(ret != __bch2_snapshot_is_ancestor_early(t, id, ancestor));
-       } else {
-               ret = id == ancestor;
-       }
+       EBUG_ON(ret != __bch2_snapshot_is_ancestor_early(t, id, ancestor));
  out:
         rcu_read_unlock();
  
diff --git a/fs/bcachefs/super-io.c b/fs/bcachefs/super-io.c

index e0aa3655b63b4cd7ca8dd2ee03e0370474427ab3..5eee055ee2721a3967fb31ca38adcbc5672521d6 100644 (file)
--- a/fs/bcachefs/super-io.c
+++ b/fs/bcachefs/super-io.c
@@ -143,7 +143,7 @@ void bch2_free_super(struct bch_sb_handle *sb)
  {
         kfree(sb->bio);
         if (!IS_ERR_OR_NULL(sb->s_bdev_file))
-               fput(sb->s_bdev_file);
+               bdev_fput(sb->s_bdev_file);
         kfree(sb->holder);
         kfree(sb->sb_name);
  
diff --git a/fs/bcachefs/super.c b/fs/bcachefs/super.c

index ed63018f21bef58b2aa854f9c3f05ad1b3f26202..8daf80a38d60c6e4fa97b97345d3d4ecb80e7e88 100644 (file)
--- a/fs/bcachefs/super.c
+++ b/fs/bcachefs/super.c
@@ -288,8 +288,13 @@ static void __bch2_fs_read_only(struct bch_fs *c)
         if (test_bit(JOURNAL_REPLAY_DONE, &c->journal.flags) &&
             !test_bit(BCH_FS_emergency_ro, &c->flags))
                 set_bit(BCH_FS_clean_shutdown, &c->flags);
+
         bch2_fs_journal_stop(&c->journal);
  
+       bch_info(c, "%sshutdown complete, journal seq %llu",
+                test_bit(BCH_FS_clean_shutdown, &c->flags) ? "" : "un",
+                c->journal.seq_ondisk);
+
         /*
          * After stopping journal:
          */
diff --git a/fs/bcachefs/sysfs.c b/fs/bcachefs/sysfs.c

index c86a93a8d8fc81bbe373efcbec74f3e2563e6da5..b18b0cc81b594ad6144b43599418418a1caf5e95 100644 (file)
--- a/fs/bcachefs/sysfs.c
+++ b/fs/bcachefs/sysfs.c
@@ -17,7 +17,6 @@
  #include "btree_iter.h"
  #include "btree_key_cache.h"
  #include "btree_update.h"
-#include "btree_update_interior.h"
  #include "btree_gc.h"
  #include "buckets.h"
  #include "clock.h"
@@ -166,7 +165,6 @@ read_attribute(btree_write_stats);
  read_attribute(btree_cache_size);
  read_attribute(compression_stats);
  read_attribute(journal_debug);
-read_attribute(btree_updates);
  read_attribute(btree_cache);
  read_attribute(btree_key_cache);
  read_attribute(stripes_heap);
@@ -415,9 +413,6 @@ SHOW(bch2_fs)
         if (attr == &sysfs_journal_debug)
                 bch2_journal_debug_to_text(out, &c->journal);
  
-       if (attr == &sysfs_btree_updates)
-               bch2_btree_updates_to_text(out, c);
-
         if (attr == &sysfs_btree_cache)
                 bch2_btree_cache_to_text(out, c);
  
@@ -639,7 +634,6 @@ SYSFS_OPS(bch2_fs_internal);
  struct attribute *bch2_fs_internal_files[] = {
         &sysfs_flags,
         &sysfs_journal_debug,
-       &sysfs_btree_updates,
         &sysfs_btree_cache,
         &sysfs_btree_key_cache,
         &sysfs_new_stripes,
diff --git a/fs/bcachefs/tests.c b/fs/bcachefs/tests.c

index b3fe9fc577470ff14659df531959c9e7aa6c324b..bfec656f94c0758ee081ea7d36fe1e272baca810 100644 (file)
--- a/fs/bcachefs/tests.c
+++ b/fs/bcachefs/tests.c
@@ -672,7 +672,7 @@ static int __do_delete(struct btree_trans *trans, struct bpos pos)
  
         bch2_trans_iter_init(trans, &iter, BTREE_ID_xattrs, pos,
                              BTREE_ITER_INTENT);
-       k = bch2_btree_iter_peek(&iter);
+       k = bch2_btree_iter_peek_upto(&iter, POS(0, U64_MAX));
         ret = bkey_err(k);
         if (ret)
                 goto err;
diff --git a/fs/bcachefs/util.h b/fs/bcachefs/util.h

index b7e7c29278fc052a90fe7c029e8fd0626c48ddc5..5cf885b09986ac95effa15f7a37fc78bd56323cb 100644 (file)
--- a/fs/bcachefs/util.h
+++ b/fs/bcachefs/util.h
@@ -788,6 +788,14 @@ static inline int copy_from_user_errcode(void *to, const void __user *from, unsi
  
  #endif
  
+static inline void mod_bit(long nr, volatile unsigned long *addr, bool v)
+{
+       if (v)
+               set_bit(nr, addr);
+       else
+               clear_bit(nr, addr);
+}
+
  static inline void __set_bit_le64(size_t bit, __le64 *addr)
  {
         addr[bit / 64] |= cpu_to_le64(BIT_ULL(bit % 64));
@@ -795,7 +803,7 @@ static inline void __set_bit_le64(size_t bit, __le64 *addr)
  
  static inline void __clear_bit_le64(size_t bit, __le64 *addr)
  {
-       addr[bit / 64] &= !cpu_to_le64(BIT_ULL(bit % 64));
+       addr[bit / 64] &= ~cpu_to_le64(BIT_ULL(bit % 64));
  }
  
  static inline bool test_bit_le64(size_t bit, __le64 *addr)
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c

index dd6f566a383f00e83c9f36125ee4ecd3fc0e3541..121ab890bd0557e4779bd25d00dc422ba8fb1b3f 100644 (file)
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1133,6 +1133,9 @@ __btrfs_commit_inode_delayed_items(struct btrfs_trans_handle *trans,
         if (ret)
                 return ret;
  
+       ret = btrfs_record_root_in_trans(trans, node->root);
+       if (ret)
+               return ret;
         ret = btrfs_update_delayed_inode(trans, node->root, path, node);
         return ret;
  }
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c

index 37701531eeb1ba486cd8117f104794083dff8816..c65fe5de40220d3b51003bb73b3e6414eaefba08 100644 (file)
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2533,7 +2533,7 @@ void btrfs_clear_delalloc_extent(struct btrfs_inode *inode,
                  */
                 if (bits & EXTENT_CLEAR_META_RESV &&
                     root != fs_info->tree_root)
-                       btrfs_delalloc_release_metadata(inode, len, false);
+                       btrfs_delalloc_release_metadata(inode, len, true);
  
                 /* For sanity tests. */
                 if (btrfs_is_testing(fs_info))
@@ -4503,6 +4503,7 @@ int btrfs_delete_subvolume(struct btrfs_inode *dir, struct dentry *dentry)
         struct btrfs_trans_handle *trans;
         struct btrfs_block_rsv block_rsv;
         u64 root_flags;
+       u64 qgroup_reserved = 0;
         int ret;
  
         down_write(&fs_info->subvol_sem);
@@ -4547,12 +4548,20 @@ int btrfs_delete_subvolume(struct btrfs_inode *dir, struct dentry *dentry)
         ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 5, true);
         if (ret)
                 goto out_undead;
+       qgroup_reserved = block_rsv.qgroup_rsv_reserved;
  
         trans = btrfs_start_transaction(root, 0);
         if (IS_ERR(trans)) {
                 ret = PTR_ERR(trans);
                 goto out_release;
         }
+       ret = btrfs_record_root_in_trans(trans, root);
+       if (ret) {
+               btrfs_abort_transaction(trans, ret);
+               goto out_end_trans;
+       }
+       btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+       qgroup_reserved = 0;
         trans->block_rsv = &block_rsv;
         trans->bytes_reserved = block_rsv.size;
  
@@ -4611,7 +4620,9 @@ out_end_trans:
         ret = btrfs_end_transaction(trans);
         inode->i_flags |= S_DEAD;
  out_release:
-       btrfs_subvolume_release_metadata(root, &block_rsv);
+       btrfs_block_rsv_release(fs_info, &block_rsv, (u64)-1, NULL);
+       if (qgroup_reserved)
+               btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
  out_undead:
         if (ret) {
                 spin_lock(&dest->root_item_lock);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c

index 294e31edec9d3bbe566e9234c8ef76d73612adbc..55f3ba6a831ca194e2d8405dbf7caa60fbd81dfc 100644 (file)
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -613,6 +613,7 @@ static noinline int create_subvol(struct mnt_idmap *idmap,
         int ret;
         dev_t anon_dev;
         u64 objectid;
+       u64 qgroup_reserved = 0;
  
         root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
         if (!root_item)
@@ -650,13 +651,18 @@ static noinline int create_subvol(struct mnt_idmap *idmap,
                                                trans_num_items, false);
         if (ret)
                 goto out_new_inode_args;
+       qgroup_reserved = block_rsv.qgroup_rsv_reserved;
  
         trans = btrfs_start_transaction(root, 0);
         if (IS_ERR(trans)) {
                 ret = PTR_ERR(trans);
-               btrfs_subvolume_release_metadata(root, &block_rsv);
-               goto out_new_inode_args;
+               goto out_release_rsv;
         }
+       ret = btrfs_record_root_in_trans(trans, BTRFS_I(dir)->root);
+       if (ret)
+               goto out;
+       btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+       qgroup_reserved = 0;
         trans->block_rsv = &block_rsv;
         trans->bytes_reserved = block_rsv.size;
         /* Tree log can't currently deal with an inode which is a new root. */
@@ -767,9 +773,11 @@ static noinline int create_subvol(struct mnt_idmap *idmap,
  out:
         trans->block_rsv = NULL;
         trans->bytes_reserved = 0;
-       btrfs_subvolume_release_metadata(root, &block_rsv);
-
         btrfs_end_transaction(trans);
+out_release_rsv:
+       btrfs_block_rsv_release(fs_info, &block_rsv, (u64)-1, NULL);
+       if (qgroup_reserved)
+               btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
  out_new_inode_args:
         btrfs_new_inode_args_destroy(&new_inode_args);
  out_inode:
@@ -791,6 +799,8 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
         struct btrfs_pending_snapshot *pending_snapshot;
         unsigned int trans_num_items;
         struct btrfs_trans_handle *trans;
+       struct btrfs_block_rsv *block_rsv;
+       u64 qgroup_reserved = 0;
         int ret;
  
         /* We do not support snapshotting right now. */
@@ -827,19 +837,19 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
                 goto free_pending;
         }
  
-       btrfs_init_block_rsv(&pending_snapshot->block_rsv,
-                            BTRFS_BLOCK_RSV_TEMP);
+       block_rsv = &pending_snapshot->block_rsv;
+       btrfs_init_block_rsv(block_rsv, BTRFS_BLOCK_RSV_TEMP);
         /*
          * 1 to add dir item
          * 1 to add dir index
          * 1 to update parent inode item
          */
         trans_num_items = create_subvol_num_items(inherit) + 3;
-       ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root,
-                                              &pending_snapshot->block_rsv,
+       ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root, block_rsv,
                                                trans_num_items, false);
         if (ret)
                 goto free_pending;
+       qgroup_reserved = block_rsv->qgroup_rsv_reserved;
  
         pending_snapshot->dentry = dentry;
         pending_snapshot->root = root;
@@ -852,6 +862,13 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
                 ret = PTR_ERR(trans);
                 goto fail;
         }
+       ret = btrfs_record_root_in_trans(trans, BTRFS_I(dir)->root);
+       if (ret) {
+               btrfs_end_transaction(trans);
+               goto fail;
+       }
+       btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+       qgroup_reserved = 0;
  
         trans->pending_snapshot = pending_snapshot;
  
@@ -881,7 +898,9 @@ fail:
         if (ret && pending_snapshot->snap)
                 pending_snapshot->snap->anon_dev = 0;
         btrfs_put_root(pending_snapshot->snap);
-       btrfs_subvolume_release_metadata(root, &pending_snapshot->block_rsv);
+       btrfs_block_rsv_release(fs_info, block_rsv, (u64)-1, NULL);
+       if (qgroup_reserved)
+               btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
  free_pending:
         if (pending_snapshot->anon_dev)
                 free_anon_bdev(pending_snapshot->anon_dev);
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c

index 5f90f0605b12f7126e93d69e7fbec42720301fad..cf8820ce7aa2979920c6daafc1071c26571ecee6 100644 (file)
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -4495,6 +4495,8 @@ void btrfs_qgroup_convert_reserved_meta(struct btrfs_root *root, int num_bytes)
                                       BTRFS_QGROUP_RSV_META_PREALLOC);
         trace_qgroup_meta_convert(root, num_bytes);
         qgroup_convert_meta(fs_info, root->root_key.objectid, num_bytes);
+       if (!sb_rdonly(fs_info->sb))
+               add_root_meta_rsv(root, num_bytes, BTRFS_QGROUP_RSV_META_PERTRANS);
  }
  
  /*
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c

index 4bb538a372ce56404de84d6ddbca7fb951715949..7007f9e0c97282bc5f415f56d14e02e79895aafc 100644 (file)
--- a/fs/btrfs/root-tree.c
+++ b/fs/btrfs/root-tree.c
@@ -548,13 +548,3 @@ int btrfs_subvolume_reserve_metadata(struct btrfs_root *root,
         }
         return ret;
  }
-
-void btrfs_subvolume_release_metadata(struct btrfs_root *root,
-                                     struct btrfs_block_rsv *rsv)
-{
-       struct btrfs_fs_info *fs_info = root->fs_info;
-       u64 qgroup_to_release;
-
-       btrfs_block_rsv_release(fs_info, rsv, (u64)-1, &qgroup_to_release);
-       btrfs_qgroup_convert_reserved_meta(root, qgroup_to_release);
-}
diff --git a/fs/btrfs/root-tree.h b/fs/btrfs/root-tree.h

index 6f929cf3bd4967560964659ee9f631e6766a07ab..8f5739e732b9b6c9cc1d47ee34e20d50a403d90c 100644 (file)
--- a/fs/btrfs/root-tree.h
+++ b/fs/btrfs/root-tree.h
@@ -18,8 +18,6 @@ struct btrfs_trans_handle;
  int btrfs_subvolume_reserve_metadata(struct btrfs_root *root,
                                      struct btrfs_block_rsv *rsv,
                                      int nitems, bool use_global_rsv);
-void btrfs_subvolume_release_metadata(struct btrfs_root *root,
-                                     struct btrfs_block_rsv *rsv);
  int btrfs_add_root_ref(struct btrfs_trans_handle *trans, u64 root_id,
                        u64 ref_id, u64 dirid, u64 sequence,
                        const struct fscrypt_str *name);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c

index 46e8426adf4f15768507303430b38c9e6be56c7d..85f359e0e0a7f2ea078157c85a1f78b0ea2bcadd 100644 (file)
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -745,14 +745,6 @@ again:
                 h->reloc_reserved = reloc_reserved;
         }
  
-       /*
-        * Now that we have found a transaction to be a part of, convert the
-        * qgroup reservation from prealloc to pertrans. A different transaction
-        * can't race in and free our pertrans out from under us.
-        */
-       if (qgroup_reserved)
-               btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
-
  got_it:
         if (!current->journal_info)
                 current->journal_info = h;
@@ -786,8 +778,15 @@ got_it:
                  * not just freed.
                  */
                 btrfs_end_transaction(h);
-               return ERR_PTR(ret);
+               goto reserve_fail;
         }
+       /*
+        * Now that we have found a transaction to be a part of, convert the
+        * qgroup reservation from prealloc to pertrans. A different transaction
+        * can't race in and free our pertrans out from under us.
+        */
+       if (qgroup_reserved)
+               btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
  
         return h;
  
@@ -1495,6 +1494,7 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
                         radix_tree_tag_clear(&fs_info->fs_roots_radix,
                                         (unsigned long)root->root_key.objectid,
                                         BTRFS_ROOT_TRANS_TAG);
+                       btrfs_qgroup_free_meta_all_pertrans(root);
                         spin_unlock(&fs_info->fs_roots_radix_lock);
  
                         btrfs_free_log(trans, root);
@@ -1519,7 +1519,6 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
                         if (ret2)
                                 return ret2;
                         spin_lock(&fs_info->fs_roots_radix_lock);
-                       btrfs_qgroup_free_meta_all_pertrans(root);
                 }
         }
         spin_unlock(&fs_info->fs_roots_radix_lock);
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c

index 39e75131fd5aa01d732f703cb1f421a3696bffd6..9901057a15ba79a110c8a90423bc7707102590d8 100644 (file)
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -495,7 +495,7 @@ static void cramfs_kill_sb(struct super_block *sb)
                 sb->s_mtd = NULL;
         } else if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV) && sb->s_bdev) {
                 sync_blockdev(sb->s_bdev);
-               fput(sb->s_bdev_file);
+               bdev_fput(sb->s_bdev_file);
         }
         kfree(sbi);
  }
diff --git a/fs/ext4/super.c b/fs/ext4/super.c

index cfb8449c731f9ac53fb3add808e13493175508c4..044135796f2b6ebe86e56b69f57501e7567d761b 100644 (file)
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5668,7 +5668,7 @@ failed_mount:
         brelse(sbi->s_sbh);
         if (sbi->s_journal_bdev_file) {
                 invalidate_bdev(file_bdev(sbi->s_journal_bdev_file));
-               fput(sbi->s_journal_bdev_file);
+               bdev_fput(sbi->s_journal_bdev_file);
         }
  out_fail:
         invalidate_bdev(sb->s_bdev);
@@ -5913,7 +5913,7 @@ static struct file *ext4_get_journal_blkdev(struct super_block *sb,
  out_bh:
         brelse(bh);
  out_bdev:
-       fput(bdev_file);
+       bdev_fput(bdev_file);
         return ERR_PTR(errno);
  }
  
@@ -5952,7 +5952,7 @@ static journal_t *ext4_open_dev_journal(struct super_block *sb,
  out_journal:
         jbd2_journal_destroy(journal);
  out_bdev:
-       fput(bdev_file);
+       bdev_fput(bdev_file);
         return ERR_PTR(errno);
  }
  
@@ -7327,7 +7327,7 @@ static void ext4_kill_sb(struct super_block *sb)
         kill_block_super(sb);
  
         if (bdev_file)
-               fput(bdev_file);
+               bdev_fput(bdev_file);
  }
  
  static struct file_system_type ext4_fs_type = {
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

index a6867f26f141836dcd4a4f0136dd67a9de6c3c74..a4bc26dfdb1af5973783d2817bf2deed889f3c33 100644 (file)
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1558,7 +1558,7 @@ static void destroy_device_list(struct f2fs_sb_info *sbi)
  
         for (i = 0; i < sbi->s_ndevs; i++) {
                 if (i > 0)
-                       fput(FDEV(i).bdev_file);
+                       bdev_fput(FDEV(i).bdev_file);
  #ifdef CONFIG_BLK_DEV_ZONED
                 kvfree(FDEV(i).blkz_seq);
  #endif
diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c

index 73389c68e25170c81d6f84483f09b43216ba4b52..9609349e92e5e1ba422369fa29a2f6345f7fe908 100644 (file)
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -1141,7 +1141,7 @@ journal_found:
         lbmLogShutdown(log);
  
        close:           /* close external log device */
-       fput(bdev_file);
+       bdev_fput(bdev_file);
  
        free:            /* free log descriptor */
         mutex_unlock(&jfs_log_mutex);
@@ -1485,7 +1485,7 @@ int lmLogClose(struct super_block *sb)
         bdev_file = log->bdev_file;
         rc = lmLogShutdown(log);
  
-       fput(bdev_file);
+       bdev_fput(bdev_file);
  
         kfree(log);
  
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c

index 2391ab3c3231975bde1606875339b7736975054f..84d4093ca71317ebb7a70bde76819704e25ec7dc 100644 (file)
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3042,12 +3042,9 @@ static void
  nfsd4_cb_recall_any_release(struct nfsd4_callback *cb)
  {
         struct nfs4_client *clp = cb->cb_clp;
-       struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
  
-       spin_lock(&nn->client_lock);
         clear_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
-       put_client_renew_locked(clp);
-       spin_unlock(&nn->client_lock);
+       drop_client(clp);
  }
  
  static int
@@ -6616,7 +6613,7 @@ deleg_reaper(struct nfsd_net *nn)
                 list_add(&clp->cl_ra_cblist, &cblist);
  
                 /* release in nfsd4_cb_recall_any_release */
-               atomic_inc(&clp->cl_rpc_users);
+               kref_get(&clp->cl_nfsdfs.cl_ref);
                 set_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
                 clp->cl_ra_time = ktime_get_boottime_seconds();
         }
diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c

index 902b326e1e5607d5537721b51f68c28e602e2b92..87dcaae32ff87b40c3d65a0d2463a6e60cdc0dbc 100644 (file)
--- a/fs/proc/bootconfig.c
+++ b/fs/proc/bootconfig.c
@@ -62,12 +62,12 @@ static int __init copy_xbc_key_value_list(char *dst, size_t size)
                                 break;
                         dst += ret;
                 }
-               if (ret >= 0 && boot_command_line[0]) {
-                       ret = snprintf(dst, rest(dst, end), "# Parameters from bootloader:\n# %s\n",
-                                      boot_command_line);
-                       if (ret > 0)
-                               dst += ret;
-               }
+       }
+       if (cmdline_has_extra_options() && ret >= 0 && boot_command_line[0]) {
+               ret = snprintf(dst, rest(dst, end), "# Parameters from bootloader:\n# %s\n",
+                              boot_command_line);
+               if (ret > 0)
+                       dst += ret;
         }
  out:
         kfree(key);
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c

index 6474529c42530628fd3969573fb175283f4f51e8..e539ccd39e1ee74cd8bdfd35d29f826be6f514e1 100644 (file)
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2589,7 +2589,7 @@ static void journal_list_init(struct super_block *sb)
  static void release_journal_dev(struct reiserfs_journal *journal)
  {
         if (journal->j_bdev_file) {
-               fput(journal->j_bdev_file);
+               bdev_fput(journal->j_bdev_file);
                 journal->j_bdev_file = NULL;
         }
  }
diff --git a/fs/romfs/super.c b/fs/romfs/super.c

index 2be227532f399788de82a03e55970d33c67dc695..2cbb924620747f68d04ac53783c3b0f21c5ea0ab 100644 (file)
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -594,7 +594,7 @@ static void romfs_kill_sb(struct super_block *sb)
  #ifdef CONFIG_ROMFS_ON_BLOCK
         if (sb->s_bdev) {
                 sync_blockdev(sb->s_bdev);
-               fput(sb->s_bdev_file);
+               bdev_fput(sb->s_bdev_file);
         }
  #endif
  }
diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c

index a0017724d5239312c14644bd15d2867880337d91..13a9d7acf8f8ec151323d18d44a346e060bf0ce2 100644 (file)
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -417,6 +417,7 @@ smb2_close_cached_fid(struct kref *ref)
  {
         struct cached_fid *cfid = container_of(ref, struct cached_fid,
                                                refcount);
+       int rc;
  
         spin_lock(&cfid->cfids->cfid_list_lock);
         if (cfid->on_list) {
@@ -430,9 +431,10 @@ smb2_close_cached_fid(struct kref *ref)
         cfid->dentry = NULL;
  
         if (cfid->is_open) {
-               SMB2_close(0, cfid->tcon, cfid->fid.persistent_fid,
+               rc = SMB2_close(0, cfid->tcon, cfid->fid.persistent_fid,
                            cfid->fid.volatile_fid);
-               atomic_dec(&cfid->tcon->num_remote_opens);
+               if (rc != -EBUSY && rc != -EAGAIN)
+                       atomic_dec(&cfid->tcon->num_remote_opens);
         }
  
         free_cached_dir(cfid);
diff --git a/fs/smb/client/cifs_debug.c b/fs/smb/client/cifs_debug.c

index 226d4835c92db8ba3f1f0540a16643d0b8ac3fd0..c71ae5c043060ebf5dd7f6d9e5f63e6e7bcf7841 100644 (file)
--- a/fs/smb/client/cifs_debug.c
+++ b/fs/smb/client/cifs_debug.c
@@ -250,6 +250,8 @@ static int cifs_debug_files_proc_show(struct seq_file *m, void *v)
         spin_lock(&cifs_tcp_ses_lock);
         list_for_each_entry(server, &cifs_tcp_ses_list, tcp_ses_list) {
                 list_for_each_entry(ses, &server->smb_ses_list, smb_ses_list) {
+                       if (cifs_ses_exiting(ses))
+                               continue;
                         list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
                                 spin_lock(&tcon->open_file_lock);
                                 list_for_each_entry(cfile, &tcon->openFileList, tlist) {
@@ -676,6 +678,8 @@ static ssize_t cifs_stats_proc_write(struct file *file,
                         }
  #endif /* CONFIG_CIFS_STATS2 */
                         list_for_each_entry(ses, &server->smb_ses_list, smb_ses_list) {
+                               if (cifs_ses_exiting(ses))
+                                       continue;
                                 list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
                                         atomic_set(&tcon->num_smbs_sent, 0);
                                         spin_lock(&tcon->stat_lock);
@@ -755,6 +759,8 @@ static int cifs_stats_proc_show(struct seq_file *m, void *v)
                         }
  #endif /* STATS2 */
                 list_for_each_entry(ses, &server->smb_ses_list, smb_ses_list) {
+                       if (cifs_ses_exiting(ses))
+                               continue;
                         list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
                                 i++;
                                 seq_printf(m, "\n%d) %s", i, tcon->tree_name);
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c

index aa6f1ecb7c0e8fc11f9b1fb830d8c0ca8071b631..d41eedbff674abb0e62e52ae6cc585aaa5d83d77 100644 (file)
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -156,6 +156,7 @@ struct workqueue_struct     *decrypt_wq;
  struct workqueue_struct        *fileinfo_put_wq;
  struct workqueue_struct        *cifsoplockd_wq;
  struct workqueue_struct        *deferredclose_wq;
+struct workqueue_struct        *serverclose_wq;
  __u32 cifs_lock_secret;
  
  /*
@@ -1888,6 +1889,13 @@ init_cifs(void)
                 goto out_destroy_cifsoplockd_wq;
         }
  
+       serverclose_wq = alloc_workqueue("serverclose",
+                                          WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+       if (!serverclose_wq) {
+               rc = -ENOMEM;
+               goto out_destroy_serverclose_wq;
+       }
+
         rc = cifs_init_inodecache();
         if (rc)
                 goto out_destroy_deferredclose_wq;
@@ -1962,6 +1970,8 @@ out_destroy_decrypt_wq:
         destroy_workqueue(decrypt_wq);
  out_destroy_cifsiod_wq:
         destroy_workqueue(cifsiod_wq);
+out_destroy_serverclose_wq:
+       destroy_workqueue(serverclose_wq);
  out_clean_proc:
         cifs_proc_clean();
         return rc;
@@ -1991,6 +2001,7 @@ exit_cifs(void)
         destroy_workqueue(cifsoplockd_wq);
         destroy_workqueue(decrypt_wq);
         destroy_workqueue(fileinfo_put_wq);
+       destroy_workqueue(serverclose_wq);
         destroy_workqueue(cifsiod_wq);
         cifs_proc_clean();
  }
diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h

index 7ed9d05f6890b4d40cb11a7b8d7384c6e0111461..f6a302205f89c456d9fa3adb3dae238deeb97d10 100644 (file)
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -442,10 +442,10 @@ struct smb_version_operations {
         /* set fid protocol-specific info */
         void (*set_fid)(struct cifsFileInfo *, struct cifs_fid *, __u32);
         /* close a file */
-       void (*close)(const unsigned int, struct cifs_tcon *,
+       int (*close)(const unsigned int, struct cifs_tcon *,
                       struct cifs_fid *);
         /* close a file, returning file attributes and timestamps */
-       void (*close_getattr)(const unsigned int xid, struct cifs_tcon *tcon,
+       int (*close_getattr)(const unsigned int xid, struct cifs_tcon *tcon,
                       struct cifsFileInfo *pfile_info);
         /* send a flush request to the server */
         int (*flush)(const unsigned int, struct cifs_tcon *, struct cifs_fid *);
@@ -1281,7 +1281,6 @@ struct cifs_tcon {
         struct cached_fids *cfids;
         /* BB add field for back pointer to sb struct(s)? */
  #ifdef CONFIG_CIFS_DFS_UPCALL
-       struct list_head dfs_ses_list;
         struct delayed_work dfs_cache_work;
  #endif
         struct delayed_work     query_interfaces; /* query interfaces workqueue job */
@@ -1440,6 +1439,7 @@ struct cifsFileInfo {
         bool swapfile:1;
         bool oplock_break_cancelled:1;
         bool status_file_deleted:1; /* file has been deleted */
+       bool offload:1; /* offload final part of _put to a wq */
         unsigned int oplock_epoch; /* epoch from the lease break */
         __u32 oplock_level; /* oplock/lease level from the lease break */
         int count;
@@ -1448,6 +1448,7 @@ struct cifsFileInfo {
         struct cifs_search_info srch_inf;
         struct work_struct oplock_break; /* work for oplock breaks */
         struct work_struct put; /* work for the final part of _put */
+       struct work_struct serverclose; /* work for serverclose */
         struct delayed_work deferred;
         bool deferred_close_scheduled; /* Flag to indicate close is scheduled */
         char *symlink_target;
@@ -1804,7 +1805,6 @@ struct cifs_mount_ctx {
         struct TCP_Server_Info *server;
         struct cifs_ses *ses;
         struct cifs_tcon *tcon;
-       struct list_head dfs_ses_list;
  };
  
  static inline void __free_dfs_info_param(struct dfs_info3_param *param)
@@ -2105,6 +2105,7 @@ extern struct workqueue_struct *decrypt_wq;
  extern struct workqueue_struct *fileinfo_put_wq;
  extern struct workqueue_struct *cifsoplockd_wq;
  extern struct workqueue_struct *deferredclose_wq;
+extern struct workqueue_struct *serverclose_wq;
  extern __u32 cifs_lock_secret;
  
  extern mempool_t *cifs_sm_req_poolp;
@@ -2324,4 +2325,14 @@ struct smb2_compound_vars {
         struct kvec ea_iov;
  };
  
+static inline bool cifs_ses_exiting(struct cifs_ses *ses)
+{
+       bool ret;
+
+       spin_lock(&ses->ses_lock);
+       ret = ses->ses_status == SES_EXITING;
+       spin_unlock(&ses->ses_lock);
+       return ret;
+}
+
  #endif /* _CIFS_GLOB_H */
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h

index 0723e1b57256b8fe0d07e0a4698d60074914ec38..8e0a348f1f660ebc14498c7fd7d342693411c106 100644 (file)
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -725,31 +725,31 @@ struct super_block *cifs_get_tcon_super(struct cifs_tcon *tcon);
  void cifs_put_tcon_super(struct super_block *sb);
  int cifs_wait_for_server_reconnect(struct TCP_Server_Info *server, bool retry);
  
-/* Put references of @ses and @ses->dfs_root_ses */
+/* Put references of @ses and its children */
  static inline void cifs_put_smb_ses(struct cifs_ses *ses)
  {
-       struct cifs_ses *rses = ses->dfs_root_ses;
+       struct cifs_ses *next;
  
-       __cifs_put_smb_ses(ses);
-       if (rses)
-               __cifs_put_smb_ses(rses);
+       do {
+               next = ses->dfs_root_ses;
+               __cifs_put_smb_ses(ses);
+       } while ((ses = next));
  }
  
-/* Get an active reference of @ses and @ses->dfs_root_ses.
+/* Get an active reference of @ses and its children.
   *
   * NOTE: make sure to call this function when incrementing reference count of
   * @ses to ensure that any DFS root session attached to it (@ses->dfs_root_ses)
   * will also get its reference count incremented.
   *
- * cifs_put_smb_ses() will put both references, so call it when you're done.
+ * cifs_put_smb_ses() will put all references, so call it when you're done.
   */
  static inline void cifs_smb_ses_inc_refcount(struct cifs_ses *ses)
  {
         lockdep_assert_held(&cifs_tcp_ses_lock);
  
-       ses->ses_count++;
-       if (ses->dfs_root_ses)
-               ses->dfs_root_ses->ses_count++;
+       for (; ses; ses = ses->dfs_root_ses)
+               ses->ses_count++;
  }
  
  static inline bool dfs_src_pathname_equal(const char *s1, const char *s2)
diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c

index 5aee555515730d78031734da95526ee729416216..23b5709ddc311c7366e33505528711363d151a14 100644 (file)
--- a/fs/smb/client/cifssmb.c
+++ b/fs/smb/client/cifssmb.c
@@ -5854,10 +5854,8 @@ SetEARetry:
         parm_data->list.EA_flags = 0;
         /* we checked above that name len is less than 255 */
         parm_data->list.name_len = (__u8)name_len;
-       /* EA names are always ASCII */
-       if (ea_name)
-               strncpy(parm_data->list.name, ea_name, name_len);
-       parm_data->list.name[name_len] = '\0';
+       /* EA names are always ASCII and NUL-terminated */
+       strscpy(parm_data->list.name, ea_name ?: "", name_len + 1);
         parm_data->list.value_len = cpu_to_le16(ea_value_len);
         /* caller ensures that ea_value_len is less than 64K but
         we need to ensure that it fits within the smb */
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c

index 9b85b5341822e73d9a4746a027f50121f01873c3..85679ae106fd50a4e3289349e2916204ae3f94fc 100644 (file)
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -175,6 +175,8 @@ cifs_signal_cifsd_for_reconnect(struct TCP_Server_Info *server,
  
         spin_lock(&cifs_tcp_ses_lock);
         list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
+               if (cifs_ses_exiting(ses))
+                       continue;
                 spin_lock(&ses->chan_lock);
                 for (i = 0; i < ses->chan_count; i++) {
                         if (!ses->chans[i].server)
@@ -232,7 +234,13 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
  
         spin_lock(&cifs_tcp_ses_lock);
         list_for_each_entry_safe(ses, nses, &pserver->smb_ses_list, smb_ses_list) {
-               /* check if iface is still active */
+               spin_lock(&ses->ses_lock);
+               if (ses->ses_status == SES_EXITING) {
+                       spin_unlock(&ses->ses_lock);
+                       continue;
+               }
+               spin_unlock(&ses->ses_lock);
+
                 spin_lock(&ses->chan_lock);
                 if (cifs_ses_get_chan_index(ses, server) ==
                     CIFS_INVAL_CHAN_INDEX) {
@@ -1860,6 +1868,9 @@ static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
             ctx->sectype != ses->sectype)
                 return 0;
  
+       if (ctx->dfs_root_ses != ses->dfs_root_ses)
+               return 0;
+
         /*
          * If an existing session is limited to less channels than
          * requested, it should not be reused
@@ -1963,31 +1974,6 @@ out:
         return rc;
  }
  
-/**
- * cifs_free_ipc - helper to release the session IPC tcon
- * @ses: smb session to unmount the IPC from
- *
- * Needs to be called everytime a session is destroyed.
- *
- * On session close, the IPC is closed and the server must release all tcons of the session.
- * No need to send a tree disconnect here.
- *
- * Besides, it will make the server to not close durable and resilient files on session close, as
- * specified in MS-SMB2 3.3.5.6 Receiving an SMB2 LOGOFF Request.
- */
-static int
-cifs_free_ipc(struct cifs_ses *ses)
-{
-       struct cifs_tcon *tcon = ses->tcon_ipc;
-
-       if (tcon == NULL)
-               return 0;
-
-       tconInfoFree(tcon);
-       ses->tcon_ipc = NULL;
-       return 0;
-}
-
  static struct cifs_ses *
  cifs_find_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
  {
@@ -2019,48 +2005,52 @@ cifs_find_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
  void __cifs_put_smb_ses(struct cifs_ses *ses)
  {
         struct TCP_Server_Info *server = ses->server;
+       struct cifs_tcon *tcon;
         unsigned int xid;
         size_t i;
+       bool do_logoff;
         int rc;
  
+       spin_lock(&cifs_tcp_ses_lock);
         spin_lock(&ses->ses_lock);
-       if (ses->ses_status == SES_EXITING) {
+       cifs_dbg(FYI, "%s: id=0x%llx ses_count=%d ses_status=%u ipc=%s\n",
+                __func__, ses->Suid, ses->ses_count, ses->ses_status,
+                ses->tcon_ipc ? ses->tcon_ipc->tree_name : "none");
+       if (ses->ses_status == SES_EXITING || --ses->ses_count > 0) {
                 spin_unlock(&ses->ses_lock);
+               spin_unlock(&cifs_tcp_ses_lock);
                 return;
         }
-       spin_unlock(&ses->ses_lock);
+       /* ses_count can never go negative */
+       WARN_ON(ses->ses_count < 0);
  
-       cifs_dbg(FYI, "%s: ses_count=%d\n", __func__, ses->ses_count);
-       cifs_dbg(FYI,
-                "%s: ses ipc: %s\n", __func__, ses->tcon_ipc ? ses->tcon_ipc->tree_name : "NONE");
+       spin_lock(&ses->chan_lock);
+       cifs_chan_clear_need_reconnect(ses, server);
+       spin_unlock(&ses->chan_lock);
  
-       spin_lock(&cifs_tcp_ses_lock);
-       if (--ses->ses_count > 0) {
-               spin_unlock(&cifs_tcp_ses_lock);
-               return;
-       }
-       spin_lock(&ses->ses_lock);
-       if (ses->ses_status == SES_GOOD)
-               ses->ses_status = SES_EXITING;
+       do_logoff = ses->ses_status == SES_GOOD && server->ops->logoff;
+       ses->ses_status = SES_EXITING;
+       tcon = ses->tcon_ipc;
+       ses->tcon_ipc = NULL;
         spin_unlock(&ses->ses_lock);
         spin_unlock(&cifs_tcp_ses_lock);
  
-       /* ses_count can never go negative */
-       WARN_ON(ses->ses_count < 0);
-
-       spin_lock(&ses->ses_lock);
-       if (ses->ses_status == SES_EXITING && server->ops->logoff) {
-               spin_unlock(&ses->ses_lock);
-               cifs_free_ipc(ses);
+       /*
+        * On session close, the IPC is closed and the server must release all
+        * tcons of the session.  No need to send a tree disconnect here.
+        *
+        * Besides, it will make the server to not close durable and resilient
+        * files on session close, as specified in MS-SMB2 3.3.5.6 Receiving an
+        * SMB2 LOGOFF Request.
+        */
+       tconInfoFree(tcon);
+       if (do_logoff) {
                 xid = get_xid();
                 rc = server->ops->logoff(xid, ses);
                 if (rc)
                         cifs_server_dbg(VFS, "%s: Session Logoff failure rc=%d\n",
                                 __func__, rc);
                 _free_xid(xid);
-       } else {
-               spin_unlock(&ses->ses_lock);
-               cifs_free_ipc(ses);
         }
  
         spin_lock(&cifs_tcp_ses_lock);
@@ -2373,9 +2363,9 @@ cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
          * need to lock before changing something in the session.
          */
         spin_lock(&cifs_tcp_ses_lock);
+       if (ctx->dfs_root_ses)
+               cifs_smb_ses_inc_refcount(ctx->dfs_root_ses);
         ses->dfs_root_ses = ctx->dfs_root_ses;
-       if (ses->dfs_root_ses)
-               ses->dfs_root_ses->ses_count++;
         list_add(&ses->smb_ses_list, &server->smb_ses_list);
         spin_unlock(&cifs_tcp_ses_lock);
  
@@ -3326,6 +3316,9 @@ void cifs_mount_put_conns(struct cifs_mount_ctx *mnt_ctx)
                 cifs_put_smb_ses(mnt_ctx->ses);
         else if (mnt_ctx->server)
                 cifs_put_tcp_session(mnt_ctx->server, 0);
+       mnt_ctx->ses = NULL;
+       mnt_ctx->tcon = NULL;
+       mnt_ctx->server = NULL;
         mnt_ctx->cifs_sb->mnt_cifs_flags &= ~CIFS_MOUNT_POSIX_PATHS;
         free_xid(mnt_ctx->xid);
  }
@@ -3604,8 +3597,6 @@ int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb3_fs_context *ctx)
         bool isdfs;
         int rc;
  
-       INIT_LIST_HEAD(&mnt_ctx.dfs_ses_list);
-
         rc = dfs_mount_share(&mnt_ctx, &isdfs);
         if (rc)
                 goto error;
@@ -3636,7 +3627,6 @@ out:
         return rc;
  
  error:
-       dfs_put_root_smb_sessions(&mnt_ctx.dfs_ses_list);
         cifs_mount_put_conns(&mnt_ctx);
         return rc;
  }
@@ -3651,6 +3641,18 @@ int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb3_fs_context *ctx)
                 goto error;
  
         rc = cifs_mount_get_tcon(&mnt_ctx);
+       if (!rc) {
+               /*
+                * Prevent superblock from being created with any missing
+                * connections.
+                */
+               if (WARN_ON(!mnt_ctx.server))
+                       rc = -EHOSTDOWN;
+               else if (WARN_ON(!mnt_ctx.ses))
+                       rc = -EACCES;
+               else if (WARN_ON(!mnt_ctx.tcon))
+                       rc = -ENOENT;
+       }
         if (rc)
                 goto error;
  
@@ -3988,13 +3990,14 @@ cifs_set_vol_auth(struct smb3_fs_context *ctx, struct cifs_ses *ses)
  }
  
  static struct cifs_tcon *
-cifs_construct_tcon(struct cifs_sb_info *cifs_sb, kuid_t fsuid)
+__cifs_construct_tcon(struct cifs_sb_info *cifs_sb, kuid_t fsuid)
  {
         int rc;
         struct cifs_tcon *master_tcon = cifs_sb_master_tcon(cifs_sb);
         struct cifs_ses *ses;
         struct cifs_tcon *tcon = NULL;
         struct smb3_fs_context *ctx;
+       char *origin_fullpath = NULL;
  
         ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
         if (ctx == NULL)
@@ -4018,6 +4021,7 @@ cifs_construct_tcon(struct cifs_sb_info *cifs_sb, kuid_t fsuid)
         ctx->sign = master_tcon->ses->sign;
         ctx->seal = master_tcon->seal;
         ctx->witness = master_tcon->use_witness;
+       ctx->dfs_root_ses = master_tcon->ses->dfs_root_ses;
  
         rc = cifs_set_vol_auth(ctx, master_tcon->ses);
         if (rc) {
@@ -4037,12 +4041,39 @@ cifs_construct_tcon(struct cifs_sb_info *cifs_sb, kuid_t fsuid)
                 goto out;
         }
  
+#ifdef CONFIG_CIFS_DFS_UPCALL
+       spin_lock(&master_tcon->tc_lock);
+       if (master_tcon->origin_fullpath) {
+               spin_unlock(&master_tcon->tc_lock);
+               origin_fullpath = dfs_get_path(cifs_sb, cifs_sb->ctx->source);
+               if (IS_ERR(origin_fullpath)) {
+                       tcon = ERR_CAST(origin_fullpath);
+                       origin_fullpath = NULL;
+                       cifs_put_smb_ses(ses);
+                       goto out;
+               }
+       } else {
+               spin_unlock(&master_tcon->tc_lock);
+       }
+#endif
+
         tcon = cifs_get_tcon(ses, ctx);
         if (IS_ERR(tcon)) {
                 cifs_put_smb_ses(ses);
                 goto out;
         }
  
+#ifdef CONFIG_CIFS_DFS_UPCALL
+       if (origin_fullpath) {
+               spin_lock(&tcon->tc_lock);
+               tcon->origin_fullpath = origin_fullpath;
+               spin_unlock(&tcon->tc_lock);
+               origin_fullpath = NULL;
+               queue_delayed_work(dfscache_wq, &tcon->dfs_cache_work,
+                                  dfs_cache_get_ttl() * HZ);
+       }
+#endif
+
  #ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
         if (cap_unix(ses))
                 reset_cifs_unix_caps(0, tcon, NULL, ctx);
@@ -4051,11 +4082,23 @@ cifs_construct_tcon(struct cifs_sb_info *cifs_sb, kuid_t fsuid)
  out:
         kfree(ctx->username);
         kfree_sensitive(ctx->password);
+       kfree(origin_fullpath);
         kfree(ctx);
  
         return tcon;
  }
  
+static struct cifs_tcon *
+cifs_construct_tcon(struct cifs_sb_info *cifs_sb, kuid_t fsuid)
+{
+       struct cifs_tcon *ret;
+
+       cifs_mount_lock();
+       ret = __cifs_construct_tcon(cifs_sb, fsuid);
+       cifs_mount_unlock();
+       return ret;
+}
+
  struct cifs_tcon *
  cifs_sb_master_tcon(struct cifs_sb_info *cifs_sb)
  {
diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c

index 449c59830039bc04897e5031dba2dbc9c6649bad..3ec965547e3d4d5979da80f41681b291c29c9256 100644 (file)
--- a/fs/smb/client/dfs.c
+++ b/fs/smb/client/dfs.c
@@ -66,33 +66,20 @@ static int get_session(struct cifs_mount_ctx *mnt_ctx, const char *full_path)
  }
  
  /*
- * Track individual DFS referral servers used by new DFS mount.
- *
- * On success, their lifetime will be shared by final tcon (dfs_ses_list).
- * Otherwise, they will be put by dfs_put_root_smb_sessions() in cifs_mount().
+ * Get an active reference of @ses so that next call to cifs_put_tcon() won't
+ * release it as any new DFS referrals must go through its IPC tcon.
   */
-static int add_root_smb_session(struct cifs_mount_ctx *mnt_ctx)
+static void add_root_smb_session(struct cifs_mount_ctx *mnt_ctx)
  {
         struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
-       struct dfs_root_ses *root_ses;
         struct cifs_ses *ses = mnt_ctx->ses;
  
         if (ses) {
-               root_ses = kmalloc(sizeof(*root_ses), GFP_KERNEL);
-               if (!root_ses)
-                       return -ENOMEM;
-
-               INIT_LIST_HEAD(&root_ses->list);
-
                 spin_lock(&cifs_tcp_ses_lock);
                 cifs_smb_ses_inc_refcount(ses);
                 spin_unlock(&cifs_tcp_ses_lock);
-               root_ses->ses = ses;
-               list_add_tail(&root_ses->list, &mnt_ctx->dfs_ses_list);
         }
-       /* Select new DFS referral server so that new referrals go through it */
         ctx->dfs_root_ses = ses;
-       return 0;
  }
  
  static inline int parse_dfs_target(struct smb3_fs_context *ctx,
@@ -185,11 +172,8 @@ again:
                                         continue;
                         }
  
-                       if (is_refsrv) {
-                               rc = add_root_smb_session(mnt_ctx);
-                               if (rc)
-                                       goto out;
-                       }
+                       if (is_refsrv)
+                               add_root_smb_session(mnt_ctx);
  
                         rc = ref_walk_advance(rw);
                         if (!rc) {
@@ -232,6 +216,7 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
         struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
         struct cifs_tcon *tcon;
         char *origin_fullpath;
+       bool new_tcon = true;
         int rc;
  
         origin_fullpath = dfs_get_path(cifs_sb, ctx->source);
@@ -239,6 +224,18 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
                 return PTR_ERR(origin_fullpath);
  
         rc = dfs_referral_walk(mnt_ctx);
+       if (!rc) {
+               /*
+                * Prevent superblock from being created with any missing
+                * connections.
+                */
+               if (WARN_ON(!mnt_ctx->server))
+                       rc = -EHOSTDOWN;
+               else if (WARN_ON(!mnt_ctx->ses))
+                       rc = -EACCES;
+               else if (WARN_ON(!mnt_ctx->tcon))
+                       rc = -ENOENT;
+       }
         if (rc)
                 goto out;
  
@@ -247,15 +244,14 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
         if (!tcon->origin_fullpath) {
                 tcon->origin_fullpath = origin_fullpath;
                 origin_fullpath = NULL;
+       } else {
+               new_tcon = false;
         }
         spin_unlock(&tcon->tc_lock);
  
-       if (list_empty(&tcon->dfs_ses_list)) {
-               list_replace_init(&mnt_ctx->dfs_ses_list, &tcon->dfs_ses_list);
+       if (new_tcon) {
                 queue_delayed_work(dfscache_wq, &tcon->dfs_cache_work,
                                    dfs_cache_get_ttl() * HZ);
-       } else {
-               dfs_put_root_smb_sessions(&mnt_ctx->dfs_ses_list);
         }
  
  out:
@@ -298,7 +294,6 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
         if (rc)
                 return rc;
  
-       ctx->dfs_root_ses = mnt_ctx->ses;
         /*
          * If called with 'nodfs' mount option, then skip DFS resolving.  Otherwise unconditionally
          * try to get an DFS referral (even cached) to determine whether it is an DFS mount.
@@ -324,7 +319,9 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
  
         *isdfs = true;
         add_root_smb_session(mnt_ctx);
-       return __dfs_mount_share(mnt_ctx);
+       rc = __dfs_mount_share(mnt_ctx);
+       dfs_put_root_smb_sessions(mnt_ctx);
+       return rc;
  }
  
  /* Update dfs referral path of superblock */
diff --git a/fs/smb/client/dfs.h b/fs/smb/client/dfs.h

index 875ab7ae57fcdf4493237084d41c6e3617128623..e5c4dcf837503aa2851f9b01680b6f5b7eb8874d 100644 (file)
--- a/fs/smb/client/dfs.h
+++ b/fs/smb/client/dfs.h
@@ -7,7 +7,9 @@
  #define _CIFS_DFS_H
  
  #include "cifsglob.h"
+#include "cifsproto.h"
  #include "fs_context.h"
+#include "dfs_cache.h"
  #include "cifs_unicode.h"
  #include <linux/namei.h>
  
@@ -114,11 +116,6 @@ static inline void ref_walk_set_tgt_hint(struct dfs_ref_walk *rw)
                                        ref_walk_tit(rw));
  }
  
-struct dfs_root_ses {
-       struct list_head list;
-       struct cifs_ses *ses;
-};
-
  int dfs_parse_target_referral(const char *full_path, const struct dfs_info3_param *ref,
                               struct smb3_fs_context *ctx);
  int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs);
@@ -133,20 +130,32 @@ static inline int dfs_get_referral(struct cifs_mount_ctx *mnt_ctx, const char *p
  {
         struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
         struct cifs_sb_info *cifs_sb = mnt_ctx->cifs_sb;
+       struct cifs_ses *rses = ctx->dfs_root_ses ?: mnt_ctx->ses;
  
-       return dfs_cache_find(mnt_ctx->xid, ctx->dfs_root_ses, cifs_sb->local_nls,
+       return dfs_cache_find(mnt_ctx->xid, rses, cifs_sb->local_nls,
                               cifs_remap(cifs_sb), path, ref, tl);
  }
  
-static inline void dfs_put_root_smb_sessions(struct list_head *head)
+/*
+ * cifs_get_smb_ses() already guarantees an active reference of
+ * @ses->dfs_root_ses when a new session is created, so we need to put extra
+ * references of all DFS root sessions that were used across the mount process
+ * in dfs_mount_share().
+ */
+static inline void dfs_put_root_smb_sessions(struct cifs_mount_ctx *mnt_ctx)
  {
-       struct dfs_root_ses *root, *tmp;
+       const struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+       struct cifs_ses *ses = ctx->dfs_root_ses;
+       struct cifs_ses *cur;
+
+       if (!ses)
+               return;
  
-       list_for_each_entry_safe(root, tmp, head, list) {
-               list_del_init(&root->list);
-               cifs_put_smb_ses(root->ses);
-               kfree(root);
+       for (cur = ses; cur; cur = cur->dfs_root_ses) {
+               if (cur->dfs_root_ses)
+                       cifs_put_smb_ses(cur->dfs_root_ses);
         }
+       cifs_put_smb_ses(ses);
  }
  
  #endif /* _CIFS_DFS_H */
diff --git a/fs/smb/client/dfs_cache.c b/fs/smb/client/dfs_cache.c

index 508d831fabe37899fb016b05431307e0c3cd8a80..11c8efecf7aa128d30527ac1e44a3124f03ec5fb 100644 (file)
--- a/fs/smb/client/dfs_cache.c
+++ b/fs/smb/client/dfs_cache.c
@@ -1172,8 +1172,8 @@ static bool is_ses_good(struct cifs_ses *ses)
         return ret;
  }
  
-/* Refresh dfs referral of tcon and mark it for reconnect if needed */
-static int __refresh_tcon(const char *path, struct cifs_ses *ses, bool force_refresh)
+/* Refresh dfs referral of @ses and mark it for reconnect if needed */
+static void __refresh_ses_referral(struct cifs_ses *ses, bool force_refresh)
  {
         struct TCP_Server_Info *server = ses->server;
         DFS_CACHE_TGT_LIST(old_tl);
@@ -1181,10 +1181,21 @@ static int __refresh_tcon(const char *path, struct cifs_ses *ses, bool force_ref
         bool needs_refresh = false;
         struct cache_entry *ce;
         unsigned int xid;
+       char *path = NULL;
         int rc = 0;
  
         xid = get_xid();
  
+       mutex_lock(&server->refpath_lock);
+       if (server->leaf_fullpath) {
+               path = kstrdup(server->leaf_fullpath + 1, GFP_ATOMIC);
+               if (!path)
+                       rc = -ENOMEM;
+       }
+       mutex_unlock(&server->refpath_lock);
+       if (!path)
+               goto out;
+
         down_read(&htable_rw_lock);
         ce = lookup_cache_entry(path);
         needs_refresh = force_refresh || IS_ERR(ce) || cache_entry_expired(ce);
@@ -1218,19 +1229,17 @@ out:
         free_xid(xid);
         dfs_cache_free_tgts(&old_tl);
         dfs_cache_free_tgts(&new_tl);
-       return rc;
+       kfree(path);
  }
  
-static int refresh_tcon(struct cifs_tcon *tcon, bool force_refresh)
+static inline void refresh_ses_referral(struct cifs_ses *ses)
  {
-       struct TCP_Server_Info *server = tcon->ses->server;
-       struct cifs_ses *ses = tcon->ses;
+       __refresh_ses_referral(ses, false);
+}
  
-       mutex_lock(&server->refpath_lock);
-       if (server->leaf_fullpath)
-               __refresh_tcon(server->leaf_fullpath + 1, ses, force_refresh);
-       mutex_unlock(&server->refpath_lock);
-       return 0;
+static inline void force_refresh_ses_referral(struct cifs_ses *ses)
+{
+       __refresh_ses_referral(ses, true);
  }
  
  /**
@@ -1271,34 +1280,20 @@ int dfs_cache_remount_fs(struct cifs_sb_info *cifs_sb)
          */
         cifs_sb->mnt_cifs_flags |= CIFS_MOUNT_USE_PREFIX_PATH;
  
-       return refresh_tcon(tcon, true);
+       force_refresh_ses_referral(tcon->ses);
+       return 0;
  }
  
  /* Refresh all DFS referrals related to DFS tcon */
  void dfs_cache_refresh(struct work_struct *work)
  {
-       struct TCP_Server_Info *server;
-       struct dfs_root_ses *rses;
         struct cifs_tcon *tcon;
         struct cifs_ses *ses;
  
         tcon = container_of(work, struct cifs_tcon, dfs_cache_work.work);
-       ses = tcon->ses;
-       server = ses->server;
  
-       mutex_lock(&server->refpath_lock);
-       if (server->leaf_fullpath)
-               __refresh_tcon(server->leaf_fullpath + 1, ses, false);
-       mutex_unlock(&server->refpath_lock);
-
-       list_for_each_entry(rses, &tcon->dfs_ses_list, list) {
-               ses = rses->ses;
-               server = ses->server;
-               mutex_lock(&server->refpath_lock);
-               if (server->leaf_fullpath)
-                       __refresh_tcon(server->leaf_fullpath + 1, ses, false);
-               mutex_unlock(&server->refpath_lock);
-       }
+       for (ses = tcon->ses; ses; ses = ses->dfs_root_ses)
+               refresh_ses_referral(ses);
  
         queue_delayed_work(dfscache_wq, &tcon->dfs_cache_work,
                            atomic_read(&dfs_cache_ttl) * HZ);
diff --git a/fs/smb/client/dir.c b/fs/smb/client/dir.c

index d11dc3aa458ba2d545c5a80786230b54cb62166b..864b194dbaa0a0bffc3cecac0a3c9fd4ccc9b7a4 100644 (file)
--- a/fs/smb/client/dir.c
+++ b/fs/smb/client/dir.c
@@ -189,6 +189,7 @@ static int cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned
         int disposition;
         struct TCP_Server_Info *server = tcon->ses->server;
         struct cifs_open_parms oparms;
+       int rdwr_for_fscache = 0;
  
         *oplock = 0;
         if (tcon->ses->server->oplocks)
@@ -200,6 +201,10 @@ static int cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned
                 return PTR_ERR(full_path);
         }
  
+       /* If we're caching, we need to be able to fill in around partial writes. */
+       if (cifs_fscache_enabled(inode) && (oflags & O_ACCMODE) == O_WRONLY)
+               rdwr_for_fscache = 1;
+
  #ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
         if (tcon->unix_ext && cap_unix(tcon->ses) && !tcon->broken_posix_open &&
             (CIFS_UNIX_POSIX_PATH_OPS_CAP &
@@ -276,6 +281,8 @@ static int cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned
                 desired_access |= GENERIC_READ; /* is this too little? */
         if (OPEN_FMODE(oflags) & FMODE_WRITE)
                 desired_access |= GENERIC_WRITE;
+       if (rdwr_for_fscache == 1)
+               desired_access |= GENERIC_READ;
  
         disposition = FILE_OVERWRITE_IF;
         if ((oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))
@@ -304,6 +311,7 @@ static int cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned
         if (!tcon->unix_ext && (mode & S_IWUGO) == 0)
                 create_options |= CREATE_OPTION_READONLY;
  
+retry_open:
         oparms = (struct cifs_open_parms) {
                 .tcon = tcon,
                 .cifs_sb = cifs_sb,
@@ -317,8 +325,15 @@ static int cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned
         rc = server->ops->open(xid, &oparms, oplock, buf);
         if (rc) {
                 cifs_dbg(FYI, "cifs_create returned 0x%x\n", rc);
+               if (rc == -EACCES && rdwr_for_fscache == 1) {
+                       desired_access &= ~GENERIC_READ;
+                       rdwr_for_fscache = 2;
+                       goto retry_open;
+               }
                 goto out;
         }
+       if (rdwr_for_fscache == 2)
+               cifs_invalidate_cache(inode, FSCACHE_INVAL_DIO_WRITE);
  
  #ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
         /*
diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c

index 16aadce492b2ec67b973c8726e7cefd4857197d5..9be37d0fe724e90af1981c109512a64f01d86a0a 100644 (file)
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -206,12 +206,12 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
          */
  }
  
-static inline int cifs_convert_flags(unsigned int flags)
+static inline int cifs_convert_flags(unsigned int flags, int rdwr_for_fscache)
  {
         if ((flags & O_ACCMODE) == O_RDONLY)
                 return GENERIC_READ;
         else if ((flags & O_ACCMODE) == O_WRONLY)
-               return GENERIC_WRITE;
+               return rdwr_for_fscache == 1 ? (GENERIC_READ | GENERIC_WRITE) : GENERIC_WRITE;
         else if ((flags & O_ACCMODE) == O_RDWR) {
                 /* GENERIC_ALL is too much permission to request
                    can cause unnecessary access denied on create */
@@ -348,11 +348,16 @@ static int cifs_nt_open(const char *full_path, struct inode *inode, struct cifs_
         int create_options = CREATE_NOT_DIR;
         struct TCP_Server_Info *server = tcon->ses->server;
         struct cifs_open_parms oparms;
+       int rdwr_for_fscache = 0;
  
         if (!server->ops->open)
                 return -ENOSYS;
  
-       desired_access = cifs_convert_flags(f_flags);
+       /* If we're caching, we need to be able to fill in around partial writes. */
+       if (cifs_fscache_enabled(inode) && (f_flags & O_ACCMODE) == O_WRONLY)
+               rdwr_for_fscache = 1;
+
+       desired_access = cifs_convert_flags(f_flags, rdwr_for_fscache);
  
  /*********************************************************************
   *  open flag mapping table:
@@ -389,6 +394,7 @@ static int cifs_nt_open(const char *full_path, struct inode *inode, struct cifs_
         if (f_flags & O_DIRECT)
                 create_options |= CREATE_NO_BUFFER;
  
+retry_open:
         oparms = (struct cifs_open_parms) {
                 .tcon = tcon,
                 .cifs_sb = cifs_sb,
@@ -400,8 +406,16 @@ static int cifs_nt_open(const char *full_path, struct inode *inode, struct cifs_
         };
  
         rc = server->ops->open(xid, &oparms, oplock, buf);
-       if (rc)
+       if (rc) {
+               if (rc == -EACCES && rdwr_for_fscache == 1) {
+                       desired_access = cifs_convert_flags(f_flags, 0);
+                       rdwr_for_fscache = 2;
+                       goto retry_open;
+               }
                 return rc;
+       }
+       if (rdwr_for_fscache == 2)
+               cifs_invalidate_cache(inode, FSCACHE_INVAL_DIO_WRITE);
  
         /* TODO: Add support for calling posix query info but with passing in fid */
         if (tcon->unix_ext)
@@ -445,6 +459,7 @@ cifs_down_write(struct rw_semaphore *sem)
  }
  
  static void cifsFileInfo_put_work(struct work_struct *work);
+void serverclose_work(struct work_struct *work);
  
  struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
                                        struct tcon_link *tlink, __u32 oplock,
@@ -491,6 +506,7 @@ struct cifsFileInfo *cifs_new_fileinfo(struct cifs_fid *fid, struct file *file,
         cfile->tlink = cifs_get_tlink(tlink);
         INIT_WORK(&cfile->oplock_break, cifs_oplock_break);
         INIT_WORK(&cfile->put, cifsFileInfo_put_work);
+       INIT_WORK(&cfile->serverclose, serverclose_work);
         INIT_DELAYED_WORK(&cfile->deferred, smb2_deferred_work_close);
         mutex_init(&cfile->fh_mutex);
         spin_lock_init(&cfile->file_info_lock);
@@ -582,6 +598,40 @@ static void cifsFileInfo_put_work(struct work_struct *work)
         cifsFileInfo_put_final(cifs_file);
  }
  
+void serverclose_work(struct work_struct *work)
+{
+       struct cifsFileInfo *cifs_file = container_of(work,
+                       struct cifsFileInfo, serverclose);
+
+       struct cifs_tcon *tcon = tlink_tcon(cifs_file->tlink);
+
+       struct TCP_Server_Info *server = tcon->ses->server;
+       int rc = 0;
+       int retries = 0;
+       int MAX_RETRIES = 4;
+
+       do {
+               if (server->ops->close_getattr)
+                       rc = server->ops->close_getattr(0, tcon, cifs_file);
+               else if (server->ops->close)
+                       rc = server->ops->close(0, tcon, &cifs_file->fid);
+
+               if (rc == -EBUSY || rc == -EAGAIN) {
+                       retries++;
+                       msleep(250);
+               }
+       } while ((rc == -EBUSY || rc == -EAGAIN) && (retries < MAX_RETRIES)
+       );
+
+       if (retries == MAX_RETRIES)
+               pr_warn("Serverclose failed %d times, giving up\n", MAX_RETRIES);
+
+       if (cifs_file->offload)
+               queue_work(fileinfo_put_wq, &cifs_file->put);
+       else
+               cifsFileInfo_put_final(cifs_file);
+}
+
  /**
   * cifsFileInfo_put - release a reference of file priv data
   *
@@ -622,10 +672,13 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file,
         struct cifs_fid fid = {};
         struct cifs_pending_open open;
         bool oplock_break_cancelled;
+       bool serverclose_offloaded = false;
  
         spin_lock(&tcon->open_file_lock);
         spin_lock(&cifsi->open_file_lock);
         spin_lock(&cifs_file->file_info_lock);
+
+       cifs_file->offload = offload;
         if (--cifs_file->count > 0) {
                 spin_unlock(&cifs_file->file_info_lock);
                 spin_unlock(&cifsi->open_file_lock);
@@ -667,13 +720,20 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file,
         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
                 struct TCP_Server_Info *server = tcon->ses->server;
                 unsigned int xid;
+               int rc = 0;
  
                 xid = get_xid();
                 if (server->ops->close_getattr)
-                       server->ops->close_getattr(xid, tcon, cifs_file);
+                       rc = server->ops->close_getattr(xid, tcon, cifs_file);
                 else if (server->ops->close)
-                       server->ops->close(xid, tcon, &cifs_file->fid);
+                       rc = server->ops->close(xid, tcon, &cifs_file->fid);
                 _free_xid(xid);
+
+               if (rc == -EBUSY || rc == -EAGAIN) {
+                       // Server close failed, hence offloading it as an async op
+                       queue_work(serverclose_wq, &cifs_file->serverclose);
+                       serverclose_offloaded = true;
+               }
         }
  
         if (oplock_break_cancelled)
@@ -681,10 +741,15 @@ void _cifsFileInfo_put(struct cifsFileInfo *cifs_file,
  
         cifs_del_pending_open(&open);
  
-       if (offload)
-               queue_work(fileinfo_put_wq, &cifs_file->put);
-       else
-               cifsFileInfo_put_final(cifs_file);
+       // if serverclose has been offloaded to wq (on failure), it will
+       // handle offloading put as well. If serverclose not offloaded,
+       // we need to handle offloading put here.
+       if (!serverclose_offloaded) {
+               if (offload)
+                       queue_work(fileinfo_put_wq, &cifs_file->put);
+               else
+                       cifsFileInfo_put_final(cifs_file);
+       }
  }
  
  int cifs_open(struct inode *inode, struct file *file)
@@ -834,11 +899,11 @@ int cifs_open(struct inode *inode, struct file *file)
  use_cache:
         fscache_use_cookie(cifs_inode_cookie(file_inode(file)),
                            file->f_mode & FMODE_WRITE);
-       if (file->f_flags & O_DIRECT &&
-           (!((file->f_flags & O_ACCMODE) != O_RDONLY) ||
-            file->f_flags & O_APPEND))
-               cifs_invalidate_cache(file_inode(file),
-                                     FSCACHE_INVAL_DIO_WRITE);
+       if (!(file->f_flags & O_DIRECT))
+               goto out;
+       if ((file->f_flags & (O_ACCMODE | O_APPEND)) == O_RDONLY)
+               goto out;
+       cifs_invalidate_cache(file_inode(file), FSCACHE_INVAL_DIO_WRITE);
  
  out:
         free_dentry_path(page);
@@ -903,6 +968,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
         int disposition = FILE_OPEN;
         int create_options = CREATE_NOT_DIR;
         struct cifs_open_parms oparms;
+       int rdwr_for_fscache = 0;
  
         xid = get_xid();
         mutex_lock(&cfile->fh_mutex);
@@ -966,7 +1032,11 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
         }
  #endif /* CONFIG_CIFS_ALLOW_INSECURE_LEGACY */
  
-       desired_access = cifs_convert_flags(cfile->f_flags);
+       /* If we're caching, we need to be able to fill in around partial writes. */
+       if (cifs_fscache_enabled(inode) && (cfile->f_flags & O_ACCMODE) == O_WRONLY)
+               rdwr_for_fscache = 1;
+
+       desired_access = cifs_convert_flags(cfile->f_flags, rdwr_for_fscache);
  
         /* O_SYNC also has bit for O_DSYNC so following check picks up either */
         if (cfile->f_flags & O_SYNC)
@@ -978,6 +1048,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
         if (server->ops->get_lease_key)
                 server->ops->get_lease_key(inode, &cfile->fid);
  
+retry_open:
         oparms = (struct cifs_open_parms) {
                 .tcon = tcon,
                 .cifs_sb = cifs_sb,
@@ -1003,6 +1074,11 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
                 /* indicate that we need to relock the file */
                 oparms.reconnect = true;
         }
+       if (rc == -EACCES && rdwr_for_fscache == 1) {
+               desired_access = cifs_convert_flags(cfile->f_flags, 0);
+               rdwr_for_fscache = 2;
+               goto retry_open;
+       }
  
         if (rc) {
                 mutex_unlock(&cfile->fh_mutex);
@@ -1011,6 +1087,9 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
                 goto reopen_error_exit;
         }
  
+       if (rdwr_for_fscache == 2)
+               cifs_invalidate_cache(inode, FSCACHE_INVAL_DIO_WRITE);
+
  #ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
  reopen_success:
  #endif /* CONFIG_CIFS_ALLOW_INSECURE_LEGACY */
diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c

index bdcbe6ff2739ab4539c128e7945258a8914a477a..b7bfe705b2c498b83a60131713246bb9d37abf98 100644 (file)
--- a/fs/smb/client/fs_context.c
+++ b/fs/smb/client/fs_context.c
@@ -37,7 +37,7 @@
  #include "rfc1002pdu.h"
  #include "fs_context.h"
  
-static DEFINE_MUTEX(cifs_mount_mutex);
+DEFINE_MUTEX(cifs_mount_mutex);
  
  static const match_table_t cifs_smb_version_tokens = {
         { Smb_1, SMB1_VERSION_STRING },
@@ -783,9 +783,9 @@ static int smb3_get_tree(struct fs_context *fc)
  
         if (err)
                 return err;
-       mutex_lock(&cifs_mount_mutex);
+       cifs_mount_lock();
         ret = smb3_get_tree_common(fc);
-       mutex_unlock(&cifs_mount_mutex);
+       cifs_mount_unlock();
         return ret;
  }
  
diff --git a/fs/smb/client/fs_context.h b/fs/smb/client/fs_context.h

index 7863f2248c4df8f1e892c2b8589813cca1f1bdd4..8a35645e0b65b244741da59177a2bcb0acea0256 100644 (file)
--- a/fs/smb/client/fs_context.h
+++ b/fs/smb/client/fs_context.h
@@ -304,4 +304,16 @@ extern void smb3_update_mnt_flags(struct cifs_sb_info *cifs_sb);
  #define MAX_CACHED_FIDS 16
  extern char *cifs_sanitize_prepath(char *prepath, gfp_t gfp);
  
+extern struct mutex cifs_mount_mutex;
+
+static inline void cifs_mount_lock(void)
+{
+       mutex_lock(&cifs_mount_mutex);
+}
+
+static inline void cifs_mount_unlock(void)
+{
+       mutex_unlock(&cifs_mount_mutex);
+}
+
  #endif
diff --git a/fs/smb/client/fscache.h b/fs/smb/client/fscache.h

index a3d73720914f888cead5bec48c96c633f21370e8..1f2ea9f5cc9a8a5f900a5a5367cf093d54f2f80d 100644 (file)
--- a/fs/smb/client/fscache.h
+++ b/fs/smb/client/fscache.h
@@ -109,6 +109,11 @@ static inline void cifs_readahead_to_fscache(struct inode *inode,
                 __cifs_readahead_to_fscache(inode, pos, len);
  }
  
+static inline bool cifs_fscache_enabled(struct inode *inode)
+{
+       return fscache_cookie_enabled(cifs_inode_cookie(inode));
+}
+
  #else /* CONFIG_CIFS_FSCACHE */
  static inline
  void cifs_fscache_fill_coherency(struct inode *inode,
@@ -124,6 +129,7 @@ static inline void cifs_fscache_release_inode_cookie(struct inode *inode) {}
  static inline void cifs_fscache_unuse_inode_cookie(struct inode *inode, bool update) {}
  static inline struct fscache_cookie *cifs_inode_cookie(struct inode *inode) { return NULL; }
  static inline void cifs_invalidate_cache(struct inode *inode, unsigned int flags) {}
+static inline bool cifs_fscache_enabled(struct inode *inode) { return false; }
  
  static inline int cifs_fscache_query_occupancy(struct inode *inode,
                                                pgoff_t first, unsigned int nr_pages,
diff --git a/fs/smb/client/ioctl.c b/fs/smb/client/ioctl.c

index c012dfdba80d457e27dc51996c55309840d3f892..855ac5a62edfaa50215cfed46e361dcb79f0c8fc 100644 (file)
--- a/fs/smb/client/ioctl.c
+++ b/fs/smb/client/ioctl.c
@@ -247,7 +247,9 @@ static int cifs_dump_full_key(struct cifs_tcon *tcon, struct smb3_full_key_debug
                 spin_lock(&cifs_tcp_ses_lock);
                 list_for_each_entry(server_it, &cifs_tcp_ses_list, tcp_ses_list) {
                         list_for_each_entry(ses_it, &server_it->smb_ses_list, smb_ses_list) {
-                               if (ses_it->Suid == out.session_id) {
+                               spin_lock(&ses_it->ses_lock);
+                               if (ses_it->ses_status != SES_EXITING &&
+                                   ses_it->Suid == out.session_id) {
                                         ses = ses_it;
                                         /*
                                          * since we are using the session outside the crit
@@ -255,9 +257,11 @@ static int cifs_dump_full_key(struct cifs_tcon *tcon, struct smb3_full_key_debug
                                          * so increment its refcount
                                          */
                                         cifs_smb_ses_inc_refcount(ses);
+                                       spin_unlock(&ses_it->ses_lock);
                                         found = true;
                                         goto search_end;
                                 }
+                               spin_unlock(&ses_it->ses_lock);
                         }
                 }
  search_end:
diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c

index c3771fc81328ff12d7d075cadab77c97b16b7e05..33ac4f8f5050c416cd2004ee4516756edd3b11d8 100644 (file)
--- a/fs/smb/client/misc.c
+++ b/fs/smb/client/misc.c
@@ -138,9 +138,6 @@ tcon_info_alloc(bool dir_leases_enabled)
         atomic_set(&ret_buf->num_local_opens, 0);
         atomic_set(&ret_buf->num_remote_opens, 0);
         ret_buf->stats_from_time = ktime_get_real_seconds();
-#ifdef CONFIG_CIFS_DFS_UPCALL
-       INIT_LIST_HEAD(&ret_buf->dfs_ses_list);
-#endif
  
         return ret_buf;
  }
@@ -156,9 +153,6 @@ tconInfoFree(struct cifs_tcon *tcon)
         atomic_dec(&tconInfoAllocCount);
         kfree(tcon->nativeFileSystem);
         kfree_sensitive(tcon->password);
-#ifdef CONFIG_CIFS_DFS_UPCALL
-       dfs_put_root_smb_sessions(&tcon->dfs_ses_list);
-#endif
         kfree(tcon->origin_fullpath);
         kfree(tcon);
  }
@@ -487,6 +481,8 @@ is_valid_oplock_break(char *buffer, struct TCP_Server_Info *srv)
         /* look up tcon based on tid & uid */
         spin_lock(&cifs_tcp_ses_lock);
         list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
+               if (cifs_ses_exiting(ses))
+                       continue;
                 list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
                         if (tcon->tid != buf->Tid)
                                 continue;
diff --git a/fs/smb/client/smb1ops.c b/fs/smb/client/smb1ops.c

index a9eaba8083b0d6b2745ebedb1d3d4705c0f4809d..212ec6f66ec65b15f50275803bbc38b0153c9645 100644 (file)
--- a/fs/smb/client/smb1ops.c
+++ b/fs/smb/client/smb1ops.c
@@ -753,11 +753,11 @@ cifs_set_fid(struct cifsFileInfo *cfile, struct cifs_fid *fid, __u32 oplock)
         cinode->can_cache_brlcks = CIFS_CACHE_WRITE(cinode);
  }
  
-static void
+static int
  cifs_close_file(const unsigned int xid, struct cifs_tcon *tcon,
                 struct cifs_fid *fid)
  {
-       CIFSSMBClose(xid, tcon, fid->netfid);
+       return CIFSSMBClose(xid, tcon, fid->netfid);
  }
  
  static int
diff --git a/fs/smb/client/smb2misc.c b/fs/smb/client/smb2misc.c

index 82b84a4941dd2f05e8d516b54b6a209dbd7985d1..cc72be5a93a933b09c45256c2a7d7615478f54c0 100644 (file)
--- a/fs/smb/client/smb2misc.c
+++ b/fs/smb/client/smb2misc.c
@@ -622,6 +622,8 @@ smb2_is_valid_lease_break(char *buffer, struct TCP_Server_Info *server)
         /* look up tcon based on tid & uid */
         spin_lock(&cifs_tcp_ses_lock);
         list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
+               if (cifs_ses_exiting(ses))
+                       continue;
                 list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
                         spin_lock(&tcon->open_file_lock);
                         cifs_stats_inc(
@@ -697,6 +699,8 @@ smb2_is_valid_oplock_break(char *buffer, struct TCP_Server_Info *server)
         /* look up tcon based on tid & uid */
         spin_lock(&cifs_tcp_ses_lock);
         list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
+               if (cifs_ses_exiting(ses))
+                       continue;
                 list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
  
                         spin_lock(&tcon->open_file_lock);
diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c

index 2ed456948f34ca0cd88dbe626429694456a681e2..b156eefa75d7cb4b13d1bf402234f08271a558ad 100644 (file)
--- a/fs/smb/client/smb2ops.c
+++ b/fs/smb/client/smb2ops.c
@@ -1412,14 +1412,14 @@ smb2_set_fid(struct cifsFileInfo *cfile, struct cifs_fid *fid, __u32 oplock)
         memcpy(cfile->fid.create_guid, fid->create_guid, 16);
  }
  
-static void
+static int
  smb2_close_file(const unsigned int xid, struct cifs_tcon *tcon,
                 struct cifs_fid *fid)
  {
-       SMB2_close(xid, tcon, fid->persistent_fid, fid->volatile_fid);
+       return SMB2_close(xid, tcon, fid->persistent_fid, fid->volatile_fid);
  }
  
-static void
+static int
  smb2_close_getattr(const unsigned int xid, struct cifs_tcon *tcon,
                    struct cifsFileInfo *cfile)
  {
@@ -1430,7 +1430,7 @@ smb2_close_getattr(const unsigned int xid, struct cifs_tcon *tcon,
         rc = __SMB2_close(xid, tcon, cfile->fid.persistent_fid,
                    cfile->fid.volatile_fid, &file_inf);
         if (rc)
-               return;
+               return rc;
  
         inode = d_inode(cfile->dentry);
  
@@ -1459,6 +1459,7 @@ smb2_close_getattr(const unsigned int xid, struct cifs_tcon *tcon,
  
         /* End of file and Attributes should not have to be updated on close */
         spin_unlock(&inode->i_lock);
+       return rc;
  }
  
  static int
@@ -2480,6 +2481,8 @@ smb2_is_network_name_deleted(char *buf, struct TCP_Server_Info *server)
  
         spin_lock(&cifs_tcp_ses_lock);
         list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
+               if (cifs_ses_exiting(ses))
+                       continue;
                 list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
                         if (tcon->tid == le32_to_cpu(shdr->Id.SyncId.TreeId)) {
                                 spin_lock(&tcon->tc_lock);
@@ -3913,7 +3916,7 @@ smb21_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock,
                 strcat(message, "W");
         }
         if (!new_oplock)
-               strncpy(message, "None", sizeof(message));
+               strscpy(message, "None");
  
         cinode->oplock = new_oplock;
         cifs_dbg(FYI, "%s Lease granted on inode %p\n", message,
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c

index 3ea688558e6c9b96f454e29197023ec74f274e2a..c0c4933af5fc386911922b4e23c7869bdea8098b 100644 (file)
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -3628,9 +3628,9 @@ replay_again:
                         memcpy(&pbuf->network_open_info,
                                &rsp->network_open_info,
                                sizeof(pbuf->network_open_info));
+               atomic_dec(&tcon->num_remote_opens);
         }
  
-       atomic_dec(&tcon->num_remote_opens);
  close_exit:
         SMB2_close_free(&rqst);
         free_rsp_buf(resp_buftype, rsp);
diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c

index 5a3ca62d2f07f72584392975221cbc9b12276fe8..1d6e54f7879e6a5e8034a90d30471fecc02d2d1b 100644 (file)
--- a/fs/smb/client/smb2transport.c
+++ b/fs/smb/client/smb2transport.c
@@ -659,7 +659,7 @@ smb2_sign_rqst(struct smb_rqst *rqst, struct TCP_Server_Info *server)
         }
         spin_unlock(&server->srv_lock);
         if (!is_binding && !server->session_estab) {
-               strncpy(shdr->Signature, "BSRSPYL", 8);
+               strscpy(shdr->Signature, "BSRSPYL");
                 return 0;
         }
  
diff --git a/fs/smb/server/ksmbd_netlink.h b/fs/smb/server/ksmbd_netlink.h

index 8ca8a45c4c621c5dabf56664bec27e5df0db8356..686b321c5a8bb5f0a1189023a3311e85aaf84c9e 100644 (file)
--- a/fs/smb/server/ksmbd_netlink.h
+++ b/fs/smb/server/ksmbd_netlink.h
@@ -167,7 +167,8 @@ struct ksmbd_share_config_response {
         __u16   force_uid;
         __u16   force_gid;
         __s8    share_name[KSMBD_REQ_MAX_SHARE_NAME];
-       __u32   reserved[112];          /* Reserved room */
+       __u32   reserved[111];          /* Reserved room */
+       __u32   payload_sz;
         __u32   veto_list_sz;
         __s8    ____payload[];
  };
diff --git a/fs/smb/server/mgmt/share_config.c b/fs/smb/server/mgmt/share_config.c

index 328a412259dc1b935eb2ce3eee2977d39155157b..a2f0a2edceb8ae49852dde1b40cf7594df6084d9 100644 (file)
--- a/fs/smb/server/mgmt/share_config.c
+++ b/fs/smb/server/mgmt/share_config.c
@@ -158,7 +158,12 @@ static struct ksmbd_share_config *share_config_request(struct unicode_map *um,
         share->name = kstrdup(name, GFP_KERNEL);
  
         if (!test_share_config_flag(share, KSMBD_SHARE_FLAG_PIPE)) {
-               share->path = kstrdup(ksmbd_share_config_path(resp),
+               int path_len = PATH_MAX;
+
+               if (resp->payload_sz)
+                       path_len = resp->payload_sz - resp->veto_list_sz;
+
+               share->path = kstrndup(ksmbd_share_config_path(resp), path_len,
                                       GFP_KERNEL);
                 if (share->path)
                         share->path_sz = strlen(share->path);
diff --git a/fs/smb/server/smb2ops.c b/fs/smb/server/smb2ops.c

index a45f7dca482e01897720cd69d90291cbaf3ff388..606aa3c5189a28de2e49e602e2c08362c91f46b7 100644 (file)
--- a/fs/smb/server/smb2ops.c
+++ b/fs/smb/server/smb2ops.c
@@ -228,6 +228,11 @@ void init_smb3_0_server(struct ksmbd_conn *conn)
             conn->cli_cap & SMB2_GLOBAL_CAP_ENCRYPTION)
                 conn->vals->capabilities |= SMB2_GLOBAL_CAP_ENCRYPTION;
  
+       if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION ||
+           (!(server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION_OFF) &&
+            conn->cli_cap & SMB2_GLOBAL_CAP_ENCRYPTION))
+               conn->vals->capabilities |= SMB2_GLOBAL_CAP_ENCRYPTION;
+
         if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB3_MULTICHANNEL)
                 conn->vals->capabilities |= SMB2_GLOBAL_CAP_MULTI_CHANNEL;
  }
@@ -278,11 +283,6 @@ int init_smb3_11_server(struct ksmbd_conn *conn)
                 conn->vals->capabilities |= SMB2_GLOBAL_CAP_LEASING |
                         SMB2_GLOBAL_CAP_DIRECTORY_LEASING;
  
-       if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION ||
-           (!(server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION_OFF) &&
-            conn->cli_cap & SMB2_GLOBAL_CAP_ENCRYPTION))
-               conn->vals->capabilities |= SMB2_GLOBAL_CAP_ENCRYPTION;
-
         if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB3_MULTICHANNEL)
                 conn->vals->capabilities |= SMB2_GLOBAL_CAP_MULTI_CHANNEL;
  
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c

index d478fa0c57abdbc7b8478624edf5133e202c85bf..5723bbf372d7cc93c9e1b2dbdd5082c2824f85f8 100644 (file)
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -5857,8 +5857,9 @@ static int smb2_rename(struct ksmbd_work *work,
         if (!file_info->ReplaceIfExists)
                 flags = RENAME_NOREPLACE;
  
-       smb_break_all_levII_oplock(work, fp, 0);
         rc = ksmbd_vfs_rename(work, &fp->filp->f_path, new_name, flags);
+       if (!rc)
+               smb_break_all_levII_oplock(work, fp, 0);
  out:
         kfree(new_name);
         return rc;
diff --git a/fs/smb/server/transport_ipc.c b/fs/smb/server/transport_ipc.c

index f29bb03f0dc47bfcb0fe3fc5c5acff16d5a314a8..8752ac82c557bf92985bd4d87a3e37f4cd4a60dc 100644 (file)
--- a/fs/smb/server/transport_ipc.c
+++ b/fs/smb/server/transport_ipc.c
@@ -65,6 +65,7 @@ struct ipc_msg_table_entry {
         struct hlist_node       ipc_table_hlist;
  
         void                    *response;
+       unsigned int            msg_sz;
  };
  
  static struct delayed_work ipc_timer_work;
@@ -275,6 +276,7 @@ static int handle_response(int type, void *payload, size_t sz)
                 }
  
                 memcpy(entry->response, payload, sz);
+               entry->msg_sz = sz;
                 wake_up_interruptible(&entry->wait);
                 ret = 0;
                 break;
@@ -453,6 +455,34 @@ out:
         return ret;
  }
  
+static int ipc_validate_msg(struct ipc_msg_table_entry *entry)
+{
+       unsigned int msg_sz = entry->msg_sz;
+
+       if (entry->type == KSMBD_EVENT_RPC_REQUEST) {
+               struct ksmbd_rpc_command *resp = entry->response;
+
+               msg_sz = sizeof(struct ksmbd_rpc_command) + resp->payload_sz;
+       } else if (entry->type == KSMBD_EVENT_SPNEGO_AUTHEN_REQUEST) {
+               struct ksmbd_spnego_authen_response *resp = entry->response;
+
+               msg_sz = sizeof(struct ksmbd_spnego_authen_response) +
+                               resp->session_key_len + resp->spnego_blob_len;
+       } else if (entry->type == KSMBD_EVENT_SHARE_CONFIG_REQUEST) {
+               struct ksmbd_share_config_response *resp = entry->response;
+
+               if (resp->payload_sz) {
+                       if (resp->payload_sz < resp->veto_list_sz)
+                               return -EINVAL;
+
+                       msg_sz = sizeof(struct ksmbd_share_config_response) +
+                                       resp->payload_sz;
+               }
+       }
+
+       return entry->msg_sz != msg_sz ? -EINVAL : 0;
+}
+
  static void *ipc_msg_send_request(struct ksmbd_ipc_msg *msg, unsigned int handle)
  {
         struct ipc_msg_table_entry entry;
@@ -477,6 +507,13 @@ static void *ipc_msg_send_request(struct ksmbd_ipc_msg *msg, unsigned int handle
         ret = wait_event_interruptible_timeout(entry.wait,
                                                entry.response != NULL,
                                                IPC_WAIT_TIMEOUT);
+       if (entry.response) {
+               ret = ipc_validate_msg(&entry);
+               if (ret) {
+                       kvfree(entry.response);
+                       entry.response = NULL;
+               }
+       }
  out:
         down_write(&ipc_msg_table_lock);
         hash_del(&entry.ipc_table_hlist);
diff --git a/fs/super.c b/fs/super.c

index 71d9779c42b10aca8bd4e0b7b667fc62386e2305..69ce6c600968479bd6832a6705352eb2d88427c1 100644 (file)
--- a/fs/super.c
+++ b/fs/super.c
@@ -1515,29 +1515,11 @@ static int fs_bdev_thaw(struct block_device *bdev)
         return error;
  }
  
-static void fs_bdev_super_get(void *data)
-{
-       struct super_block *sb = data;
-
-       spin_lock(&sb_lock);
-       sb->s_count++;
-       spin_unlock(&sb_lock);
-}
-
-static void fs_bdev_super_put(void *data)
-{
-       struct super_block *sb = data;
-
-       put_super(sb);
-}
-
  const struct blk_holder_ops fs_holder_ops = {
         .mark_dead              = fs_bdev_mark_dead,
         .sync                   = fs_bdev_sync,
         .freeze                 = fs_bdev_freeze,
         .thaw                   = fs_bdev_thaw,
-       .get_holder             = fs_bdev_super_get,
-       .put_holder             = fs_bdev_super_put,
  };
  EXPORT_SYMBOL_GPL(fs_holder_ops);
  
@@ -1562,7 +1544,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
          * writable from userspace even for a read-only block device.
          */
         if ((mode & BLK_OPEN_WRITE) && bdev_read_only(bdev)) {
-               fput(bdev_file);
+               bdev_fput(bdev_file);
                 return -EACCES;
         }
  
@@ -1573,7 +1555,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
         if (atomic_read(&bdev->bd_fsfreeze_count) > 0) {
                 if (fc)
                         warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev);
-               fput(bdev_file);
+               bdev_fput(bdev_file);
                 return -EBUSY;
         }
         spin_lock(&sb_lock);
@@ -1693,7 +1675,7 @@ void kill_block_super(struct super_block *sb)
         generic_shutdown_super(sb);
         if (bdev) {
                 sync_blockdev(bdev);
-               fput(sb->s_bdev_file);
+               bdev_fput(sb->s_bdev_file);
         }
  }
  
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c

index 1a18c381127e2183169eaa8280aa620d66340a71..f0fa02264edaaeef2d23d1101d1953d6454e8832 100644 (file)
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -2030,7 +2030,7 @@ xfs_free_buftarg(
         fs_put_dax(btp->bt_daxdev, btp->bt_mount);
         /* the main block device is closed by kill_block_super */
         if (btp->bt_bdev != btp->bt_mount->m_super->s_bdev)
-               fput(btp->bt_bdev_file);
+               bdev_fput(btp->bt_bdev_file);
         kfree(btp);
  }
  
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c

index ea48774f6b76d398bf6f9731145483751a4bf506..d55b42b2480d6c53f3367e4453cc69c5b80c6870 100644 (file)
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1301,8 +1301,19 @@ xfs_link(
          */
         if (unlikely((tdp->i_diflags & XFS_DIFLAG_PROJINHERIT) &&
                      tdp->i_projid != sip->i_projid)) {
-               error = -EXDEV;
-               goto error_return;
+               /*
+                * Project quota setup skips special files which can
+                * leave inodes in a PROJINHERIT directory without a
+                * project ID set. We need to allow links to be made
+                * to these "project-less" inodes because userspace
+                * expects them to succeed after project ID setup,
+                * but everything else should be rejected.
+                */
+               if (!special_file(VFS_I(sip)->i_mode) ||
+                   sip->i_projid != 0) {
+                       error = -EXDEV;
+                       goto error_return;
+               }
         }
  
         if (!resblks) {
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c

index c21f10ab0f5dbef4051b6bef01eb64c77247e056..bce020374c5eba5255a98d71176d7198024603d2 100644 (file)
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -485,7 +485,7 @@ xfs_open_devices(
                 mp->m_logdev_targp = mp->m_ddev_targp;
                 /* Handle won't be used, drop it */
                 if (logdev_file)
-                       fput(logdev_file);
+                       bdev_fput(logdev_file);
         }
  
         return 0;
@@ -497,10 +497,10 @@ xfs_open_devices(
         xfs_free_buftarg(mp->m_ddev_targp);
   out_close_rtdev:
          if (rtdev_file)
-               fput(rtdev_file);
+               bdev_fput(rtdev_file);
   out_close_logdev:
         if (logdev_file)
-               fput(logdev_file);
+               bdev_fput(logdev_file);
         return error;
  }
  
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h

index c3e8f7cf96be9e1c10169d2e7afe31696082eb8f..172c918799995f8ce6ed285bc5a8ea3341ddb6ed 100644 (file)
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1505,16 +1505,6 @@ struct blk_holder_ops {
          * Thaw the file system mounted on the block device.
          */
         int (*thaw)(struct block_device *bdev);
-
-       /*
-        * If needed, get a reference to the holder.
-        */
-       void (*get_holder)(void *holder);
-
-       /*
-        * Release the holder.
-        */
-       void (*put_holder)(void *holder);
  };
  
  /*
@@ -1585,6 +1575,7 @@ static inline int early_lookup_bdev(const char *pathname, dev_t *dev)
  
  int bdev_freeze(struct block_device *bdev);
  int bdev_thaw(struct block_device *bdev);
+void bdev_fput(struct file *bdev_file);
  
  struct io_comp_batch {
         struct request *req_list;
diff --git a/include/linux/bootconfig.h b/include/linux/bootconfig.h

index ca73940e26df83ddd65301b39b77c350b0f4c3c2..e5ee2c694401e0972d4a644f01ca523f63f83475 100644 (file)
--- a/include/linux/bootconfig.h
+++ b/include/linux/bootconfig.h
@@ -10,6 +10,7 @@
  #ifdef __KERNEL__
  #include <linux/kernel.h>
  #include <linux/types.h>
+bool __init cmdline_has_extra_options(void);
  #else /* !__KERNEL__ */
  /*
   * NOTE: This is only for tools/bootconfig, because tools/bootconfig will
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h

index cb0d6cd1c12f24e1dd8681b5f9f0302675bec7d5..60693a1458946223f791aae517210cb67ba13050 100644 (file)
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -90,6 +90,14 @@ enum cc_attr {
          * Examples include TDX Guest.
          */
         CC_ATTR_HOTPLUG_DISABLED,
+
+       /**
+        * @CC_ATTR_HOST_SEV_SNP: AMD SNP enabled on the host.
+        *
+        * The host kernel is running with the necessary features
+        * enabled to run SEV-SNP guests.
+        */
+       CC_ATTR_HOST_SEV_SNP,
  };
  
  #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
@@ -107,10 +115,14 @@ enum cc_attr {
   * * FALSE - Specified Confidential Computing attribute is not active
   */
  bool cc_platform_has(enum cc_attr attr);
+void cc_platform_set(enum cc_attr attr);
+void cc_platform_clear(enum cc_attr attr);
  
  #else  /* !CONFIG_ARCH_HAS_CC_PLATFORM */
  
  static inline bool cc_platform_has(enum cc_attr attr) { return false; }
+static inline void cc_platform_set(enum cc_attr attr) { }
+static inline void cc_platform_clear(enum cc_attr attr) { }
  
  #endif /* CONFIG_ARCH_HAS_CC_PLATFORM */
  
diff --git a/include/linux/compiler.h b/include/linux/compiler.h

index c00cc6c0878a1e173701a6267ac062fa3d41b790..8c252e073bd8103c8b13f39d90a965370b16d37d 100644 (file)
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -268,7 +268,7 @@ static inline void *offset_to_ptr(const int *off)
   *   - When one operand is a null pointer constant (i.e. when x is an integer
   *     constant expression) and the other is an object pointer (i.e. our
   *     third operand), the conditional operator returns the type of the
- *     object pointer operand (i.e. "int *). Here, within the sizeof(), we
+ *     object pointer operand (i.e. "int *"). Here, within the sizeof(), we
   *     would then get:
   *       sizeof(*((int *)(...))  == sizeof(int)  == 4
   *   - When one operand is a void pointer (i.e. when x is not an integer
diff --git a/include/linux/device.h b/include/linux/device.h

index 97c4b046c09d9464243c81f294724985dc4a292a..b9f5464f44ed81134cb90032f3533f015d51389a 100644 (file)
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1247,6 +1247,7 @@ void device_link_del(struct device_link *link);
  void device_link_remove(void *consumer, struct device *supplier);
  void device_links_supplier_sync_state_pause(void);
  void device_links_supplier_sync_state_resume(void);
+void device_link_wait_removal(void);
  
  /* Create alias, so I can be autoloaded. */
  #define MODULE_ALIAS_CHARDEV(major,minor) \
diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h

index 770755df852f14b31be1cefa5e37693acd6a0421..70cd7258cd29f5fc1396762f239908322ec97933 100644 (file)
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -245,7 +245,6 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
          * max utilization to the allowed CPU capacity before calculating
          * effective performance.
          */
-       max_util = map_util_perf(max_util);
         max_util = min(max_util, allowed_cpu_cap);
  
         /*
diff --git a/include/linux/fs.h b/include/linux/fs.h

index 00fc429b0af0fb9bbab2382a9e347fdbac383981..8dfd53b52744a4dfffb8ccb350364972658f00eb 100644 (file)
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -121,6 +121,8 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
  #define FMODE_PWRITE           ((__force fmode_t)0x10)
  /* File is opened for execution with sys_execve / sys_uselib */
  #define FMODE_EXEC             ((__force fmode_t)0x20)
+/* File writes are restricted (block device specific) */
+#define FMODE_WRITE_RESTRICTED  ((__force fmode_t)0x40)
  /* 32bit hashes as llseek() offset (for directories) */
  #define FMODE_32BITHASH         ((__force fmode_t)0x200)
  /* 64bit hashes as llseek() offset (for directories) */
diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h

index 868c8fb1bbc1c2dabd708bc2c6485c2e42dee8fe..13becafe41df00f94dddb5e4f0417d3447c6456c 100644 (file)
--- a/include/linux/gfp_types.h
+++ b/include/linux/gfp_types.h
@@ -2,6 +2,8 @@
  #ifndef __LINUX_GFP_TYPES_H
  #define __LINUX_GFP_TYPES_H
  
+#include <linux/bits.h>
+
  /* The typedef is in types.h but we want the documentation here */
  #if 0
  /**
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h

index e248936250852dd09c7e71b958a0072c1ac86193..05df0e399d7c0b84236198f57e0c61e90412beaa 100644 (file)
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -294,7 +294,6 @@ struct io_ring_ctx {
  
                 struct io_submit_state  submit_state;
  
-               struct io_buffer_list   *io_bl;
                 struct xarray           io_bl_xa;
  
                 struct io_hash_table    cancel_table_locked;
diff --git a/include/linux/mm.h b/include/linux/mm.h

index 0436b919f1c7fc535b30400bf95affc7e4534186..7b0ee64225de9cefebaf63e0d759c082684e3ab5 100644 (file)
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2207,11 +2207,6 @@ static inline int arch_make_folio_accessible(struct folio *folio)
   */
  #include <linux/vmstat.h>
  
-static __always_inline void *lowmem_page_address(const struct page *page)
-{
-       return page_to_virt(page);
-}
-
  #if defined(CONFIG_HIGHMEM) && !defined(WANT_PAGE_VIRTUAL)
  #define HASHED_PAGE_VIRTUAL
  #endif
@@ -2234,6 +2229,11 @@ void set_page_address(struct page *page, void *virtual);
  void page_address_init(void);
  #endif
  
+static __always_inline void *lowmem_page_address(const struct page *page)
+{
+       return page_to_virt(page);
+}
+
  #if !defined(HASHED_PAGE_VIRTUAL) && !defined(WANT_PAGE_VIRTUAL)
  #define page_address(page) lowmem_page_address(page)
  #define set_page_address(page, address)  do { } while(0)
diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h

index 5d868505a94e43fe4f6124915e90dc814c269278..6d92b68efbf6c3afe86b9f6c22a8759ce51e7a28 100644 (file)
--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -80,7 +80,7 @@ DECLARE_PER_CPU(u32, kstack_offset);
         if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
                                 &randomize_kstack_offset)) {            \
                 u32 offset = raw_cpu_read(kstack_offset);               \
-               offset ^= (rand);                                       \
+               offset = ror32(offset, 5) ^ (rand);                     \
                 raw_cpu_write(kstack_offset, offset);                   \
         }                                                               \
  } while (0)
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h

index 35f3a4a8ceb1e3276f3be9ebb68a1dd275f50de0..acf7e1a3f3def9fd4027aa659ba1d4dec8e944b0 100644 (file)
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -13,10 +13,10 @@ static inline bool folio_is_secretmem(struct folio *folio)
         /*
          * Using folio_mapping() is quite slow because of the actual call
          * instruction.
-        * We know that secretmem pages are not compound and LRU so we can
+        * We know that secretmem pages are not compound, so we can
          * save a couple of cycles here.
          */
-       if (folio_test_large(folio) || !folio_test_lru(folio))
+       if (folio_test_large(folio))
                 return false;
  
         mapping = (struct address_space *)
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h

index 3c6caa5abc7c4262b8d62f9ef46d3fa273e7fb1e..e9ec32fb97d4a729e06aaabc7cd5627bb384d72f 100644 (file)
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -44,10 +44,9 @@ typedef u32 depot_stack_handle_t;
  union handle_parts {
         depot_stack_handle_t handle;
         struct {
-               /* pool_index is offset by 1 */
-               u32 pool_index  : DEPOT_POOL_INDEX_BITS;
-               u32 offset      : DEPOT_OFFSET_BITS;
-               u32 extra       : STACK_DEPOT_EXTRA_BITS;
+               u32 pool_index_plus_1   : DEPOT_POOL_INDEX_BITS;
+               u32 offset              : DEPOT_OFFSET_BITS;
+               u32 extra               : STACK_DEPOT_EXTRA_BITS;
         };
  };
  
diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h

index c6540ceea14303151317627d3216d9f86a549341..0982d1d52b24d9cbb5da81ad2adaa9f4340e66e5 100644 (file)
--- a/include/linux/timecounter.h
+++ b/include/linux/timecounter.h
@@ -22,7 +22,7 @@
   *
   * @read:              returns the current cycle value
   * @mask:              bitmask for two's complement
- *                     subtraction of non 64 bit counters,
+ *                     subtraction of non-64-bit counters,
   *                     see CYCLECOUNTER_MASK() helper macro
   * @mult:              cycle to nanosecond multiplier
   * @shift:             cycle to nanosecond divisor (power of two)
@@ -35,7 +35,7 @@ struct cyclecounter {
  };
  
  /**
- * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds
+ * struct timecounter - layer above a &struct cyclecounter which counts nanoseconds
   *     Contains the state needed by timecounter_read() to detect
   *     cycle counter wrap around. Initialize with
   *     timecounter_init(). Also used to convert cycle counts into the
@@ -66,6 +66,8 @@ struct timecounter {
   * @cycles:    Cycles
   * @mask:      bit mask for maintaining the 'frac' field
   * @frac:      pointer to storage for the fractional nanoseconds.
+ *
+ * Returns: cycle counter cycles converted to nanoseconds
   */
  static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc,
                                       u64 cycles, u64 mask, u64 *frac)
@@ -79,6 +81,7 @@ static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc,
  
  /**
   * timecounter_adjtime - Shifts the time of the clock.
+ * @tc:                The &struct timecounter to adjust
   * @delta:     Desired change in nanoseconds.
   */
  static inline void timecounter_adjtime(struct timecounter *tc, s64 delta)
@@ -107,6 +110,8 @@ extern void timecounter_init(struct timecounter *tc,
   *
   * In other words, keeps track of time since the same epoch as
   * the function which generated the initial time stamp.
+ *
+ * Returns: nanoseconds since the initial time stamp
   */
  extern u64 timecounter_read(struct timecounter *tc);
  
@@ -123,6 +128,8 @@ extern u64 timecounter_read(struct timecounter *tc);
   *
   * This allows conversion of cycle counter values which were generated
   * in the past.
+ *
+ * Returns: cycle counter converted to nanoseconds since the initial time stamp
   */
  extern u64 timecounter_cyc2time(const struct timecounter *tc,
                                 u64 cycle_tstamp);
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h

index 7e50cbd97f86e3de4160373aaf1aff05179ed9e4..0ea7823b7f31f6c3cd653ddce0204a44e1f61a8f 100644 (file)
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -22,14 +22,14 @@ extern int do_sys_settimeofday64(const struct timespec64 *tv,
                                  const struct timezone *tz);
  
  /*
- * ktime_get() family: read the current time in a multitude of ways,
+ * ktime_get() family - read the current time in a multitude of ways.
   *
   * The default time reference is CLOCK_MONOTONIC, starting at
   * boot time but not counting the time spent in suspend.
   * For other references, use the functions with "real", "clocktai",
   * "boottime" and "raw" suffixes.
   *
- * To get the time in a different format, use the ones wit
+ * To get the time in a different format, use the ones with
   * "ns", "ts64" and "seconds" suffix.
   *
   * See Documentation/core-api/timekeeping.rst for more details.
@@ -74,6 +74,8 @@ extern u32 ktime_get_resolution_ns(void);
  
  /**
   * ktime_get_real - get the real (wall-) time in ktime_t format
+ *
+ * Returns: real (wall) time in ktime_t format
   */
  static inline ktime_t ktime_get_real(void)
  {
@@ -86,10 +88,12 @@ static inline ktime_t ktime_get_coarse_real(void)
  }
  
  /**
- * ktime_get_boottime - Returns monotonic time since boot in ktime_t format
+ * ktime_get_boottime - Get monotonic time since boot in ktime_t format
   *
   * This is similar to CLOCK_MONTONIC/ktime_get, but also includes the
   * time spent in suspend.
+ *
+ * Returns: monotonic time since boot in ktime_t format
   */
  static inline ktime_t ktime_get_boottime(void)
  {
@@ -102,7 +106,9 @@ static inline ktime_t ktime_get_coarse_boottime(void)
  }
  
  /**
- * ktime_get_clocktai - Returns the TAI time of day in ktime_t format
+ * ktime_get_clocktai - Get the TAI time of day in ktime_t format
+ *
+ * Returns: the TAI time of day in ktime_t format
   */
  static inline ktime_t ktime_get_clocktai(void)
  {
@@ -144,32 +150,60 @@ static inline u64 ktime_get_coarse_clocktai_ns(void)
  
  /**
   * ktime_mono_to_real - Convert monotonic time to clock realtime
+ * @mono: monotonic time to convert
+ *
+ * Returns: time converted to realtime clock
   */
  static inline ktime_t ktime_mono_to_real(ktime_t mono)
  {
         return ktime_mono_to_any(mono, TK_OFFS_REAL);
  }
  
+/**
+ * ktime_get_ns - Get the current time in nanoseconds
+ *
+ * Returns: current time converted to nanoseconds
+ */
  static inline u64 ktime_get_ns(void)
  {
         return ktime_to_ns(ktime_get());
  }
  
+/**
+ * ktime_get_real_ns - Get the current real/wall time in nanoseconds
+ *
+ * Returns: current real time converted to nanoseconds
+ */
  static inline u64 ktime_get_real_ns(void)
  {
         return ktime_to_ns(ktime_get_real());
  }
  
+/**
+ * ktime_get_boottime_ns - Get the monotonic time since boot in nanoseconds
+ *
+ * Returns: current boottime converted to nanoseconds
+ */
  static inline u64 ktime_get_boottime_ns(void)
  {
         return ktime_to_ns(ktime_get_boottime());
  }
  
+/**
+ * ktime_get_clocktai_ns - Get the current TAI time of day in nanoseconds
+ *
+ * Returns: current TAI time converted to nanoseconds
+ */
  static inline u64 ktime_get_clocktai_ns(void)
  {
         return ktime_to_ns(ktime_get_clocktai());
  }
  
+/**
+ * ktime_get_raw_ns - Get the raw monotonic time in nanoseconds
+ *
+ * Returns: current raw monotonic time converted to nanoseconds
+ */
  static inline u64 ktime_get_raw_ns(void)
  {
         return ktime_to_ns(ktime_get_raw());
@@ -224,8 +258,8 @@ extern bool timekeeping_rtc_skipresume(void);
  
  extern void timekeeping_inject_sleeptime64(const struct timespec64 *delta);
  
-/*
- * struct ktime_timestanps - Simultaneous mono/boot/real timestamps
+/**
+ * struct ktime_timestamps - Simultaneous mono/boot/real timestamps
   * @mono:      Monotonic timestamp
   * @boot:      Boottime timestamp
   * @real:      Realtime timestamp
@@ -242,7 +276,8 @@ struct ktime_timestamps {
   * @cycles:    Clocksource counter value to produce the system times
   * @real:      Realtime system time
   * @raw:       Monotonic raw system time
- * @clock_was_set_seq: The sequence number of clock was set events
+ * @cs_id:     Clocksource ID
+ * @clock_was_set_seq: The sequence number of clock-was-set events
   * @cs_was_changed_seq:        The sequence number of clocksource change events
   */
  struct system_time_snapshot {
diff --git a/include/linux/timer.h b/include/linux/timer.h

index 14a633ba61d6433400a488ff21d73172f3616281..e67ecd1cbc97d6b92994c15b688cdde5ec3c998f 100644 (file)
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -22,7 +22,7 @@
  #define __TIMER_LOCKDEP_MAP_INITIALIZER(_kn)
  #endif
  
-/**
+/*
   * @TIMER_DEFERRABLE: A deferrable timer will work normally when the
   * system is busy, but will not cause a CPU to come out of idle just
   * to service it; instead, the timer will be serviced when the CPU
@@ -140,7 +140,7 @@ static inline void destroy_timer_on_stack(struct timer_list *timer) { }
   * or not. Callers must ensure serialization wrt. other operations done
   * to this timer, eg. interrupt contexts, or other CPUs on SMP.
   *
- * return value: 1 if the timer is pending, 0 if not.
+ * Returns: 1 if the timer is pending, 0 if not.
   */
  static inline int timer_pending(const struct timer_list * timer)
  {
@@ -175,6 +175,10 @@ extern int timer_shutdown(struct timer_list *timer);
   * See timer_delete_sync() for detailed explanation.
   *
   * Do not use in new code. Use timer_delete_sync() instead.
+ *
+ * Returns:
+ * * %0        - The timer was not pending
+ * * %1        - The timer was pending and deactivated
   */
  static inline int del_timer_sync(struct timer_list *timer)
  {
@@ -188,6 +192,10 @@ static inline int del_timer_sync(struct timer_list *timer)
   * See timer_delete() for detailed explanation.
   *
   * Do not use in new code. Use timer_delete() instead.
+ *
+ * Returns:
+ * * %0        - The timer was not pending
+ * * %1        - The timer was pending and deactivated
   */
  static inline int del_timer(struct timer_list *timer)
  {
diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h

index 9fe95a22abeb7e2fb12d7384a974e5689db61211..eaec5d6caa29d293902f86666c91cceebd6f388c 100644 (file)
--- a/include/net/bluetooth/bluetooth.h
+++ b/include/net/bluetooth/bluetooth.h
@@ -585,6 +585,15 @@ static inline struct sk_buff *bt_skb_sendmmsg(struct sock *sk,
         return skb;
  }
  
+static inline int bt_copy_from_sockptr(void *dst, size_t dst_size,
+                                      sockptr_t src, size_t src_size)
+{
+       if (dst_size > src_size)
+               return -EINVAL;
+
+       return copy_from_sockptr(dst, src, dst_size);
+}
+
  int bt_to_errno(u16 code);
  __u8 bt_status(int err);
  
diff --git a/include/sound/hdaudio_ext.h b/include/sound/hdaudio_ext.h

index a8bebac1e4b28dd3d0195894dc96e18f74184992..957295364a5e3c1aa3bc8a9108a7da02e7b6ee44 100644 (file)
--- a/include/sound/hdaudio_ext.h
+++ b/include/sound/hdaudio_ext.h
@@ -56,6 +56,9 @@ struct hdac_ext_stream {
         u32 pphcldpl;
         u32 pphcldpu;
  
+       u32 pplcllpl;
+       u32 pplcllpu;
+
         bool decoupled:1;
         bool link_locked:1;
         bool link_prepared;
diff --git a/include/sound/tas2781-tlv.h b/include/sound/tas2781-tlv.h

index 4038dd421150a3f03182be5d1f0a503812029c15..1dc59005d241fbe683fb98d62d86e8c04f3708eb 100644 (file)
--- a/include/sound/tas2781-tlv.h
+++ b/include/sound/tas2781-tlv.h
@@ -15,7 +15,7 @@
  #ifndef __TAS2781_TLV_H__
  #define __TAS2781_TLV_H__
  
-static const DECLARE_TLV_DB_SCALE(dvc_tlv, -10000, 100, 0);
+static const __maybe_unused DECLARE_TLV_DB_SCALE(dvc_tlv, -10000, 100, 0);
  static const DECLARE_TLV_DB_SCALE(amp_vol_tlv, 1100, 50, 0);
  
  #endif
diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h

index 5d5c0b8efff2d44be1c29c66a9c914ed03d5b95f..c71ddb6d46914be3d91758608e92248aea98fcba 100644 (file)
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -19,12 +19,6 @@
  #include <vdso/time32.h>
  #include <vdso/time64.h>
  
-#ifdef CONFIG_ARM64
-#include <asm/page-def.h>
-#else
-#include <asm/page.h>
-#endif
-
  #ifdef CONFIG_ARCH_HAS_VDSO_DATA
  #include <asm/vdso/data.h>
  #else
@@ -132,7 +126,7 @@ extern struct vdso_data _timens_data[CS_BASES] __attribute__((visibility("hidden
   */
  union vdso_data_store {
         struct vdso_data        data[CS_BASES];
-       u8                      page[PAGE_SIZE];
+       u8                      page[1U << CONFIG_PAGE_SHIFT];
  };
  
  /*
diff --git a/init/initramfs.c b/init/initramfs.c

index 3127e0bf7bbd15487ea6b50bd277801f6c01ae7e..a298a3854a8018923dbbd26adbc271550ef9a198 100644 (file)
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -367,7 +367,7 @@ static int __init do_name(void)
         if (S_ISREG(mode)) {
                 int ml = maybe_link();
                 if (ml >= 0) {
-                       int openflags = O_WRONLY|O_CREAT;
+                       int openflags = O_WRONLY|O_CREAT|O_LARGEFILE;
                         if (ml != 1)
                                 openflags |= O_TRUNC;
                         wfile = filp_open(collected, openflags, mode);
diff --git a/init/main.c b/init/main.c

index 2ca52474d0c3032e44ae410add7b2430218f9c41..881f6230ee59e9675eb98b62adf761ee74823a16 100644 (file)
--- a/init/main.c
+++ b/init/main.c
@@ -487,6 +487,11 @@ static int __init warn_bootconfig(char *str)
  
  early_param("bootconfig", warn_bootconfig);
  
+bool __init cmdline_has_extra_options(void)
+{
+       return extra_command_line || extra_init_args;
+}
+
  /* Change NUL term back to "=", to make "param" the whole string. */
  static void __init repair_env_string(char *param, char *val)
  {
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c

index 5d4b448fdc503822cb97ca5ca93d545ade642db5..4521c2b66b98db3c3affc55c7aeb4a69b8eec0a7 100644 (file)
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -147,6 +147,7 @@ static bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
  static void io_queue_sqe(struct io_kiocb *req);
  
  struct kmem_cache *req_cachep;
+static struct workqueue_struct *iou_wq __ro_after_init;
  
  static int __read_mostly sysctl_io_uring_disabled;
  static int __read_mostly sysctl_io_uring_group = -1;
@@ -350,7 +351,6 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
  err:
         kfree(ctx->cancel_table.hbs);
         kfree(ctx->cancel_table_locked.hbs);
-       kfree(ctx->io_bl);
         xa_destroy(&ctx->io_bl_xa);
         kfree(ctx);
         return NULL;
@@ -1982,10 +1982,15 @@ fail:
                 err = -EBADFD;
                 if (!io_file_can_poll(req))
                         goto fail;
-               err = -ECANCELED;
-               if (io_arm_poll_handler(req, issue_flags) != IO_APOLL_OK)
-                       goto fail;
-               return;
+               if (req->file->f_flags & O_NONBLOCK ||
+                   req->file->f_mode & FMODE_NOWAIT) {
+                       err = -ECANCELED;
+                       if (io_arm_poll_handler(req, issue_flags) != IO_APOLL_OK)
+                               goto fail;
+                       return;
+               } else {
+                       req->flags &= ~REQ_F_APOLL_MULTISHOT;
+               }
         }
  
         if (req->flags & REQ_F_FORCE_ASYNC) {
@@ -2926,7 +2931,6 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
         io_napi_free(ctx);
         kfree(ctx->cancel_table.hbs);
         kfree(ctx->cancel_table_locked.hbs);
-       kfree(ctx->io_bl);
         xa_destroy(&ctx->io_bl_xa);
         kfree(ctx);
  }
@@ -3161,7 +3165,7 @@ static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
          * noise and overhead, there's no discernable change in runtime
          * over using system_wq.
          */
-       queue_work(system_unbound_wq, &ctx->exit_work);
+       queue_work(iou_wq, &ctx->exit_work);
  }
  
  static int io_uring_release(struct inode *inode, struct file *file)
@@ -3443,14 +3447,15 @@ static void *io_uring_validate_mmap_request(struct file *file,
                 ptr = ctx->sq_sqes;
                 break;
         case IORING_OFF_PBUF_RING: {
+               struct io_buffer_list *bl;
                 unsigned int bgid;
  
                 bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT;
-               rcu_read_lock();
-               ptr = io_pbuf_get_address(ctx, bgid);
-               rcu_read_unlock();
-               if (!ptr)
-                       return ERR_PTR(-EINVAL);
+               bl = io_pbuf_get_bl(ctx, bgid);
+               if (IS_ERR(bl))
+                       return bl;
+               ptr = bl->buf_ring;
+               io_put_bl(ctx, bl);
                 break;
                 }
         default:
@@ -4185,6 +4190,8 @@ static int __init io_uring_init(void)
         io_buf_cachep = KMEM_CACHE(io_buffer,
                                           SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
  
+       iou_wq = alloc_workqueue("iou_exit", WQ_UNBOUND, 64);
+
  #ifdef CONFIG_SYSCTL
         register_sysctl_init("kernel", kernel_io_uring_disabled_table);
  #endif
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c

index 693c26da4ee1a36b4b0acee0b0cc7a7d0cfde3e6..3aa16e27f5099a426abe1f991c991ffdd9ab379f 100644 (file)
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -17,8 +17,6 @@
  
  #define IO_BUFFER_LIST_BUF_PER_PAGE (PAGE_SIZE / sizeof(struct io_uring_buf))
  
-#define BGID_ARRAY     64
-
  /* BIDs are addressed by a 16-bit field in a CQE */
  #define MAX_BIDS_PER_BGID (1 << 16)
  
@@ -40,13 +38,9 @@ struct io_buf_free {
         int                             inuse;
  };
  
-static struct io_buffer_list *__io_buffer_get_list(struct io_ring_ctx *ctx,
-                                                  struct io_buffer_list *bl,
-                                                  unsigned int bgid)
+static inline struct io_buffer_list *__io_buffer_get_list(struct io_ring_ctx *ctx,
+                                                         unsigned int bgid)
  {
-       if (bl && bgid < BGID_ARRAY)
-               return &bl[bgid];
-
         return xa_load(&ctx->io_bl_xa, bgid);
  }
  
@@ -55,7 +49,7 @@ static inline struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx *ctx,
  {
         lockdep_assert_held(&ctx->uring_lock);
  
-       return __io_buffer_get_list(ctx, ctx->io_bl, bgid);
+       return __io_buffer_get_list(ctx, bgid);
  }
  
  static int io_buffer_add_list(struct io_ring_ctx *ctx,
@@ -67,11 +61,7 @@ static int io_buffer_add_list(struct io_ring_ctx *ctx,
          * always under the ->uring_lock, but the RCU lookup from mmap does.
          */
         bl->bgid = bgid;
-       smp_store_release(&bl->is_ready, 1);
-
-       if (bgid < BGID_ARRAY)
-               return 0;
-
+       atomic_set(&bl->refs, 1);
         return xa_err(xa_store(&ctx->io_bl_xa, bgid, bl, GFP_KERNEL));
  }
  
@@ -208,24 +198,6 @@ void __user *io_buffer_select(struct io_kiocb *req, size_t *len,
         return ret;
  }
  
-static __cold int io_init_bl_list(struct io_ring_ctx *ctx)
-{
-       struct io_buffer_list *bl;
-       int i;
-
-       bl = kcalloc(BGID_ARRAY, sizeof(struct io_buffer_list), GFP_KERNEL);
-       if (!bl)
-               return -ENOMEM;
-
-       for (i = 0; i < BGID_ARRAY; i++) {
-               INIT_LIST_HEAD(&bl[i].buf_list);
-               bl[i].bgid = i;
-       }
-
-       smp_store_release(&ctx->io_bl, bl);
-       return 0;
-}
-
  /*
   * Mark the given mapped range as free for reuse
   */
@@ -294,24 +266,24 @@ static int __io_remove_buffers(struct io_ring_ctx *ctx,
         return i;
  }
  
+void io_put_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl)
+{
+       if (atomic_dec_and_test(&bl->refs)) {
+               __io_remove_buffers(ctx, bl, -1U);
+               kfree_rcu(bl, rcu);
+       }
+}
+
  void io_destroy_buffers(struct io_ring_ctx *ctx)
  {
         struct io_buffer_list *bl;
         struct list_head *item, *tmp;
         struct io_buffer *buf;
         unsigned long index;
-       int i;
-
-       for (i = 0; i < BGID_ARRAY; i++) {
-               if (!ctx->io_bl)
-                       break;
-               __io_remove_buffers(ctx, &ctx->io_bl[i], -1U);
-       }
  
         xa_for_each(&ctx->io_bl_xa, index, bl) {
                 xa_erase(&ctx->io_bl_xa, bl->bgid);
-               __io_remove_buffers(ctx, bl, -1U);
-               kfree_rcu(bl, rcu);
+               io_put_bl(ctx, bl);
         }
  
         /*
@@ -489,12 +461,6 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
  
         io_ring_submit_lock(ctx, issue_flags);
  
-       if (unlikely(p->bgid < BGID_ARRAY && !ctx->io_bl)) {
-               ret = io_init_bl_list(ctx);
-               if (ret)
-                       goto err;
-       }
-
         bl = io_buffer_get_list(ctx, p->bgid);
         if (unlikely(!bl)) {
                 bl = kzalloc(sizeof(*bl), GFP_KERNEL_ACCOUNT);
@@ -507,14 +473,9 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
                 if (ret) {
                         /*
                          * Doesn't need rcu free as it was never visible, but
-                        * let's keep it consistent throughout. Also can't
-                        * be a lower indexed array group, as adding one
-                        * where lookup failed cannot happen.
+                        * let's keep it consistent throughout.
                          */
-                       if (p->bgid >= BGID_ARRAY)
-                               kfree_rcu(bl, rcu);
-                       else
-                               WARN_ON_ONCE(1);
+                       kfree_rcu(bl, rcu);
                         goto err;
                 }
         }
@@ -679,12 +640,6 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
         if (reg.ring_entries >= 65536)
                 return -EINVAL;
  
-       if (unlikely(reg.bgid < BGID_ARRAY && !ctx->io_bl)) {
-               int ret = io_init_bl_list(ctx);
-               if (ret)
-                       return ret;
-       }
-
         bl = io_buffer_get_list(ctx, reg.bgid);
         if (bl) {
                 /* if mapped buffer ring OR classic exists, don't allow */
@@ -733,11 +688,8 @@ int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
         if (!bl->is_buf_ring)
                 return -EINVAL;
  
-       __io_remove_buffers(ctx, bl, -1U);
-       if (bl->bgid >= BGID_ARRAY) {
-               xa_erase(&ctx->io_bl_xa, bl->bgid);
-               kfree_rcu(bl, rcu);
-       }
+       xa_erase(&ctx->io_bl_xa, bl->bgid);
+       io_put_bl(ctx, bl);
         return 0;
  }
  
@@ -767,23 +719,35 @@ int io_register_pbuf_status(struct io_ring_ctx *ctx, void __user *arg)
         return 0;
  }
  
-void *io_pbuf_get_address(struct io_ring_ctx *ctx, unsigned long bgid)
+struct io_buffer_list *io_pbuf_get_bl(struct io_ring_ctx *ctx,
+                                     unsigned long bgid)
  {
         struct io_buffer_list *bl;
+       bool ret;
  
-       bl = __io_buffer_get_list(ctx, smp_load_acquire(&ctx->io_bl), bgid);
-
-       if (!bl || !bl->is_mmap)
-               return NULL;
         /*
-        * Ensure the list is fully setup. Only strictly needed for RCU lookup
-        * via mmap, and in that case only for the array indexed groups. For
-        * the xarray lookups, it's either visible and ready, or not at all.
+        * We have to be a bit careful here - we're inside mmap and cannot grab
+        * the uring_lock. This means the buffer_list could be simultaneously
+        * going away, if someone is trying to be sneaky. Look it up under rcu
+        * so we know it's not going away, and attempt to grab a reference to
+        * it. If the ref is already zero, then fail the mapping. If successful,
+        * the caller will call io_put_bl() to drop the the reference at at the
+        * end. This may then safely free the buffer_list (and drop the pages)
+        * at that point, vm_insert_pages() would've already grabbed the
+        * necessary vma references.
          */
-       if (!smp_load_acquire(&bl->is_ready))
-               return NULL;
-
-       return bl->buf_ring;
+       rcu_read_lock();
+       bl = xa_load(&ctx->io_bl_xa, bgid);
+       /* must be a mmap'able buffer ring and have pages */
+       ret = false;
+       if (bl && bl->is_mmap)
+               ret = atomic_inc_not_zero(&bl->refs);
+       rcu_read_unlock();
+
+       if (ret)
+               return bl;
+
+       return ERR_PTR(-EINVAL);
  }
  
  /*
diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h

index 1c7b654ee7263af16e1491934dcbc21744a4a337..df365b8860cf1eeb7eff261e1f5a5a7fc8d9b77f 100644 (file)
--- a/io_uring/kbuf.h
+++ b/io_uring/kbuf.h
@@ -25,12 +25,12 @@ struct io_buffer_list {
         __u16 head;
         __u16 mask;
  
+       atomic_t refs;
+
         /* ring mapped provided buffers */
         __u8 is_buf_ring;
         /* ring mapped provided buffers, but mmap'ed by application */
         __u8 is_mmap;
-       /* bl is visible from an RCU point of view for lookup */
-       __u8 is_ready;
  };
  
  struct io_buffer {
@@ -61,7 +61,9 @@ void __io_put_kbuf(struct io_kiocb *req, unsigned issue_flags);
  
  bool io_kbuf_recycle_legacy(struct io_kiocb *req, unsigned issue_flags);
  
-void *io_pbuf_get_address(struct io_ring_ctx *ctx, unsigned long bgid);
+void io_put_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl);
+struct io_buffer_list *io_pbuf_get_bl(struct io_ring_ctx *ctx,
+                                     unsigned long bgid);
  
  static inline bool io_kbuf_recycle_ring(struct io_kiocb *req)
  {
diff --git a/io_uring/rw.c b/io_uring/rw.c

index 0585ebcc9773d3349b007e6ff436c0320e26356d..c8d48287439e5a06d19113ecb07f1c05db47dc3f 100644 (file)
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -936,6 +936,13 @@ int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags)
  
         ret = __io_read(req, issue_flags);
  
+       /*
+        * If the file doesn't support proper NOWAIT, then disable multishot
+        * and stay in single shot mode.
+        */
+       if (!io_file_supports_nowait(req))
+               req->flags &= ~REQ_F_APOLL_MULTISHOT;
+
         /*
          * If we get -EAGAIN, recycle our buffer and just let normal poll
          * handling arm it.
@@ -955,7 +962,7 @@ int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags)
         /*
          * Any successful return value will keep the multishot read armed.
          */
-       if (ret > 0) {
+       if (ret > 0 && req->flags & REQ_F_APOLL_MULTISHOT) {
                 /*
                  * Put our buffer and post a CQE. If we fail to post a CQE, then
                  * jump to the termination path. This request is then done.
diff --git a/kernel/kprobes.c b/kernel/kprobes.c

index 9d9095e817928658d2c6d54d5da6f4826ff7c6be..65adc815fc6e63027e1b7f0b23c597475a3fea1e 100644 (file)
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p,
         jump_label_lock();
         preempt_disable();
  
-       /* Ensure it is not in reserved area nor out of text */
-       if (!(core_kernel_text((unsigned long) p->addr) ||
-           is_module_text_address((unsigned long) p->addr)) ||
-           in_gate_area_no_mm((unsigned long) p->addr) ||
+       /* Ensure the address is in a text area, and find a module if exists. */
+       *probed_mod = NULL;
+       if (!core_kernel_text((unsigned long) p->addr)) {
+               *probed_mod = __module_text_address((unsigned long) p->addr);
+               if (!(*probed_mod)) {
+                       ret = -EINVAL;
+                       goto out;
+               }
+       }
+       /* Ensure it is not in reserved area. */
+       if (in_gate_area_no_mm((unsigned long) p->addr) ||
             within_kprobe_blacklist((unsigned long) p->addr) ||
             jump_label_text_reserved(p->addr, p->addr) ||
             static_call_text_reserved(p->addr, p->addr) ||
@@ -1580,8 +1587,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
                 goto out;
         }
  
-       /* Check if 'p' is probing a module. */
-       *probed_mod = __module_text_address((unsigned long) p->addr);
+       /* Get module refcount and reject __init functions for loaded modules. */
         if (*probed_mod) {
                 /*
                  * We must hold a refcount of the probed module while updating
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c

index 269e21590df5368bd6fc84d422f44162501a3f31..1331216a9cae749cce5e13b7ff4adcf9ba5fefaa 100644 (file)
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -697,6 +697,7 @@ bool tick_nohz_tick_stopped_cpu(int cpu)
  
  /**
   * tick_nohz_update_jiffies - update jiffies when idle was interrupted
+ * @now: current ktime_t
   *
   * Called from interrupt entry when the CPU was idle
   *
@@ -794,7 +795,7 @@ static u64 get_cpu_sleep_time_us(struct tick_sched *ts, ktime_t *sleeptime,
   * This time is measured via accounting rather than sampling,
   * and is as accurate as ktime_get() is.
   *
- * This function returns -1 if NOHZ is not enabled.
+ * Return: -1 if NOHZ is not enabled, else total idle time of the @cpu
   */
  u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
  {
@@ -820,7 +821,7 @@ EXPORT_SYMBOL_GPL(get_cpu_idle_time_us);
   * This time is measured via accounting rather than sampling,
   * and is as accurate as ktime_get() is.
   *
- * This function returns -1 if NOHZ is not enabled.
+ * Return: -1 if NOHZ is not enabled, else total iowait time of @cpu
   */
  u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
  {
@@ -1287,6 +1288,8 @@ void tick_nohz_irq_exit(void)
  
  /**
   * tick_nohz_idle_got_tick - Check whether or not the tick handler has run
+ *
+ * Return: %true if the tick handler has run, otherwise %false
   */
  bool tick_nohz_idle_got_tick(void)
  {
@@ -1305,6 +1308,8 @@ bool tick_nohz_idle_got_tick(void)
   * stopped, it returns the next hrtimer.
   *
   * Called from power state control code with interrupts disabled
+ *
+ * Return: the next expiration time
   */
  ktime_t tick_nohz_get_next_hrtimer(void)
  {
@@ -1320,6 +1325,8 @@ ktime_t tick_nohz_get_next_hrtimer(void)
   * The return value of this function and/or the value returned by it through the
   * @delta_next pointer can be negative which must be taken into account by its
   * callers.
+ *
+ * Return: the expected length of the current sleep
   */
  ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
  {
@@ -1357,8 +1364,11 @@ ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
  /**
   * tick_nohz_get_idle_calls_cpu - return the current idle calls counter value
   * for a particular CPU.
+ * @cpu: target CPU number
   *
   * Called from the schedutil frequency scaling governor in scheduler context.
+ *
+ * Return: the current idle calls counter value for @cpu
   */
  unsigned long tick_nohz_get_idle_calls_cpu(int cpu)
  {
@@ -1371,6 +1381,8 @@ unsigned long tick_nohz_get_idle_calls_cpu(int cpu)
   * tick_nohz_get_idle_calls - return the current idle calls counter value
   *
   * Called from the schedutil frequency scaling governor in scheduler context.
+ *
+ * Return: the current idle calls counter value for the current CPU
   */
  unsigned long tick_nohz_get_idle_calls(void)
  {
@@ -1559,7 +1571,7 @@ early_param("skew_tick", skew_tick);
  
  /**
   * tick_setup_sched_timer - setup the tick emulation timer
- * @mode: tick_nohz_mode to setup for
+ * @hrtimer: whether to use the hrtimer or not
   */
  void tick_setup_sched_timer(bool hrtimer)
  {
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h

index e11c4dc65bcb24b3b4200c83d538e5a800ad4fc2..b4a7822f495d3460b636089dc5024cd92b61a467 100644 (file)
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -46,8 +46,8 @@ struct tick_device {
   * @next_tick:         Next tick to be fired when in dynticks mode.
   * @idle_jiffies:      jiffies at the entry to idle for idle time accounting
   * @idle_waketime:     Time when the idle was interrupted
+ * @idle_sleeptime_seq:        sequence counter for data consistency
   * @idle_entrytime:    Time when the idle call was entered
- * @nohz_mode:         Mode - one state of tick_nohz_mode
   * @last_jiffies:      Base jiffies snapshot when next event was last computed
   * @timer_expires_base:        Base time clock monotonic for @timer_expires
   * @timer_expires:     Anticipated timer expiration time (in case sched tick is stopped)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c

index dee29f1f5b75f3c059f831a2cd1c86cd8907fef9..3baf2fbe6848f03efb7418c5d5f10b279b30cf6c 100644 (file)
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -64,15 +64,15 @@ EXPORT_SYMBOL(jiffies_64);
  
  /*
   * The timer wheel has LVL_DEPTH array levels. Each level provides an array of
- * LVL_SIZE buckets. Each level is driven by its own clock and therefor each
+ * LVL_SIZE buckets. Each level is driven by its own clock and therefore each
   * level has a different granularity.
   *
- * The level granularity is:           LVL_CLK_DIV ^ lvl
+ * The level granularity is:           LVL_CLK_DIV ^ level
   * The level clock frequency is:       HZ / (LVL_CLK_DIV ^ level)
   *
   * The array level of a newly armed timer depends on the relative expiry
   * time. The farther the expiry time is away the higher the array level and
- * therefor the granularity becomes.
+ * therefore the granularity becomes.
   *
   * Contrary to the original timer wheel implementation, which aims for 'exact'
   * expiry of the timers, this implementation removes the need for recascading
@@ -207,7 +207,7 @@ EXPORT_SYMBOL(jiffies_64);
   * struct timer_base - Per CPU timer base (number of base depends on config)
   * @lock:              Lock protecting the timer_base
   * @running_timer:     When expiring timers, the lock is dropped. To make
- *                     sure not to race agains deleting/modifying a
+ *                     sure not to race against deleting/modifying a
   *                     currently running timer, the pointer is set to the
   *                     timer, which expires at the moment. If no timer is
   *                     running, the pointer is NULL.
@@ -737,7 +737,7 @@ static bool timer_is_static_object(void *addr)
  }
  
  /*
- * fixup_init is called when:
+ * timer_fixup_init is called when:
   * - an active object is initialized
   */
  static bool timer_fixup_init(void *addr, enum debug_obj_state state)
@@ -761,7 +761,7 @@ static void stub_timer(struct timer_list *unused)
  }
  
  /*
- * fixup_activate is called when:
+ * timer_fixup_activate is called when:
   * - an active object is activated
   * - an unknown non-static object is activated
   */
@@ -783,7 +783,7 @@ static bool timer_fixup_activate(void *addr, enum debug_obj_state state)
  }
  
  /*
- * fixup_free is called when:
+ * timer_fixup_free is called when:
   * - an active object is freed
   */
  static bool timer_fixup_free(void *addr, enum debug_obj_state state)
@@ -801,7 +801,7 @@ static bool timer_fixup_free(void *addr, enum debug_obj_state state)
  }
  
  /*
- * fixup_assert_init is called when:
+ * timer_fixup_assert_init is called when:
   * - an untracked/uninit-ed object is found
   */
  static bool timer_fixup_assert_init(void *addr, enum debug_obj_state state)
@@ -914,7 +914,7 @@ static void do_init_timer(struct timer_list *timer,
   * @key: lockdep class key of the fake lock used for tracking timer
   *       sync lock dependencies
   *
- * init_timer_key() must be done to a timer prior calling *any* of the
+ * init_timer_key() must be done to a timer prior to calling *any* of the
   * other timer functions.
   */
  void init_timer_key(struct timer_list *timer,
@@ -1417,7 +1417,7 @@ static int __timer_delete(struct timer_list *timer, bool shutdown)
          * If @shutdown is set then the lock has to be taken whether the
          * timer is pending or not to protect against a concurrent rearm
          * which might hit between the lockless pending check and the lock
-        * aquisition. By taking the lock it is ensured that such a newly
+        * acquisition. By taking the lock it is ensured that such a newly
          * enqueued timer is dequeued and cannot end up with
          * timer->function == NULL in the expiry code.
          *
@@ -2306,7 +2306,7 @@ static inline u64 __get_next_timer_interrupt(unsigned long basej, u64 basem,
  
                 /*
                  * When timer base is not set idle, undo the effect of
-                * tmigr_cpu_deactivate() to prevent inconsitent states - active
+                * tmigr_cpu_deactivate() to prevent inconsistent states - active
                  * timer base but inactive timer migration hierarchy.
                  *
                  * When timer base was already marked idle, nothing will be
diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c

index c63a0afdcebed5c1e8b7bff161647d194f810ef2..ccba875d2234fe582264e7d802dcb62f4864e4f6 100644 (file)
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -751,6 +751,33 @@ bool tmigr_update_events(struct tmigr_group *group, struct tmigr_group *child,
  
                 first_childevt = evt = data->evt;
  
+               /*
+                * Walking the hierarchy is required in any case when a
+                * remote expiry was done before. This ensures to not lose
+                * already queued events in non active groups (see section
+                * "Required event and timerqueue update after a remote
+                * expiry" in the documentation at the top).
+                *
+                * The two call sites which are executed without a remote expiry
+                * before, are not prevented from propagating changes through
+                * the hierarchy by the return:
+                *  - When entering this path by tmigr_new_timer(), @evt->ignore
+                *    is never set.
+                *  - tmigr_inactive_up() takes care of the propagation by
+                *    itself and ignores the return value. But an immediate
+                *    return is possible if there is a parent, sparing group
+                *    locking at this level, because the upper walking call to
+                *    the parent will take care about removing this event from
+                *    within the group and update next_expiry accordingly.
+                *
+                * However if there is no parent, ie: the hierarchy has only a
+                * single level so @group is the top level group, make sure the
+                * first event information of the group is updated properly and
+                * also handled properly, so skip this fast return path.
+                */
+               if (evt->ignore && !remote && group->parent)
+                       return true;
+
                 raw_spin_lock(&group->lock);
  
                 childstate.state = 0;
@@ -762,8 +789,11 @@ bool tmigr_update_events(struct tmigr_group *group, struct tmigr_group *child,
          * queue when the expiry time changed only or when it could be ignored.
          */
         if (timerqueue_node_queued(&evt->nextevt)) {
-               if ((evt->nextevt.expires == nextexp) && !evt->ignore)
+               if ((evt->nextevt.expires == nextexp) && !evt->ignore) {
+                       /* Make sure not to miss a new CPU event with the same expiry */
+                       evt->cpu = first_childevt->cpu;
                         goto check_toplvl;
+               }
  
                 if (!timerqueue_del(&group->events, &evt->nextevt))
                         WRITE_ONCE(group->next_expiry, KTIME_MAX);
diff --git a/lib/stackdepot.c b/lib/stackdepot.c

index af6cc19a200331aa0c37cf2e497384f0b19d8db0..68c97387aa54e2728b79b2cacf23815fcb2f90ca 100644 (file)
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -330,7 +330,7 @@ static struct stack_record *depot_pop_free_pool(void **prealloc, size_t size)
         stack = current_pool + pool_offset;
  
         /* Pre-initialize handle once. */
-       stack->handle.pool_index = pool_index + 1;
+       stack->handle.pool_index_plus_1 = pool_index + 1;
         stack->handle.offset = pool_offset >> DEPOT_STACK_ALIGN;
         stack->handle.extra = 0;
         INIT_LIST_HEAD(&stack->hash_list);
@@ -441,7 +441,7 @@ static struct stack_record *depot_fetch_stack(depot_stack_handle_t handle)
         const int pools_num_cached = READ_ONCE(pools_num);
         union handle_parts parts = { .handle = handle };
         void *pool;
-       u32 pool_index = parts.pool_index - 1;
+       u32 pool_index = parts.pool_index_plus_1 - 1;
         size_t offset = parts.offset << DEPOT_STACK_ALIGN;
         struct stack_record *stack;
  
diff --git a/lib/test_ubsan.c b/lib/test_ubsan.c

index 276c12140ee26dac37137e400f4c303d34045bb1..c288df9372ede1cbda1371ae92e20a25ae9ebe91 100644 (file)
--- a/lib/test_ubsan.c
+++ b/lib/test_ubsan.c
@@ -134,7 +134,7 @@ static const test_ubsan_fp test_ubsan_array[] = {
  };
  
  /* Excluded because they Oops the module. */
-static const test_ubsan_fp skip_ubsan_array[] = {
+static __used const test_ubsan_fp skip_ubsan_array[] = {
         test_ubsan_divrem_overflow,
  };
  
diff --git a/mm/memory.c b/mm/memory.c

index 904f70b994985a7682b59a917522f284fd786950..d2155ced45f8f84ef8eac74ba3eda42a67d37102 100644 (file)
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5973,6 +5973,10 @@ int follow_phys(struct vm_area_struct *vma,
                 goto out;
         pte = ptep_get(ptep);
  
+       /* Never return PFNs of anon folios in COW mappings. */
+       if (vm_normal_folio(vma, address, pte))
+               goto unlock;
+
         if ((flags & FOLL_WRITE) && !pte_write(pte))
                 goto unlock;
  
diff --git a/mm/vmalloc.c b/mm/vmalloc.c

index 22aa63f4ef6322a71030c5dd163706c83ef5cd8d..68fa001648cc1cb766d8fa4111e88c3fbbf257e8 100644 (file)
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -989,6 +989,27 @@ unsigned long vmalloc_nr_pages(void)
         return atomic_long_read(&nr_vmalloc_pages);
  }
  
+static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
+{
+       struct rb_node *n = root->rb_node;
+
+       addr = (unsigned long)kasan_reset_tag((void *)addr);
+
+       while (n) {
+               struct vmap_area *va;
+
+               va = rb_entry(n, struct vmap_area, rb_node);
+               if (addr < va->va_start)
+                       n = n->rb_left;
+               else if (addr >= va->va_end)
+                       n = n->rb_right;
+               else
+                       return va;
+       }
+
+       return NULL;
+}
+
  /* Look up the first VA which satisfies addr < va_end, NULL if none. */
  static struct vmap_area *
  __find_vmap_area_exceed_addr(unsigned long addr, struct rb_root *root)
@@ -1025,47 +1046,39 @@ __find_vmap_area_exceed_addr(unsigned long addr, struct rb_root *root)
  static struct vmap_node *
  find_vmap_area_exceed_addr_lock(unsigned long addr, struct vmap_area **va)
  {
-       struct vmap_node *vn, *va_node = NULL;
-       struct vmap_area *va_lowest;
+       unsigned long va_start_lowest;
+       struct vmap_node *vn;
         int i;
  
-       for (i = 0; i < nr_vmap_nodes; i++) {
+repeat:
+       for (i = 0, va_start_lowest = 0; i < nr_vmap_nodes; i++) {
                 vn = &vmap_nodes[i];
  
                 spin_lock(&vn->busy.lock);
-               va_lowest = __find_vmap_area_exceed_addr(addr, &vn->busy.root);
-               if (va_lowest) {
-                       if (!va_node || va_lowest->va_start < (*va)->va_start) {
-                               if (va_node)
-                                       spin_unlock(&va_node->busy.lock);
-
-                               *va = va_lowest;
-                               va_node = vn;
-                               continue;
-                       }
-               }
+               *va = __find_vmap_area_exceed_addr(addr, &vn->busy.root);
+
+               if (*va)
+                       if (!va_start_lowest || (*va)->va_start < va_start_lowest)
+                               va_start_lowest = (*va)->va_start;
                 spin_unlock(&vn->busy.lock);
         }
  
-       return va_node;
-}
-
-static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
-{
-       struct rb_node *n = root->rb_node;
+       /*
+        * Check if found VA exists, it might have gone away.  In this case we
+        * repeat the search because a VA has been removed concurrently and we
+        * need to proceed to the next one, which is a rare case.
+        */
+       if (va_start_lowest) {
+               vn = addr_to_node(va_start_lowest);
  
-       addr = (unsigned long)kasan_reset_tag((void *)addr);
+               spin_lock(&vn->busy.lock);
+               *va = __find_vmap_area(va_start_lowest, &vn->busy.root);
  
-       while (n) {
-               struct vmap_area *va;
+               if (*va)
+                       return vn;
  
-               va = rb_entry(n, struct vmap_area, rb_node);
-               if (addr < va->va_start)
-                       n = n->rb_left;
-               else if (addr >= va->va_end)
-                       n = n->rb_right;
-               else
-                       return va;
+               spin_unlock(&vn->busy.lock);
+               goto repeat;
         }
  
         return NULL;
@@ -2343,6 +2356,9 @@ struct vmap_area *find_vmap_area(unsigned long addr)
         struct vmap_area *va;
         int i, j;
  
+       if (unlikely(!vmap_initialized))
+               return NULL;
+
         /*
          * An addr_to_node_id(addr) converts an address to a node index
          * where a VA is located. If VA spans several zones and passed
diff --git a/net/9p/client.c b/net/9p/client.c

index e265a0ca6bddd40711235c8d7560a6f409a51241..f7e90b4769bba92ef8187b0a96cb310f0c13d5f8 100644 (file)
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -1583,7 +1583,7 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to,
                 received = rsize;
         }
  
-       p9_debug(P9_DEBUG_9P, "<<< RREAD count %d\n", count);
+       p9_debug(P9_DEBUG_9P, "<<< RREAD count %d\n", received);
  
         if (non_zc) {
                 int n = copy_to_iter(dataptr, received, to);
@@ -1609,9 +1609,6 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
         int total = 0;
         *err = 0;
  
-       p9_debug(P9_DEBUG_9P, ">>> TWRITE fid %d offset %llu count %zd\n",
-                fid->fid, offset, iov_iter_count(from));
-
         while (iov_iter_count(from)) {
                 int count = iov_iter_count(from);
                 int rsize = fid->iounit;
@@ -1623,6 +1620,9 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
                 if (count < rsize)
                         rsize = count;
  
+               p9_debug(P9_DEBUG_9P, ">>> TWRITE fid %d offset %llu count %d (/%d)\n",
+                        fid->fid, offset, rsize, count);
+
                 /* Don't bother zerocopy for small IO (< 1024) */
                 if (clnt->trans_mod->zc_request && rsize > 1024) {
                         req = p9_client_zc_rpc(clnt, P9_TWRITE, NULL, from, 0,
@@ -1650,7 +1650,7 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
                         written = rsize;
                 }
  
-               p9_debug(P9_DEBUG_9P, "<<< RWRITE count %d\n", count);
+               p9_debug(P9_DEBUG_9P, "<<< RWRITE count %d\n", written);
  
                 p9_req_put(clnt, req);
                 iov_iter_revert(from, count - written - iov_iter_count(from));
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c

index 1a3948b8c493eda3aca297896bd8adf7a63d443a..196060dc6138af10e99ad04a76ee36a11f770c65 100644 (file)
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -95,7 +95,6 @@ struct p9_poll_wait {
   * @unsent_req_list: accounting for requests that haven't been sent
   * @rreq: read request
   * @wreq: write request
- * @req: current request being processed (if any)
   * @tmp_buf: temporary buffer to read in header
   * @rc: temporary fcall for reading current frame
   * @wpos: write position for current frame
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c

index 00e02138003ecefef75714c950056ced5ccd5fda..efea25eb56ce036364c7325916326b687180bbcf 100644 (file)
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -105,8 +105,10 @@ void hci_req_sync_complete(struct hci_dev *hdev, u8 result, u16 opcode,
         if (hdev->req_status == HCI_REQ_PEND) {
                 hdev->req_result = result;
                 hdev->req_status = HCI_REQ_DONE;
-               if (skb)
+               if (skb) {
+                       kfree_skb(hdev->req_skb);
                         hdev->req_skb = skb_get(skb);
+               }
                 wake_up_interruptible(&hdev->req_wait_q);
         }
  }
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c

index 4ee1b976678b2525ff135fb947221b93923f2aee..703b84bd48d5befc51d787bcd6c04dcbcff61675 100644 (file)
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -1946,10 +1946,9 @@ static int hci_sock_setsockopt_old(struct socket *sock, int level, int optname,
  
         switch (optname) {
         case HCI_DATA_DIR:
-               if (copy_from_sockptr(&opt, optval, sizeof(opt))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, len);
+               if (err)
                         break;
-               }
  
                 if (opt)
                         hci_pi(sk)->cmsg_mask |= HCI_CMSG_DIR;
@@ -1958,10 +1957,9 @@ static int hci_sock_setsockopt_old(struct socket *sock, int level, int optname,
                 break;
  
         case HCI_TIME_STAMP:
-               if (copy_from_sockptr(&opt, optval, sizeof(opt))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, len);
+               if (err)
                         break;
-               }
  
                 if (opt)
                         hci_pi(sk)->cmsg_mask |= HCI_CMSG_TSTAMP;
@@ -1979,11 +1977,9 @@ static int hci_sock_setsockopt_old(struct socket *sock, int level, int optname,
                         uf.event_mask[1] = *((u32 *) f->event_mask + 1);
                 }
  
-               len = min_t(unsigned int, len, sizeof(uf));
-               if (copy_from_sockptr(&uf, optval, len)) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&uf, sizeof(uf), optval, len);
+               if (err)
                         break;
-               }
  
                 if (!capable(CAP_NET_RAW)) {
                         uf.type_mask &= hci_sec_filter.type_mask;
@@ -2042,10 +2038,9 @@ static int hci_sock_setsockopt(struct socket *sock, int level, int optname,
                         goto done;
                 }
  
-               if (copy_from_sockptr(&opt, optval, sizeof(opt))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, len);
+               if (err)
                         break;
-               }
  
                 hci_pi(sk)->mtu = opt;
                 break;
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c

index 8fe02921adf15d4b968be310415bf9383ae3d63d..c5d8799046ccffbf798e6f47ffaef3dddcb364ca 100644 (file)
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -2814,8 +2814,8 @@ static int hci_le_set_ext_scan_param_sync(struct hci_dev *hdev, u8 type,
                                 if (qos->bcast.in.phy & BT_ISO_PHY_CODED) {
                                         cp->scanning_phys |= LE_SCAN_PHY_CODED;
                                         hci_le_scan_phy_params(phy, type,
-                                                              interval,
-                                                              window);
+                                                              interval * 3,
+                                                              window * 3);
                                         num_phy++;
                                         phy++;
                                 }
@@ -2835,7 +2835,7 @@ static int hci_le_set_ext_scan_param_sync(struct hci_dev *hdev, u8 type,
  
         if (scan_coded(hdev)) {
                 cp->scanning_phys |= LE_SCAN_PHY_CODED;
-               hci_le_scan_phy_params(phy, type, interval, window);
+               hci_le_scan_phy_params(phy, type, interval * 3, window * 3);
                 num_phy++;
                 phy++;
         }
diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c

index c8793e57f4b547d5bd465b80575143083b867624..ef0cc80b4c0cc1ff4043d05c05fc0c429a64a6c2 100644 (file)
--- a/net/bluetooth/iso.c
+++ b/net/bluetooth/iso.c
@@ -1451,8 +1451,8 @@ static bool check_ucast_qos(struct bt_iso_qos *qos)
  
  static bool check_bcast_qos(struct bt_iso_qos *qos)
  {
-       if (qos->bcast.sync_factor == 0x00)
-               return false;
+       if (!qos->bcast.sync_factor)
+               qos->bcast.sync_factor = 0x01;
  
         if (qos->bcast.packing > 0x01)
                 return false;
@@ -1475,6 +1475,9 @@ static bool check_bcast_qos(struct bt_iso_qos *qos)
         if (qos->bcast.skip > 0x01f3)
                 return false;
  
+       if (!qos->bcast.sync_timeout)
+               qos->bcast.sync_timeout = BT_ISO_SYNC_TIMEOUT;
+
         if (qos->bcast.sync_timeout < 0x000a || qos->bcast.sync_timeout > 0x4000)
                 return false;
  
@@ -1484,6 +1487,9 @@ static bool check_bcast_qos(struct bt_iso_qos *qos)
         if (qos->bcast.mse > 0x1f)
                 return false;
  
+       if (!qos->bcast.timeout)
+               qos->bcast.sync_timeout = BT_ISO_SYNC_TIMEOUT;
+
         if (qos->bcast.timeout < 0x000a || qos->bcast.timeout > 0x4000)
                 return false;
  
@@ -1494,7 +1500,7 @@ static int iso_sock_setsockopt(struct socket *sock, int level, int optname,
                                sockptr_t optval, unsigned int optlen)
  {
         struct sock *sk = sock->sk;
-       int len, err = 0;
+       int err = 0;
         struct bt_iso_qos qos = default_qos;
         u32 opt;
  
@@ -1509,10 +1515,9 @@ static int iso_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt)
                         set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags);
@@ -1521,10 +1526,9 @@ static int iso_sock_setsockopt(struct socket *sock, int level, int optname,
                 break;
  
         case BT_PKT_STATUS:
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt)
                         set_bit(BT_SK_PKT_STATUS, &bt_sk(sk)->flags);
@@ -1539,17 +1543,9 @@ static int iso_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               len = min_t(unsigned int, sizeof(qos), optlen);
-
-               if (copy_from_sockptr(&qos, optval, len)) {
-                       err = -EFAULT;
-                       break;
-               }
-
-               if (len == sizeof(qos.ucast) && !check_ucast_qos(&qos)) {
-                       err = -EINVAL;
+               err = bt_copy_from_sockptr(&qos, sizeof(qos), optval, optlen);
+               if (err)
                         break;
-               }
  
                 iso_pi(sk)->qos = qos;
                 iso_pi(sk)->qos_user_set = true;
@@ -1564,18 +1560,16 @@ static int iso_sock_setsockopt(struct socket *sock, int level, int optname,
                 }
  
                 if (optlen > sizeof(iso_pi(sk)->base)) {
-                       err = -EOVERFLOW;
+                       err = -EINVAL;
                         break;
                 }
  
-               len = min_t(unsigned int, sizeof(iso_pi(sk)->base), optlen);
-
-               if (copy_from_sockptr(iso_pi(sk)->base, optval, len)) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(iso_pi(sk)->base, optlen, optval,
+                                          optlen);
+               if (err)
                         break;
-               }
  
-               iso_pi(sk)->base_len = len;
+               iso_pi(sk)->base_len = optlen;
  
                 break;
  
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c

index 467b242d8be071da16bd48d04e1520ce1e1aa8a6..dc089740879363dd0d6d973dcdb4fc05cfc7070a 100644 (file)
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -4054,8 +4054,7 @@ static int l2cap_connect_req(struct l2cap_conn *conn,
                 return -EPROTO;
  
         hci_dev_lock(hdev);
-       if (hci_dev_test_flag(hdev, HCI_MGMT) &&
-           !test_and_set_bit(HCI_CONN_MGMT_CONNECTED, &hcon->flags))
+       if (hci_dev_test_flag(hdev, HCI_MGMT))
                 mgmt_device_connected(hdev, hcon, NULL, 0);
         hci_dev_unlock(hdev);
  
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c

index 4287aa6cc988e3ce34849d1f317be8fd8645832c..e7d810b23082f5ffd8ea4b506366b2684f2e1ece 100644 (file)
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -727,7 +727,7 @@ static int l2cap_sock_setsockopt_old(struct socket *sock, int optname,
         struct sock *sk = sock->sk;
         struct l2cap_chan *chan = l2cap_pi(sk)->chan;
         struct l2cap_options opts;
-       int len, err = 0;
+       int err = 0;
         u32 opt;
  
         BT_DBG("sk %p", sk);
@@ -754,11 +754,9 @@ static int l2cap_sock_setsockopt_old(struct socket *sock, int optname,
                 opts.max_tx   = chan->max_tx;
                 opts.txwin_size = chan->tx_win;
  
-               len = min_t(unsigned int, sizeof(opts), optlen);
-               if (copy_from_sockptr(&opts, optval, len)) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opts, sizeof(opts), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opts.txwin_size > L2CAP_DEFAULT_EXT_WINDOW) {
                         err = -EINVAL;
@@ -801,10 +799,9 @@ static int l2cap_sock_setsockopt_old(struct socket *sock, int optname,
                 break;
  
         case L2CAP_LM:
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt & L2CAP_LM_FIPS) {
                         err = -EINVAL;
@@ -885,7 +882,7 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
         struct bt_security sec;
         struct bt_power pwr;
         struct l2cap_conn *conn;
-       int len, err = 0;
+       int err = 0;
         u32 opt;
         u16 mtu;
         u8 mode;
@@ -911,11 +908,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
  
                 sec.level = BT_SECURITY_LOW;
  
-               len = min_t(unsigned int, sizeof(sec), optlen);
-               if (copy_from_sockptr(&sec, optval, len)) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&sec, sizeof(sec), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (sec.level < BT_SECURITY_LOW ||
                     sec.level > BT_SECURITY_FIPS) {
@@ -960,10 +955,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt) {
                         set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags);
@@ -975,10 +969,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
                 break;
  
         case BT_FLUSHABLE:
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt > BT_FLUSHABLE_ON) {
                         err = -EINVAL;
@@ -1010,11 +1003,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
  
                 pwr.force_active = BT_POWER_FORCE_ACTIVE_ON;
  
-               len = min_t(unsigned int, sizeof(pwr), optlen);
-               if (copy_from_sockptr(&pwr, optval, len)) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&pwr, sizeof(pwr), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (pwr.force_active)
                         set_bit(FLAG_FORCE_ACTIVE, &chan->flags);
@@ -1023,10 +1014,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
                 break;
  
         case BT_CHANNEL_POLICY:
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 err = -EOPNOTSUPP;
                 break;
@@ -1055,10 +1045,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               if (copy_from_sockptr(&mtu, optval, sizeof(u16))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&mtu, sizeof(mtu), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (chan->mode == L2CAP_MODE_EXT_FLOWCTL &&
                     sk->sk_state == BT_CONNECTED)
@@ -1086,10 +1075,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               if (copy_from_sockptr(&mode, optval, sizeof(u8))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&mode, sizeof(mode), optval, optlen);
+               if (err)
                         break;
-               }
  
                 BT_DBG("mode %u", mode);
  
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c

index b54e8a530f55a1ff9547a2a5546db34059ebd672..29aa07e9db9d7122bac6ac0c6dfcd76765f11cb8 100644 (file)
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -629,7 +629,7 @@ static int rfcomm_sock_setsockopt_old(struct socket *sock, int optname,
  
         switch (optname) {
         case RFCOMM_LM:
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
+               if (bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen)) {
                         err = -EFAULT;
                         break;
                 }
@@ -664,7 +664,6 @@ static int rfcomm_sock_setsockopt(struct socket *sock, int level, int optname,
         struct sock *sk = sock->sk;
         struct bt_security sec;
         int err = 0;
-       size_t len;
         u32 opt;
  
         BT_DBG("sk %p", sk);
@@ -686,11 +685,9 @@ static int rfcomm_sock_setsockopt(struct socket *sock, int level, int optname,
  
                 sec.level = BT_SECURITY_LOW;
  
-               len = min_t(unsigned int, sizeof(sec), optlen);
-               if (copy_from_sockptr(&sec, optval, len)) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&sec, sizeof(sec), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (sec.level > BT_SECURITY_HIGH) {
                         err = -EINVAL;
@@ -706,10 +703,9 @@ static int rfcomm_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt)
                         set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags);
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c

index 43daf965a01e4ac5c9329150080b00dcd63c7e1c..368e026f4d15ca4711737af941ad30c7b48b827f 100644 (file)
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -824,7 +824,7 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
                                sockptr_t optval, unsigned int optlen)
  {
         struct sock *sk = sock->sk;
-       int len, err = 0;
+       int err = 0;
         struct bt_voice voice;
         u32 opt;
         struct bt_codecs *codecs;
@@ -843,10 +843,9 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt)
                         set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags);
@@ -863,11 +862,10 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
  
                 voice.setting = sco_pi(sk)->setting;
  
-               len = min_t(unsigned int, sizeof(voice), optlen);
-               if (copy_from_sockptr(&voice, optval, len)) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&voice, sizeof(voice), optval,
+                                          optlen);
+               if (err)
                         break;
-               }
  
                 /* Explicitly check for these values */
                 if (voice.setting != BT_VOICE_TRANSPARENT &&
@@ -890,10 +888,9 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
                 break;
  
         case BT_PKT_STATUS:
-               if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
-                       err = -EFAULT;
+               err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
+               if (err)
                         break;
-               }
  
                 if (opt)
                         set_bit(BT_SK_PKT_STATUS, &bt_sk(sk)->flags);
@@ -934,9 +931,9 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
                         break;
                 }
  
-               if (copy_from_sockptr(buffer, optval, optlen)) {
+               err = bt_copy_from_sockptr(buffer, optlen, optval, optlen);
+               if (err) {
                         hci_dev_put(hdev);
-                       err = -EFAULT;
                         break;
                 }
  
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c

index b150c9929b12e86219a55c77da480e0c538b3449..14365b20f1c5c09964dd7024060116737f22cb63 100644 (file)
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -966,6 +966,8 @@ static int do_replace(struct net *net, sockptr_t arg, unsigned int len)
                 return -ENOMEM;
         if (tmp.num_counters == 0)
                 return -EINVAL;
+       if ((u64)len < (u64)tmp.size + sizeof(tmp))
+               return -EINVAL;
  
         tmp.name[sizeof(tmp.name)-1] = 0;
  
@@ -1266,6 +1268,8 @@ static int compat_do_replace(struct net *net, sockptr_t arg, unsigned int len)
                 return -ENOMEM;
         if (tmp.num_counters == 0)
                 return -EINVAL;
+       if ((u64)len < (u64)tmp.size + sizeof(tmp))
+               return -EINVAL;
  
         tmp.name[sizeof(tmp.name)-1] = 0;
  
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c

index 487670759578168c5ff53bce6642898fc41936b3..fe89a056eb06c43743b2d7449e59f4e9360ba223 100644 (file)
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1118,6 +1118,8 @@ do_replace(struct net *net, sockptr_t arg, unsigned int len)
                 return -ENOMEM;
         if (tmp.num_counters == 0)
                 return -EINVAL;
+       if ((u64)len < (u64)tmp.size + sizeof(tmp))
+               return -EINVAL;
  
         tmp.name[sizeof(tmp.name)-1] = 0;
  
@@ -1504,6 +1506,8 @@ compat_do_replace(struct net *net, sockptr_t arg, unsigned int len)
                 return -ENOMEM;
         if (tmp.num_counters == 0)
                 return -EINVAL;
+       if ((u64)len < (u64)tmp.size + sizeof(tmp))
+               return -EINVAL;
  
         tmp.name[sizeof(tmp.name)-1] = 0;
  
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c

index 636b360311c5365fba2330f6ca2f7f1b6dd1363e..131f7bb2110d3a08244c6da40ff9be45a2be711b 100644 (file)
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1135,6 +1135,8 @@ do_replace(struct net *net, sockptr_t arg, unsigned int len)
                 return -ENOMEM;
         if (tmp.num_counters == 0)
                 return -EINVAL;
+       if ((u64)len < (u64)tmp.size + sizeof(tmp))
+               return -EINVAL;
  
         tmp.name[sizeof(tmp.name)-1] = 0;
  
@@ -1513,6 +1515,8 @@ compat_do_replace(struct net *net, sockptr_t arg, unsigned int len)
                 return -ENOMEM;
         if (tmp.num_counters == 0)
                 return -EINVAL;
+       if ((u64)len < (u64)tmp.size + sizeof(tmp))
+               return -EINVAL;
  
         tmp.name[sizeof(tmp.name)-1] = 0;
  
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c

index 545017a3daa4d6b20255c51c6c0dea73ec32ecfc..6b3f01beb294b99740ae4364acbe31cc92e4a980 100644 (file)
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1206,15 +1206,6 @@ err_noclose:
   * MSG_SPLICE_PAGES is used exclusively to reduce the number of
   * copy operations in this path. Therefore the caller must ensure
   * that the pages backing @xdr are unchanging.
- *
- * Note that the send is non-blocking. The caller has incremented
- * the reference count on each page backing the RPC message, and
- * the network layer will "put" these pages when transmission is
- * complete.
- *
- * This is safe for our RPC services because the memory backing
- * the head and tail components is never kmalloc'd. These always
- * come from pages in the svc_rqst::rq_pages array.
   */
  static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp,
                            rpc_fraghdr marker, unsigned int *sentp)
@@ -1244,6 +1235,7 @@ static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp,
         iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
                       1 + count, sizeof(marker) + rqstp->rq_res.len);
         ret = sock_sendmsg(svsk->sk_sock, &msg);
+       page_frag_free(buf);
         if (ret < 0)
                 return ret;
         *sentp += ret;
diff --git a/net/unix/garbage.c b/net/unix/garbage.c

index fa39b626523851df29275f1448d30a7390e7e0fb..6433a414acf8624a1d98727f4e309b7c040710b9 100644 (file)
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -274,11 +274,22 @@ static void __unix_gc(struct work_struct *work)
          * receive queues.  Other, non candidate sockets _can_ be
          * added to queue, so we must make sure only to touch
          * candidates.
+        *
+        * Embryos, though never candidates themselves, affect which
+        * candidates are reachable by the garbage collector.  Before
+        * being added to a listener's queue, an embryo may already
+        * receive data carrying SCM_RIGHTS, potentially making the
+        * passed socket a candidate that is not yet reachable by the
+        * collector.  It becomes reachable once the embryo is
+        * enqueued.  Therefore, we must ensure that no SCM-laden
+        * embryo appears in a (candidate) listener's queue between
+        * consecutive scan_children() calls.
          */
         list_for_each_entry_safe(u, next, &gc_inflight_list, link) {
+               struct sock *sk = &u->sk;
                 long total_refs;
  
-               total_refs = file_count(u->sk.sk_socket->file);
+               total_refs = file_count(sk->sk_socket->file);
  
                 WARN_ON_ONCE(!u->inflight);
                 WARN_ON_ONCE(total_refs < u->inflight);
@@ -286,6 +297,11 @@ static void __unix_gc(struct work_struct *work)
                         list_move_tail(&u->link, &gc_candidates);
                         __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags);
                         __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags);
+
+                       if (sk->sk_state == TCP_LISTEN) {
+                               unix_state_lock(sk);
+                               unix_state_unlock(sk);
+                       }
                 }
         }
  
diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c

index c5c2ce113c9232c331c4ebac2ba4384a24424640..d20c47d21ad8352973c93ccefc7b0931e93de545 100644 (file)
--- a/scripts/gcc-plugins/stackleak_plugin.c
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -467,6 +467,8 @@ static bool stackleak_gate(void)
                         return false;
                 if (STRING_EQUAL(section, ".entry.text"))
                         return false;
+               if (STRING_EQUAL(section, ".head.text"))
+                       return false;
         }
  
         return track_frame_size >= 0;
diff --git a/sound/oss/dmasound/dmasound_paula.c b/sound/oss/dmasound/dmasound_paula.c

index 0ba8f0c4cd99a27aaff1d3b43edd306daf0ad857..3a593da09280dca9e5b59dc96c6c2cb27cac6b6b 100644 (file)
--- a/sound/oss/dmasound/dmasound_paula.c
+++ b/sound/oss/dmasound/dmasound_paula.c
@@ -725,7 +725,13 @@ static void __exit amiga_audio_remove(struct platform_device *pdev)
         dmasound_deinit();
  }
  
-static struct platform_driver amiga_audio_driver = {
+/*
+ * amiga_audio_remove() lives in .exit.text. For drivers registered via
+ * module_platform_driver_probe() this is ok because they cannot get unbound at
+ * runtime. So mark the driver struct with __refdata to prevent modpost
+ * triggering a section mismatch warning.
+ */
+static struct platform_driver amiga_audio_driver __refdata = {
         .remove_new = __exit_p(amiga_audio_remove),
         .driver = {
                 .name   = "amiga-audio",
diff --git a/sound/pci/emu10k1/emu10k1_callback.c b/sound/pci/emu10k1/emu10k1_callback.c

index d36234b88fb4219fec68e3f68103d01ee4c224ab..941bfbf812ed305bbfb368771d66134703ba8bea 100644 (file)
--- a/sound/pci/emu10k1/emu10k1_callback.c
+++ b/sound/pci/emu10k1/emu10k1_callback.c
@@ -255,7 +255,7 @@ lookup_voices(struct snd_emux *emu, struct snd_emu10k1 *hw,
                 /* check if sample is finished playing (non-looping only) */
                 if (bp != best + V_OFF && bp != best + V_FREE &&
                     (vp->reg.sample_mode & SNDRV_SFNT_SAMPLE_SINGLESHOT)) {
-                       val = snd_emu10k1_ptr_read(hw, CCCA_CURRADDR, vp->ch) - 64;
+                       val = snd_emu10k1_ptr_read(hw, CCCA_CURRADDR, vp->ch);
                         if (val >= vp->reg.loopstart)
                                 bp = best + V_OFF;
                 }
@@ -362,7 +362,7 @@ start_voice(struct snd_emux_voice *vp)
  
         map = (hw->silent_page.addr << hw->address_mode) | (hw->address_mode ? MAP_PTI_MASK1 : MAP_PTI_MASK0);
  
-       addr = vp->reg.start + 64;
+       addr = vp->reg.start;
         temp = vp->reg.parm.filterQ;
         ccca = (temp << 28) | addr;
         if (vp->apitch < 0xe400)
@@ -430,9 +430,6 @@ start_voice(struct snd_emux_voice *vp)
                 /* Q & current address (Q 4bit value, MSB) */
                 CCCA, ccca,
  
-               /* cache */
-               CCR, REG_VAL_PUT(CCR_CACHEINVALIDSIZE, 64),
-
                 /* reset volume */
                 VTFT, vtarget | vp->ftarget,
                 CVCF, vtarget | CVCF_CURRENTFILTER_MASK,
diff --git a/sound/pci/hda/cs35l41_hda_property.c b/sound/pci/hda/cs35l41_hda_property.c

index 72ec872afb8d27de1d2b23288988bf4a5e4f4b88..8fb688e4141485cdf1d29ba73ed25ddc17a2a936 100644 (file)
--- a/sound/pci/hda/cs35l41_hda_property.c
+++ b/sound/pci/hda/cs35l41_hda_property.c
@@ -108,7 +108,10 @@ static const struct cs35l41_config cs35l41_config_table[] = {
         { "10431F12", 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
         { "10431F1F", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, -1, 0, 0, 0, 0 },
         { "10431F62", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 0, 0, 0 },
+       { "10433A60", 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
         { "17AA386F", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, -1, -1, 0, 0, 0 },
+       { "17AA3877", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 0, 0, 0 },
+       { "17AA3878", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 0, 0, 0 },
         { "17AA38A9", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 2, -1, 0, 0, 0 },
         { "17AA38AB", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 2, -1, 0, 0, 0 },
         { "17AA38B4", 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 0, 0, 0 },
@@ -496,7 +499,10 @@ static const struct cs35l41_prop_model cs35l41_prop_model_table[] = {
         { "CSC3551", "10431F12", generic_dsd_config },
         { "CSC3551", "10431F1F", generic_dsd_config },
         { "CSC3551", "10431F62", generic_dsd_config },
+       { "CSC3551", "10433A60", generic_dsd_config },
         { "CSC3551", "17AA386F", generic_dsd_config },
+       { "CSC3551", "17AA3877", generic_dsd_config },
+       { "CSC3551", "17AA3878", generic_dsd_config },
         { "CSC3551", "17AA38A9", generic_dsd_config },
         { "CSC3551", "17AA38AB", generic_dsd_config },
         { "CSC3551", "17AA38B4", generic_dsd_config },
diff --git a/sound/pci/hda/cs35l56_hda_i2c.c b/sound/pci/hda/cs35l56_hda_i2c.c

index 13beee807308f1763145cc1f9c1590e427236dc6..40f2f97944d54c916d58279a94f84e6be69b5bba 100644 (file)
--- a/sound/pci/hda/cs35l56_hda_i2c.c
+++ b/sound/pci/hda/cs35l56_hda_i2c.c
@@ -56,10 +56,19 @@ static const struct i2c_device_id cs35l56_hda_i2c_id[] = {
         {}
  };
  
+static const struct acpi_device_id cs35l56_acpi_hda_match[] = {
+       { "CSC3554", 0 },
+       { "CSC3556", 0 },
+       { "CSC3557", 0 },
+       {}
+};
+MODULE_DEVICE_TABLE(acpi, cs35l56_acpi_hda_match);
+
  static struct i2c_driver cs35l56_hda_i2c_driver = {
         .driver = {
-               .name           = "cs35l56-hda",
-               .pm             = &cs35l56_hda_pm_ops,
+               .name             = "cs35l56-hda",
+               .acpi_match_table = cs35l56_acpi_hda_match,
+               .pm               = &cs35l56_hda_pm_ops,
         },
         .id_table       = cs35l56_hda_i2c_id,
         .probe          = cs35l56_hda_i2c_probe,
diff --git a/sound/pci/hda/cs35l56_hda_spi.c b/sound/pci/hda/cs35l56_hda_spi.c

index a3b2fa76663d3685cf404e59785cdc00c502e493..7f02155fe61e3cd529e4c688bf6d734848baf45d 100644 (file)
--- a/sound/pci/hda/cs35l56_hda_spi.c
+++ b/sound/pci/hda/cs35l56_hda_spi.c
@@ -56,10 +56,19 @@ static const struct spi_device_id cs35l56_hda_spi_id[] = {
         {}
  };
  
+static const struct acpi_device_id cs35l56_acpi_hda_match[] = {
+       { "CSC3554", 0 },
+       { "CSC3556", 0 },
+       { "CSC3557", 0 },
+       {}
+};
+MODULE_DEVICE_TABLE(acpi, cs35l56_acpi_hda_match);
+
  static struct spi_driver cs35l56_hda_spi_driver = {
         .driver = {
-               .name           = "cs35l56-hda",
-               .pm             = &cs35l56_hda_pm_ops,
+               .name             = "cs35l56-hda",
+               .acpi_match_table = cs35l56_acpi_hda_match,
+               .pm               = &cs35l56_hda_pm_ops,
         },
         .id_table       = cs35l56_hda_spi_id,
         .probe          = cs35l56_hda_spi_probe,
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c

index a17c36a36aa5375fd8295911a2ffc707cb14263e..cdcb28aa9d7bf028d429aeea0016cec7c6bc0c22 100644 (file)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6875,11 +6875,38 @@ static void alc287_fixup_legion_16ithg6_speakers(struct hda_codec *cdc, const st
         comp_generic_fixup(cdc, action, "i2c", "CLSA0101", "-%s:00-cs35l41-hda.%d", 2);
  }
  
+static void cs35l56_fixup_i2c_two(struct hda_codec *cdc, const struct hda_fixup *fix, int action)
+{
+       comp_generic_fixup(cdc, action, "i2c", "CSC3556", "-%s:00-cs35l56-hda.%d", 2);
+}
+
+static void cs35l56_fixup_i2c_four(struct hda_codec *cdc, const struct hda_fixup *fix, int action)
+{
+       comp_generic_fixup(cdc, action, "i2c", "CSC3556", "-%s:00-cs35l56-hda.%d", 4);
+}
+
+static void cs35l56_fixup_spi_two(struct hda_codec *cdc, const struct hda_fixup *fix, int action)
+{
+       comp_generic_fixup(cdc, action, "spi", "CSC3556", "-%s:00-cs35l56-hda.%d", 2);
+}
+
  static void cs35l56_fixup_spi_four(struct hda_codec *cdc, const struct hda_fixup *fix, int action)
  {
         comp_generic_fixup(cdc, action, "spi", "CSC3556", "-%s:00-cs35l56-hda.%d", 4);
  }
  
+static void alc285_fixup_asus_ga403u(struct hda_codec *cdc, const struct hda_fixup *fix, int action)
+{
+       /*
+        * The same SSID has been re-used in different hardware, they have
+        * different codecs and the newer GA403U has a ALC285.
+        */
+       if (cdc->core.vendor_id == 0x10ec0285)
+               cs35l56_fixup_i2c_two(cdc, fix, action);
+       else
+               alc_fixup_inv_dmic(cdc, fix, action);
+}
+
  static void tas2781_fixup_i2c(struct hda_codec *cdc,
         const struct hda_fixup *fix, int action)
  {
@@ -7436,6 +7463,10 @@ enum {
         ALC256_FIXUP_ACER_SFG16_MICMUTE_LED,
         ALC256_FIXUP_HEADPHONE_AMP_VOL,
         ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX,
+       ALC285_FIXUP_CS35L56_SPI_2,
+       ALC285_FIXUP_CS35L56_I2C_2,
+       ALC285_FIXUP_CS35L56_I2C_4,
+       ALC285_FIXUP_ASUS_GA403U,
  };
  
  /* A special fixup for Lenovo C940 and Yoga Duet 7;
@@ -9643,6 +9674,22 @@ static const struct hda_fixup alc269_fixups[] = {
                 .type = HDA_FIXUP_FUNC,
                 .v.func = alc245_fixup_hp_spectre_x360_eu0xxx,
         },
+       [ALC285_FIXUP_CS35L56_SPI_2] = {
+               .type = HDA_FIXUP_FUNC,
+               .v.func = cs35l56_fixup_spi_two,
+       },
+       [ALC285_FIXUP_CS35L56_I2C_2] = {
+               .type = HDA_FIXUP_FUNC,
+               .v.func = cs35l56_fixup_i2c_two,
+       },
+       [ALC285_FIXUP_CS35L56_I2C_4] = {
+               .type = HDA_FIXUP_FUNC,
+               .v.func = cs35l56_fixup_i2c_four,
+       },
+       [ALC285_FIXUP_ASUS_GA403U] = {
+               .type = HDA_FIXUP_FUNC,
+               .v.func = alc285_fixup_asus_ga403u,
+       },
  };
  
  static const struct snd_pci_quirk alc269_fixup_tbl[] = {
@@ -10096,7 +10143,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
         SND_PCI_QUIRK(0x1043, 0x1a83, "ASUS UM5302LA", ALC294_FIXUP_CS35L41_I2C_2),
         SND_PCI_QUIRK(0x1043, 0x1a8f, "ASUS UX582ZS", ALC245_FIXUP_CS35L41_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x1b11, "ASUS UX431DA", ALC294_FIXUP_ASUS_COEF_1B),
-       SND_PCI_QUIRK(0x1043, 0x1b13, "Asus U41SV", ALC269_FIXUP_INV_DMIC),
+       SND_PCI_QUIRK(0x1043, 0x1b13, "ASUS U41SV/GA403U", ALC285_FIXUP_ASUS_GA403U),
         SND_PCI_QUIRK(0x1043, 0x1b93, "ASUS G614JVR/JIR", ALC245_FIXUP_CS35L41_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x1bbd, "ASUS Z550MA", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
         SND_PCI_QUIRK(0x1043, 0x1c03, "ASUS UM3406HA", ALC287_FIXUP_CS35L41_I2C_2),
@@ -10104,6 +10151,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
         SND_PCI_QUIRK(0x1043, 0x1c33, "ASUS UX5304MA", ALC245_FIXUP_CS35L41_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x1c43, "ASUS UX8406MA", ALC245_FIXUP_CS35L41_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x1c62, "ASUS GU603", ALC289_FIXUP_ASUS_GA401),
+       SND_PCI_QUIRK(0x1043, 0x1c63, "ASUS GU605M", ALC285_FIXUP_CS35L56_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x1c92, "ASUS ROG Strix G15", ALC285_FIXUP_ASUS_G533Z_PINS),
         SND_PCI_QUIRK(0x1043, 0x1c9f, "ASUS G614JU/JV/JI", ALC285_FIXUP_ASUS_HEADSET_MIC),
         SND_PCI_QUIRK(0x1043, 0x1caf, "ASUS G634JY/JZ/JI/JG", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
@@ -10115,11 +10163,14 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
         SND_PCI_QUIRK(0x1043, 0x1d42, "ASUS Zephyrus G14 2022", ALC289_FIXUP_ASUS_GA401),
         SND_PCI_QUIRK(0x1043, 0x1d4e, "ASUS TM420", ALC256_FIXUP_ASUS_HPE),
         SND_PCI_QUIRK(0x1043, 0x1da2, "ASUS UP6502ZA/ZD", ALC245_FIXUP_CS35L41_SPI_2),
+       SND_PCI_QUIRK(0x1043, 0x1df3, "ASUS UM5606", ALC285_FIXUP_CS35L56_I2C_4),
         SND_PCI_QUIRK(0x1043, 0x1e02, "ASUS UX3402ZA", ALC245_FIXUP_CS35L41_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x1e11, "ASUS Zephyrus G15", ALC289_FIXUP_ASUS_GA502),
         SND_PCI_QUIRK(0x1043, 0x1e12, "ASUS UM3402", ALC287_FIXUP_CS35L41_I2C_2),
         SND_PCI_QUIRK(0x1043, 0x1e51, "ASUS Zephyrus M15", ALC294_FIXUP_ASUS_GU502_PINS),
         SND_PCI_QUIRK(0x1043, 0x1e5e, "ASUS ROG Strix G513", ALC294_FIXUP_ASUS_G513_PINS),
+       SND_PCI_QUIRK(0x1043, 0x1e63, "ASUS H7606W", ALC285_FIXUP_CS35L56_I2C_2),
+       SND_PCI_QUIRK(0x1043, 0x1e83, "ASUS GA605W", ALC285_FIXUP_CS35L56_I2C_2),
         SND_PCI_QUIRK(0x1043, 0x1e8e, "ASUS Zephyrus G15", ALC289_FIXUP_ASUS_GA401),
         SND_PCI_QUIRK(0x1043, 0x1ee2, "ASUS UM6702RA/RC", ALC287_FIXUP_CS35L41_I2C_2),
         SND_PCI_QUIRK(0x1043, 0x1c52, "ASUS Zephyrus G15 2022", ALC289_FIXUP_ASUS_GA401),
@@ -10133,7 +10184,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
         SND_PCI_QUIRK(0x1043, 0x3a30, "ASUS G814JVR/JIR", ALC245_FIXUP_CS35L41_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x3a40, "ASUS G814JZR", ALC245_FIXUP_CS35L41_SPI_2),
         SND_PCI_QUIRK(0x1043, 0x3a50, "ASUS G834JYR/JZR", ALC245_FIXUP_CS35L41_SPI_2),
-       SND_PCI_QUIRK(0x1043, 0x3a60, "ASUS G634JYR/JZR", ALC245_FIXUP_CS35L41_SPI_2),
+       SND_PCI_QUIRK(0x1043, 0x3a60, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
         SND_PCI_QUIRK(0x1043, 0x831a, "ASUS P901", ALC269_FIXUP_STEREO_DMIC),
         SND_PCI_QUIRK(0x1043, 0x834a, "ASUS S101", ALC269_FIXUP_STEREO_DMIC),
         SND_PCI_QUIRK(0x1043, 0x8398, "ASUS P1005", ALC269_FIXUP_STEREO_DMIC),
@@ -10159,7 +10210,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
         SND_PCI_QUIRK(0x10ec, 0x1254, "Intel Reference board", ALC295_FIXUP_CHROME_BOOK),
         SND_PCI_QUIRK(0x10ec, 0x12cc, "Intel Reference board", ALC295_FIXUP_CHROME_BOOK),
         SND_PCI_QUIRK(0x10ec, 0x12f6, "Intel Reference board", ALC295_FIXUP_CHROME_BOOK),
-       SND_PCI_QUIRK(0x10f7, 0x8338, "Panasonic CF-SZ6", ALC269_FIXUP_HEADSET_MODE),
+       SND_PCI_QUIRK(0x10f7, 0x8338, "Panasonic CF-SZ6", ALC269_FIXUP_ASPIRE_HEADSET_MIC),
         SND_PCI_QUIRK(0x144d, 0xc109, "Samsung Ativ book 9 (NP900X3G)", ALC269_FIXUP_INV_DMIC),
         SND_PCI_QUIRK(0x144d, 0xc169, "Samsung Notebook 9 Pen (NP930SBE-K01US)", ALC298_FIXUP_SAMSUNG_AMP),
         SND_PCI_QUIRK(0x144d, 0xc176, "Samsung Notebook 9 Pro (NP930MBE-K04US)", ALC298_FIXUP_SAMSUNG_AMP),
@@ -10333,6 +10384,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
         SND_PCI_QUIRK(0x17aa, 0x3869, "Lenovo Yoga7 14IAL7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN),
         SND_PCI_QUIRK(0x17aa, 0x386f, "Legion 7i 16IAX7", ALC287_FIXUP_CS35L41_I2C_2),
         SND_PCI_QUIRK(0x17aa, 0x3870, "Lenovo Yoga 7 14ARB7", ALC287_FIXUP_YOGA7_14ARB7_I2C),
+       SND_PCI_QUIRK(0x17aa, 0x3877, "Lenovo Legion 7 Slim 16ARHA7", ALC287_FIXUP_CS35L41_I2C_2),
+       SND_PCI_QUIRK(0x17aa, 0x3878, "Lenovo Legion 7 Slim 16ARHA7", ALC287_FIXUP_CS35L41_I2C_2),
         SND_PCI_QUIRK(0x17aa, 0x387d, "Yoga S780-16 pro Quad AAC", ALC287_FIXUP_TAS2781_I2C),
         SND_PCI_QUIRK(0x17aa, 0x387e, "Yoga S780-16 pro Quad YC", ALC287_FIXUP_TAS2781_I2C),
         SND_PCI_QUIRK(0x17aa, 0x3881, "YB9 dual power mode2 YC", ALC287_FIXUP_TAS2781_I2C),
@@ -10403,6 +10456,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
         SND_PCI_QUIRK(0x1d05, 0x1147, "TongFang GMxTGxx", ALC269_FIXUP_NO_SHUTUP),
         SND_PCI_QUIRK(0x1d05, 0x115c, "TongFang GMxTGxx", ALC269_FIXUP_NO_SHUTUP),
         SND_PCI_QUIRK(0x1d05, 0x121b, "TongFang GMxAGxx", ALC269_FIXUP_NO_SHUTUP),
+       SND_PCI_QUIRK(0x1d05, 0x1387, "TongFang GMxIXxx", ALC2XX_FIXUP_HEADSET_MIC),
         SND_PCI_QUIRK(0x1d72, 0x1602, "RedmiBook", ALC255_FIXUP_XIAOMI_HEADSET_MIC),
         SND_PCI_QUIRK(0x1d72, 0x1701, "XiaomiNotebook Pro", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE),
         SND_PCI_QUIRK(0x1d72, 0x1901, "RedmiBook 14", ALC256_FIXUP_ASUS_HEADSET_MIC),
diff --git a/sound/soc/amd/acp/acp-pci.c b/sound/soc/amd/acp/acp-pci.c

index 8c8b1dcac6281c1d7448a529ba316d5346fd9030..5f35b90eab8d3f1aa46e6d4ea6bdd8d6d49638a7 100644 (file)
--- a/sound/soc/amd/acp/acp-pci.c
+++ b/sound/soc/amd/acp/acp-pci.c
@@ -115,7 +115,10 @@ static int acp_pci_probe(struct pci_dev *pci, const struct pci_device_id *pci_id
                 goto unregister_dmic_dev;
         }
  
-       acp_init(chip);
+       ret = acp_init(chip);
+       if (ret)
+               goto unregister_dmic_dev;
+
         res = devm_kcalloc(&pci->dev, num_res, sizeof(struct resource), GFP_KERNEL);
         if (!res) {
                 ret = -ENOMEM;
@@ -133,11 +136,9 @@ static int acp_pci_probe(struct pci_dev *pci, const struct pci_device_id *pci_id
                 }
         }
  
-       if (flag == FLAG_AMD_LEGACY_ONLY_DMIC) {
-               ret = check_acp_pdm(pci, chip);
-               if (ret < 0)
-                       goto skip_pdev_creation;
-       }
+       ret = check_acp_pdm(pci, chip);
+       if (ret < 0)
+               goto skip_pdev_creation;
  
         chip->flag = flag;
         memset(&pdevinfo, 0, sizeof(pdevinfo));
diff --git a/sound/soc/codecs/cs-amp-lib.c b/sound/soc/codecs/cs-amp-lib.c

index 01ef4db5407da52cc04c6813ba07970c24d91541..287ac01a387357beffe2a8f76ab5f3ec15dc7b3d 100644 (file)
--- a/sound/soc/codecs/cs-amp-lib.c
+++ b/sound/soc/codecs/cs-amp-lib.c
@@ -56,6 +56,11 @@ static int _cs_amp_write_cal_coeffs(struct cs_dsp *dsp,
         dev_dbg(dsp->dev, "Calibration: Ambient=%#x, Status=%#x, CalR=%d\n",
                 data->calAmbient, data->calStatus, data->calR);
  
+       if (list_empty(&dsp->ctl_list)) {
+               dev_info(dsp->dev, "Calibration disabled due to missing firmware controls\n");
+               return -ENOENT;
+       }
+
         ret = cs_amp_write_cal_coeff(dsp, controls, controls->ambient, data->calAmbient);
         if (ret)
                 return ret;
diff --git a/sound/soc/codecs/cs42l43.c b/sound/soc/codecs/cs42l43.c

index 860d5cda67bffe83c5c4934497cdf8c1c510922b..94685449f0f48c9b7bd534b1947da07ab26fad53 100644 (file)
--- a/sound/soc/codecs/cs42l43.c
+++ b/sound/soc/codecs/cs42l43.c
@@ -2364,7 +2364,8 @@ static int cs42l43_codec_runtime_resume(struct device *dev)
  
  static int cs42l43_codec_suspend(struct device *dev)
  {
-       struct cs42l43 *cs42l43 = dev_get_drvdata(dev);
+       struct cs42l43_codec *priv = dev_get_drvdata(dev);
+       struct cs42l43 *cs42l43 = priv->core;
  
         disable_irq(cs42l43->irq);
  
@@ -2373,7 +2374,8 @@ static int cs42l43_codec_suspend(struct device *dev)
  
  static int cs42l43_codec_suspend_noirq(struct device *dev)
  {
-       struct cs42l43 *cs42l43 = dev_get_drvdata(dev);
+       struct cs42l43_codec *priv = dev_get_drvdata(dev);
+       struct cs42l43 *cs42l43 = priv->core;
  
         enable_irq(cs42l43->irq);
  
@@ -2382,7 +2384,8 @@ static int cs42l43_codec_suspend_noirq(struct device *dev)
  
  static int cs42l43_codec_resume(struct device *dev)
  {
-       struct cs42l43 *cs42l43 = dev_get_drvdata(dev);
+       struct cs42l43_codec *priv = dev_get_drvdata(dev);
+       struct cs42l43 *cs42l43 = priv->core;
  
         enable_irq(cs42l43->irq);
  
@@ -2391,7 +2394,8 @@ static int cs42l43_codec_resume(struct device *dev)
  
  static int cs42l43_codec_resume_noirq(struct device *dev)
  {
-       struct cs42l43 *cs42l43 = dev_get_drvdata(dev);
+       struct cs42l43_codec *priv = dev_get_drvdata(dev);
+       struct cs42l43 *cs42l43 = priv->core;
  
         disable_irq(cs42l43->irq);
  
diff --git a/sound/soc/codecs/es8326.c b/sound/soc/codecs/es8326.c

index 15289dadafea091d2693149e600d72e0cbb975c0..17bd6b5160772e01d8597767868a9d2472cae276 100644 (file)
--- a/sound/soc/codecs/es8326.c
+++ b/sound/soc/codecs/es8326.c
@@ -412,9 +412,9 @@ static const struct _coeff_div coeff_div_v3[] = {
         {125, 48000, 6000000, 0x04, 0x04, 0x1F, 0x2D, 0x8A, 0x0A, 0x27, 0x27},
  
         {128, 8000, 1024000, 0x60, 0x00, 0x05, 0x75, 0x8A, 0x1B, 0x1F, 0x7F},
-       {128, 16000, 2048000, 0x20, 0x00, 0x31, 0x35, 0x8A, 0x1B, 0x1F, 0x3F},
-       {128, 44100, 5644800, 0xE0, 0x00, 0x01, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
-       {128, 48000, 6144000, 0xE0, 0x00, 0x01, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
+       {128, 16000, 2048000, 0x20, 0x00, 0x31, 0x35, 0x08, 0x19, 0x1F, 0x3F},
+       {128, 44100, 5644800, 0xE0, 0x00, 0x01, 0x2D, 0x48, 0x08, 0x1F, 0x1F},
+       {128, 48000, 6144000, 0xE0, 0x00, 0x01, 0x2D, 0x48, 0x08, 0x1F, 0x1F},
         {144, 8000, 1152000, 0x20, 0x00, 0x03, 0x35, 0x8A, 0x1B, 0x23, 0x47},
         {144, 16000, 2304000, 0x20, 0x00, 0x11, 0x35, 0x8A, 0x1B, 0x23, 0x47},
         {192, 8000, 1536000, 0x60, 0x02, 0x0D, 0x75, 0x8A, 0x1B, 0x1F, 0x7F},
@@ -423,10 +423,10 @@ static const struct _coeff_div coeff_div_v3[] = {
  
         {200, 48000, 9600000, 0x04, 0x04, 0x0F, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
         {250, 48000, 12000000, 0x04, 0x04, 0x0F, 0x2D, 0xCA, 0x0A, 0x27, 0x27},
-       {256, 8000, 2048000, 0x60, 0x00, 0x31, 0x35, 0x8A, 0x1B, 0x1F, 0x7F},
-       {256, 16000, 4096000, 0x20, 0x00, 0x01, 0x35, 0x8A, 0x1B, 0x1F, 0x3F},
-       {256, 44100, 11289600, 0xE0, 0x00, 0x30, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
-       {256, 48000, 12288000, 0xE0, 0x00, 0x30, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
+       {256, 8000, 2048000, 0x60, 0x00, 0x31, 0x35, 0x08, 0x19, 0x1F, 0x7F},
+       {256, 16000, 4096000, 0x20, 0x00, 0x01, 0x35, 0x08, 0x19, 0x1F, 0x3F},
+       {256, 44100, 11289600, 0xE0, 0x01, 0x01, 0x2D, 0x48, 0x08, 0x1F, 0x1F},
+       {256, 48000, 12288000, 0xE0, 0x01, 0x01, 0x2D, 0x48, 0x08, 0x1F, 0x1F},
         {288, 8000, 2304000, 0x20, 0x00, 0x01, 0x35, 0x8A, 0x1B, 0x23, 0x47},
         {384, 8000, 3072000, 0x60, 0x02, 0x05, 0x75, 0x8A, 0x1B, 0x1F, 0x7F},
         {384, 16000, 6144000, 0x20, 0x02, 0x03, 0x35, 0x8A, 0x1B, 0x1F, 0x3F},
@@ -435,10 +435,10 @@ static const struct _coeff_div coeff_div_v3[] = {
  
         {400, 48000, 19200000, 0xE4, 0x04, 0x35, 0x6d, 0xCA, 0x0A, 0x1F, 0x1F},
         {500, 48000, 24000000, 0xF8, 0x04, 0x3F, 0x6D, 0xCA, 0x0A, 0x1F, 0x1F},
-       {512, 8000, 4096000, 0x60, 0x00, 0x01, 0x35, 0x8A, 0x1B, 0x1F, 0x7F},
-       {512, 16000, 8192000, 0x20, 0x00, 0x30, 0x35, 0x8A, 0x1B, 0x1F, 0x3F},
-       {512, 44100, 22579200, 0xE0, 0x00, 0x00, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
-       {512, 48000, 24576000, 0xE0, 0x00, 0x00, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
+       {512, 8000, 4096000, 0x60, 0x00, 0x01, 0x08, 0x19, 0x1B, 0x1F, 0x7F},
+       {512, 16000, 8192000, 0x20, 0x00, 0x30, 0x35, 0x08, 0x19, 0x1F, 0x3F},
+       {512, 44100, 22579200, 0xE0, 0x00, 0x00, 0x2D, 0x48, 0x08, 0x1F, 0x1F},
+       {512, 48000, 24576000, 0xE0, 0x00, 0x00, 0x2D, 0x48, 0x08, 0x1F, 0x1F},
         {768, 8000, 6144000, 0x60, 0x02, 0x11, 0x35, 0x8A, 0x1B, 0x1F, 0x7F},
         {768, 16000, 12288000, 0x20, 0x02, 0x01, 0x35, 0x8A, 0x1B, 0x1F, 0x3F},
         {768, 32000, 24576000, 0xE0, 0x02, 0x30, 0x2D, 0xCA, 0x0A, 0x1F, 0x1F},
@@ -835,7 +835,6 @@ static void es8326_jack_detect_handler(struct work_struct *work)
                         dev_dbg(comp->dev, "Report hp remove event\n");
                         snd_soc_jack_report(es8326->jack, 0, SND_JACK_HEADSET);
                         /* mute adc when mic path switch */
-                       regmap_write(es8326->regmap, ES8326_ADC_SCALE, 0x33);
                         regmap_write(es8326->regmap, ES8326_ADC1_SRC, 0x44);
                         regmap_write(es8326->regmap, ES8326_ADC2_SRC, 0x66);
                         es8326->hp = 0;
@@ -843,6 +842,7 @@ static void es8326_jack_detect_handler(struct work_struct *work)
                 regmap_update_bits(es8326->regmap, ES8326_HPDET_TYPE, 0x03, 0x01);
                 regmap_write(es8326->regmap, ES8326_SYS_BIAS, 0x0a);
                 regmap_update_bits(es8326->regmap, ES8326_HP_DRIVER_REF, 0x0f, 0x03);
+               regmap_write(es8326->regmap, ES8326_INT_SOURCE, ES8326_INT_SRC_PIN9);
                 /*
                  * Inverted HPJACK_POL bit to trigger one IRQ to double check HP Removal event
                  */
@@ -865,6 +865,8 @@ static void es8326_jack_detect_handler(struct work_struct *work)
                          * set auto-check mode, then restart jack_detect_work after 400ms.
                          * Don't report jack status.
                          */
+                       regmap_write(es8326->regmap, ES8326_INT_SOURCE,
+                                       (ES8326_INT_SRC_PIN9 | ES8326_INT_SRC_BUTTON));
                         regmap_update_bits(es8326->regmap, ES8326_HPDET_TYPE, 0x03, 0x01);
                         es8326_enable_micbias(es8326->component);
                         usleep_range(50000, 70000);
@@ -891,7 +893,6 @@ static void es8326_jack_detect_handler(struct work_struct *work)
                         snd_soc_jack_report(es8326->jack,
                                         SND_JACK_HEADSET, SND_JACK_HEADSET);
  
-                       regmap_write(es8326->regmap, ES8326_ADC_SCALE, 0x33);
                         regmap_update_bits(es8326->regmap, ES8326_PGA_PDN,
                                         0x08, 0x08);
                         regmap_update_bits(es8326->regmap, ES8326_PGAGAIN,
@@ -987,7 +988,7 @@ static int es8326_resume(struct snd_soc_component *component)
         regmap_write(es8326->regmap, ES8326_VMIDSEL, 0x0E);
         regmap_write(es8326->regmap, ES8326_ANA_LP, 0xf0);
         usleep_range(10000, 15000);
-       regmap_write(es8326->regmap, ES8326_HPJACK_TIMER, 0xe9);
+       regmap_write(es8326->regmap, ES8326_HPJACK_TIMER, 0xd9);
         regmap_write(es8326->regmap, ES8326_ANA_MICBIAS, 0xcb);
         /* set headphone default type and detect pin */
         regmap_write(es8326->regmap, ES8326_HPDET_TYPE, 0x83);
@@ -1038,8 +1039,7 @@ static int es8326_resume(struct snd_soc_component *component)
         es8326_enable_micbias(es8326->component);
         usleep_range(50000, 70000);
         regmap_update_bits(es8326->regmap, ES8326_HPDET_TYPE, 0x03, 0x00);
-       regmap_write(es8326->regmap, ES8326_INT_SOURCE,
-                   (ES8326_INT_SRC_PIN9 | ES8326_INT_SRC_BUTTON));
+       regmap_write(es8326->regmap, ES8326_INT_SOURCE, ES8326_INT_SRC_PIN9);
         regmap_write(es8326->regmap, ES8326_INTOUT_IO,
                      es8326->interrupt_clk);
         regmap_write(es8326->regmap, ES8326_SDINOUT1_IO,
@@ -1060,6 +1060,8 @@ static int es8326_resume(struct snd_soc_component *component)
         es8326->hp = 0;
         es8326->hpl_vol = 0x03;
         es8326->hpr_vol = 0x03;
+
+       es8326_irq(es8326->irq, es8326);
         return 0;
  }
  
@@ -1070,6 +1072,9 @@ static int es8326_suspend(struct snd_soc_component *component)
         cancel_delayed_work_sync(&es8326->jack_detect_work);
         es8326_disable_micbias(component);
         es8326->calibrated = false;
+       regmap_write(es8326->regmap, ES8326_CLK_MUX, 0x2d);
+       regmap_write(es8326->regmap, ES8326_DAC2HPMIX, 0x00);
+       regmap_write(es8326->regmap, ES8326_ANA_PDN, 0x3b);
         regmap_write(es8326->regmap, ES8326_CLK_CTL, ES8326_CLK_OFF);
         regcache_cache_only(es8326->regmap, true);
         regcache_mark_dirty(es8326->regmap);
diff --git a/sound/soc/codecs/es8326.h b/sound/soc/codecs/es8326.h

index ee12caef810532380cdf1b5d6b0b204afef78e63..c3e52e7bdef57de0377cb7b467bf6fd8fd62b8c9 100644 (file)
--- a/sound/soc/codecs/es8326.h
+++ b/sound/soc/codecs/es8326.h
@@ -104,7 +104,7 @@
  #define ES8326_MUTE (3 << 0)
  
  /* ES8326_CLK_CTL */
-#define ES8326_CLK_ON (0x7e << 0)
+#define ES8326_CLK_ON (0x7f << 0)
  #define ES8326_CLK_OFF (0 << 0)
  
  /* ES8326_CLK_INV */
diff --git a/sound/soc/codecs/rt1316-sdw.c b/sound/soc/codecs/rt1316-sdw.c

index 47511f70119ae3b1d810ce8561d6026ccbbd98da..0b3bf920bcab2307c0107387e0ad728552bb6b9c 100644 (file)
--- a/sound/soc/codecs/rt1316-sdw.c
+++ b/sound/soc/codecs/rt1316-sdw.c
@@ -537,7 +537,7 @@ static int rt1316_sdw_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt1316->sdw_slave, &stream_config,
                                 &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
@@ -577,12 +577,12 @@ static int rt1316_sdw_parse_dt(struct rt1316_sdw_priv *rt1316, struct device *de
         if (rt1316->bq_params_cnt) {
                 rt1316->bq_params = devm_kzalloc(dev, rt1316->bq_params_cnt, GFP_KERNEL);
                 if (!rt1316->bq_params) {
-                       dev_err(dev, "Could not allocate bq_params memory\n");
+                       dev_err(dev, "%s: Could not allocate bq_params memory\n", __func__);
                         ret = -ENOMEM;
                 } else {
                         ret = device_property_read_u8_array(dev, "realtek,bq-params", rt1316->bq_params, rt1316->bq_params_cnt);
                         if (ret < 0)
-                               dev_err(dev, "Could not read list of realtek,bq-params\n");
+                               dev_err(dev, "%s: Could not read list of realtek,bq-params\n", __func__);
                 }
         }
  
@@ -759,7 +759,7 @@ static int __maybe_unused rt1316_dev_resume(struct device *dev)
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                 msecs_to_jiffies(RT1316_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__);
                 sdw_show_ping_status(slave->bus, true);
  
                 return -ETIMEDOUT;
diff --git a/sound/soc/codecs/rt1318-sdw.c b/sound/soc/codecs/rt1318-sdw.c

index ff364bde4a084943d78da479dc876f7b328eb02b..462c9a4b1be5ddb27c078b4c49e9c2ee3737e467 100644 (file)
--- a/sound/soc/codecs/rt1318-sdw.c
+++ b/sound/soc/codecs/rt1318-sdw.c
@@ -606,7 +606,7 @@ static int rt1318_sdw_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt1318->sdw_slave, &stream_config,
                                 &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
@@ -631,8 +631,8 @@ static int rt1318_sdw_hw_params(struct snd_pcm_substream *substream,
                 sampling_rate = RT1318_SDCA_RATE_192000HZ;
                 break;
         default:
-               dev_err(component->dev, "Rate %d is not supported\n",
-                       params_rate(params));
+               dev_err(component->dev, "%s: Rate %d is not supported\n",
+                       __func__, params_rate(params));
                 return -EINVAL;
         }
  
@@ -835,7 +835,7 @@ static int __maybe_unused rt1318_dev_resume(struct device *dev)
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                 msecs_to_jiffies(RT1318_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__);
                 return -ETIMEDOUT;
         }
  
diff --git a/sound/soc/codecs/rt5682-sdw.c b/sound/soc/codecs/rt5682-sdw.c

index e67c2e19cb1a7291170ada3b69cbdda4aadb8b6c..f9ee42c13dbac34afd0f79ff5299050106d82357 100644 (file)
--- a/sound/soc/codecs/rt5682-sdw.c
+++ b/sound/soc/codecs/rt5682-sdw.c
@@ -132,7 +132,7 @@ static int rt5682_sdw_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt5682->slave, &stream_config,
                                       &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
@@ -315,8 +315,8 @@ static int rt5682_sdw_init(struct device *dev, struct regmap *regmap,
                                           &rt5682_sdw_indirect_regmap);
         if (IS_ERR(rt5682->regmap)) {
                 ret = PTR_ERR(rt5682->regmap);
-               dev_err(dev, "Failed to allocate register map: %d\n",
-                       ret);
+               dev_err(dev, "%s: Failed to allocate register map: %d\n",
+                       __func__, ret);
                 return ret;
         }
  
@@ -400,7 +400,7 @@ static int rt5682_io_init(struct device *dev, struct sdw_slave *slave)
         }
  
         if (val != DEVICE_ID) {
-               dev_err(dev, "Device with ID register %x is not rt5682\n", val);
+               dev_err(dev, "%s: Device with ID register %x is not rt5682\n", __func__, val);
                 ret = -ENODEV;
                 goto err_nodev;
         }
@@ -648,7 +648,7 @@ static int rt5682_bus_config(struct sdw_slave *slave,
  
         ret = rt5682_clock_config(&slave->dev);
         if (ret < 0)
-               dev_err(&slave->dev, "Invalid clk config");
+               dev_err(&slave->dev, "%s: Invalid clk config", __func__);
  
         return ret;
  }
@@ -763,19 +763,19 @@ static int __maybe_unused rt5682_dev_resume(struct device *dev)
                 return 0;
  
         if (!slave->unattach_request) {
+               mutex_lock(&rt5682->disable_irq_lock);
                 if (rt5682->disable_irq == true) {
-                       mutex_lock(&rt5682->disable_irq_lock);
                         sdw_write_no_pm(slave, SDW_SCP_INTMASK1, SDW_SCP_INT1_IMPL_DEF);
                         rt5682->disable_irq = false;
-                       mutex_unlock(&rt5682->disable_irq_lock);
                 }
+               mutex_unlock(&rt5682->disable_irq_lock);
                 goto regmap_sync;
         }
  
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                 msecs_to_jiffies(RT5682_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__);
                 sdw_show_ping_status(slave->bus, true);
  
                 return -ETIMEDOUT;
diff --git a/sound/soc/codecs/rt700.c b/sound/soc/codecs/rt700.c

index 0ebf344a1b6094a38a4c38c6817b8cf0c9242f48..434b926f96c8376c1c5b8b73c37b01b2d64641fe 100644 (file)
--- a/sound/soc/codecs/rt700.c
+++ b/sound/soc/codecs/rt700.c
@@ -37,8 +37,8 @@ static int rt700_index_write(struct regmap *regmap,
  
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
-               pr_err("Failed to set private value: %06x <= %04x ret=%d\n",
-                       addr, value, ret);
+               pr_err("%s: Failed to set private value: %06x <= %04x ret=%d\n",
+                      __func__, addr, value, ret);
  
         return ret;
  }
@@ -52,8 +52,8 @@ static int rt700_index_read(struct regmap *regmap,
         *value = 0;
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
-               pr_err("Failed to get private value: %06x => %04x ret=%d\n",
-                       addr, *value, ret);
+               pr_err("%s: Failed to get private value: %06x => %04x ret=%d\n",
+                      __func__, addr, *value, ret);
  
         return ret;
  }
@@ -930,14 +930,14 @@ static int rt700_pcm_hw_params(struct snd_pcm_substream *substream,
                 port_config.num += 2;
                 break;
         default:
-               dev_err(component->dev, "Invalid DAI id %d\n", dai->id);
+               dev_err(component->dev, "%s: Invalid DAI id %d\n", __func__, dai->id);
                 return -EINVAL;
         }
  
         retval = sdw_stream_add_slave(rt700->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
@@ -945,8 +945,8 @@ static int rt700_pcm_hw_params(struct snd_pcm_substream *substream,
                 /* bit 3:0 Number of Channel */
                 val |= (params_channels(params) - 1);
         } else {
-               dev_err(component->dev, "Unsupported channels %d\n",
-                       params_channels(params));
+               dev_err(component->dev, "%s: Unsupported channels %d\n",
+                       __func__, params_channels(params));
                 return -EINVAL;
         }
  
diff --git a/sound/soc/codecs/rt711-sdca-sdw.c b/sound/soc/codecs/rt711-sdca-sdw.c

index 935e597022d3242187b378107e302a36bedd17f5..2636c2eea4bc8be6af732d3c2d5f03b8d78be22f 100644 (file)
--- a/sound/soc/codecs/rt711-sdca-sdw.c
+++ b/sound/soc/codecs/rt711-sdca-sdw.c
@@ -438,20 +438,20 @@ static int __maybe_unused rt711_sdca_dev_resume(struct device *dev)
                 return 0;
  
         if (!slave->unattach_request) {
+               mutex_lock(&rt711->disable_irq_lock);
                 if (rt711->disable_irq == true) {
-                       mutex_lock(&rt711->disable_irq_lock);
                         sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK1, SDW_SCP_SDCA_INTMASK_SDCA_0);
                         sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK2, SDW_SCP_SDCA_INTMASK_SDCA_8);
                         rt711->disable_irq = false;
-                       mutex_unlock(&rt711->disable_irq_lock);
                 }
+               mutex_unlock(&rt711->disable_irq_lock);
                 goto regmap_sync;
         }
  
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                 msecs_to_jiffies(RT711_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__);
                 sdw_show_ping_status(slave->bus, true);
  
                 return -ETIMEDOUT;
diff --git a/sound/soc/codecs/rt711-sdca.c b/sound/soc/codecs/rt711-sdca.c

index 447154cb60104d31bb66ef268d7b60c513e273e3..1e8dbfc3ecd969be3a87cb5f00aed853a56e2a41 100644 (file)
--- a/sound/soc/codecs/rt711-sdca.c
+++ b/sound/soc/codecs/rt711-sdca.c
@@ -36,8 +36,8 @@ static int rt711_sdca_index_write(struct rt711_sdca_priv *rt711,
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt711->slave->dev,
-                       "Failed to set private value: %06x <= %04x ret=%d\n",
-                       addr, value, ret);
+                       "%s: Failed to set private value: %06x <= %04x ret=%d\n",
+                       __func__, addr, value, ret);
  
         return ret;
  }
@@ -52,8 +52,8 @@ static int rt711_sdca_index_read(struct rt711_sdca_priv *rt711,
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt711->slave->dev,
-                       "Failed to get private value: %06x => %04x ret=%d\n",
-                       addr, *value, ret);
+                       "%s: Failed to get private value: %06x => %04x ret=%d\n",
+                       __func__, addr, *value, ret);
  
         return ret;
  }
@@ -1293,13 +1293,13 @@ static int rt711_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt711->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
         if (params_channels(params) > 16) {
-               dev_err(component->dev, "Unsupported channels %d\n",
-                       params_channels(params));
+               dev_err(component->dev, "%s: Unsupported channels %d\n",
+                       __func__, params_channels(params));
                 return -EINVAL;
         }
  
@@ -1318,8 +1318,8 @@ static int rt711_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
                 sampling_rate = RT711_SDCA_RATE_192000HZ;
                 break;
         default:
-               dev_err(component->dev, "Rate %d is not supported\n",
-                       params_rate(params));
+               dev_err(component->dev, "%s: Rate %d is not supported\n",
+                       __func__, params_rate(params));
                 return -EINVAL;
         }
  
diff --git a/sound/soc/codecs/rt711-sdw.c b/sound/soc/codecs/rt711-sdw.c

index 3f5773310ae8cc3b5d94f76aa481724ac35bad0a..0d3b43dd22e63d2343b0ce882166ca7bedc7b67c 100644 (file)
--- a/sound/soc/codecs/rt711-sdw.c
+++ b/sound/soc/codecs/rt711-sdw.c
@@ -408,7 +408,7 @@ static int rt711_bus_config(struct sdw_slave *slave,
  
         ret = rt711_clock_config(&slave->dev);
         if (ret < 0)
-               dev_err(&slave->dev, "Invalid clk config");
+               dev_err(&slave->dev, "%s: Invalid clk config", __func__);
  
         return ret;
  }
@@ -536,19 +536,19 @@ static int __maybe_unused rt711_dev_resume(struct device *dev)
                 return 0;
  
         if (!slave->unattach_request) {
+               mutex_lock(&rt711->disable_irq_lock);
                 if (rt711->disable_irq == true) {
-                       mutex_lock(&rt711->disable_irq_lock);
                         sdw_write_no_pm(slave, SDW_SCP_INTMASK1, SDW_SCP_INT1_IMPL_DEF);
                         rt711->disable_irq = false;
-                       mutex_unlock(&rt711->disable_irq_lock);
                 }
+               mutex_unlock(&rt711->disable_irq_lock);
                 goto regmap_sync;
         }
  
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                 msecs_to_jiffies(RT711_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__);
                 return -ETIMEDOUT;
         }
  
diff --git a/sound/soc/codecs/rt711.c b/sound/soc/codecs/rt711.c

index 66eaed13b0d6a06ff1a649be6924e16850e65997..5446f9506a16722e8a43571631d109fa27c9fe65 100644 (file)
--- a/sound/soc/codecs/rt711.c
+++ b/sound/soc/codecs/rt711.c
@@ -37,8 +37,8 @@ static int rt711_index_write(struct regmap *regmap,
  
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
-               pr_err("Failed to set private value: %06x <= %04x ret=%d\n",
-                       addr, value, ret);
+               pr_err("%s: Failed to set private value: %06x <= %04x ret=%d\n",
+                      __func__, addr, value, ret);
  
         return ret;
  }
@@ -52,8 +52,8 @@ static int rt711_index_read(struct regmap *regmap,
         *value = 0;
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
-               pr_err("Failed to get private value: %06x => %04x ret=%d\n",
-                       addr, *value, ret);
+               pr_err("%s: Failed to get private value: %06x => %04x ret=%d\n",
+                      __func__, addr, *value, ret);
  
         return ret;
  }
@@ -428,7 +428,7 @@ static void rt711_jack_init(struct rt711_priv *rt711)
                                 RT711_HP_JD_FINAL_RESULT_CTL_JD12);
                         break;
                 default:
-                       dev_warn(rt711->component->dev, "Wrong JD source\n");
+                       dev_warn(rt711->component->dev, "%s: Wrong JD source\n", __func__);
                         break;
                 }
  
@@ -1020,7 +1020,7 @@ static int rt711_pcm_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt711->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
@@ -1028,8 +1028,8 @@ static int rt711_pcm_hw_params(struct snd_pcm_substream *substream,
                 /* bit 3:0 Number of Channel */
                 val |= (params_channels(params) - 1);
         } else {
-               dev_err(component->dev, "Unsupported channels %d\n",
-                       params_channels(params));
+               dev_err(component->dev, "%s: Unsupported channels %d\n",
+                       __func__, params_channels(params));
                 return -EINVAL;
         }
  
diff --git a/sound/soc/codecs/rt712-sdca-dmic.c b/sound/soc/codecs/rt712-sdca-dmic.c

index 0926b26619bd45b69f5b0b4b5508d677ca207766..012b79e72cf6b64e1e5c2837aaf034237cddcaa6 100644 (file)
--- a/sound/soc/codecs/rt712-sdca-dmic.c
+++ b/sound/soc/codecs/rt712-sdca-dmic.c
@@ -139,8 +139,8 @@ static int rt712_sdca_dmic_index_write(struct rt712_sdca_dmic_priv *rt712,
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt712->slave->dev,
-                       "Failed to set private value: %06x <= %04x ret=%d\n",
-                       addr, value, ret);
+                       "%s: Failed to set private value: %06x <= %04x ret=%d\n",
+                       __func__, addr, value, ret);
  
         return ret;
  }
@@ -155,8 +155,8 @@ static int rt712_sdca_dmic_index_read(struct rt712_sdca_dmic_priv *rt712,
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt712->slave->dev,
-                       "Failed to get private value: %06x => %04x ret=%d\n",
-                       addr, *value, ret);
+                       "%s: Failed to get private value: %06x => %04x ret=%d\n",
+                       __func__, addr, *value, ret);
  
         return ret;
  }
@@ -317,7 +317,8 @@ static int rt712_sdca_dmic_set_gain_put(struct snd_kcontrol *kcontrol,
         for (i = 0; i < p->count; i++) {
                 err = regmap_write(rt712->mbq_regmap, p->reg_base + i, gain_val[i]);
                 if (err < 0)
-                       dev_err(&rt712->slave->dev, "0x%08x can't be set\n", p->reg_base + i);
+                       dev_err(&rt712->slave->dev, "%s: 0x%08x can't be set\n",
+                               __func__, p->reg_base + i);
         }
  
         return changed;
@@ -667,13 +668,13 @@ static int rt712_sdca_dmic_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt712->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
         if (params_channels(params) > 4) {
-               dev_err(component->dev, "Unsupported channels %d\n",
-                       params_channels(params));
+               dev_err(component->dev, "%s: Unsupported channels %d\n",
+                       __func__, params_channels(params));
                 return -EINVAL;
         }
  
@@ -698,8 +699,8 @@ static int rt712_sdca_dmic_hw_params(struct snd_pcm_substream *substream,
                 sampling_rate = RT712_SDCA_RATE_192000HZ;
                 break;
         default:
-               dev_err(component->dev, "Rate %d is not supported\n",
-                       params_rate(params));
+               dev_err(component->dev, "%s: Rate %d is not supported\n",
+                       __func__, params_rate(params));
                 return -EINVAL;
         }
  
@@ -923,7 +924,8 @@ static int __maybe_unused rt712_sdca_dmic_dev_resume(struct device *dev)
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                 msecs_to_jiffies(RT712_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n",
+                       __func__);
                 sdw_show_ping_status(slave->bus, true);
  
                 return -ETIMEDOUT;
diff --git a/sound/soc/codecs/rt712-sdca-sdw.c b/sound/soc/codecs/rt712-sdca-sdw.c

index 01ac555cd79b84a0b1aabe57899b1fedc214a2b5..4e9ab3ef135b34946d37d6280a9afb568cceef51 100644 (file)
--- a/sound/soc/codecs/rt712-sdca-sdw.c
+++ b/sound/soc/codecs/rt712-sdca-sdw.c
@@ -438,20 +438,21 @@ static int __maybe_unused rt712_sdca_dev_resume(struct device *dev)
                 return 0;
  
         if (!slave->unattach_request) {
+               mutex_lock(&rt712->disable_irq_lock);
                 if (rt712->disable_irq == true) {
-                       mutex_lock(&rt712->disable_irq_lock);
+
                         sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK1, SDW_SCP_SDCA_INTMASK_SDCA_0);
                         sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK2, SDW_SCP_SDCA_INTMASK_SDCA_8);
                         rt712->disable_irq = false;
-                       mutex_unlock(&rt712->disable_irq_lock);
                 }
+               mutex_unlock(&rt712->disable_irq_lock);
                 goto regmap_sync;
         }
  
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                 msecs_to_jiffies(RT712_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__);
                 sdw_show_ping_status(slave->bus, true);
  
                 return -ETIMEDOUT;
diff --git a/sound/soc/codecs/rt712-sdca.c b/sound/soc/codecs/rt712-sdca.c

index 6954fbe7ec5f3bb79f8693c23f302a7a1003e11e..b503de9fda80e71cbe78e8916a6a7f41286ac5b2 100644 (file)
--- a/sound/soc/codecs/rt712-sdca.c
+++ b/sound/soc/codecs/rt712-sdca.c
@@ -34,8 +34,8 @@ static int rt712_sdca_index_write(struct rt712_sdca_priv *rt712,
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt712->slave->dev,
-                       "Failed to set private value: %06x <= %04x ret=%d\n",
-                       addr, value, ret);
+                       "%s: Failed to set private value: %06x <= %04x ret=%d\n",
+                       __func__, addr, value, ret);
  
         return ret;
  }
@@ -50,8 +50,8 @@ static int rt712_sdca_index_read(struct rt712_sdca_priv *rt712,
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt712->slave->dev,
-                       "Failed to get private value: %06x => %04x ret=%d\n",
-                       addr, *value, ret);
+                       "%s: Failed to get private value: %06x => %04x ret=%d\n",
+                       __func__, addr, *value, ret);
  
         return ret;
  }
@@ -1060,13 +1060,13 @@ static int rt712_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt712->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
         if (params_channels(params) > 16) {
-               dev_err(component->dev, "Unsupported channels %d\n",
-                       params_channels(params));
+               dev_err(component->dev, "%s: Unsupported channels %d\n",
+                       __func__, params_channels(params));
                 return -EINVAL;
         }
  
@@ -1085,8 +1085,8 @@ static int rt712_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
                 sampling_rate = RT712_SDCA_RATE_192000HZ;
                 break;
         default:
-               dev_err(component->dev, "Rate %d is not supported\n",
-                       params_rate(params));
+               dev_err(component->dev, "%s: Rate %d is not supported\n",
+                       __func__, params_rate(params));
                 return -EINVAL;
         }
  
@@ -1106,7 +1106,7 @@ static int rt712_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
                         sampling_rate);
                 break;
         default:
-               dev_err(component->dev, "Wrong DAI id\n");
+               dev_err(component->dev, "%s: Wrong DAI id\n", __func__);
                 return -EINVAL;
         }
  
diff --git a/sound/soc/codecs/rt715-sdca-sdw.c b/sound/soc/codecs/rt715-sdca-sdw.c

index ab54a67a27ebbfc8fbe19837fa07ed7d084bd429..ee450126106f969588ab52b83434309f8cfb8036 100644 (file)
--- a/sound/soc/codecs/rt715-sdca-sdw.c
+++ b/sound/soc/codecs/rt715-sdca-sdw.c
@@ -237,7 +237,7 @@ static int __maybe_unused rt715_dev_resume(struct device *dev)
         time = wait_for_completion_timeout(&slave->enumeration_complete,
                                            msecs_to_jiffies(RT715_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Enumeration not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Enumeration not complete, timed out\n", __func__);
                 sdw_show_ping_status(slave->bus, true);
  
                 return -ETIMEDOUT;
diff --git a/sound/soc/codecs/rt715-sdca.c b/sound/soc/codecs/rt715-sdca.c

index 4533eedd7e189f3b48e36175eb5494f20a6f1be0..3fb7b9adb61de628705d784fbe64e259bf031089 100644 (file)
--- a/sound/soc/codecs/rt715-sdca.c
+++ b/sound/soc/codecs/rt715-sdca.c
@@ -41,8 +41,8 @@ static int rt715_sdca_index_write(struct rt715_sdca_priv *rt715,
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt715->slave->dev,
-                       "Failed to set private value: %08x <= %04x %d\n",
-                       addr, value, ret);
+                       "%s: Failed to set private value: %08x <= %04x %d\n",
+                       __func__, addr, value, ret);
  
         return ret;
  }
@@ -59,8 +59,8 @@ static int rt715_sdca_index_read(struct rt715_sdca_priv *rt715,
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt715->slave->dev,
-                               "Failed to get private value: %06x => %04x ret=%d\n",
-                               addr, *value, ret);
+                       "%s: Failed to get private value: %06x => %04x ret=%d\n",
+                       __func__, addr, *value, ret);
  
         return ret;
  }
@@ -152,8 +152,8 @@ static int rt715_sdca_set_amp_gain_put(struct snd_kcontrol *kcontrol,
                                 mc->shift);
                 ret = regmap_write(rt715->mbq_regmap, mc->reg + i, gain_val);
                 if (ret != 0) {
-                       dev_err(component->dev, "Failed to write 0x%x=0x%x\n",
-                               mc->reg + i, gain_val);
+                       dev_err(component->dev, "%s: Failed to write 0x%x=0x%x\n",
+                               __func__, mc->reg + i, gain_val);
                         return ret;
                 }
         }
@@ -188,8 +188,8 @@ static int rt715_sdca_set_amp_gain_4ch_put(struct snd_kcontrol *kcontrol,
                 ret = regmap_write(rt715->mbq_regmap, reg_base + i,
                                 gain_val);
                 if (ret != 0) {
-                       dev_err(component->dev, "Failed to write 0x%x=0x%x\n",
-                               reg_base + i, gain_val);
+                       dev_err(component->dev, "%s: Failed to write 0x%x=0x%x\n",
+                               __func__, reg_base + i, gain_val);
                         return ret;
                 }
         }
@@ -224,8 +224,8 @@ static int rt715_sdca_set_amp_gain_8ch_put(struct snd_kcontrol *kcontrol,
                 reg = i < 7 ? reg_base + i : (reg_base - 1) | BIT(15);
                 ret = regmap_write(rt715->mbq_regmap, reg, gain_val);
                 if (ret != 0) {
-                       dev_err(component->dev, "Failed to write 0x%x=0x%x\n",
-                               reg, gain_val);
+                       dev_err(component->dev, "%s: Failed to write 0x%x=0x%x\n",
+                               __func__, reg, gain_val);
                         return ret;
                 }
         }
@@ -246,8 +246,8 @@ static int rt715_sdca_set_amp_gain_get(struct snd_kcontrol *kcontrol,
         for (i = 0; i < 2; i++) {
                 ret = regmap_read(rt715->mbq_regmap, mc->reg + i, &val);
                 if (ret < 0) {
-                       dev_err(component->dev, "Failed to read 0x%x, ret=%d\n",
-                               mc->reg + i, ret);
+                       dev_err(component->dev, "%s: Failed to read 0x%x, ret=%d\n",
+                               __func__, mc->reg + i, ret);
                         return ret;
                 }
                 ucontrol->value.integer.value[i] = rt715_sdca_get_gain(val, mc->shift);
@@ -271,8 +271,8 @@ static int rt715_sdca_set_amp_gain_4ch_get(struct snd_kcontrol *kcontrol,
         for (i = 0; i < 4; i++) {
                 ret = regmap_read(rt715->mbq_regmap, reg_base + i, &val);
                 if (ret < 0) {
-                       dev_err(component->dev, "Failed to read 0x%x, ret=%d\n",
-                               reg_base + i, ret);
+                       dev_err(component->dev, "%s: Failed to read 0x%x, ret=%d\n",
+                               __func__, reg_base + i, ret);
                         return ret;
                 }
                 ucontrol->value.integer.value[i] = rt715_sdca_get_gain(val, gain_sft);
@@ -297,8 +297,8 @@ static int rt715_sdca_set_amp_gain_8ch_get(struct snd_kcontrol *kcontrol,
         for (i = 0; i < 8; i += 2) {
                 ret = regmap_read(rt715->mbq_regmap, reg_base + i, &val_l);
                 if (ret < 0) {
-                       dev_err(component->dev, "Failed to read 0x%x, ret=%d\n",
-                                       reg_base + i, ret);
+                       dev_err(component->dev, "%s: Failed to read 0x%x, ret=%d\n",
+                               __func__, reg_base + i, ret);
                         return ret;
                 }
                 ucontrol->value.integer.value[i] = (val_l >> gain_sft) / 10;
@@ -306,8 +306,8 @@ static int rt715_sdca_set_amp_gain_8ch_get(struct snd_kcontrol *kcontrol,
                 reg = (i == 6) ? (reg_base - 1) | BIT(15) : reg_base + 1 + i;
                 ret = regmap_read(rt715->mbq_regmap, reg, &val_r);
                 if (ret < 0) {
-                       dev_err(component->dev, "Failed to read 0x%x, ret=%d\n",
-                                       reg, ret);
+                       dev_err(component->dev, "%s: Failed to read 0x%x, ret=%d\n",
+                               __func__, reg, ret);
                         return ret;
                 }
                 ucontrol->value.integer.value[i + 1] = (val_r >> gain_sft) / 10;
@@ -834,15 +834,15 @@ static int rt715_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
                         0xaf00);
                 break;
         default:
-               dev_err(component->dev, "Invalid DAI id %d\n", dai->id);
+               dev_err(component->dev, "%s: Invalid DAI id %d\n", __func__, dai->id);
                 return -EINVAL;
         }
  
         retval = sdw_stream_add_slave(rt715->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(component->dev, "Unable to configure port, retval:%d\n",
-                       retval);
+               dev_err(component->dev, "%s: Unable to configure port, retval:%d\n",
+                       __func__, retval);
                 return retval;
         }
  
@@ -893,8 +893,8 @@ static int rt715_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
                 val = 0xf;
                 break;
         default:
-               dev_err(component->dev, "Unsupported sample rate %d\n",
-                       params_rate(params));
+               dev_err(component->dev, "%s: Unsupported sample rate %d\n",
+                       __func__, params_rate(params));
                 return -EINVAL;
         }
  
diff --git a/sound/soc/codecs/rt715-sdw.c b/sound/soc/codecs/rt715-sdw.c

index 21f37babd148a487e82568144a791bff07fdf6c0..7e13868ff99f03110c165dcd706cff46a8eeba5d 100644 (file)
--- a/sound/soc/codecs/rt715-sdw.c
+++ b/sound/soc/codecs/rt715-sdw.c
@@ -482,7 +482,7 @@ static int rt715_bus_config(struct sdw_slave *slave,
  
         ret = rt715_clock_config(&slave->dev);
         if (ret < 0)
-               dev_err(&slave->dev, "Invalid clk config");
+               dev_err(&slave->dev, "%s: Invalid clk config", __func__);
  
         return 0;
  }
@@ -554,7 +554,7 @@ static int __maybe_unused rt715_dev_resume(struct device *dev)
         time = wait_for_completion_timeout(&slave->initialization_complete,
                                            msecs_to_jiffies(RT715_PROBE_TIMEOUT));
         if (!time) {
-               dev_err(&slave->dev, "Initialization not complete, timed out\n");
+               dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__);
                 sdw_show_ping_status(slave->bus, true);
  
                 return -ETIMEDOUT;
diff --git a/sound/soc/codecs/rt715.c b/sound/soc/codecs/rt715.c

index 9f732a5abd53f37cd24382522f9dc3ab97ecd7b0..299c9b12377c6ada95a40b4a876a43dd127786be 100644 (file)
--- a/sound/soc/codecs/rt715.c
+++ b/sound/soc/codecs/rt715.c
@@ -40,8 +40,8 @@ static int rt715_index_write(struct regmap *regmap, unsigned int reg,
  
         ret = regmap_write(regmap, addr, value);
         if (ret < 0) {
-               pr_err("Failed to set private value: %08x <= %04x %d\n",
-                      addr, value, ret);
+               pr_err("%s: Failed to set private value: %08x <= %04x %d\n",
+                      __func__, addr, value, ret);
         }
  
         return ret;
@@ -55,8 +55,8 @@ static int rt715_index_write_nid(struct regmap *regmap,
  
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
-               pr_err("Failed to set private value: %06x <= %04x ret=%d\n",
-                       addr, value, ret);
+               pr_err("%s: Failed to set private value: %06x <= %04x ret=%d\n",
+                      __func__, addr, value, ret);
  
         return ret;
  }
@@ -70,8 +70,8 @@ static int rt715_index_read_nid(struct regmap *regmap,
         *value = 0;
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
-               pr_err("Failed to get private value: %06x => %04x ret=%d\n",
-                       addr, *value, ret);
+               pr_err("%s: Failed to get private value: %06x => %04x ret=%d\n",
+                      __func__, addr, *value, ret);
  
         return ret;
  }
@@ -862,14 +862,14 @@ static int rt715_pcm_hw_params(struct snd_pcm_substream *substream,
                 rt715_index_write(rt715->regmap, RT715_SDW_INPUT_SEL, 0xa000);
                 break;
         default:
-               dev_err(component->dev, "Invalid DAI id %d\n", dai->id);
+               dev_err(component->dev, "%s: Invalid DAI id %d\n", __func__, dai->id);
                 return -EINVAL;
         }
  
         retval = sdw_stream_add_slave(rt715->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
@@ -883,8 +883,8 @@ static int rt715_pcm_hw_params(struct snd_pcm_substream *substream,
                 val |= 0x0 << 8;
                 break;
         default:
-               dev_err(component->dev, "Unsupported sample rate %d\n",
-                       params_rate(params));
+               dev_err(component->dev, "%s: Unsupported sample rate %d\n",
+                       __func__, params_rate(params));
                 return -EINVAL;
         }
  
@@ -892,8 +892,8 @@ static int rt715_pcm_hw_params(struct snd_pcm_substream *substream,
                 /* bit 3:0 Number of Channel */
                 val |= (params_channels(params) - 1);
         } else {
-               dev_err(component->dev, "Unsupported channels %d\n",
-                       params_channels(params));
+               dev_err(component->dev, "%s: Unsupported channels %d\n",
+                       __func__, params_channels(params));
                 return -EINVAL;
         }
  
diff --git a/sound/soc/codecs/rt722-sdca-sdw.c b/sound/soc/codecs/rt722-sdca-sdw.c

index eb76f4c675b67fd59df00cb41a17955c751bbd44..65d584c1886e819597577ed10551d2fe5d104e53 100644 (file)
--- a/sound/soc/codecs/rt722-sdca-sdw.c
+++ b/sound/soc/codecs/rt722-sdca-sdw.c
@@ -467,13 +467,13 @@ static int __maybe_unused rt722_sdca_dev_resume(struct device *dev)
                 return 0;
  
         if (!slave->unattach_request) {
+               mutex_lock(&rt722->disable_irq_lock);
                 if (rt722->disable_irq == true) {
-                       mutex_lock(&rt722->disable_irq_lock);
                         sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK1, SDW_SCP_SDCA_INTMASK_SDCA_6);
                         sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK2, SDW_SCP_SDCA_INTMASK_SDCA_8);
                         rt722->disable_irq = false;
-                       mutex_unlock(&rt722->disable_irq_lock);
                 }
+               mutex_unlock(&rt722->disable_irq_lock);
                 goto regmap_sync;
         }
  
diff --git a/sound/soc/codecs/rt722-sdca.c b/sound/soc/codecs/rt722-sdca.c

index 0e1c65a20392addb92a6bdbc39319884f4d2f9c9..e0ea3a23f7cc6844691338ff8daae7f2843d2c6e 100644 (file)
--- a/sound/soc/codecs/rt722-sdca.c
+++ b/sound/soc/codecs/rt722-sdca.c
@@ -35,8 +35,8 @@ int rt722_sdca_index_write(struct rt722_sdca_priv *rt722,
         ret = regmap_write(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt722->slave->dev,
-                       "Failed to set private value: %06x <= %04x ret=%d\n",
-                       addr, value, ret);
+                       "%s: Failed to set private value: %06x <= %04x ret=%d\n",
+                       __func__, addr, value, ret);
  
         return ret;
  }
@@ -51,8 +51,8 @@ int rt722_sdca_index_read(struct rt722_sdca_priv *rt722,
         ret = regmap_read(regmap, addr, value);
         if (ret < 0)
                 dev_err(&rt722->slave->dev,
-                       "Failed to get private value: %06x => %04x ret=%d\n",
-                       addr, *value, ret);
+                       "%s: Failed to get private value: %06x => %04x ret=%d\n",
+                       __func__, addr, *value, ret);
  
         return ret;
  }
@@ -663,7 +663,8 @@ static int rt722_sdca_dmic_set_gain_put(struct snd_kcontrol *kcontrol,
         for (i = 0; i < p->count; i++) {
                 err = regmap_write(rt722->mbq_regmap, p->reg_base + i, gain_val[i]);
                 if (err < 0)
-                       dev_err(&rt722->slave->dev, "%#08x can't be set\n", p->reg_base + i);
+                       dev_err(&rt722->slave->dev, "%s: %#08x can't be set\n",
+                               __func__, p->reg_base + i);
         }
  
         return changed;
@@ -1211,13 +1212,13 @@ static int rt722_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
         retval = sdw_stream_add_slave(rt722->slave, &stream_config,
                                         &port_config, 1, sdw_stream);
         if (retval) {
-               dev_err(dai->dev, "Unable to configure port\n");
+               dev_err(dai->dev, "%s: Unable to configure port\n", __func__);
                 return retval;
         }
  
         if (params_channels(params) > 16) {
-               dev_err(component->dev, "Unsupported channels %d\n",
-                       params_channels(params));
+               dev_err(component->dev, "%s: Unsupported channels %d\n",
+                       __func__, params_channels(params));
                 return -EINVAL;
         }
  
@@ -1236,8 +1237,8 @@ static int rt722_sdca_pcm_hw_params(struct snd_pcm_substream *substream,
                 sampling_rate = RT722_SDCA_RATE_192000HZ;
                 break;
         default:
-               dev_err(component->dev, "Rate %d is not supported\n",
-                       params_rate(params));
+               dev_err(component->dev, "%s: Rate %d is not supported\n",
+                       __func__, params_rate(params));
                 return -EINVAL;
         }
  
diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c

index e451c009f2d99980bab20dd5d4c55cc26bd73cd5..7d5c096e06cd32b77fc6b73f18002af63bd6c8d5 100644 (file)
--- a/sound/soc/codecs/wm_adsp.c
+++ b/sound/soc/codecs/wm_adsp.c
@@ -683,11 +683,12 @@ static void wm_adsp_control_remove(struct cs_dsp_coeff_ctl *cs_ctl)
  int wm_adsp_write_ctl(struct wm_adsp *dsp, const char *name, int type,
                       unsigned int alg, void *buf, size_t len)
  {
-       struct cs_dsp_coeff_ctl *cs_ctl = cs_dsp_get_ctl(&dsp->cs_dsp, name, type, alg);
+       struct cs_dsp_coeff_ctl *cs_ctl;
         struct wm_coeff_ctl *ctl;
         int ret;
  
         mutex_lock(&dsp->cs_dsp.pwr_lock);
+       cs_ctl = cs_dsp_get_ctl(&dsp->cs_dsp, name, type, alg);
         ret = cs_dsp_coeff_write_ctrl(cs_ctl, 0, buf, len);
         mutex_unlock(&dsp->cs_dsp.pwr_lock);
  
diff --git a/sound/soc/intel/avs/boards/da7219.c b/sound/soc/intel/avs/boards/da7219.c

index c018f84fe02529322455035e6ca4fff7ddf2afaf..fc072dc58968cb80d6481cb0c84a4b776bdef150 100644 (file)
--- a/sound/soc/intel/avs/boards/da7219.c
+++ b/sound/soc/intel/avs/boards/da7219.c
@@ -296,5 +296,6 @@ static struct platform_driver avs_da7219_driver = {
  
  module_platform_driver(avs_da7219_driver);
  
+MODULE_DESCRIPTION("Intel da7219 machine driver");
  MODULE_AUTHOR("Cezary Rojewski <cezary.rojewski@intel.com>");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/dmic.c b/sound/soc/intel/avs/boards/dmic.c

index ba2bc7f689eb603051870bfcbf28dc5640b7ac66..d9e5e85f523358d26a85218c5050fe8b31876e21 100644 (file)
--- a/sound/soc/intel/avs/boards/dmic.c
+++ b/sound/soc/intel/avs/boards/dmic.c
@@ -96,4 +96,5 @@ static struct platform_driver avs_dmic_driver = {
  
  module_platform_driver(avs_dmic_driver);
  
+MODULE_DESCRIPTION("Intel DMIC machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/es8336.c b/sound/soc/intel/avs/boards/es8336.c

index 1090082e7d5bfcd47e92ecbd6bed22269fab3678..5c90a60075773409431f957f05f2e3bc03303334 100644 (file)
--- a/sound/soc/intel/avs/boards/es8336.c
+++ b/sound/soc/intel/avs/boards/es8336.c
@@ -326,4 +326,5 @@ static struct platform_driver avs_es8336_driver = {
  
  module_platform_driver(avs_es8336_driver);
  
+MODULE_DESCRIPTION("Intel es8336 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/i2s_test.c b/sound/soc/intel/avs/boards/i2s_test.c

index 28f254eb0d03fcfa6f5fc8c4bd0184d73f9c298d..027373d6a16d602c62b07d34b2b0bd984c14ae78 100644 (file)
--- a/sound/soc/intel/avs/boards/i2s_test.c
+++ b/sound/soc/intel/avs/boards/i2s_test.c
@@ -204,4 +204,5 @@ static struct platform_driver avs_i2s_test_driver = {
  
  module_platform_driver(avs_i2s_test_driver);
  
+MODULE_DESCRIPTION("Intel i2s test machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/max98357a.c b/sound/soc/intel/avs/boards/max98357a.c

index a83b95f25129f90e1a0fd6f5d34e7e6fa799d34f..1ff85e4d8e160b7c61a74fc6dbdf9a32ee410614 100644 (file)
--- a/sound/soc/intel/avs/boards/max98357a.c
+++ b/sound/soc/intel/avs/boards/max98357a.c
@@ -154,4 +154,5 @@ static struct platform_driver avs_max98357a_driver = {
  
  module_platform_driver(avs_max98357a_driver)
  
+MODULE_DESCRIPTION("Intel max98357a machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/max98373.c b/sound/soc/intel/avs/boards/max98373.c

index 3b980a025e6f697446419f6efec4f071e45495cb..8d31586b73eaec7c10edb319002f1b024e41480b 100644 (file)
--- a/sound/soc/intel/avs/boards/max98373.c
+++ b/sound/soc/intel/avs/boards/max98373.c
@@ -211,4 +211,5 @@ static struct platform_driver avs_max98373_driver = {
  
  module_platform_driver(avs_max98373_driver)
  
+MODULE_DESCRIPTION("Intel max98373 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/max98927.c b/sound/soc/intel/avs/boards/max98927.c

index 86dd2b228df3a5ce1a2751221659834cb458c775..572ec58073d06bce6e07bc7dbd05f2b9a3cf5462 100644 (file)
--- a/sound/soc/intel/avs/boards/max98927.c
+++ b/sound/soc/intel/avs/boards/max98927.c
@@ -208,4 +208,5 @@ static struct platform_driver avs_max98927_driver = {
  
  module_platform_driver(avs_max98927_driver)
  
+MODULE_DESCRIPTION("Intel max98927 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/nau8825.c b/sound/soc/intel/avs/boards/nau8825.c

index 1c1e2083f474df122259c41f774e246dd7223a1f..55db75efae41425684bdd1c67441e76cae4d4062 100644 (file)
--- a/sound/soc/intel/avs/boards/nau8825.c
+++ b/sound/soc/intel/avs/boards/nau8825.c
@@ -313,4 +313,5 @@ static struct platform_driver avs_nau8825_driver = {
  
  module_platform_driver(avs_nau8825_driver)
  
+MODULE_DESCRIPTION("Intel nau8825 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/probe.c b/sound/soc/intel/avs/boards/probe.c

index a9469b5ecb402f1af389c52fd6e1e5d9991cc3e9..8be6887bbc6e81cb0f6af16685524fd01b96e36a 100644 (file)
--- a/sound/soc/intel/avs/boards/probe.c
+++ b/sound/soc/intel/avs/boards/probe.c
@@ -69,4 +69,5 @@ static struct platform_driver avs_probe_mb_driver = {
  
  module_platform_driver(avs_probe_mb_driver);
  
+MODULE_DESCRIPTION("Intel probe machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/rt274.c b/sound/soc/intel/avs/boards/rt274.c

index bfcb8845fd15d06ec39d7360008ac60f73491d3b..1cf52421608753e1cca23d333eab4a7a9d624b63 100644 (file)
--- a/sound/soc/intel/avs/boards/rt274.c
+++ b/sound/soc/intel/avs/boards/rt274.c
@@ -276,4 +276,5 @@ static struct platform_driver avs_rt274_driver = {
  
  module_platform_driver(avs_rt274_driver);
  
+MODULE_DESCRIPTION("Intel rt274 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/rt286.c b/sound/soc/intel/avs/boards/rt286.c

index 28d7d86b1cc99dabed8c76a94ee2c5dc3064582e..4740bba1057032128c60594b9339b820f9f7bc70 100644 (file)
--- a/sound/soc/intel/avs/boards/rt286.c
+++ b/sound/soc/intel/avs/boards/rt286.c
@@ -247,4 +247,5 @@ static struct platform_driver avs_rt286_driver = {
  
  module_platform_driver(avs_rt286_driver);
  
+MODULE_DESCRIPTION("Intel rt286 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/rt298.c b/sound/soc/intel/avs/boards/rt298.c

index 80f490b9e11842c34859ff8b6cebc8b7cee51e71..6e409e29f6974654a0a5cbe4eba105f78c055164 100644 (file)
--- a/sound/soc/intel/avs/boards/rt298.c
+++ b/sound/soc/intel/avs/boards/rt298.c
@@ -266,4 +266,5 @@ static struct platform_driver avs_rt298_driver = {
  
  module_platform_driver(avs_rt298_driver);
  
+MODULE_DESCRIPTION("Intel rt298 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/rt5514.c b/sound/soc/intel/avs/boards/rt5514.c

index 60105f453ae235c6affdcfef2181b95b101a3e1c..097ae5f73241efea14cf187e85bc88d936be1956 100644 (file)
--- a/sound/soc/intel/avs/boards/rt5514.c
+++ b/sound/soc/intel/avs/boards/rt5514.c
@@ -192,4 +192,5 @@ static struct platform_driver avs_rt5514_driver = {
  
  module_platform_driver(avs_rt5514_driver);
  
+MODULE_DESCRIPTION("Intel rt5514 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/rt5663.c b/sound/soc/intel/avs/boards/rt5663.c

index b4762c2a7bf2d1a3b0237479145380243861c4d8..1880c315cc4d1f9be4b7f42113382b322bd9c4cd 100644 (file)
--- a/sound/soc/intel/avs/boards/rt5663.c
+++ b/sound/soc/intel/avs/boards/rt5663.c
@@ -265,4 +265,5 @@ static struct platform_driver avs_rt5663_driver = {
  
  module_platform_driver(avs_rt5663_driver);
  
+MODULE_DESCRIPTION("Intel rt5663 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/rt5682.c b/sound/soc/intel/avs/boards/rt5682.c

index 243f979fda98a4d6e67b76cef8b5a192dd6a8ff7..594a971ded9eb2ea339ab2e45ae84da8b4b1dd6d 100644 (file)
--- a/sound/soc/intel/avs/boards/rt5682.c
+++ b/sound/soc/intel/avs/boards/rt5682.c
@@ -341,5 +341,6 @@ static struct platform_driver avs_rt5682_driver = {
  
  module_platform_driver(avs_rt5682_driver)
  
+MODULE_DESCRIPTION("Intel rt5682 machine driver");
  MODULE_AUTHOR("Cezary Rojewski <cezary.rojewski@intel.com>");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/intel/avs/boards/ssm4567.c b/sound/soc/intel/avs/boards/ssm4567.c

index 4a0e136835ff5d05118b1d802d2884beabb68a95..d6f7f046c24e5d12bd3189fe800bb05b18ee4444 100644 (file)
--- a/sound/soc/intel/avs/boards/ssm4567.c
+++ b/sound/soc/intel/avs/boards/ssm4567.c
@@ -200,4 +200,5 @@ static struct platform_driver avs_ssm4567_driver = {
  
  module_platform_driver(avs_ssm4567_driver)
  
+MODULE_DESCRIPTION("Intel ssm4567 machine driver");
  MODULE_LICENSE("GPL");
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c

index 2d25748ca70662bf771c6896297ccb6a0fb0798f..b27e89ff6a1673f57db6e253a818d6fbe3d1ab91 100644 (file)
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -263,7 +263,7 @@ int snd_soc_get_volsw(struct snd_kcontrol *kcontrol,
         int max = mc->max;
         int min = mc->min;
         int sign_bit = mc->sign_bit;
-       unsigned int mask = (1 << fls(max)) - 1;
+       unsigned int mask = (1ULL << fls(max)) - 1;
         unsigned int invert = mc->invert;
         int val;
         int ret;
diff --git a/sound/soc/sof/amd/acp.c b/sound/soc/sof/amd/acp.c

index be7dc1e02284ab62f8cbaeffdd70f26a19ff6232..c12c7f820529476de0273474082b8174ab0ae052 100644 (file)
--- a/sound/soc/sof/amd/acp.c
+++ b/sound/soc/sof/amd/acp.c
@@ -704,6 +704,10 @@ int amd_sof_acp_probe(struct snd_sof_dev *sdev)
                 goto unregister_dev;
         }
  
+       ret = acp_init(sdev);
+       if (ret < 0)
+               goto free_smn_dev;
+
         sdev->ipc_irq = pci->irq;
         ret = request_threaded_irq(sdev->ipc_irq, acp_irq_handler, acp_irq_thread,
                                    IRQF_SHARED, "AudioDSP", sdev);
@@ -713,10 +717,6 @@ int amd_sof_acp_probe(struct snd_sof_dev *sdev)
                 goto free_smn_dev;
         }
  
-       ret = acp_init(sdev);
-       if (ret < 0)
-               goto free_ipc_irq;
-
         /* scan SoundWire capabilities exposed by DSDT */
         ret = acp_sof_scan_sdw_devices(sdev, chip->sdw_acpi_dev_addr);
         if (ret < 0) {
diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c

index 9b00ede2a486a2ff2d619ab714ed2c1665eb3463..cc84d4c81be9d363d701b1d5c658e26a62079435 100644 (file)
--- a/sound/soc/sof/core.c
+++ b/sound/soc/sof/core.c
@@ -339,8 +339,7 @@ static int sof_init_environment(struct snd_sof_dev *sdev)
         ret = snd_sof_probe(sdev);
         if (ret < 0) {
                 dev_err(sdev->dev, "failed to probe DSP %d\n", ret);
-               sof_ops_free(sdev);
-               return ret;
+               goto err_sof_probe;
         }
  
         /* check machine info */
@@ -358,15 +357,18 @@ static int sof_init_environment(struct snd_sof_dev *sdev)
                 ret = validate_sof_ops(sdev);
                 if (ret < 0) {
                         snd_sof_remove(sdev);
+                       snd_sof_remove_late(sdev);
                         return ret;
                 }
         }
  
+       return 0;
+
  err_machine_check:
-       if (ret) {
-               snd_sof_remove(sdev);
-               sof_ops_free(sdev);
-       }
+       snd_sof_remove(sdev);
+err_sof_probe:
+       snd_sof_remove_late(sdev);
+       sof_ops_free(sdev);
  
         return ret;
  }
diff --git a/sound/soc/sof/intel/hda-common-ops.c b/sound/soc/sof/intel/hda-common-ops.c

index 2b385cddc385c5bd59e11acfe8e6bda45704fdd1..d71bb66b9991164cdb8b0ed000e461d9e3a0719c 100644 (file)
--- a/sound/soc/sof/intel/hda-common-ops.c
+++ b/sound/soc/sof/intel/hda-common-ops.c
@@ -57,6 +57,9 @@ struct snd_sof_dsp_ops sof_hda_common_ops = {
         .pcm_pointer    = hda_dsp_pcm_pointer,
         .pcm_ack        = hda_dsp_pcm_ack,
  
+       .get_dai_frame_counter = hda_dsp_get_stream_llp,
+       .get_host_byte_counter = hda_dsp_get_stream_ldp,
+
         /* firmware loading */
         .load_firmware = snd_sof_load_firmware_raw,
  
diff --git a/sound/soc/sof/intel/hda-dai-ops.c b/sound/soc/sof/intel/hda-dai-ops.c

index c50ca9e72d37385816ddb3cd6ef7456ed50a58e9..b073720b4cf432466e18bf8840dd87eb5efac98e 100644 (file)
--- a/sound/soc/sof/intel/hda-dai-ops.c
+++ b/sound/soc/sof/intel/hda-dai-ops.c
@@ -7,6 +7,7 @@
  
  #include <sound/pcm_params.h>
  #include <sound/hdaudio_ext.h>
+#include <sound/hda_register.h>
  #include <sound/hda-mlink.h>
  #include <sound/sof/ipc4/header.h>
  #include <uapi/sound/sof/header.h>
@@ -362,6 +363,16 @@ static int hda_trigger(struct snd_sof_dev *sdev, struct snd_soc_dai *cpu_dai,
         case SNDRV_PCM_TRIGGER_STOP:
         case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
                 snd_hdac_ext_stream_clear(hext_stream);
+
+               /*
+                * Save the LLP registers in case the stream is
+                * restarting due PAUSE_RELEASE, or START without a pcm
+                * close/open since in this case the LLP register is not reset
+                * to 0 and the delay calculation will return with invalid
+                * results.
+                */
+               hext_stream->pplcllpl = readl(hext_stream->pplc_addr + AZX_REG_PPLCLLPL);
+               hext_stream->pplcllpu = readl(hext_stream->pplc_addr + AZX_REG_PPLCLLPU);
                 break;
         default:
                 dev_err(sdev->dev, "unknown trigger command %d\n", cmd);
diff --git a/sound/soc/sof/intel/hda-dsp.c b/sound/soc/sof/intel/hda-dsp.c

index 31ffa1a8f2ac04ddd5c31aadec5400c52757dd19..ef5c915db8ffb47a622a8c753f14dd950fb9c45c 100644 (file)
--- a/sound/soc/sof/intel/hda-dsp.c
+++ b/sound/soc/sof/intel/hda-dsp.c
@@ -681,17 +681,27 @@ static int hda_suspend(struct snd_sof_dev *sdev, bool runtime_suspend)
         struct sof_intel_hda_dev *hda = sdev->pdata->hw_pdata;
         const struct sof_intel_dsp_desc *chip = hda->desc;
         struct hdac_bus *bus = sof_to_bus(sdev);
+       bool imr_lost = false;
         int ret, j;
  
         /*
-        * The memory used for IMR boot loses its content in deeper than S3 state
-        * We must not try IMR boot on next power up (as it will fail).
-        *
+        * The memory used for IMR boot loses its content in deeper than S3
+        * state on CAVS platforms.
+        * On ACE platforms due to the system architecture the IMR content is
+        * lost at S3 state already, they are tailored for s2idle use.
+        * We must not try IMR boot on next power up in these cases as it will
+        * fail.
+        */
+       if (sdev->system_suspend_target > SOF_SUSPEND_S3 ||
+           (chip->hw_ip_version >= SOF_INTEL_ACE_1_0 &&
+            sdev->system_suspend_target == SOF_SUSPEND_S3))
+               imr_lost = true;
+
+       /*
          * In case of firmware crash or boot failure set the skip_imr_boot to true
          * as well in order to try to re-load the firmware to do a 'cold' boot.
          */
-       if (sdev->system_suspend_target > SOF_SUSPEND_S3 ||
-           sdev->fw_state == SOF_FW_CRASHED ||
+       if (imr_lost || sdev->fw_state == SOF_FW_CRASHED ||
             sdev->fw_state == SOF_FW_BOOT_FAILED)
                 hda->skip_imr_boot = true;
  
diff --git a/sound/soc/sof/intel/hda-pcm.c b/sound/soc/sof/intel/hda-pcm.c

index 18f07364d2198425bffd3e111a546e11b536cd63..d7b446f3f973e3d532d8eaef241aac2f3a30a54d 100644 (file)
--- a/sound/soc/sof/intel/hda-pcm.c
+++ b/sound/soc/sof/intel/hda-pcm.c
@@ -259,8 +259,37 @@ int hda_dsp_pcm_open(struct snd_sof_dev *sdev,
                 snd_pcm_hw_constraint_mask64(substream->runtime, SNDRV_PCM_HW_PARAM_FORMAT,
                                              SNDRV_PCM_FMTBIT_S16 | SNDRV_PCM_FMTBIT_S32);
  
+       /*
+        * The dsp_max_burst_size_in_ms is the length of the maximum burst size
+        * of the host DMA in the ALSA buffer.
+        *
+        * On playback start the DMA will transfer dsp_max_burst_size_in_ms
+        * amount of data in one initial burst to fill up the host DMA buffer.
+        * Consequent DMA burst sizes are shorter and their length can vary.
+        * To make sure that userspace allocate large enough ALSA buffer we need
+        * to place a constraint on the buffer time.
+        *
+        * On capture the DMA will transfer 1ms chunks.
+        *
+        * Exact dsp_max_burst_size_in_ms constraint is racy, so set the
+        * constraint to a minimum of 2x dsp_max_burst_size_in_ms.
+        */
+       if (spcm->stream[direction].dsp_max_burst_size_in_ms)
+               snd_pcm_hw_constraint_minmax(substream->runtime,
+                       SNDRV_PCM_HW_PARAM_BUFFER_TIME,
+                       spcm->stream[direction].dsp_max_burst_size_in_ms * USEC_PER_MSEC * 2,
+                       UINT_MAX);
+
         /* binding pcm substream to hda stream */
         substream->runtime->private_data = &dsp_stream->hstream;
+
+       /*
+        * Reset the llp cache values (they are used for LLP compensation in
+        * case the counter is not reset)
+        */
+       dsp_stream->pplcllpl = 0;
+       dsp_stream->pplcllpu = 0;
+
         return 0;
  }
  
diff --git a/sound/soc/sof/intel/hda-stream.c b/sound/soc/sof/intel/hda-stream.c

index b387b1a69d7ea3ceaed9fe814b174d9040e3eae1..0c189d3b19c1af6448d5d1264802ef493e5c7b14 100644 (file)
--- a/sound/soc/sof/intel/hda-stream.c
+++ b/sound/soc/sof/intel/hda-stream.c
@@ -1063,3 +1063,73 @@ snd_pcm_uframes_t hda_dsp_stream_get_position(struct hdac_stream *hstream,
  
         return pos;
  }
+
+#define merge_u64(u32_u, u32_l) (((u64)(u32_u) << 32) | (u32_l))
+
+/**
+ * hda_dsp_get_stream_llp - Retrieve the LLP (Linear Link Position) of the stream
+ * @sdev: SOF device
+ * @component: ASoC component
+ * @substream: PCM substream
+ *
+ * Returns the raw Linear Link Position value
+ */
+u64 hda_dsp_get_stream_llp(struct snd_sof_dev *sdev,
+                          struct snd_soc_component *component,
+                          struct snd_pcm_substream *substream)
+{
+       struct hdac_stream *hstream = substream->runtime->private_data;
+       struct hdac_ext_stream *hext_stream = stream_to_hdac_ext_stream(hstream);
+       u32 llp_l, llp_u;
+
+       /*
+        * The pplc_addr have been calculated during probe in
+        * hda_dsp_stream_init():
+        * pplc_addr = sdev->bar[HDA_DSP_PP_BAR] +
+        *             SOF_HDA_PPLC_BASE +
+        *             SOF_HDA_PPLC_MULTI * total_stream +
+        *             SOF_HDA_PPLC_INTERVAL * stream_index
+        *
+        * Use this pre-calculated address to avoid repeated re-calculation.
+        */
+       llp_l = readl(hext_stream->pplc_addr + AZX_REG_PPLCLLPL);
+       llp_u = readl(hext_stream->pplc_addr + AZX_REG_PPLCLLPU);
+
+       /* Compensate the LLP counter with the saved offset */
+       if (hext_stream->pplcllpl || hext_stream->pplcllpu)
+               return merge_u64(llp_u, llp_l) -
+                      merge_u64(hext_stream->pplcllpu, hext_stream->pplcllpl);
+
+       return merge_u64(llp_u, llp_l);
+}
+
+/**
+ * hda_dsp_get_stream_ldp - Retrieve the LDP (Linear DMA Position) of the stream
+ * @sdev: SOF device
+ * @component: ASoC component
+ * @substream: PCM substream
+ *
+ * Returns the raw Linear Link Position value
+ */
+u64 hda_dsp_get_stream_ldp(struct snd_sof_dev *sdev,
+                          struct snd_soc_component *component,
+                          struct snd_pcm_substream *substream)
+{
+       struct hdac_stream *hstream = substream->runtime->private_data;
+       struct hdac_ext_stream *hext_stream = stream_to_hdac_ext_stream(hstream);
+       u32 ldp_l, ldp_u;
+
+       /*
+        * The pphc_addr have been calculated during probe in
+        * hda_dsp_stream_init():
+        * pphc_addr = sdev->bar[HDA_DSP_PP_BAR] +
+        *             SOF_HDA_PPHC_BASE +
+        *             SOF_HDA_PPHC_INTERVAL * stream_index
+        *
+        * Use this pre-calculated address to avoid repeated re-calculation.
+        */
+       ldp_l = readl(hext_stream->pphc_addr + AZX_REG_PPHCLDPL);
+       ldp_u = readl(hext_stream->pphc_addr + AZX_REG_PPHCLDPU);
+
+       return ((u64)ldp_u << 32) | ldp_l;
+}
diff --git a/sound/soc/sof/intel/hda.h b/sound/soc/sof/intel/hda.h

index b36eb7c7891335a3038d5e1402d6f73ede754b81..81a1d4606d3cde8ecb1b9b2ef859c7b0393555f5 100644 (file)
--- a/sound/soc/sof/intel/hda.h
+++ b/sound/soc/sof/intel/hda.h
@@ -662,6 +662,12 @@ bool hda_dsp_check_stream_irq(struct snd_sof_dev *sdev);
  
  snd_pcm_uframes_t hda_dsp_stream_get_position(struct hdac_stream *hstream,
                                               int direction, bool can_sleep);
+u64 hda_dsp_get_stream_llp(struct snd_sof_dev *sdev,
+                          struct snd_soc_component *component,
+                          struct snd_pcm_substream *substream);
+u64 hda_dsp_get_stream_ldp(struct snd_sof_dev *sdev,
+                          struct snd_soc_component *component,
+                          struct snd_pcm_substream *substream);
  
  struct hdac_ext_stream *
         hda_dsp_stream_get(struct snd_sof_dev *sdev, int direction, u32 flags);
diff --git a/sound/soc/sof/intel/lnl.c b/sound/soc/sof/intel/lnl.c

index 7ae017a00184e52371052c188d75f13fbfc053df..aeb4350cce6bba3b229af876e188c4ead7f2b201 100644 (file)
--- a/sound/soc/sof/intel/lnl.c
+++ b/sound/soc/sof/intel/lnl.c
@@ -29,15 +29,17 @@ static const struct snd_sof_debugfs_map lnl_dsp_debugfs[] = {
  };
  
  /* this helps allows the DSP to setup DMIC/SSP */
-static int hdac_bus_offload_dmic_ssp(struct hdac_bus *bus)
+static int hdac_bus_offload_dmic_ssp(struct hdac_bus *bus, bool enable)
  {
         int ret;
  
-       ret = hdac_bus_eml_enable_offload(bus, true,  AZX_REG_ML_LEPTR_ID_INTEL_SSP, true);
+       ret = hdac_bus_eml_enable_offload(bus, true,
+                                         AZX_REG_ML_LEPTR_ID_INTEL_SSP, enable);
         if (ret < 0)
                 return ret;
  
-       ret = hdac_bus_eml_enable_offload(bus, true,  AZX_REG_ML_LEPTR_ID_INTEL_DMIC, true);
+       ret = hdac_bus_eml_enable_offload(bus, true,
+                                         AZX_REG_ML_LEPTR_ID_INTEL_DMIC, enable);
         if (ret < 0)
                 return ret;
  
@@ -52,7 +54,19 @@ static int lnl_hda_dsp_probe(struct snd_sof_dev *sdev)
         if (ret < 0)
                 return ret;
  
-       return hdac_bus_offload_dmic_ssp(sof_to_bus(sdev));
+       return hdac_bus_offload_dmic_ssp(sof_to_bus(sdev), true);
+}
+
+static void lnl_hda_dsp_remove(struct snd_sof_dev *sdev)
+{
+       int ret;
+
+       ret = hdac_bus_offload_dmic_ssp(sof_to_bus(sdev), false);
+       if (ret < 0)
+               dev_warn(sdev->dev,
+                        "Failed to disable offload for DMIC/SSP: %d\n", ret);
+
+       hda_dsp_remove(sdev);
  }
  
  static int lnl_hda_dsp_resume(struct snd_sof_dev *sdev)
@@ -63,7 +77,7 @@ static int lnl_hda_dsp_resume(struct snd_sof_dev *sdev)
         if (ret < 0)
                 return ret;
  
-       return hdac_bus_offload_dmic_ssp(sof_to_bus(sdev));
+       return hdac_bus_offload_dmic_ssp(sof_to_bus(sdev), true);
  }
  
  static int lnl_hda_dsp_runtime_resume(struct snd_sof_dev *sdev)
@@ -74,7 +88,7 @@ static int lnl_hda_dsp_runtime_resume(struct snd_sof_dev *sdev)
         if (ret < 0)
                 return ret;
  
-       return hdac_bus_offload_dmic_ssp(sof_to_bus(sdev));
+       return hdac_bus_offload_dmic_ssp(sof_to_bus(sdev), true);
  }
  
  static int lnl_dsp_post_fw_run(struct snd_sof_dev *sdev)
@@ -97,9 +111,11 @@ int sof_lnl_ops_init(struct snd_sof_dev *sdev)
         /* common defaults */
         memcpy(&sof_lnl_ops, &sof_hda_common_ops, sizeof(struct snd_sof_dsp_ops));
  
-       /* probe */
-       if (!sdev->dspless_mode_selected)
+       /* probe/remove */
+       if (!sdev->dspless_mode_selected) {
                 sof_lnl_ops.probe = lnl_hda_dsp_probe;
+               sof_lnl_ops.remove = lnl_hda_dsp_remove;
+       }
  
         /* shutdown */
         sof_lnl_ops.shutdown = hda_dsp_shutdown;
@@ -134,8 +150,6 @@ int sof_lnl_ops_init(struct snd_sof_dev *sdev)
                 sof_lnl_ops.runtime_resume = lnl_hda_dsp_runtime_resume;
         }
  
-       sof_lnl_ops.get_stream_position = mtl_dsp_get_stream_hda_link_position;
-
         /* dsp core get/put */
         sof_lnl_ops.core_get = mtl_dsp_core_get;
         sof_lnl_ops.core_put = mtl_dsp_core_put;
diff --git a/sound/soc/sof/intel/mtl.c b/sound/soc/sof/intel/mtl.c

index df05dc77b8d5e3bef5f3e55ea7e82837b5a89504..060c34988e90d122caf12cc30fe42ba5f1d0c87d 100644 (file)
--- a/sound/soc/sof/intel/mtl.c
+++ b/sound/soc/sof/intel/mtl.c
@@ -626,18 +626,6 @@ static int mtl_dsp_disable_interrupts(struct snd_sof_dev *sdev)
         return mtl_enable_interrupts(sdev, false);
  }
  
-u64 mtl_dsp_get_stream_hda_link_position(struct snd_sof_dev *sdev,
-                                        struct snd_soc_component *component,
-                                        struct snd_pcm_substream *substream)
-{
-       struct hdac_stream *hstream = substream->runtime->private_data;
-       u32 llp_l, llp_u;
-
-       llp_l = snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR, MTL_PPLCLLPL(hstream->index));
-       llp_u = snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR, MTL_PPLCLLPU(hstream->index));
-       return ((u64)llp_u << 32) | llp_l;
-}
-
  int mtl_dsp_core_get(struct snd_sof_dev *sdev, int core)
  {
         const struct sof_ipc_pm_ops *pm_ops = sdev->ipc->ops->pm;
@@ -707,8 +695,6 @@ int sof_mtl_ops_init(struct snd_sof_dev *sdev)
         sof_mtl_ops.core_get = mtl_dsp_core_get;
         sof_mtl_ops.core_put = mtl_dsp_core_put;
  
-       sof_mtl_ops.get_stream_position = mtl_dsp_get_stream_hda_link_position;
-
         sdev->private = kzalloc(sizeof(struct sof_ipc4_fw_data), GFP_KERNEL);
         if (!sdev->private)
                 return -ENOMEM;
diff --git a/sound/soc/sof/intel/mtl.h b/sound/soc/sof/intel/mtl.h

index cc5a1f46fd09560e9fefc10d6b4775b82294bfd4..ea8c1b83f7127d58f76bd5db018eeb0f0d9b1d7f 100644 (file)
--- a/sound/soc/sof/intel/mtl.h
+++ b/sound/soc/sof/intel/mtl.h
@@ -6,12 +6,6 @@
   * Copyright(c) 2020-2022 Intel Corporation. All rights reserved.
   */
  
-/* HDA Registers */
-#define MTL_PPLCLLPL_BASE              0x948
-#define MTL_PPLCLLPU_STRIDE            0x10
-#define MTL_PPLCLLPL(x)                        (MTL_PPLCLLPL_BASE + (x) * MTL_PPLCLLPU_STRIDE)
-#define MTL_PPLCLLPU(x)                        (MTL_PPLCLLPL_BASE + 0x4 + (x) * MTL_PPLCLLPU_STRIDE)
-
  /* DSP Registers */
  #define MTL_HFDSSCS                    0x1000
  #define MTL_HFDSSCS_SPA_MASK           BIT(16)
@@ -103,9 +97,5 @@ int mtl_dsp_ipc_get_window_offset(struct snd_sof_dev *sdev, u32 id);
  
  void mtl_ipc_dump(struct snd_sof_dev *sdev);
  
-u64 mtl_dsp_get_stream_hda_link_position(struct snd_sof_dev *sdev,
-                                        struct snd_soc_component *component,
-                                        struct snd_pcm_substream *substream);
-
  int mtl_dsp_core_get(struct snd_sof_dev *sdev, int core);
  int mtl_dsp_core_put(struct snd_sof_dev *sdev, int core);
diff --git a/sound/soc/sof/ipc4-mtrace.c b/sound/soc/sof/ipc4-mtrace.c

index 9f1e33ee8826123cdc57bd0da78b876e79bf6f27..0e04bea9432ddab2e60b2f61d209689d560b39fb 100644 (file)
--- a/sound/soc/sof/ipc4-mtrace.c
+++ b/sound/soc/sof/ipc4-mtrace.c
@@ -4,6 +4,7 @@
  
  #include <linux/debugfs.h>
  #include <linux/sched/signal.h>
+#include <linux/sched/clock.h>
  #include <sound/sof/ipc4/header.h>
  #include "sof-priv.h"
  #include "ipc4-priv.h"
@@ -412,7 +413,6 @@ static int ipc4_mtrace_enable(struct snd_sof_dev *sdev)
         const struct sof_ipc_ops *iops = sdev->ipc->ops;
         struct sof_ipc4_msg msg;
         u64 system_time;
-       ktime_t kt;
         int ret;
  
         if (priv->mtrace_state != SOF_MTRACE_DISABLED)
@@ -424,9 +424,12 @@ static int ipc4_mtrace_enable(struct snd_sof_dev *sdev)
         msg.primary |= SOF_IPC4_MOD_INSTANCE(SOF_IPC4_MOD_INIT_BASEFW_INSTANCE_ID);
         msg.extension = SOF_IPC4_MOD_EXT_MSG_PARAM_ID(SOF_IPC4_FW_PARAM_SYSTEM_TIME);
  
-       /* The system time is in usec, UTC, epoch is 1601-01-01 00:00:00 */
-       kt = ktime_add_us(ktime_get_real(), FW_EPOCH_DELTA * USEC_PER_SEC);
-       system_time = ktime_to_us(kt);
+       /*
+        * local_clock() is used to align with dmesg, so both kernel and firmware logs have
+        * the same base and a minor delta due to the IPC. system time is in us format but
+        * local_clock() returns the time in ns, so convert to ns.
+        */
+       system_time = div64_u64(local_clock(), NSEC_PER_USEC);
         msg.data_size = sizeof(system_time);
         msg.data_ptr = &system_time;
         ret = iops->set_get_data(sdev, &msg, msg.data_size, true);
diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c

index 0f332c8cdbe6afe6fc9449b48194ebb749db5d3d..e915f9f87a6c35d74f1cf7096accca70dce688da 100644 (file)
--- a/sound/soc/sof/ipc4-pcm.c
+++ b/sound/soc/sof/ipc4-pcm.c
@@ -15,6 +15,28 @@
  #include "ipc4-topology.h"
  #include "ipc4-fw-reg.h"
  
+/**
+ * struct sof_ipc4_timestamp_info - IPC4 timestamp info
+ * @host_copier: the host copier of the pcm stream
+ * @dai_copier: the dai copier of the pcm stream
+ * @stream_start_offset: reported by fw in memory window (converted to frames)
+ * @stream_end_offset: reported by fw in memory window (converted to frames)
+ * @llp_offset: llp offset in memory window
+ * @boundary: wrap boundary should be used for the LLP frame counter
+ * @delay: Calculated and stored in pointer callback. The stored value is
+ *        returned in the delay callback.
+ */
+struct sof_ipc4_timestamp_info {
+       struct sof_ipc4_copier *host_copier;
+       struct sof_ipc4_copier *dai_copier;
+       u64 stream_start_offset;
+       u64 stream_end_offset;
+       u32 llp_offset;
+
+       u64 boundary;
+       snd_pcm_sframes_t delay;
+};
+
  static int sof_ipc4_set_multi_pipeline_state(struct snd_sof_dev *sdev, u32 state,
                                              struct ipc4_pipeline_set_state_data *trigger_list)
  {
@@ -423,8 +445,19 @@ static int sof_ipc4_trigger_pipelines(struct snd_soc_component *component,
         }
  
         /* return if this is the final state */
-       if (state == SOF_IPC4_PIPE_PAUSED)
+       if (state == SOF_IPC4_PIPE_PAUSED) {
+               struct sof_ipc4_timestamp_info *time_info;
+
+               /*
+                * Invalidate the stream_start_offset to make sure that it is
+                * going to be updated if the stream resumes
+                */
+               time_info = spcm->stream[substream->stream].private;
+               if (time_info)
+                       time_info->stream_start_offset = SOF_IPC4_INVALID_STREAM_POSITION;
+
                 goto free;
+       }
  skip_pause_transition:
         /* else set the RUNNING/RESET state in the DSP */
         ret = sof_ipc4_set_multi_pipeline_state(sdev, state, trigger_list);
@@ -464,14 +497,12 @@ static int sof_ipc4_pcm_trigger(struct snd_soc_component *component,
  
         /* determine the pipeline state */
         switch (cmd) {
-       case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
-               state = SOF_IPC4_PIPE_PAUSED;
-               break;
         case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
         case SNDRV_PCM_TRIGGER_RESUME:
         case SNDRV_PCM_TRIGGER_START:
                 state = SOF_IPC4_PIPE_RUNNING;
                 break;
+       case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
         case SNDRV_PCM_TRIGGER_SUSPEND:
         case SNDRV_PCM_TRIGGER_STOP:
                 state = SOF_IPC4_PIPE_PAUSED;
@@ -703,6 +734,10 @@ static int sof_ipc4_pcm_setup(struct snd_sof_dev *sdev, struct snd_sof_pcm *spcm
         if (abi_version < SOF_IPC4_FW_REGS_ABI_VER)
                 support_info = false;
  
+       /* For delay reporting the get_host_byte_counter callback is needed */
+       if (!sof_ops(sdev) || !sof_ops(sdev)->get_host_byte_counter)
+               support_info = false;
+
         for_each_pcm_streams(stream) {
                 pipeline_list = &spcm->stream[stream].pipeline_list;
  
@@ -835,7 +870,6 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
         struct sof_ipc4_copier *host_copier = time_info->host_copier;
         struct sof_ipc4_copier *dai_copier = time_info->dai_copier;
         struct sof_ipc4_pipeline_registers ppl_reg;
-       u64 stream_start_position;
         u32 dai_sample_size;
         u32 ch, node_index;
         u32 offset;
@@ -852,38 +886,51 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev,
         if (ppl_reg.stream_start_offset == SOF_IPC4_INVALID_STREAM_POSITION)
                 return -EINVAL;
  
-       stream_start_position = ppl_reg.stream_start_offset;
         ch = dai_copier->data.out_format.fmt_cfg;
         ch = SOF_IPC4_AUDIO_FORMAT_CFG_CHANNELS_COUNT(ch);
         dai_sample_size = (dai_copier->data.out_format.bit_depth >> 3) * ch;
-       /* convert offset to sample count */
-       do_div(stream_start_position, dai_sample_size);
-       time_info->stream_start_offset = stream_start_position;
+
+       /* convert offsets to frame count */
+       time_info->stream_start_offset = ppl_reg.stream_start_offset;
+       do_div(time_info->stream_start_offset, dai_sample_size);
+       time_info->stream_end_offset = ppl_reg.stream_end_offset;
+       do_div(time_info->stream_end_offset, dai_sample_size);
+
+       /*
+        * Calculate the wrap boundary need to be used for delay calculation
+        * The host counter is in bytes, it will wrap earlier than the frames
+        * based link counter.
+        */
+       time_info->boundary = div64_u64(~((u64)0),
+                                       frames_to_bytes(substream->runtime, 1));
+       /* Initialize the delay value to 0 (no delay) */
+       time_info->delay = 0;
  
         return 0;
  }
  
-static snd_pcm_sframes_t sof_ipc4_pcm_delay(struct snd_soc_component *component,
-                                           struct snd_pcm_substream *substream)
+static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
+                               struct snd_pcm_substream *substream,
+                               snd_pcm_uframes_t *pointer)
  {
         struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(component);
         struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(substream);
         struct sof_ipc4_timestamp_info *time_info;
         struct sof_ipc4_llp_reading_slot llp;
-       snd_pcm_uframes_t head_ptr, tail_ptr;
+       snd_pcm_uframes_t head_cnt, tail_cnt;
         struct snd_sof_pcm_stream *stream;
+       u64 dai_cnt, host_cnt, host_ptr;
         struct snd_sof_pcm *spcm;
-       u64 tmp_ptr;
         int ret;
  
         spcm = snd_sof_find_spcm_dai(component, rtd);
         if (!spcm)
-               return 0;
+               return -EOPNOTSUPP;
  
         stream = &spcm->stream[substream->stream];
         time_info = stream->private;
         if (!time_info)
-               return 0;
+               return -EOPNOTSUPP;
  
         /*
          * stream_start_offset is updated to memory window by FW based on
@@ -893,45 +940,116 @@ static snd_pcm_sframes_t sof_ipc4_pcm_delay(struct snd_soc_component *component,
         if (time_info->stream_start_offset == SOF_IPC4_INVALID_STREAM_POSITION) {
                 ret = sof_ipc4_get_stream_start_offset(sdev, substream, stream, time_info);
                 if (ret < 0)
-                       return 0;
+                       return -EOPNOTSUPP;
         }
  
+       /* For delay calculation we need the host counter */
+       host_cnt = snd_sof_pcm_get_host_byte_counter(sdev, component, substream);
+       host_ptr = host_cnt;
+
+       /* convert the host_cnt to frames */
+       host_cnt = div64_u64(host_cnt, frames_to_bytes(substream->runtime, 1));
+
         /*
-        * HDaudio links don't support the LLP counter reported by firmware
-        * the link position is read directly from hardware registers.
+        * If the LLP counter is not reported by firmware in the SRAM window
+        * then read the dai (link) counter via host accessible means if
+        * available.
          */
         if (!time_info->llp_offset) {
-               tmp_ptr = snd_sof_pcm_get_stream_position(sdev, component, substream);
-               if (!tmp_ptr)
-                       return 0;
+               dai_cnt = snd_sof_pcm_get_dai_frame_counter(sdev, component, substream);
+               if (!dai_cnt)
+                       return -EOPNOTSUPP;
         } else {
                 sof_mailbox_read(sdev, time_info->llp_offset, &llp, sizeof(llp));
-               tmp_ptr = ((u64)llp.reading.llp_u << 32) | llp.reading.llp_l;
+               dai_cnt = ((u64)llp.reading.llp_u << 32) | llp.reading.llp_l;
         }
+       dai_cnt += time_info->stream_end_offset;
  
-       /* In two cases dai dma position is not accurate
+       /* In two cases dai dma counter is not accurate
          * (1) dai pipeline is started before host pipeline
-        * (2) multiple streams mixed into one. Each stream has the same dai dma position
+        * (2) multiple streams mixed into one. Each stream has the same dai dma
+        *     counter
          *
-        * Firmware calculates correct stream_start_offset for all cases including above two.
-        * Driver subtracts stream_start_offset from dai dma position to get accurate one
+        * Firmware calculates correct stream_start_offset for all cases
+        * including above two.
+        * Driver subtracts stream_start_offset from dai dma counter to get
+        * accurate one
          */
-       tmp_ptr -= time_info->stream_start_offset;
  
-       /* Calculate the delay taking into account that both pointer can wrap */
-       div64_u64_rem(tmp_ptr, substream->runtime->boundary, &tmp_ptr);
+       /*
+        * On stream start the dai counter might not yet have reached the
+        * stream_start_offset value which means that no frames have left the
+        * DSP yet from the audio stream (on playback, capture streams have
+        * offset of 0 as we start capturing right away).
+        * In this case we need to adjust the distance between the counters by
+        * increasing the host counter by (offset - dai_counter).
+        * Otherwise the dai_counter needs to be adjusted to reflect the number
+        * of valid frames passed on the DAI side.
+        *
+        * The delay is the difference between the counters on the two
+        * sides of the DSP.
+        */
+       if (dai_cnt < time_info->stream_start_offset) {
+               host_cnt += time_info->stream_start_offset - dai_cnt;
+               dai_cnt = 0;
+       } else {
+               dai_cnt -= time_info->stream_start_offset;
+       }
+
+       /* Wrap the dai counter at the boundary where the host counter wraps */
+       div64_u64_rem(dai_cnt, time_info->boundary, &dai_cnt);
+
         if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
-               head_ptr = substream->runtime->status->hw_ptr;
-               tail_ptr = tmp_ptr;
+               head_cnt = host_cnt;
+               tail_cnt = dai_cnt;
         } else {
-               head_ptr = tmp_ptr;
-               tail_ptr = substream->runtime->status->hw_ptr;
+               head_cnt = dai_cnt;
+               tail_cnt = host_cnt;
+       }
+
+       if (head_cnt < tail_cnt) {
+               time_info->delay = time_info->boundary - tail_cnt + head_cnt;
+               goto out;
         }
  
-       if (head_ptr < tail_ptr)
-               return substream->runtime->boundary - tail_ptr + head_ptr;
+       time_info->delay =  head_cnt - tail_cnt;
+
+out:
+       /*
+        * Convert the host byte counter to PCM pointer which wraps in buffer
+        * and it is in frames
+        */
+       div64_u64_rem(host_ptr, snd_pcm_lib_buffer_bytes(substream), &host_ptr);
+       *pointer = bytes_to_frames(substream->runtime, host_ptr);
+
+       return 0;
+}
+
+static snd_pcm_sframes_t sof_ipc4_pcm_delay(struct snd_soc_component *component,
+                                           struct snd_pcm_substream *substream)
+{
+       struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(substream);
+       struct sof_ipc4_timestamp_info *time_info;
+       struct snd_sof_pcm_stream *stream;
+       struct snd_sof_pcm *spcm;
+
+       spcm = snd_sof_find_spcm_dai(component, rtd);
+       if (!spcm)
+               return 0;
+
+       stream = &spcm->stream[substream->stream];
+       time_info = stream->private;
+       /*
+        * Report the stored delay value calculated in the pointer callback.
+        * In the unlikely event that the calculation was skipped/aborted, the
+        * default 0 delay returned.
+        */
+       if (time_info)
+               return time_info->delay;
+
+       /* No delay information available, report 0 as delay */
+       return 0;
  
-       return head_ptr - tail_ptr;
  }
  
  const struct sof_ipc_pcm_ops ipc4_pcm_ops = {
@@ -941,6 +1059,7 @@ const struct sof_ipc_pcm_ops ipc4_pcm_ops = {
         .dai_link_fixup = sof_ipc4_pcm_dai_link_fixup,
         .pcm_setup = sof_ipc4_pcm_setup,
         .pcm_free = sof_ipc4_pcm_free,
+       .pointer = sof_ipc4_pcm_pointer,
         .delay = sof_ipc4_pcm_delay,
         .ipc_first_on_start = true,
         .platform_stop_during_hw_free = true,
diff --git a/sound/soc/sof/ipc4-priv.h b/sound/soc/sof/ipc4-priv.h

index f3b908b093f9562ddeb6932b4f1743a27c2b3c09..afed618a15f061a8588466490ee38ea19a80bc3d 100644 (file)
--- a/sound/soc/sof/ipc4-priv.h
+++ b/sound/soc/sof/ipc4-priv.h
@@ -92,20 +92,6 @@ struct sof_ipc4_fw_data {
         struct mutex pipeline_state_mutex; /* protect pipeline triggers, ref counts and states */
  };
  
-/**
- * struct sof_ipc4_timestamp_info - IPC4 timestamp info
- * @host_copier: the host copier of the pcm stream
- * @dai_copier: the dai copier of the pcm stream
- * @stream_start_offset: reported by fw in memory window
- * @llp_offset: llp offset in memory window
- */
-struct sof_ipc4_timestamp_info {
-       struct sof_ipc4_copier *host_copier;
-       struct sof_ipc4_copier *dai_copier;
-       u64 stream_start_offset;
-       u32 llp_offset;
-};
-
  extern const struct sof_ipc_fw_loader_ops ipc4_loader_ops;
  extern const struct sof_ipc_tplg_ops ipc4_tplg_ops;
  extern const struct sof_ipc_tplg_control_ops tplg_ipc4_control_ops;
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c

index f28edd9830c1b3e25961add70dff87a489cfa119..5cca058421260978dd18e992b09dfff58b44bbdb 100644 (file)
--- a/sound/soc/sof/ipc4-topology.c
+++ b/sound/soc/sof/ipc4-topology.c
@@ -412,8 +412,9 @@ static int sof_ipc4_widget_setup_pcm(struct snd_sof_widget *swidget)
         struct sof_ipc4_available_audio_format *available_fmt;
         struct snd_soc_component *scomp = swidget->scomp;
         struct sof_ipc4_copier *ipc4_copier;
+       struct snd_sof_pcm *spcm;
         int node_type = 0;
-       int ret;
+       int ret, dir;
  
         ipc4_copier = kzalloc(sizeof(*ipc4_copier), GFP_KERNEL);
         if (!ipc4_copier)
@@ -447,6 +448,25 @@ static int sof_ipc4_widget_setup_pcm(struct snd_sof_widget *swidget)
         }
         dev_dbg(scomp->dev, "host copier '%s' node_type %u\n", swidget->widget->name, node_type);
  
+       spcm = snd_sof_find_spcm_comp(scomp, swidget->comp_id, &dir);
+       if (!spcm)
+               goto skip_gtw_cfg;
+
+       if (dir == SNDRV_PCM_STREAM_PLAYBACK) {
+               struct snd_sof_pcm_stream *sps = &spcm->stream[dir];
+
+               sof_update_ipc_object(scomp, &sps->dsp_max_burst_size_in_ms,
+                                     SOF_COPIER_DEEP_BUFFER_TOKENS,
+                                     swidget->tuples,
+                                     swidget->num_tuples, sizeof(u32), 1);
+               /* Set default DMA buffer size if it is not specified in topology */
+               if (!sps->dsp_max_burst_size_in_ms)
+                       sps->dsp_max_burst_size_in_ms = SOF_IPC4_MIN_DMA_BUFFER_SIZE;
+       } else {
+               /* Capture data is copied from DSP to host in 1ms bursts */
+               spcm->stream[dir].dsp_max_burst_size_in_ms = 1;
+       }
+
  skip_gtw_cfg:
         ipc4_copier->gtw_attr = kzalloc(sizeof(*ipc4_copier->gtw_attr), GFP_KERNEL);
         if (!ipc4_copier->gtw_attr) {
diff --git a/sound/soc/sof/ops.h b/sound/soc/sof/ops.h

index 6cf21e829e07272ccf4c7005f74f9ae61403d39b..3cd748e13460916517d9533c48d2172d556fc344 100644 (file)
--- a/sound/soc/sof/ops.h
+++ b/sound/soc/sof/ops.h
@@ -523,12 +523,26 @@ static inline int snd_sof_pcm_platform_ack(struct snd_sof_dev *sdev,
         return 0;
  }
  
-static inline u64 snd_sof_pcm_get_stream_position(struct snd_sof_dev *sdev,
-                                                 struct snd_soc_component *component,
-                                                 struct snd_pcm_substream *substream)
+static inline u64
+snd_sof_pcm_get_dai_frame_counter(struct snd_sof_dev *sdev,
+                                 struct snd_soc_component *component,
+                                 struct snd_pcm_substream *substream)
  {
-       if (sof_ops(sdev) && sof_ops(sdev)->get_stream_position)
-               return sof_ops(sdev)->get_stream_position(sdev, component, substream);
+       if (sof_ops(sdev) && sof_ops(sdev)->get_dai_frame_counter)
+               return sof_ops(sdev)->get_dai_frame_counter(sdev, component,
+                                                           substream);
+
+       return 0;
+}
+
+static inline u64
+snd_sof_pcm_get_host_byte_counter(struct snd_sof_dev *sdev,
+                                 struct snd_soc_component *component,
+                                 struct snd_pcm_substream *substream)
+{
+       if (sof_ops(sdev) && sof_ops(sdev)->get_host_byte_counter)
+               return sof_ops(sdev)->get_host_byte_counter(sdev, component,
+                                                           substream);
  
         return 0;
  }
diff --git a/sound/soc/sof/pcm.c b/sound/soc/sof/pcm.c

index 33d576b1764783ab3468591703e0c97106893e9e..f03cee94bce62642e3c419d4f956a2011ea4dd3f 100644 (file)
--- a/sound/soc/sof/pcm.c
+++ b/sound/soc/sof/pcm.c
@@ -388,13 +388,21 @@ static snd_pcm_uframes_t sof_pcm_pointer(struct snd_soc_component *component,
  {
         struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(substream);
         struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(component);
+       const struct sof_ipc_pcm_ops *pcm_ops = sof_ipc_get_ops(sdev, pcm);
         struct snd_sof_pcm *spcm;
         snd_pcm_uframes_t host, dai;
+       int ret = -EOPNOTSUPP;
  
         /* nothing to do for BE */
         if (rtd->dai_link->no_pcm)
                 return 0;
  
+       if (pcm_ops && pcm_ops->pointer)
+               ret = pcm_ops->pointer(component, substream, &host);
+
+       if (ret != -EOPNOTSUPP)
+               return ret ? ret : host;
+
         /* use dsp ops pointer callback directly if set */
         if (sof_ops(sdev)->pcm_pointer)
                 return sof_ops(sdev)->pcm_pointer(sdev, substream);
diff --git a/sound/soc/sof/sof-audio.h b/sound/soc/sof/sof-audio.h

index 9ea2ac5adac79ee322f82060b908ce529cd9c43b..86bbb531e142c72be1ca5d710c466d16c9058734 100644 (file)
--- a/sound/soc/sof/sof-audio.h
+++ b/sound/soc/sof/sof-audio.h
@@ -103,7 +103,10 @@ struct snd_sof_dai_config_data {
   *            additional memory in the SOF PCM stream structure
   * @pcm_free: Function pointer for PCM free that can be used for freeing any
   *            additional memory in the SOF PCM stream structure
- * @delay: Function pointer for pcm delay calculation
+ * @pointer: Function pointer for pcm pointer
+ *          Note: the @pointer callback may return -EOPNOTSUPP which should be
+ *                handled in a same way as if the callback is not provided
+ * @delay: Function pointer for pcm delay reporting
   * @reset_hw_params_during_stop: Flag indicating whether the hw_params should be reset during the
   *                              STOP pcm trigger
   * @ipc_first_on_start: Send IPC before invoking platform trigger during
@@ -124,6 +127,9 @@ struct sof_ipc_pcm_ops {
         int (*dai_link_fixup)(struct snd_soc_pcm_runtime *rtd, struct snd_pcm_hw_params *params);
         int (*pcm_setup)(struct snd_sof_dev *sdev, struct snd_sof_pcm *spcm);
         void (*pcm_free)(struct snd_sof_dev *sdev, struct snd_sof_pcm *spcm);
+       int (*pointer)(struct snd_soc_component *component,
+                      struct snd_pcm_substream *substream,
+                      snd_pcm_uframes_t *pointer);
         snd_pcm_sframes_t (*delay)(struct snd_soc_component *component,
                                    struct snd_pcm_substream *substream);
         bool reset_hw_params_during_stop;
@@ -322,6 +328,7 @@ struct snd_sof_pcm_stream {
         struct work_struct period_elapsed_work;
         struct snd_soc_dapm_widget_list *list; /* list of connected DAPM widgets */
         bool d0i3_compatible; /* DSP can be in D0I3 when this pcm is opened */
+       unsigned int dsp_max_burst_size_in_ms; /* The maximum size of the host DMA burst in ms */
         /*
          * flag to indicate that the DSP pipelines should be kept
          * active or not while suspending the stream
diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h

index d453a4ce3b219d601813310c22cbf11029a08a77..d3c436f826046bca9f385b429d6d7e1639600f63 100644 (file)
--- a/sound/soc/sof/sof-priv.h
+++ b/sound/soc/sof/sof-priv.h
@@ -262,13 +262,25 @@ struct snd_sof_dsp_ops {
         int (*pcm_ack)(struct snd_sof_dev *sdev, struct snd_pcm_substream *substream); /* optional */
  
         /*
-        * optional callback to retrieve the link DMA position for the substream
-        * when the position is not reported in the shared SRAM windows but
-        * instead from a host-accessible hardware counter.
+        * optional callback to retrieve the number of frames left/arrived from/to
+        * the DSP on the DAI side (link/codec/DMIC/etc).
+        *
+        * The callback is used when the firmware does not provide this information
+        * via the shared SRAM window and it can be retrieved by host.
          */
-       u64 (*get_stream_position)(struct snd_sof_dev *sdev,
-                                  struct snd_soc_component *component,
-                                  struct snd_pcm_substream *substream); /* optional */
+       u64 (*get_dai_frame_counter)(struct snd_sof_dev *sdev,
+                                    struct snd_soc_component *component,
+                                    struct snd_pcm_substream *substream); /* optional */
+
+       /*
+        * Optional callback to retrieve the number of bytes left/arrived from/to
+        * the DSP on the host side (bytes between host ALSA buffer and DSP).
+        *
+        * The callback is needed for ALSA delay reporting.
+        */
+       u64 (*get_host_byte_counter)(struct snd_sof_dev *sdev,
+                                    struct snd_soc_component *component,
+                                    struct snd_pcm_substream *substream); /* optional */
  
         /* host read DSP stream data */
         int (*ipc_msg_data)(struct snd_sof_dev *sdev,
diff --git a/sound/usb/line6/driver.c b/sound/usb/line6/driver.c

index b67617b68e509d2c86d78058f7796a64aab00f41..f4437015d43a7500b809a303f175b211662d500f 100644 (file)
--- a/sound/usb/line6/driver.c
+++ b/sound/usb/line6/driver.c
@@ -202,7 +202,7 @@ int line6_send_raw_message_async(struct usb_line6 *line6, const char *buffer,
         struct urb *urb;
  
         /* create message: */
-       msg = kmalloc(sizeof(struct message), GFP_ATOMIC);
+       msg = kzalloc(sizeof(struct message), GFP_ATOMIC);
         if (msg == NULL)
                 return -ENOMEM;
  
@@ -688,7 +688,7 @@ static int line6_init_cap_control(struct usb_line6 *line6)
         int ret;
  
         /* initialize USB buffers: */
-       line6->buffer_listen = kmalloc(LINE6_BUFSIZE_LISTEN, GFP_KERNEL);
+       line6->buffer_listen = kzalloc(LINE6_BUFSIZE_LISTEN, GFP_KERNEL);
         if (!line6->buffer_listen)
                 return -ENOMEM;
  
@@ -697,7 +697,7 @@ static int line6_init_cap_control(struct usb_line6 *line6)
                 return -ENOMEM;
  
         if (line6->properties->capabilities & LINE6_CAP_CONTROL_MIDI) {
-               line6->buffer_message = kmalloc(LINE6_MIDI_MESSAGE_MAXLEN, GFP_KERNEL);
+               line6->buffer_message = kzalloc(LINE6_MIDI_MESSAGE_MAXLEN, GFP_KERNEL);
                 if (!line6->buffer_message)
                         return -ENOMEM;
  
diff --git a/tools/include/linux/kernel.h b/tools/include/linux/kernel.h

index 4b0673bf52c2e615017bf2b94da1f6fc4392e532..07cfad817d53908f2325505d2b9cb644a808a689 100644 (file)
--- a/tools/include/linux/kernel.h
+++ b/tools/include/linux/kernel.h
@@ -8,6 +8,7 @@
  #include <linux/build_bug.h>
  #include <linux/compiler.h>
  #include <linux/math.h>
+#include <linux/panic.h>
  #include <endian.h>
  #include <byteswap.h>
  
diff --git a/tools/include/linux/mm.h b/tools/include/linux/mm.h

index f3c82ab5b14cd77819030096b81e0b67cba0df1d..7d73da0980473fd3fdbdcd88e9e041077d5a2df3 100644 (file)
--- a/tools/include/linux/mm.h
+++ b/tools/include/linux/mm.h
@@ -37,4 +37,9 @@ static inline void totalram_pages_add(long count)
  {
  }
  
+static inline int early_pfn_to_nid(unsigned long pfn)
+{
+       return 0;
+}
+
  #endif
diff --git a/tools/include/linux/panic.h b/tools/include/linux/panic.h

new file mode 100644 (file)

index 0000000..9c8f17a
--- /dev/null
+++ b/tools/include/linux/panic.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _TOOLS_LINUX_PANIC_H
+#define _TOOLS_LINUX_PANIC_H
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+static inline void panic(const char *fmt, ...)
+{
+       va_list argp;
+
+       va_start(argp, fmt);
+       vfprintf(stderr, fmt, argp);
+       va_end(argp);
+       exit(-1);
+}
+
+#endif
diff --git a/tools/power/x86/turbostat/turbostat.8 b/tools/power/x86/turbostat/turbostat.8

index 8f08c3fd498d5b81185519728fc1c28a8a0d4d5f..0d3672e5d9ed1553a720f3b0b52ca71f81fbc2d9 100644 (file)
--- a/tools/power/x86/turbostat/turbostat.8
+++ b/tools/power/x86/turbostat/turbostat.8
@@ -67,6 +67,10 @@ The column name "all" can be used to enable all disabled-by-default built-in cou
  .PP
  \fB--quiet\fP Do not decode and print the system configuration header information.
  .PP
++\fB--no-msr\fP Disable all the uses of the MSR driver.
++.PP
++\fB--no-perf\fP Disable all the uses of the perf API.
++.PP
  \fB--interval seconds\fP overrides the default 5.0 second measurement interval.
  .PP
  \fB--num_iterations num\fP number of the measurement iterations.
@@ -125,9 +129,17 @@ The system configuration dump (if --quiet is not used) is followed by statistics
  .PP
  \fBPkgTmp\fP Degrees Celsius reported by the per-package Package Thermal Monitor.
  .PP
-\fBGFX%rc6\fP The percentage of time the GPU is in the "render C6" state, rc6, during the measurement interval. From /sys/class/drm/card0/power/rc6_residency_ms.
+\fBGFX%rc6\fP The percentage of time the GPU is in the "render C6" state, rc6, during the measurement interval. From /sys/class/drm/card0/power/rc6_residency_ms or /sys/class/drm/card0/gt/gt0/rc6_residency_ms or /sys/class/drm/card0/device/tile0/gtN/gtidle/idle_residency_ms depending on the graphics driver being used.
  .PP
-\fBGFXMHz\fP Instantaneous snapshot of what sysfs presents at the end of the measurement interval. From /sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz.
+\fBGFXMHz\fP Instantaneous snapshot of what sysfs presents at the end of the measurement interval. From /sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz or /sys/class/drm/card0/gt_cur_freq_mhz or /sys/class/drm/card0/gt/gt0/rps_cur_freq_mhz or /sys/class/drm/card0/device/tile0/gtN/freq0/cur_freq depending on the graphics driver being used.
+.PP
+\fBGFXAMHz\fP Instantaneous snapshot of what sysfs presents at the end of the measurement interval. From /sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz or /sys/class/drm/card0/gt_act_freq_mhz or /sys/class/drm/card0/gt/gt0/rps_act_freq_mhz or /sys/class/drm/card0/device/tile0/gtN/freq0/act_freq depending on the graphics driver being used.
+.PP
+\fBSAM%mc6\fP The percentage of time the SA Media is in the "module C6" state, mc6, during the measurement interval. From /sys/class/drm/card0/gt/gt1/rc6_residency_ms or /sys/class/drm/card0/device/tile0/gtN/gtidle/idle_residency_ms depending on the graphics driver being used.
+.PP
+\fBSAMMHz\fP Instantaneous snapshot of what sysfs presents at the end of the measurement interval. From /sys/class/drm/card0/gt/gt1/rps_cur_freq_mhz or /sys/class/drm/card0/device/tile0/gtN/freq0/cur_freq depending on the graphics driver being used.
+.PP
+\fBSAMAMHz\fP Instantaneous snapshot of what sysfs presents at the end of the measurement interval. From /sys/class/drm/card0/gt/gt1/rps_act_freq_mhz or /sys/class/drm/card0/device/tile0/gtN/freq0/act_freq depending on the graphics driver being used.
  .PP
  \fBPkg%pc2, Pkg%pc3, Pkg%pc6, Pkg%pc7\fP percentage residency in hardware package idle states.  These numbers are from hardware residency counters.
  .PP
@@ -370,7 +382,7 @@ below the processor's base frequency.
  
  Busy% = MPERF_delta/TSC_delta
  
-Bzy_MHz = TSC_delta/APERF_delta/MPERF_delta/measurement_interval
+Bzy_MHz = TSC_delta*APERF_delta/MPERF_delta/measurement_interval
  
  Note that these calculations depend on TSC_delta, so they
  are not reliable during intervals when TSC_MHz is not running at the base frequency.
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c

index 7a334377f92b978fa642a0071b19f33d7e6fe74e..98256468e24806acfc0daee374d0cf9877e92131 100644 (file)
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3,7 +3,7 @@
   * turbostat -- show CPU frequency and C-state residency
   * on modern Intel and AMD processors.
   *
- * Copyright (c) 2023 Intel Corporation.
+ * Copyright (c) 2024 Intel Corporation.
   * Len Brown <len.brown@intel.com>
   */
  
@@ -36,6 +36,8 @@
  #include <linux/perf_event.h>
  #include <asm/unistd.h>
  #include <stdbool.h>
+#include <assert.h>
+#include <linux/kernel.h>
  
  #define UNUSED(x) (void)(x)
  
@@ -53,9 +55,13 @@
  #define        NAME_BYTES 20
  #define PATH_BYTES 128
  
+#define MAX_NOFILE 0x8000
+
  enum counter_scope { SCOPE_CPU, SCOPE_CORE, SCOPE_PACKAGE };
  enum counter_type { COUNTER_ITEMS, COUNTER_CYCLES, COUNTER_SECONDS, COUNTER_USEC };
  enum counter_format { FORMAT_RAW, FORMAT_DELTA, FORMAT_PERCENT };
+enum amperf_source { AMPERF_SOURCE_PERF, AMPERF_SOURCE_MSR };
+enum rapl_source { RAPL_SOURCE_NONE, RAPL_SOURCE_PERF, RAPL_SOURCE_MSR };
  
  struct msr_counter {
         unsigned int msr_num;
@@ -127,6 +133,9 @@ struct msr_counter bic[] = {
         { 0x0, "IPC", "", 0, 0, 0, NULL, 0 },
         { 0x0, "CoreThr", "", 0, 0, 0, NULL, 0 },
         { 0x0, "UncMHz", "", 0, 0, 0, NULL, 0 },
+       { 0x0, "SAM%mc6", "", 0, 0, 0, NULL, 0 },
+       { 0x0, "SAMMHz", "", 0, 0, 0, NULL, 0 },
+       { 0x0, "SAMAMHz", "", 0, 0, 0, NULL, 0 },
  };
  
  #define MAX_BIC (sizeof(bic) / sizeof(struct msr_counter))
@@ -185,11 +194,14 @@ struct msr_counter bic[] = {
  #define        BIC_IPC         (1ULL << 52)
  #define        BIC_CORE_THROT_CNT      (1ULL << 53)
  #define        BIC_UNCORE_MHZ          (1ULL << 54)
+#define        BIC_SAM_mc6             (1ULL << 55)
+#define        BIC_SAMMHz              (1ULL << 56)
+#define        BIC_SAMACTMHz           (1ULL << 57)
  
  #define BIC_TOPOLOGY (BIC_Package | BIC_Node | BIC_CoreCnt | BIC_PkgCnt | BIC_Core | BIC_CPU | BIC_Die )
  #define BIC_THERMAL_PWR ( BIC_CoreTmp | BIC_PkgTmp | BIC_PkgWatt | BIC_CorWatt | BIC_GFXWatt | BIC_RAMWatt | BIC_PKG__ | BIC_RAM__)
-#define BIC_FREQUENCY ( BIC_Avg_MHz | BIC_Busy | BIC_Bzy_MHz | BIC_TSC_MHz | BIC_GFXMHz | BIC_GFXACTMHz | BIC_UNCORE_MHZ)
-#define BIC_IDLE ( BIC_sysfs | BIC_CPU_c1 | BIC_CPU_c3 | BIC_CPU_c6 | BIC_CPU_c7 | BIC_GFX_rc6 | BIC_Pkgpc2 | BIC_Pkgpc3 | BIC_Pkgpc6 | BIC_Pkgpc7 | BIC_Pkgpc8 | BIC_Pkgpc9 | BIC_Pkgpc10 | BIC_CPU_LPI | BIC_SYS_LPI | BIC_Mod_c6 | BIC_Totl_c0 | BIC_Any_c0 | BIC_GFX_c0 | BIC_CPUGFX)
+#define BIC_FREQUENCY (BIC_Avg_MHz | BIC_Busy | BIC_Bzy_MHz | BIC_TSC_MHz | BIC_GFXMHz | BIC_GFXACTMHz | BIC_SAMMHz | BIC_SAMACTMHz | BIC_UNCORE_MHZ)
+#define BIC_IDLE (BIC_sysfs | BIC_CPU_c1 | BIC_CPU_c3 | BIC_CPU_c6 | BIC_CPU_c7 | BIC_GFX_rc6 | BIC_Pkgpc2 | BIC_Pkgpc3 | BIC_Pkgpc6 | BIC_Pkgpc7 | BIC_Pkgpc8 | BIC_Pkgpc9 | BIC_Pkgpc10 | BIC_CPU_LPI | BIC_SYS_LPI | BIC_Mod_c6 | BIC_Totl_c0 | BIC_Any_c0 | BIC_GFX_c0 | BIC_CPUGFX | BIC_SAM_mc6)
  #define BIC_OTHER ( BIC_IRQ | BIC_SMI | BIC_ThreadC | BIC_CoreTmp | BIC_IPC)
  
  #define BIC_DISABLED_BY_DEFAULT        (BIC_USEC | BIC_TOD | BIC_APIC | BIC_X2APIC)
@@ -204,10 +216,13 @@ unsigned long long bic_present = BIC_USEC | BIC_TOD | BIC_sysfs | BIC_APIC | BIC
  #define BIC_NOT_PRESENT(COUNTER_BIT) (bic_present &= ~COUNTER_BIT)
  #define BIC_IS_ENABLED(COUNTER_BIT) (bic_enabled & COUNTER_BIT)
  
+struct amperf_group_fd;
+
  char *proc_stat = "/proc/stat";
  FILE *outf;
  int *fd_percpu;
  int *fd_instr_count_percpu;
+struct amperf_group_fd *fd_amperf_percpu;      /* File descriptors for perf group with APERF and MPERF counters. */
  struct timeval interval_tv = { 5, 0 };
  struct timespec interval_ts = { 5, 0 };
  
@@ -242,11 +257,8 @@ char *output_buffer, *outp;
  unsigned int do_dts;
  unsigned int do_ptm;
  unsigned int do_ipc;
-unsigned long long gfx_cur_rc6_ms;
  unsigned long long cpuidle_cur_cpu_lpi_us;
  unsigned long long cpuidle_cur_sys_lpi_us;
-unsigned int gfx_cur_mhz;
-unsigned int gfx_act_mhz;
  unsigned int tj_max;
  unsigned int tj_max_override;
  double rapl_power_units, rapl_time_units;
@@ -263,6 +275,28 @@ unsigned int has_hwp_epp;  /* IA32_HWP_REQUEST[bits 31:24] */
  unsigned int has_hwp_pkg;      /* IA32_HWP_REQUEST_PKG */
  unsigned int first_counter_read = 1;
  int ignore_stdin;
+bool no_msr;
+bool no_perf;
+enum amperf_source amperf_source;
+
+enum gfx_sysfs_idx {
+       GFX_rc6,
+       GFX_MHz,
+       GFX_ACTMHz,
+       SAM_mc6,
+       SAM_MHz,
+       SAM_ACTMHz,
+       GFX_MAX
+};
+
+struct gfx_sysfs_info {
+       const char *path;
+       FILE *fp;
+       unsigned int val;
+       unsigned long long val_ull;
+};
+
+static struct gfx_sysfs_info gfx_info[GFX_MAX];
  
  int get_msr(int cpu, off_t offset, unsigned long long *msr);
  
@@ -652,6 +686,7 @@ static const struct platform_features icx_features = {
         .bclk_freq = BCLK_100MHZ,
         .supported_cstates = CC1 | CC6 | PC2 | PC6,
         .cst_limit = CST_LIMIT_ICX,
+       .has_msr_core_c1_res = 1,
         .has_irtl_msrs = 1,
         .has_cst_prewake_bit = 1,
         .trl_msrs = TRL_BASE | TRL_CORECOUNT,
@@ -948,6 +983,175 @@ size_t cpu_present_setsize, cpu_effective_setsize, cpu_allowed_setsize, cpu_affi
  #define MAX_ADDED_THREAD_COUNTERS 24
  #define BITMASK_SIZE 32
  
+/* Indexes used to map data read from perf and MSRs into global variables */
+enum rapl_rci_index {
+       RAPL_RCI_INDEX_ENERGY_PKG = 0,
+       RAPL_RCI_INDEX_ENERGY_CORES = 1,
+       RAPL_RCI_INDEX_DRAM = 2,
+       RAPL_RCI_INDEX_GFX = 3,
+       RAPL_RCI_INDEX_PKG_PERF_STATUS = 4,
+       RAPL_RCI_INDEX_DRAM_PERF_STATUS = 5,
+       RAPL_RCI_INDEX_CORE_ENERGY = 6,
+       NUM_RAPL_COUNTERS,
+};
+
+enum rapl_unit {
+       RAPL_UNIT_INVALID,
+       RAPL_UNIT_JOULES,
+       RAPL_UNIT_WATTS,
+};
+
+struct rapl_counter_info_t {
+       unsigned long long data[NUM_RAPL_COUNTERS];
+       enum rapl_source source[NUM_RAPL_COUNTERS];
+       unsigned long long flags[NUM_RAPL_COUNTERS];
+       double scale[NUM_RAPL_COUNTERS];
+       enum rapl_unit unit[NUM_RAPL_COUNTERS];
+
+       union {
+               /* Active when source == RAPL_SOURCE_MSR */
+               struct {
+                       unsigned long long msr[NUM_RAPL_COUNTERS];
+                       unsigned long long msr_mask[NUM_RAPL_COUNTERS];
+                       int msr_shift[NUM_RAPL_COUNTERS];
+               };
+       };
+
+       int fd_perf;
+};
+
+/* struct rapl_counter_info_t for each RAPL domain */
+struct rapl_counter_info_t *rapl_counter_info_perdomain;
+
+#define RAPL_COUNTER_FLAG_USE_MSR_SUM (1u << 1)
+
+struct rapl_counter_arch_info {
+       int feature_mask;       /* Mask for testing if the counter is supported on host */
+       const char *perf_subsys;
+       const char *perf_name;
+       unsigned long long msr;
+       unsigned long long msr_mask;
+       int msr_shift;          /* Positive mean shift right, negative mean shift left */
+       double *platform_rapl_msr_scale;        /* Scale applied to values read by MSR (platform dependent, filled at runtime) */
+       unsigned int rci_index; /* Maps data from perf counters to global variables */
+       unsigned long long bic;
+       double compat_scale;    /* Some counters require constant scaling to be in the same range as other, similar ones */
+       unsigned long long flags;
+};
+
+static const struct rapl_counter_arch_info rapl_counter_arch_infos[] = {
+       {
+        .feature_mask = RAPL_PKG,
+        .perf_subsys = "power",
+        .perf_name = "energy-pkg",
+        .msr = MSR_PKG_ENERGY_STATUS,
+        .msr_mask = 0xFFFFFFFFFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_energy_units,
+        .rci_index = RAPL_RCI_INDEX_ENERGY_PKG,
+        .bic = BIC_PkgWatt | BIC_Pkg_J,
+        .compat_scale = 1.0,
+        .flags = RAPL_COUNTER_FLAG_USE_MSR_SUM,
+         },
+       {
+        .feature_mask = RAPL_AMD_F17H,
+        .perf_subsys = "power",
+        .perf_name = "energy-pkg",
+        .msr = MSR_PKG_ENERGY_STAT,
+        .msr_mask = 0xFFFFFFFFFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_energy_units,
+        .rci_index = RAPL_RCI_INDEX_ENERGY_PKG,
+        .bic = BIC_PkgWatt | BIC_Pkg_J,
+        .compat_scale = 1.0,
+        .flags = RAPL_COUNTER_FLAG_USE_MSR_SUM,
+         },
+       {
+        .feature_mask = RAPL_CORE_ENERGY_STATUS,
+        .perf_subsys = "power",
+        .perf_name = "energy-cores",
+        .msr = MSR_PP0_ENERGY_STATUS,
+        .msr_mask = 0xFFFFFFFFFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_energy_units,
+        .rci_index = RAPL_RCI_INDEX_ENERGY_CORES,
+        .bic = BIC_CorWatt | BIC_Cor_J,
+        .compat_scale = 1.0,
+        .flags = RAPL_COUNTER_FLAG_USE_MSR_SUM,
+         },
+       {
+        .feature_mask = RAPL_DRAM,
+        .perf_subsys = "power",
+        .perf_name = "energy-ram",
+        .msr = MSR_DRAM_ENERGY_STATUS,
+        .msr_mask = 0xFFFFFFFFFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_dram_energy_units,
+        .rci_index = RAPL_RCI_INDEX_DRAM,
+        .bic = BIC_RAMWatt | BIC_RAM_J,
+        .compat_scale = 1.0,
+        .flags = RAPL_COUNTER_FLAG_USE_MSR_SUM,
+         },
+       {
+        .feature_mask = RAPL_GFX,
+        .perf_subsys = "power",
+        .perf_name = "energy-gpu",
+        .msr = MSR_PP1_ENERGY_STATUS,
+        .msr_mask = 0xFFFFFFFFFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_energy_units,
+        .rci_index = RAPL_RCI_INDEX_GFX,
+        .bic = BIC_GFXWatt | BIC_GFX_J,
+        .compat_scale = 1.0,
+        .flags = RAPL_COUNTER_FLAG_USE_MSR_SUM,
+         },
+       {
+        .feature_mask = RAPL_PKG_PERF_STATUS,
+        .perf_subsys = NULL,
+        .perf_name = NULL,
+        .msr = MSR_PKG_PERF_STATUS,
+        .msr_mask = 0xFFFFFFFFFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_time_units,
+        .rci_index = RAPL_RCI_INDEX_PKG_PERF_STATUS,
+        .bic = BIC_PKG__,
+        .compat_scale = 100.0,
+        .flags = RAPL_COUNTER_FLAG_USE_MSR_SUM,
+         },
+       {
+        .feature_mask = RAPL_DRAM_PERF_STATUS,
+        .perf_subsys = NULL,
+        .perf_name = NULL,
+        .msr = MSR_DRAM_PERF_STATUS,
+        .msr_mask = 0xFFFFFFFFFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_time_units,
+        .rci_index = RAPL_RCI_INDEX_DRAM_PERF_STATUS,
+        .bic = BIC_RAM__,
+        .compat_scale = 100.0,
+        .flags = RAPL_COUNTER_FLAG_USE_MSR_SUM,
+         },
+       {
+        .feature_mask = RAPL_AMD_F17H,
+        .perf_subsys = NULL,
+        .perf_name = NULL,
+        .msr = MSR_CORE_ENERGY_STAT,
+        .msr_mask = 0xFFFFFFFF,
+        .msr_shift = 0,
+        .platform_rapl_msr_scale = &rapl_energy_units,
+        .rci_index = RAPL_RCI_INDEX_CORE_ENERGY,
+        .bic = BIC_CorWatt | BIC_Cor_J,
+        .compat_scale = 1.0,
+        .flags = 0,
+         },
+};
+
+struct rapl_counter {
+       unsigned long long raw_value;
+       enum rapl_unit unit;
+       double scale;
+};
+
  struct thread_data {
         struct timeval tv_begin;
         struct timeval tv_end;
@@ -974,7 +1178,7 @@ struct core_data {
         unsigned long long c7;
         unsigned long long mc6_us;      /* duplicate as per-core for now, even though per module */
         unsigned int core_temp_c;
-       unsigned int core_energy;       /* MSR_CORE_ENERGY_STAT */
+       struct rapl_counter core_energy;        /* MSR_CORE_ENERGY_STAT */
         unsigned int core_id;
         unsigned long long core_throt_cnt;
         unsigned long long counter[MAX_ADDED_COUNTERS];
@@ -989,8 +1193,8 @@ struct pkg_data {
         unsigned long long pc8;
         unsigned long long pc9;
         unsigned long long pc10;
-       unsigned long long cpu_lpi;
-       unsigned long long sys_lpi;
+       long long cpu_lpi;
+       long long sys_lpi;
         unsigned long long pkg_wtd_core_c0;
         unsigned long long pkg_any_core_c0;
         unsigned long long pkg_any_gfxe_c0;
@@ -998,13 +1202,16 @@ struct pkg_data {
         long long gfx_rc6_ms;
         unsigned int gfx_mhz;
         unsigned int gfx_act_mhz;
+       long long sam_mc6_ms;
+       unsigned int sam_mhz;
+       unsigned int sam_act_mhz;
         unsigned int package_id;
-       unsigned long long energy_pkg;  /* MSR_PKG_ENERGY_STATUS */
-       unsigned long long energy_dram; /* MSR_DRAM_ENERGY_STATUS */
-       unsigned long long energy_cores;        /* MSR_PP0_ENERGY_STATUS */
-       unsigned long long energy_gfx;  /* MSR_PP1_ENERGY_STATUS */
-       unsigned long long rapl_pkg_perf_status;        /* MSR_PKG_PERF_STATUS */
-       unsigned long long rapl_dram_perf_status;       /* MSR_DRAM_PERF_STATUS */
+       struct rapl_counter energy_pkg; /* MSR_PKG_ENERGY_STATUS */
+       struct rapl_counter energy_dram;        /* MSR_DRAM_ENERGY_STATUS */
+       struct rapl_counter energy_cores;       /* MSR_PP0_ENERGY_STATUS */
+       struct rapl_counter energy_gfx; /* MSR_PP1_ENERGY_STATUS */
+       struct rapl_counter rapl_pkg_perf_status;       /* MSR_PKG_PERF_STATUS */
+       struct rapl_counter rapl_dram_perf_status;      /* MSR_DRAM_PERF_STATUS */
         unsigned int pkg_temp_c;
         unsigned int uncore_mhz;
         unsigned long long counter[MAX_ADDED_COUNTERS];
@@ -1150,6 +1357,38 @@ struct sys_counters {
         struct msr_counter *pp;
  } sys;
  
+void free_sys_counters(void)
+{
+       struct msr_counter *p = sys.tp, *pnext = NULL;
+
+       while (p) {
+               pnext = p->next;
+               free(p);
+               p = pnext;
+       }
+
+       p = sys.cp, pnext = NULL;
+       while (p) {
+               pnext = p->next;
+               free(p);
+               p = pnext;
+       }
+
+       p = sys.pp, pnext = NULL;
+       while (p) {
+               pnext = p->next;
+               free(p);
+               p = pnext;
+       }
+
+       sys.added_thread_counters = 0;
+       sys.added_core_counters = 0;
+       sys.added_package_counters = 0;
+       sys.tp = NULL;
+       sys.cp = NULL;
+       sys.pp = NULL;
+}
+
  struct system_summary {
         struct thread_data threads;
         struct core_data cores;
@@ -1280,34 +1519,60 @@ int get_msr_fd(int cpu)
         sprintf(pathname, "/dev/cpu/%d/msr", cpu);
         fd = open(pathname, O_RDONLY);
         if (fd < 0)
-               err(-1, "%s open failed, try chown or chmod +r /dev/cpu/*/msr, or run as root", pathname);
+               err(-1, "%s open failed, try chown or chmod +r /dev/cpu/*/msr, "
+                   "or run with --no-msr, or run as root", pathname);
  
         fd_percpu[cpu] = fd;
  
         return fd;
  }
  
+static void bic_disable_msr_access(void)
+{
+       const unsigned long bic_msrs =
+           BIC_SMI |
+           BIC_CPU_c1 |
+           BIC_CPU_c3 |
+           BIC_CPU_c6 |
+           BIC_CPU_c7 |
+           BIC_Mod_c6 |
+           BIC_CoreTmp |
+           BIC_Totl_c0 |
+           BIC_Any_c0 |
+           BIC_GFX_c0 |
+           BIC_CPUGFX |
+           BIC_Pkgpc2 | BIC_Pkgpc3 | BIC_Pkgpc6 | BIC_Pkgpc7 | BIC_Pkgpc8 | BIC_Pkgpc9 | BIC_Pkgpc10 | BIC_PkgTmp;
+
+       bic_enabled &= ~bic_msrs;
+
+       free_sys_counters();
+}
+
  static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags)
  {
+       assert(!no_perf);
+
         return syscall(__NR_perf_event_open, hw_event, pid, cpu, group_fd, flags);
  }
  
-static int perf_instr_count_open(int cpu_num)
+static long open_perf_counter(int cpu, unsigned int type, unsigned int config, int group_fd, __u64 read_format)
  {
-       struct perf_event_attr pea;
-       int fd;
+       struct perf_event_attr attr;
+       const pid_t pid = -1;
+       const unsigned long flags = 0;
  
-       memset(&pea, 0, sizeof(struct perf_event_attr));
-       pea.type = PERF_TYPE_HARDWARE;
-       pea.size = sizeof(struct perf_event_attr);
-       pea.config = PERF_COUNT_HW_INSTRUCTIONS;
+       assert(!no_perf);
  
-       /* counter for cpu_num, including user + kernel and all processes */
-       fd = perf_event_open(&pea, -1, cpu_num, -1, 0);
-       if (fd == -1) {
-               warnx("capget(CAP_PERFMON) failed, try \"# setcap cap_sys_admin=ep %s\"", progname);
-               BIC_NOT_PRESENT(BIC_IPC);
-       }
+       memset(&attr, 0, sizeof(struct perf_event_attr));
+
+       attr.type = type;
+       attr.size = sizeof(struct perf_event_attr);
+       attr.config = config;
+       attr.disabled = 0;
+       attr.sample_type = PERF_SAMPLE_IDENTIFIER;
+       attr.read_format = read_format;
+
+       const int fd = perf_event_open(&attr, pid, cpu, group_fd, flags);
  
         return fd;
  }
@@ -1317,7 +1582,7 @@ int get_instr_count_fd(int cpu)
         if (fd_instr_count_percpu[cpu])
                 return fd_instr_count_percpu[cpu];
  
-       fd_instr_count_percpu[cpu] = perf_instr_count_open(cpu);
+       fd_instr_count_percpu[cpu] = open_perf_counter(cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, -1, 0);
  
         return fd_instr_count_percpu[cpu];
  }
@@ -1326,6 +1591,8 @@ int get_msr(int cpu, off_t offset, unsigned long long *msr)
  {
         ssize_t retval;
  
+       assert(!no_msr);
+
         retval = pread(get_msr_fd(cpu), msr, sizeof(*msr), offset);
  
         if (retval != sizeof *msr)
@@ -1334,6 +1601,21 @@ int get_msr(int cpu, off_t offset, unsigned long long *msr)
         return 0;
  }
  
+int probe_msr(int cpu, off_t offset)
+{
+       ssize_t retval;
+       unsigned long long dummy;
+
+       assert(!no_msr);
+
+       retval = pread(get_msr_fd(cpu), &dummy, sizeof(dummy), offset);
+
+       if (retval != sizeof(dummy))
+               return 1;
+
+       return 0;
+}
+
  #define MAX_DEFERRED 16
  char *deferred_add_names[MAX_DEFERRED];
  char *deferred_skip_names[MAX_DEFERRED];
@@ -1369,6 +1651,8 @@ void help(void)
                 "               Override default 5-second measurement interval\n"
                 "  -J, --Joules displays energy in Joules instead of Watts\n"
                 "  -l, --list   list column headers only\n"
+               "  -M, --no-msr Disable all uses of the MSR driver\n"
+               "  -P, --no-perf Disable all uses of the perf API\n"
                 "  -n, --num_iterations num\n"
                 "               number of the measurement iterations\n"
                 "  -N, --header_iterations num\n"
@@ -1573,6 +1857,15 @@ void print_header(char *delim)
         if (DO_BIC(BIC_GFXACTMHz))
                 outp += sprintf(outp, "%sGFXAMHz", (printed++ ? delim : ""));
  
+       if (DO_BIC(BIC_SAM_mc6))
+               outp += sprintf(outp, "%sSAM%%mc6", (printed++ ? delim : ""));
+
+       if (DO_BIC(BIC_SAMMHz))
+               outp += sprintf(outp, "%sSAMMHz", (printed++ ? delim : ""));
+
+       if (DO_BIC(BIC_SAMACTMHz))
+               outp += sprintf(outp, "%sSAMAMHz", (printed++ ? delim : ""));
+
         if (DO_BIC(BIC_Totl_c0))
                 outp += sprintf(outp, "%sTotl%%C0", (printed++ ? delim : ""));
         if (DO_BIC(BIC_Any_c0))
@@ -1671,26 +1964,35 @@ int dump_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p
                         outp += sprintf(outp, "SMI: %d\n", t->smi_count);
  
                 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) {
-                       outp += sprintf(outp, "tADDED [%d] msr0x%x: %08llX\n", i, mp->msr_num, t->counter[i]);
+                       outp +=
+                           sprintf(outp, "tADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num,
+                                   t->counter[i], mp->path);
                 }
         }
  
-       if (c) {
+       if (c && is_cpu_first_thread_in_core(t, c, p)) {
                 outp += sprintf(outp, "core: %d\n", c->core_id);
                 outp += sprintf(outp, "c3: %016llX\n", c->c3);
                 outp += sprintf(outp, "c6: %016llX\n", c->c6);
                 outp += sprintf(outp, "c7: %016llX\n", c->c7);
                 outp += sprintf(outp, "DTS: %dC\n", c->core_temp_c);
                 outp += sprintf(outp, "cpu_throt_count: %016llX\n", c->core_throt_cnt);
-               outp += sprintf(outp, "Joules: %0X\n", c->core_energy);
+
+               const unsigned long long energy_value = c->core_energy.raw_value * c->core_energy.scale;
+               const double energy_scale = c->core_energy.scale;
+
+               if (c->core_energy.unit == RAPL_UNIT_JOULES)
+                       outp += sprintf(outp, "Joules: %0llX (scale: %lf)\n", energy_value, energy_scale);
  
                 for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
-                       outp += sprintf(outp, "cADDED [%d] msr0x%x: %08llX\n", i, mp->msr_num, c->counter[i]);
+                       outp +=
+                           sprintf(outp, "cADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num,
+                                   c->counter[i], mp->path);
                 }
                 outp += sprintf(outp, "mc6_us: %016llX\n", c->mc6_us);
         }
  
-       if (p) {
+       if (p && is_cpu_first_core_in_package(t, c, p)) {
                 outp += sprintf(outp, "package: %d\n", p->package_id);
  
                 outp += sprintf(outp, "Weighted cores: %016llX\n", p->pkg_wtd_core_c0);
@@ -1710,16 +2012,18 @@ int dump_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p
                 outp += sprintf(outp, "pc10: %016llX\n", p->pc10);
                 outp += sprintf(outp, "cpu_lpi: %016llX\n", p->cpu_lpi);
                 outp += sprintf(outp, "sys_lpi: %016llX\n", p->sys_lpi);
-               outp += sprintf(outp, "Joules PKG: %0llX\n", p->energy_pkg);
-               outp += sprintf(outp, "Joules COR: %0llX\n", p->energy_cores);
-               outp += sprintf(outp, "Joules GFX: %0llX\n", p->energy_gfx);
-               outp += sprintf(outp, "Joules RAM: %0llX\n", p->energy_dram);
-               outp += sprintf(outp, "Throttle PKG: %0llX\n", p->rapl_pkg_perf_status);
-               outp += sprintf(outp, "Throttle RAM: %0llX\n", p->rapl_dram_perf_status);
+               outp += sprintf(outp, "Joules PKG: %0llX\n", p->energy_pkg.raw_value);
+               outp += sprintf(outp, "Joules COR: %0llX\n", p->energy_cores.raw_value);
+               outp += sprintf(outp, "Joules GFX: %0llX\n", p->energy_gfx.raw_value);
+               outp += sprintf(outp, "Joules RAM: %0llX\n", p->energy_dram.raw_value);
+               outp += sprintf(outp, "Throttle PKG: %0llX\n", p->rapl_pkg_perf_status.raw_value);
+               outp += sprintf(outp, "Throttle RAM: %0llX\n", p->rapl_dram_perf_status.raw_value);
                 outp += sprintf(outp, "PTM: %dC\n", p->pkg_temp_c);
  
                 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) {
-                       outp += sprintf(outp, "pADDED [%d] msr0x%x: %08llX\n", i, mp->msr_num, p->counter[i]);
+                       outp +=
+                           sprintf(outp, "pADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num,
+                                   p->counter[i], mp->path);
                 }
         }
  
@@ -1728,6 +2032,23 @@ int dump_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p
         return 0;
  }
  
+double rapl_counter_get_value(const struct rapl_counter *c, enum rapl_unit desired_unit, double interval)
+{
+       assert(desired_unit != RAPL_UNIT_INVALID);
+
+       /*
+        * For now we don't expect anything other than joules,
+        * so just simplify the logic.
+        */
+       assert(c->unit == RAPL_UNIT_JOULES);
+
+       const double scaled = c->raw_value * c->scale;
+
+       if (desired_unit == RAPL_UNIT_WATTS)
+               return scaled / interval;
+       return scaled;
+}
+
  /*
   * column formatting convention & formats
   */
@@ -1921,9 +2242,11 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data
  
         if (DO_BIC(BIC_CorWatt) && platform->has_per_core_rapl)
                 outp +=
-                   sprintf(outp, fmt8, (printed++ ? delim : ""), c->core_energy * rapl_energy_units / interval_float);
+                   sprintf(outp, fmt8, (printed++ ? delim : ""),
+                           rapl_counter_get_value(&c->core_energy, RAPL_UNIT_WATTS, interval_float));
         if (DO_BIC(BIC_Cor_J) && platform->has_per_core_rapl)
-               outp += sprintf(outp, fmt8, (printed++ ? delim : ""), c->core_energy * rapl_energy_units);
+               outp += sprintf(outp, fmt8, (printed++ ? delim : ""),
+                               rapl_counter_get_value(&c->core_energy, RAPL_UNIT_JOULES, interval_float));
  
         /* print per-package data only for 1st core in package */
         if (!is_cpu_first_core_in_package(t, c, p))
@@ -1951,6 +2274,24 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data
         if (DO_BIC(BIC_GFXACTMHz))
                 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->gfx_act_mhz);
  
+       /* SAMmc6 */
+       if (DO_BIC(BIC_SAM_mc6)) {
+               if (p->sam_mc6_ms == -1) {      /* detect GFX counter reset */
+                       outp += sprintf(outp, "%s**.**", (printed++ ? delim : ""));
+               } else {
+                       outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""),
+                                       p->sam_mc6_ms / 10.0 / interval_float);
+               }
+       }
+
+       /* SAMMHz */
+       if (DO_BIC(BIC_SAMMHz))
+               outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->sam_mhz);
+
+       /* SAMACTMHz */
+       if (DO_BIC(BIC_SAMACTMHz))
+               outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->sam_act_mhz);
+
         /* Totl%C0, Any%C0 GFX%C0 CPUGFX% */
         if (DO_BIC(BIC_Totl_c0))
                 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_wtd_core_c0 / tsc);
@@ -1976,43 +2317,59 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data
         if (DO_BIC(BIC_Pkgpc10))
                 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc10 / tsc);
  
-       if (DO_BIC(BIC_CPU_LPI))
-               outp +=
-                   sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->cpu_lpi / 1000000.0 / interval_float);
-       if (DO_BIC(BIC_SYS_LPI))
-               outp +=
-                   sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->sys_lpi / 1000000.0 / interval_float);
+       if (DO_BIC(BIC_CPU_LPI)) {
+               if (p->cpu_lpi >= 0)
+                       outp +=
+                           sprintf(outp, "%s%.2f", (printed++ ? delim : ""),
+                                   100.0 * p->cpu_lpi / 1000000.0 / interval_float);
+               else
+                       outp += sprintf(outp, "%s(neg)", (printed++ ? delim : ""));
+       }
+       if (DO_BIC(BIC_SYS_LPI)) {
+               if (p->sys_lpi >= 0)
+                       outp +=
+                           sprintf(outp, "%s%.2f", (printed++ ? delim : ""),
+                                   100.0 * p->sys_lpi / 1000000.0 / interval_float);
+               else
+                       outp += sprintf(outp, "%s(neg)", (printed++ ? delim : ""));
+       }
  
         if (DO_BIC(BIC_PkgWatt))
                 outp +=
-                   sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_pkg * rapl_energy_units / interval_float);
-
+                   sprintf(outp, fmt8, (printed++ ? delim : ""),
+                           rapl_counter_get_value(&p->energy_pkg, RAPL_UNIT_WATTS, interval_float));
         if (DO_BIC(BIC_CorWatt) && !platform->has_per_core_rapl)
                 outp +=
-                   sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_cores * rapl_energy_units / interval_float);
+                   sprintf(outp, fmt8, (printed++ ? delim : ""),
+                           rapl_counter_get_value(&p->energy_cores, RAPL_UNIT_WATTS, interval_float));
         if (DO_BIC(BIC_GFXWatt))
                 outp +=
-                   sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_gfx * rapl_energy_units / interval_float);
+                   sprintf(outp, fmt8, (printed++ ? delim : ""),
+                           rapl_counter_get_value(&p->energy_gfx, RAPL_UNIT_WATTS, interval_float));
         if (DO_BIC(BIC_RAMWatt))
                 outp +=
                     sprintf(outp, fmt8, (printed++ ? delim : ""),
-                           p->energy_dram * rapl_dram_energy_units / interval_float);
+                           rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_WATTS, interval_float));
         if (DO_BIC(BIC_Pkg_J))
-               outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_pkg * rapl_energy_units);
+               outp += sprintf(outp, fmt8, (printed++ ? delim : ""),
+                               rapl_counter_get_value(&p->energy_pkg, RAPL_UNIT_JOULES, interval_float));
         if (DO_BIC(BIC_Cor_J) && !platform->has_per_core_rapl)
-               outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_cores * rapl_energy_units);
+               outp += sprintf(outp, fmt8, (printed++ ? delim : ""),
+                               rapl_counter_get_value(&p->energy_cores, RAPL_UNIT_JOULES, interval_float));
         if (DO_BIC(BIC_GFX_J))
-               outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_gfx * rapl_energy_units);
+               outp += sprintf(outp, fmt8, (printed++ ? delim : ""),
+                               rapl_counter_get_value(&p->energy_gfx, RAPL_UNIT_JOULES, interval_float));
         if (DO_BIC(BIC_RAM_J))
-               outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_dram * rapl_dram_energy_units);
+               outp += sprintf(outp, fmt8, (printed++ ? delim : ""),
+                               rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_JOULES, interval_float));
         if (DO_BIC(BIC_PKG__))
                 outp +=
                     sprintf(outp, fmt8, (printed++ ? delim : ""),
-                           100.0 * p->rapl_pkg_perf_status * rapl_time_units / interval_float);
+                           rapl_counter_get_value(&p->rapl_pkg_perf_status, RAPL_UNIT_WATTS, interval_float));
         if (DO_BIC(BIC_RAM__))
                 outp +=
                     sprintf(outp, fmt8, (printed++ ? delim : ""),
-                           100.0 * p->rapl_dram_perf_status * rapl_time_units / interval_float);
+                           rapl_counter_get_value(&p->rapl_dram_perf_status, RAPL_UNIT_WATTS, interval_float));
         /* UncMHz */
         if (DO_BIC(BIC_UNCORE_MHZ))
                 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->uncore_mhz);
@@ -2121,12 +2478,22 @@ int delta_package(struct pkg_data *new, struct pkg_data *old)
         old->gfx_mhz = new->gfx_mhz;
         old->gfx_act_mhz = new->gfx_act_mhz;
  
-       old->energy_pkg = new->energy_pkg - old->energy_pkg;
-       old->energy_cores = new->energy_cores - old->energy_cores;
-       old->energy_gfx = new->energy_gfx - old->energy_gfx;
-       old->energy_dram = new->energy_dram - old->energy_dram;
-       old->rapl_pkg_perf_status = new->rapl_pkg_perf_status - old->rapl_pkg_perf_status;
-       old->rapl_dram_perf_status = new->rapl_dram_perf_status - old->rapl_dram_perf_status;
+       /* flag an error when mc6 counter resets/wraps */
+       if (old->sam_mc6_ms > new->sam_mc6_ms)
+               old->sam_mc6_ms = -1;
+       else
+               old->sam_mc6_ms = new->sam_mc6_ms - old->sam_mc6_ms;
+
+       old->sam_mhz = new->sam_mhz;
+       old->sam_act_mhz = new->sam_act_mhz;
+
+       old->energy_pkg.raw_value = new->energy_pkg.raw_value - old->energy_pkg.raw_value;
+       old->energy_cores.raw_value = new->energy_cores.raw_value - old->energy_cores.raw_value;
+       old->energy_gfx.raw_value = new->energy_gfx.raw_value - old->energy_gfx.raw_value;
+       old->energy_dram.raw_value = new->energy_dram.raw_value - old->energy_dram.raw_value;
+       old->rapl_pkg_perf_status.raw_value = new->rapl_pkg_perf_status.raw_value - old->rapl_pkg_perf_status.raw_value;
+       old->rapl_dram_perf_status.raw_value =
+           new->rapl_dram_perf_status.raw_value - old->rapl_dram_perf_status.raw_value;
  
         for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) {
                 if (mp->format == FORMAT_RAW)
@@ -2150,7 +2517,7 @@ void delta_core(struct core_data *new, struct core_data *old)
         old->core_throt_cnt = new->core_throt_cnt;
         old->mc6_us = new->mc6_us - old->mc6_us;
  
-       DELTA_WRAP32(new->core_energy, old->core_energy);
+       DELTA_WRAP32(new->core_energy.raw_value, old->core_energy.raw_value);
  
         for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
                 if (mp->format == FORMAT_RAW)
@@ -2277,6 +2644,13 @@ int delta_cpu(struct thread_data *t, struct core_data *c,
         return retval;
  }
  
+void rapl_counter_clear(struct rapl_counter *c)
+{
+       c->raw_value = 0;
+       c->scale = 0.0;
+       c->unit = RAPL_UNIT_INVALID;
+}
+
  void clear_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
  {
         int i;
@@ -2304,7 +2678,7 @@ void clear_counters(struct thread_data *t, struct core_data *c, struct pkg_data
         c->c7 = 0;
         c->mc6_us = 0;
         c->core_temp_c = 0;
-       c->core_energy = 0;
+       rapl_counter_clear(&c->core_energy);
         c->core_throt_cnt = 0;
  
         p->pkg_wtd_core_c0 = 0;
@@ -2325,18 +2699,21 @@ void clear_counters(struct thread_data *t, struct core_data *c, struct pkg_data
         p->cpu_lpi = 0;
         p->sys_lpi = 0;
  
-       p->energy_pkg = 0;
-       p->energy_dram = 0;
-       p->energy_cores = 0;
-       p->energy_gfx = 0;
-       p->rapl_pkg_perf_status = 0;
-       p->rapl_dram_perf_status = 0;
+       rapl_counter_clear(&p->energy_pkg);
+       rapl_counter_clear(&p->energy_dram);
+       rapl_counter_clear(&p->energy_cores);
+       rapl_counter_clear(&p->energy_gfx);
+       rapl_counter_clear(&p->rapl_pkg_perf_status);
+       rapl_counter_clear(&p->rapl_dram_perf_status);
         p->pkg_temp_c = 0;
  
         p->gfx_rc6_ms = 0;
         p->uncore_mhz = 0;
         p->gfx_mhz = 0;
         p->gfx_act_mhz = 0;
+       p->sam_mc6_ms = 0;
+       p->sam_mhz = 0;
+       p->sam_act_mhz = 0;
         for (i = 0, mp = sys.tp; mp; i++, mp = mp->next)
                 t->counter[i] = 0;
  
@@ -2347,6 +2724,20 @@ void clear_counters(struct thread_data *t, struct core_data *c, struct pkg_data
                 p->counter[i] = 0;
  }
  
+void rapl_counter_accumulate(struct rapl_counter *dst, const struct rapl_counter *src)
+{
+       /* Copy unit and scale from src if dst is not initialized */
+       if (dst->unit == RAPL_UNIT_INVALID) {
+               dst->unit = src->unit;
+               dst->scale = src->scale;
+       }
+
+       assert(dst->unit == src->unit);
+       assert(dst->scale == src->scale);
+
+       dst->raw_value += src->raw_value;
+}
+
  int sum_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
  {
         int i;
@@ -2393,7 +2784,7 @@ int sum_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
         average.cores.core_temp_c = MAX(average.cores.core_temp_c, c->core_temp_c);
         average.cores.core_throt_cnt = MAX(average.cores.core_throt_cnt, c->core_throt_cnt);
  
-       average.cores.core_energy += c->core_energy;
+       rapl_counter_accumulate(&average.cores.core_energy, &c->core_energy);
  
         for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
                 if (mp->format == FORMAT_RAW)
@@ -2428,25 +2819,29 @@ int sum_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
         average.packages.cpu_lpi = p->cpu_lpi;
         average.packages.sys_lpi = p->sys_lpi;
  
-       average.packages.energy_pkg += p->energy_pkg;
-       average.packages.energy_dram += p->energy_dram;
-       average.packages.energy_cores += p->energy_cores;
-       average.packages.energy_gfx += p->energy_gfx;
+       rapl_counter_accumulate(&average.packages.energy_pkg, &p->energy_pkg);
+       rapl_counter_accumulate(&average.packages.energy_dram, &p->energy_dram);
+       rapl_counter_accumulate(&average.packages.energy_cores, &p->energy_cores);
+       rapl_counter_accumulate(&average.packages.energy_gfx, &p->energy_gfx);
  
         average.packages.gfx_rc6_ms = p->gfx_rc6_ms;
         average.packages.uncore_mhz = p->uncore_mhz;
         average.packages.gfx_mhz = p->gfx_mhz;
         average.packages.gfx_act_mhz = p->gfx_act_mhz;
+       average.packages.sam_mc6_ms = p->sam_mc6_ms;
+       average.packages.sam_mhz = p->sam_mhz;
+       average.packages.sam_act_mhz = p->sam_act_mhz;
  
         average.packages.pkg_temp_c = MAX(average.packages.pkg_temp_c, p->pkg_temp_c);
  
-       average.packages.rapl_pkg_perf_status += p->rapl_pkg_perf_status;
-       average.packages.rapl_dram_perf_status += p->rapl_dram_perf_status;
+       rapl_counter_accumulate(&average.packages.rapl_pkg_perf_status, &p->rapl_pkg_perf_status);
+       rapl_counter_accumulate(&average.packages.rapl_dram_perf_status, &p->rapl_dram_perf_status);
  
         for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) {
-               if (mp->format == FORMAT_RAW)
-                       continue;
-               average.packages.counter[i] += p->counter[i];
+               if ((mp->format == FORMAT_RAW) && (topo.num_packages == 0))
+                       average.packages.counter[i] = p->counter[i];
+               else
+                       average.packages.counter[i] += p->counter[i];
         }
         return 0;
  }
@@ -2578,6 +2973,7 @@ unsigned long long snapshot_sysfs_counter(char *path)
  int get_mp(int cpu, struct msr_counter *mp, unsigned long long *counterp)
  {
         if (mp->msr_num != 0) {
+               assert(!no_msr);
                 if (get_msr(cpu, mp->msr_num, counterp))
                         return -1;
         } else {
@@ -2599,7 +2995,7 @@ unsigned long long get_uncore_mhz(int package, int die)
  {
         char path[128];
  
-       sprintf(path, "/sys/devices/system/cpu/intel_uncore_frequency/package_0%d_die_0%d/current_freq_khz", package,
+       sprintf(path, "/sys/devices/system/cpu/intel_uncore_frequency/package_%02d_die_%02d/current_freq_khz", package,
                 die);
  
         return (snapshot_sysfs_counter(path) / 1000);
@@ -2627,6 +3023,9 @@ int get_epb(int cpu)
         return epb;
  
  msr_fallback:
+       if (no_msr)
+               return -1;
+
         get_msr(cpu, MSR_IA32_ENERGY_PERF_BIAS, &msr);
  
         return msr & 0xf;
@@ -2700,187 +3099,495 @@ int get_core_throt_cnt(int cpu, unsigned long long *cnt)
         return 0;
  }
  
-/*
- * get_counters(...)
- * migrate to cpu
- * acquire and record local counters for that cpu
- */
-int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+struct amperf_group_fd {
+       int aperf;              /* Also the group descriptor */
+       int mperf;
+};
+
+static int read_perf_counter_info(const char *const path, const char *const parse_format, void *value_ptr)
  {
-       int cpu = t->cpu_id;
-       unsigned long long msr;
-       int aperf_mperf_retry_count = 0;
-       struct msr_counter *mp;
-       int i;
+       int fdmt;
+       int bytes_read;
+       char buf[64];
+       int ret = -1;
  
-       if (cpu_migrate(cpu)) {
-               fprintf(outf, "get_counters: Could not migrate to CPU %d\n", cpu);
-               return -1;
+       fdmt = open(path, O_RDONLY, 0);
+       if (fdmt == -1) {
+               if (debug)
+                       fprintf(stderr, "Failed to parse perf counter info %s\n", path);
+               ret = -1;
+               goto cleanup_and_exit;
         }
  
-       gettimeofday(&t->tv_begin, (struct timezone *)NULL);
+       bytes_read = read(fdmt, buf, sizeof(buf) - 1);
+       if (bytes_read <= 0 || bytes_read >= (int)sizeof(buf)) {
+               if (debug)
+                       fprintf(stderr, "Failed to parse perf counter info %s\n", path);
+               ret = -1;
+               goto cleanup_and_exit;
+       }
  
-       if (first_counter_read)
-               get_apic_id(t);
-retry:
-       t->tsc = rdtsc();       /* we are running on local CPU of interest */
+       buf[bytes_read] = '\0';
  
-       if (DO_BIC(BIC_Avg_MHz) || DO_BIC(BIC_Busy) || DO_BIC(BIC_Bzy_MHz) || DO_BIC(BIC_IPC)
-           || soft_c1_residency_display(BIC_Avg_MHz)) {
-               unsigned long long tsc_before, tsc_between, tsc_after, aperf_time, mperf_time;
+       if (sscanf(buf, parse_format, value_ptr) != 1) {
+               if (debug)
+                       fprintf(stderr, "Failed to parse perf counter info %s\n", path);
+               ret = -1;
+               goto cleanup_and_exit;
+       }
  
-               /*
-                * The TSC, APERF and MPERF must be read together for
-                * APERF/MPERF and MPERF/TSC to give accurate results.
-                *
-                * Unfortunately, APERF and MPERF are read by
-                * individual system call, so delays may occur
-                * between them.  If the time to read them
-                * varies by a large amount, we re-read them.
-                */
+       ret = 0;
  
-               /*
-                * This initial dummy APERF read has been seen to
-                * reduce jitter in the subsequent reads.
-                */
+cleanup_and_exit:
+       close(fdmt);
+       return ret;
+}
  
-               if (get_msr(cpu, MSR_IA32_APERF, &t->aperf))
-                       return -3;
+static unsigned int read_perf_counter_info_n(const char *const path, const char *const parse_format)
+{
+       unsigned int v;
+       int status;
  
-               t->tsc = rdtsc();       /* re-read close to APERF */
+       status = read_perf_counter_info(path, parse_format, &v);
+       if (status)
+               v = -1;
  
-               tsc_before = t->tsc;
+       return v;
+}
  
-               if (get_msr(cpu, MSR_IA32_APERF, &t->aperf))
-                       return -3;
+static unsigned int read_msr_type(void)
+{
+       const char *const path = "/sys/bus/event_source/devices/msr/type";
+       const char *const format = "%u";
  
-               tsc_between = rdtsc();
+       return read_perf_counter_info_n(path, format);
+}
  
-               if (get_msr(cpu, MSR_IA32_MPERF, &t->mperf))
-                       return -4;
+static unsigned int read_aperf_config(void)
+{
+       const char *const path = "/sys/bus/event_source/devices/msr/events/aperf";
+       const char *const format = "event=%x";
  
-               tsc_after = rdtsc();
+       return read_perf_counter_info_n(path, format);
+}
  
-               aperf_time = tsc_between - tsc_before;
-               mperf_time = tsc_after - tsc_between;
+static unsigned int read_mperf_config(void)
+{
+       const char *const path = "/sys/bus/event_source/devices/msr/events/mperf";
+       const char *const format = "event=%x";
  
-               /*
-                * If the system call latency to read APERF and MPERF
-                * differ by more than 2x, then try again.
-                */
-               if ((aperf_time > (2 * mperf_time)) || (mperf_time > (2 * aperf_time))) {
-                       aperf_mperf_retry_count++;
-                       if (aperf_mperf_retry_count < 5)
-                               goto retry;
-                       else
-                               warnx("cpu%d jitter %lld %lld", cpu, aperf_time, mperf_time);
-               }
-               aperf_mperf_retry_count = 0;
+       return read_perf_counter_info_n(path, format);
+}
  
-               t->aperf = t->aperf * aperf_mperf_multiplier;
-               t->mperf = t->mperf * aperf_mperf_multiplier;
-       }
+static unsigned int read_perf_type(const char *subsys)
+{
+       const char *const path_format = "/sys/bus/event_source/devices/%s/type";
+       const char *const format = "%u";
+       char path[128];
  
-       if (DO_BIC(BIC_IPC))
-               if (read(get_instr_count_fd(cpu), &t->instr_count, sizeof(long long)) != sizeof(long long))
-                       return -4;
+       snprintf(path, sizeof(path), path_format, subsys);
  
-       if (DO_BIC(BIC_IRQ))
-               t->irq_count = irqs_per_cpu[cpu];
-       if (DO_BIC(BIC_SMI)) {
-               if (get_msr(cpu, MSR_SMI_COUNT, &msr))
-                       return -5;
-               t->smi_count = msr & 0xFFFFFFFF;
-       }
-       if (DO_BIC(BIC_CPU_c1) && platform->has_msr_core_c1_res) {
-               if (get_msr(cpu, MSR_CORE_C1_RES, &t->c1))
-                       return -6;
-       }
+       return read_perf_counter_info_n(path, format);
+}
  
-       for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) {
-               if (get_mp(cpu, mp, &t->counter[i]))
-                       return -10;
-       }
+static unsigned int read_rapl_config(const char *subsys, const char *event_name)
+{
+       const char *const path_format = "/sys/bus/event_source/devices/%s/events/%s";
+       const char *const format = "event=%x";
+       char path[128];
  
-       /* collect core counters only for 1st thread in core */
-       if (!is_cpu_first_thread_in_core(t, c, p))
-               goto done;
+       snprintf(path, sizeof(path), path_format, subsys, event_name);
  
-       if (DO_BIC(BIC_CPU_c3) || soft_c1_residency_display(BIC_CPU_c3)) {
-               if (get_msr(cpu, MSR_CORE_C3_RESIDENCY, &c->c3))
-                       return -6;
-       }
+       return read_perf_counter_info_n(path, format);
+}
  
-       if ((DO_BIC(BIC_CPU_c6) || soft_c1_residency_display(BIC_CPU_c6)) && !platform->has_msr_knl_core_c6_residency) {
-               if (get_msr(cpu, MSR_CORE_C6_RESIDENCY, &c->c6))
-                       return -7;
-       } else if (platform->has_msr_knl_core_c6_residency && soft_c1_residency_display(BIC_CPU_c6)) {
-               if (get_msr(cpu, MSR_KNL_CORE_C6_RESIDENCY, &c->c6))
-                       return -7;
-       }
+static unsigned int read_perf_rapl_unit(const char *subsys, const char *event_name)
+{
+       const char *const path_format = "/sys/bus/event_source/devices/%s/events/%s.unit";
+       const char *const format = "%s";
+       char path[128];
+       char unit_buffer[16];
  
-       if (DO_BIC(BIC_CPU_c7) || soft_c1_residency_display(BIC_CPU_c7)) {
-               if (get_msr(cpu, MSR_CORE_C7_RESIDENCY, &c->c7))
-                       return -8;
-               else if (t->is_atom) {
-                       /*
-                        * For Atom CPUs that has core cstate deeper than c6,
-                        * MSR_CORE_C6_RESIDENCY returns residency of cc6 and deeper.
-                        * Minus CC7 (and deeper cstates) residency to get
-                        * accturate cc6 residency.
-                        */
-                       c->c6 -= c->c7;
-               }
-       }
+       snprintf(path, sizeof(path), path_format, subsys, event_name);
  
-       if (DO_BIC(BIC_Mod_c6))
-               if (get_msr(cpu, MSR_MODULE_C6_RES_MS, &c->mc6_us))
-                       return -8;
+       read_perf_counter_info(path, format, &unit_buffer);
+       if (strcmp("Joules", unit_buffer) == 0)
+               return RAPL_UNIT_JOULES;
  
-       if (DO_BIC(BIC_CoreTmp)) {
-               if (get_msr(cpu, MSR_IA32_THERM_STATUS, &msr))
-                       return -9;
-               c->core_temp_c = tj_max - ((msr >> 16) & 0x7F);
-       }
+       return RAPL_UNIT_INVALID;
+}
  
-       if (DO_BIC(BIC_CORE_THROT_CNT))
-               get_core_throt_cnt(cpu, &c->core_throt_cnt);
+static double read_perf_rapl_scale(const char *subsys, const char *event_name)
+{
+       const char *const path_format = "/sys/bus/event_source/devices/%s/events/%s.scale";
+       const char *const format = "%lf";
+       char path[128];
+       double scale;
  
-       if (platform->rapl_msrs & RAPL_AMD_F17H) {
-               if (get_msr(cpu, MSR_CORE_ENERGY_STAT, &msr))
-                       return -14;
-               c->core_energy = msr & 0xFFFFFFFF;
-       }
+       snprintf(path, sizeof(path), path_format, subsys, event_name);
  
-       for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
-               if (get_mp(cpu, mp, &c->counter[i]))
-                       return -10;
-       }
+       if (read_perf_counter_info(path, format, &scale))
+               return 0.0;
  
-       /* collect package counters only for 1st core in package */
-       if (!is_cpu_first_core_in_package(t, c, p))
-               goto done;
+       return scale;
+}
  
-       if (DO_BIC(BIC_Totl_c0)) {
-               if (get_msr(cpu, MSR_PKG_WEIGHTED_CORE_C0_RES, &p->pkg_wtd_core_c0))
-                       return -10;
-       }
-       if (DO_BIC(BIC_Any_c0)) {
-               if (get_msr(cpu, MSR_PKG_ANY_CORE_C0_RES, &p->pkg_any_core_c0))
-                       return -11;
-       }
-       if (DO_BIC(BIC_GFX_c0)) {
-               if (get_msr(cpu, MSR_PKG_ANY_GFXE_C0_RES, &p->pkg_any_gfxe_c0))
-                       return -12;
-       }
-       if (DO_BIC(BIC_CPUGFX)) {
-               if (get_msr(cpu, MSR_PKG_BOTH_CORE_GFXE_C0_RES, &p->pkg_both_core_gfxe_c0))
-                       return -13;
-       }
-       if (DO_BIC(BIC_Pkgpc3))
-               if (get_msr(cpu, MSR_PKG_C3_RESIDENCY, &p->pc3))
-                       return -9;
-       if (DO_BIC(BIC_Pkgpc6)) {
+static struct amperf_group_fd open_amperf_fd(int cpu)
+{
+       const unsigned int msr_type = read_msr_type();
+       const unsigned int aperf_config = read_aperf_config();
+       const unsigned int mperf_config = read_mperf_config();
+       struct amperf_group_fd fds = {.aperf = -1, .mperf = -1 };
+
+       fds.aperf = open_perf_counter(cpu, msr_type, aperf_config, -1, PERF_FORMAT_GROUP);
+       fds.mperf = open_perf_counter(cpu, msr_type, mperf_config, fds.aperf, PERF_FORMAT_GROUP);
+
+       return fds;
+}
+
+static int get_amperf_fd(int cpu)
+{
+       assert(fd_amperf_percpu);
+
+       if (fd_amperf_percpu[cpu].aperf)
+               return fd_amperf_percpu[cpu].aperf;
+
+       fd_amperf_percpu[cpu] = open_amperf_fd(cpu);
+
+       return fd_amperf_percpu[cpu].aperf;
+}
+
+/* Read APERF, MPERF and TSC using the perf API. */
+static int read_aperf_mperf_tsc_perf(struct thread_data *t, int cpu)
+{
+       union {
+               struct {
+                       unsigned long nr_entries;
+                       unsigned long aperf;
+                       unsigned long mperf;
+               };
+
+               unsigned long as_array[3];
+       } cnt;
+
+       const int fd_amperf = get_amperf_fd(cpu);
+
+       /*
+        * Read the TSC with rdtsc, because we want the absolute value and not
+        * the offset from the start of the counter.
+        */
+       t->tsc = rdtsc();
+
+       const int n = read(fd_amperf, &cnt.as_array[0], sizeof(cnt.as_array));
+
+       if (n != sizeof(cnt.as_array))
+               return -2;
+
+       t->aperf = cnt.aperf * aperf_mperf_multiplier;
+       t->mperf = cnt.mperf * aperf_mperf_multiplier;
+
+       return 0;
+}
+
+/* Read APERF, MPERF and TSC using the MSR driver and rdtsc instruction. */
+static int read_aperf_mperf_tsc_msr(struct thread_data *t, int cpu)
+{
+       unsigned long long tsc_before, tsc_between, tsc_after, aperf_time, mperf_time;
+       int aperf_mperf_retry_count = 0;
+
+       /*
+        * The TSC, APERF and MPERF must be read together for
+        * APERF/MPERF and MPERF/TSC to give accurate results.
+        *
+        * Unfortunately, APERF and MPERF are read by
+        * individual system call, so delays may occur
+        * between them.  If the time to read them
+        * varies by a large amount, we re-read them.
+        */
+
+       /*
+        * This initial dummy APERF read has been seen to
+        * reduce jitter in the subsequent reads.
+        */
+
+       if (get_msr(cpu, MSR_IA32_APERF, &t->aperf))
+               return -3;
+
+retry:
+       t->tsc = rdtsc();       /* re-read close to APERF */
+
+       tsc_before = t->tsc;
+
+       if (get_msr(cpu, MSR_IA32_APERF, &t->aperf))
+               return -3;
+
+       tsc_between = rdtsc();
+
+       if (get_msr(cpu, MSR_IA32_MPERF, &t->mperf))
+               return -4;
+
+       tsc_after = rdtsc();
+
+       aperf_time = tsc_between - tsc_before;
+       mperf_time = tsc_after - tsc_between;
+
+       /*
+        * If the system call latency to read APERF and MPERF
+        * differ by more than 2x, then try again.
+        */
+       if ((aperf_time > (2 * mperf_time)) || (mperf_time > (2 * aperf_time))) {
+               aperf_mperf_retry_count++;
+               if (aperf_mperf_retry_count < 5)
+                       goto retry;
+               else
+                       warnx("cpu%d jitter %lld %lld", cpu, aperf_time, mperf_time);
+       }
+       aperf_mperf_retry_count = 0;
+
+       t->aperf = t->aperf * aperf_mperf_multiplier;
+       t->mperf = t->mperf * aperf_mperf_multiplier;
+
+       return 0;
+}
+
+size_t rapl_counter_info_count_perf(const struct rapl_counter_info_t *rci)
+{
+       size_t ret = 0;
+
+       for (int i = 0; i < NUM_RAPL_COUNTERS; ++i)
+               if (rci->source[i] == RAPL_SOURCE_PERF)
+                       ++ret;
+
+       return ret;
+}
+
+void write_rapl_counter(struct rapl_counter *rc, struct rapl_counter_info_t *rci, unsigned int idx)
+{
+       rc->raw_value = rci->data[idx];
+       rc->unit = rci->unit[idx];
+       rc->scale = rci->scale[idx];
+}
+
+int get_rapl_counters(int cpu, int domain, struct core_data *c, struct pkg_data *p)
+{
+       unsigned long long perf_data[NUM_RAPL_COUNTERS + 1];
+       struct rapl_counter_info_t *rci = &rapl_counter_info_perdomain[domain];
+
+       if (debug)
+               fprintf(stderr, "%s: cpu%d domain%d\n", __func__, cpu, domain);
+
+       assert(rapl_counter_info_perdomain);
+
+       /*
+        * If we have any perf counters to read, read them all now, in bulk
+        */
+       if (rci->fd_perf != -1) {
+               size_t num_perf_counters = rapl_counter_info_count_perf(rci);
+               const ssize_t expected_read_size = (num_perf_counters + 1) * sizeof(unsigned long long);
+               const ssize_t actual_read_size = read(rci->fd_perf, &perf_data[0], sizeof(perf_data));
+
+               if (actual_read_size != expected_read_size)
+                       err(-1, "%s: failed to read perf_data (%zu %zu)", __func__, expected_read_size,
+                           actual_read_size);
+       }
+
+       for (unsigned int i = 0, pi = 1; i < NUM_RAPL_COUNTERS; ++i) {
+               switch (rci->source[i]) {
+               case RAPL_SOURCE_NONE:
+                       break;
+
+               case RAPL_SOURCE_PERF:
+                       assert(pi < ARRAY_SIZE(perf_data));
+                       assert(rci->fd_perf != -1);
+
+                       if (debug)
+                               fprintf(stderr, "Reading rapl counter via perf at %u (%llu %e %lf)\n",
+                                       i, perf_data[pi], rci->scale[i], perf_data[pi] * rci->scale[i]);
+
+                       rci->data[i] = perf_data[pi];
+
+                       ++pi;
+                       break;
+
+               case RAPL_SOURCE_MSR:
+                       if (debug)
+                               fprintf(stderr, "Reading rapl counter via msr at %u\n", i);
+
+                       assert(!no_msr);
+                       if (rci->flags[i] & RAPL_COUNTER_FLAG_USE_MSR_SUM) {
+                               if (get_msr_sum(cpu, rci->msr[i], &rci->data[i]))
+                                       return -13 - i;
+                       } else {
+                               if (get_msr(cpu, rci->msr[i], &rci->data[i]))
+                                       return -13 - i;
+                       }
+
+                       rci->data[i] &= rci->msr_mask[i];
+                       if (rci->msr_shift[i] >= 0)
+                               rci->data[i] >>= abs(rci->msr_shift[i]);
+                       else
+                               rci->data[i] <<= abs(rci->msr_shift[i]);
+
+                       break;
+               }
+       }
+
+       _Static_assert(NUM_RAPL_COUNTERS == 7);
+       write_rapl_counter(&p->energy_pkg, rci, RAPL_RCI_INDEX_ENERGY_PKG);
+       write_rapl_counter(&p->energy_cores, rci, RAPL_RCI_INDEX_ENERGY_CORES);
+       write_rapl_counter(&p->energy_dram, rci, RAPL_RCI_INDEX_DRAM);
+       write_rapl_counter(&p->energy_gfx, rci, RAPL_RCI_INDEX_GFX);
+       write_rapl_counter(&p->rapl_pkg_perf_status, rci, RAPL_RCI_INDEX_PKG_PERF_STATUS);
+       write_rapl_counter(&p->rapl_dram_perf_status, rci, RAPL_RCI_INDEX_DRAM_PERF_STATUS);
+       write_rapl_counter(&c->core_energy, rci, RAPL_RCI_INDEX_CORE_ENERGY);
+
+       return 0;
+}
+
+/*
+ * get_counters(...)
+ * migrate to cpu
+ * acquire and record local counters for that cpu
+ */
+int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+{
+       int cpu = t->cpu_id;
+       unsigned long long msr;
+       struct msr_counter *mp;
+       int i;
+       int status;
+
+       if (cpu_migrate(cpu)) {
+               fprintf(outf, "%s: Could not migrate to CPU %d\n", __func__, cpu);
+               return -1;
+       }
+
+       gettimeofday(&t->tv_begin, (struct timezone *)NULL);
+
+       if (first_counter_read)
+               get_apic_id(t);
+
+       t->tsc = rdtsc();       /* we are running on local CPU of interest */
+
+       if (DO_BIC(BIC_Avg_MHz) || DO_BIC(BIC_Busy) || DO_BIC(BIC_Bzy_MHz) || DO_BIC(BIC_IPC)
+           || soft_c1_residency_display(BIC_Avg_MHz)) {
+               int status = -1;
+
+               assert(!no_perf || !no_msr);
+
+               switch (amperf_source) {
+               case AMPERF_SOURCE_PERF:
+                       status = read_aperf_mperf_tsc_perf(t, cpu);
+                       break;
+               case AMPERF_SOURCE_MSR:
+                       status = read_aperf_mperf_tsc_msr(t, cpu);
+                       break;
+               }
+
+               if (status != 0)
+                       return status;
+       }
+
+       if (DO_BIC(BIC_IPC))
+               if (read(get_instr_count_fd(cpu), &t->instr_count, sizeof(long long)) != sizeof(long long))
+                       return -4;
+
+       if (DO_BIC(BIC_IRQ))
+               t->irq_count = irqs_per_cpu[cpu];
+       if (DO_BIC(BIC_SMI)) {
+               if (get_msr(cpu, MSR_SMI_COUNT, &msr))
+                       return -5;
+               t->smi_count = msr & 0xFFFFFFFF;
+       }
+       if (DO_BIC(BIC_CPU_c1) && platform->has_msr_core_c1_res) {
+               if (get_msr(cpu, MSR_CORE_C1_RES, &t->c1))
+                       return -6;
+       }
+
+       for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) {
+               if (get_mp(cpu, mp, &t->counter[i]))
+                       return -10;
+       }
+
+       /* collect core counters only for 1st thread in core */
+       if (!is_cpu_first_thread_in_core(t, c, p))
+               goto done;
+
+       if (platform->has_per_core_rapl) {
+               status = get_rapl_counters(cpu, c->core_id, c, p);
+               if (status != 0)
+                       return status;
+       }
+
+       if (DO_BIC(BIC_CPU_c3) || soft_c1_residency_display(BIC_CPU_c3)) {
+               if (get_msr(cpu, MSR_CORE_C3_RESIDENCY, &c->c3))
+                       return -6;
+       }
+
+       if ((DO_BIC(BIC_CPU_c6) || soft_c1_residency_display(BIC_CPU_c6)) && !platform->has_msr_knl_core_c6_residency) {
+               if (get_msr(cpu, MSR_CORE_C6_RESIDENCY, &c->c6))
+                       return -7;
+       } else if (platform->has_msr_knl_core_c6_residency && soft_c1_residency_display(BIC_CPU_c6)) {
+               if (get_msr(cpu, MSR_KNL_CORE_C6_RESIDENCY, &c->c6))
+                       return -7;
+       }
+
+       if (DO_BIC(BIC_CPU_c7) || soft_c1_residency_display(BIC_CPU_c7)) {
+               if (get_msr(cpu, MSR_CORE_C7_RESIDENCY, &c->c7))
+                       return -8;
+               else if (t->is_atom) {
+                       /*
+                        * For Atom CPUs that has core cstate deeper than c6,
+                        * MSR_CORE_C6_RESIDENCY returns residency of cc6 and deeper.
+                        * Minus CC7 (and deeper cstates) residency to get
+                        * accturate cc6 residency.
+                        */
+                       c->c6 -= c->c7;
+               }
+       }
+
+       if (DO_BIC(BIC_Mod_c6))
+               if (get_msr(cpu, MSR_MODULE_C6_RES_MS, &c->mc6_us))
+                       return -8;
+
+       if (DO_BIC(BIC_CoreTmp)) {
+               if (get_msr(cpu, MSR_IA32_THERM_STATUS, &msr))
+                       return -9;
+               c->core_temp_c = tj_max - ((msr >> 16) & 0x7F);
+       }
+
+       if (DO_BIC(BIC_CORE_THROT_CNT))
+               get_core_throt_cnt(cpu, &c->core_throt_cnt);
+
+       for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
+               if (get_mp(cpu, mp, &c->counter[i]))
+                       return -10;
+       }
+
+       /* collect package counters only for 1st core in package */
+       if (!is_cpu_first_core_in_package(t, c, p))
+               goto done;
+
+       if (DO_BIC(BIC_Totl_c0)) {
+               if (get_msr(cpu, MSR_PKG_WEIGHTED_CORE_C0_RES, &p->pkg_wtd_core_c0))
+                       return -10;
+       }
+       if (DO_BIC(BIC_Any_c0)) {
+               if (get_msr(cpu, MSR_PKG_ANY_CORE_C0_RES, &p->pkg_any_core_c0))
+                       return -11;
+       }
+       if (DO_BIC(BIC_GFX_c0)) {
+               if (get_msr(cpu, MSR_PKG_ANY_GFXE_C0_RES, &p->pkg_any_gfxe_c0))
+                       return -12;
+       }
+       if (DO_BIC(BIC_CPUGFX)) {
+               if (get_msr(cpu, MSR_PKG_BOTH_CORE_GFXE_C0_RES, &p->pkg_both_core_gfxe_c0))
+                       return -13;
+       }
+       if (DO_BIC(BIC_Pkgpc3))
+               if (get_msr(cpu, MSR_PKG_C3_RESIDENCY, &p->pc3))
+                       return -9;
+       if (DO_BIC(BIC_Pkgpc6)) {
                 if (platform->has_msr_atom_pkg_c6_residency) {
                         if (get_msr(cpu, MSR_ATOM_PKG_C6_RESIDENCY, &p->pc6))
                                 return -10;
@@ -2911,59 +3618,39 @@ retry:
         if (DO_BIC(BIC_SYS_LPI))
                 p->sys_lpi = cpuidle_cur_sys_lpi_us;
  
-       if (platform->rapl_msrs & RAPL_PKG) {
-               if (get_msr_sum(cpu, MSR_PKG_ENERGY_STATUS, &msr))
-                       return -13;
-               p->energy_pkg = msr;
-       }
-       if (platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS) {
-               if (get_msr_sum(cpu, MSR_PP0_ENERGY_STATUS, &msr))
-                       return -14;
-               p->energy_cores = msr;
-       }
-       if (platform->rapl_msrs & RAPL_DRAM) {
-               if (get_msr_sum(cpu, MSR_DRAM_ENERGY_STATUS, &msr))
-                       return -15;
-               p->energy_dram = msr;
-       }
-       if (platform->rapl_msrs & RAPL_GFX) {
-               if (get_msr_sum(cpu, MSR_PP1_ENERGY_STATUS, &msr))
-                       return -16;
-               p->energy_gfx = msr;
-       }
-       if (platform->rapl_msrs & RAPL_PKG_PERF_STATUS) {
-               if (get_msr_sum(cpu, MSR_PKG_PERF_STATUS, &msr))
-                       return -16;
-               p->rapl_pkg_perf_status = msr;
-       }
-       if (platform->rapl_msrs & RAPL_DRAM_PERF_STATUS) {
-               if (get_msr_sum(cpu, MSR_DRAM_PERF_STATUS, &msr))
-                       return -16;
-               p->rapl_dram_perf_status = msr;
-       }
-       if (platform->rapl_msrs & RAPL_AMD_F17H) {
-               if (get_msr_sum(cpu, MSR_PKG_ENERGY_STAT, &msr))
-                       return -13;
-               p->energy_pkg = msr;
+       if (!platform->has_per_core_rapl) {
+               status = get_rapl_counters(cpu, p->package_id, c, p);
+               if (status != 0)
+                       return status;
         }
+
         if (DO_BIC(BIC_PkgTmp)) {
                 if (get_msr(cpu, MSR_IA32_PACKAGE_THERM_STATUS, &msr))
                         return -17;
                 p->pkg_temp_c = tj_max - ((msr >> 16) & 0x7F);
         }
  
-       if (DO_BIC(BIC_GFX_rc6))
-               p->gfx_rc6_ms = gfx_cur_rc6_ms;
-
         /* n.b. assume die0 uncore frequency applies to whole package */
         if (DO_BIC(BIC_UNCORE_MHZ))
                 p->uncore_mhz = get_uncore_mhz(p->package_id, 0);
  
+       if (DO_BIC(BIC_GFX_rc6))
+               p->gfx_rc6_ms = gfx_info[GFX_rc6].val_ull;
+
         if (DO_BIC(BIC_GFXMHz))
-               p->gfx_mhz = gfx_cur_mhz;
+               p->gfx_mhz = gfx_info[GFX_MHz].val;
  
         if (DO_BIC(BIC_GFXACTMHz))
-               p->gfx_act_mhz = gfx_act_mhz;
+               p->gfx_act_mhz = gfx_info[GFX_ACTMHz].val;
+
+       if (DO_BIC(BIC_SAM_mc6))
+               p->sam_mc6_ms = gfx_info[SAM_mc6].val_ull;
+
+       if (DO_BIC(BIC_SAMMHz))
+               p->sam_mhz = gfx_info[SAM_MHz].val;
+
+       if (DO_BIC(BIC_SAMACTMHz))
+               p->sam_act_mhz = gfx_info[SAM_ACTMHz].val;
  
         for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) {
                 if (get_mp(cpu, mp, &p->counter[i]))
@@ -3053,7 +3740,7 @@ void probe_cst_limit(void)
         unsigned long long msr;
         int *pkg_cstate_limits;
  
-       if (!platform->has_nhm_msrs)
+       if (!platform->has_nhm_msrs || no_msr)
                 return;
  
         switch (platform->cst_limit) {
@@ -3097,7 +3784,7 @@ static void dump_platform_info(void)
         unsigned long long msr;
         unsigned int ratio;
  
-       if (!platform->has_nhm_msrs)
+       if (!platform->has_nhm_msrs || no_msr)
                 return;
  
         get_msr(base_cpu, MSR_PLATFORM_INFO, &msr);
@@ -3115,7 +3802,7 @@ static void dump_power_ctl(void)
  {
         unsigned long long msr;
  
-       if (!platform->has_nhm_msrs)
+       if (!platform->has_nhm_msrs || no_msr)
                 return;
  
         get_msr(base_cpu, MSR_IA32_POWER_CTL, &msr);
@@ -3321,7 +4008,7 @@ static void dump_cst_cfg(void)
  {
         unsigned long long msr;
  
-       if (!platform->has_nhm_msrs)
+       if (!platform->has_nhm_msrs || no_msr)
                 return;
  
         get_msr(base_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr);
@@ -3393,7 +4080,7 @@ void print_irtl(void)
  {
         unsigned long long msr;
  
-       if (!platform->has_irtl_msrs)
+       if (!platform->has_irtl_msrs || no_msr)
                 return;
  
         if (platform->supported_cstates & PC3) {
@@ -3443,12 +4130,64 @@ void free_fd_percpu(void)
  {
         int i;
  
+       if (!fd_percpu)
+               return;
+
         for (i = 0; i < topo.max_cpu_num + 1; ++i) {
                 if (fd_percpu[i] != 0)
                         close(fd_percpu[i]);
         }
  
         free(fd_percpu);
+       fd_percpu = NULL;
+}
+
+void free_fd_amperf_percpu(void)
+{
+       int i;
+
+       if (!fd_amperf_percpu)
+               return;
+
+       for (i = 0; i < topo.max_cpu_num + 1; ++i) {
+               if (fd_amperf_percpu[i].mperf != 0)
+                       close(fd_amperf_percpu[i].mperf);
+
+               if (fd_amperf_percpu[i].aperf != 0)
+                       close(fd_amperf_percpu[i].aperf);
+       }
+
+       free(fd_amperf_percpu);
+       fd_amperf_percpu = NULL;
+}
+
+void free_fd_instr_count_percpu(void)
+{
+       if (!fd_instr_count_percpu)
+               return;
+
+       for (int i = 0; i < topo.max_cpu_num + 1; ++i) {
+               if (fd_instr_count_percpu[i] != 0)
+                       close(fd_instr_count_percpu[i]);
+       }
+
+       free(fd_instr_count_percpu);
+       fd_instr_count_percpu = NULL;
+}
+
+void free_fd_rapl_percpu(void)
+{
+       if (!rapl_counter_info_perdomain)
+               return;
+
+       const int num_domains = platform->has_per_core_rapl ? topo.num_cores : topo.num_packages;
+
+       for (int domain_id = 0; domain_id < num_domains; ++domain_id) {
+               if (rapl_counter_info_perdomain[domain_id].fd_perf != -1)
+                       close(rapl_counter_info_perdomain[domain_id].fd_perf);
+       }
+
+       free(rapl_counter_info_perdomain);
  }
  
  void free_all_buffers(void)
@@ -3492,6 +4231,9 @@ void free_all_buffers(void)
         outp = NULL;
  
         free_fd_percpu();
+       free_fd_instr_count_percpu();
+       free_fd_amperf_percpu();
+       free_fd_rapl_percpu();
  
         free(irq_column_2_cpu);
         free(irqs_per_cpu);
@@ -3825,11 +4567,17 @@ static void update_effective_set(bool startup)
                 err(1, "%s: cpu str malformat %s\n", PATH_EFFECTIVE_CPUS, cpu_effective_str);
  }
  
+void linux_perf_init(void);
+void rapl_perf_init(void);
+
  void re_initialize(void)
  {
         free_all_buffers();
         setup_all_buffers(false);
-       fprintf(outf, "turbostat: re-initialized with num_cpus %d, allowed_cpus %d\n", topo.num_cpus, topo.allowed_cpus);
+       linux_perf_init();
+       rapl_perf_init();
+       fprintf(outf, "turbostat: re-initialized with num_cpus %d, allowed_cpus %d\n", topo.num_cpus,
+               topo.allowed_cpus);
  }
  
  void set_max_cpu_num(void)
@@ -3940,85 +4688,43 @@ int snapshot_proc_interrupts(void)
  }
  
  /*
- * snapshot_gfx_rc6_ms()
+ * snapshot_graphics()
   *
- * record snapshot of
- * /sys/class/drm/card0/power/rc6_residency_ms
+ * record snapshot of specified graphics sysfs knob
   *
   * return 1 if config change requires a restart, else return 0
   */
-int snapshot_gfx_rc6_ms(void)
+int snapshot_graphics(int idx)
  {
         FILE *fp;
         int retval;
  
-       fp = fopen_or_die("/sys/class/drm/card0/power/rc6_residency_ms", "r");
-
-       retval = fscanf(fp, "%lld", &gfx_cur_rc6_ms);
-       if (retval != 1)
-               err(1, "GFX rc6");
-
-       fclose(fp);
-
-       return 0;
-}
-
-/*
- * snapshot_gfx_mhz()
- *
- * fall back to /sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz
- * when /sys/class/drm/card0/gt_cur_freq_mhz is not available.
- *
- * return 1 if config change requires a restart, else return 0
- */
-int snapshot_gfx_mhz(void)
-{
-       static FILE *fp;
-       int retval;
-
-       if (fp == NULL) {
-               fp = fopen("/sys/class/drm/card0/gt_cur_freq_mhz", "r");
-               if (!fp)
-                       fp = fopen_or_die("/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz", "r");
-       } else {
-               rewind(fp);
-               fflush(fp);
-       }
-
-       retval = fscanf(fp, "%d", &gfx_cur_mhz);
-       if (retval != 1)
-               err(1, "GFX MHz");
-
-       return 0;
-}
-
-/*
- * snapshot_gfx_cur_mhz()
- *
- * fall back to /sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz
- * when /sys/class/drm/card0/gt_act_freq_mhz is not available.
- *
- * return 1 if config change requires a restart, else return 0
- */
-int snapshot_gfx_act_mhz(void)
-{
-       static FILE *fp;
-       int retval;
-
-       if (fp == NULL) {
-               fp = fopen("/sys/class/drm/card0/gt_act_freq_mhz", "r");
-               if (!fp)
-                       fp = fopen_or_die("/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz", "r");
-       } else {
-               rewind(fp);
-               fflush(fp);
+       switch (idx) {
+       case GFX_rc6:
+       case SAM_mc6:
+               fp = fopen_or_die(gfx_info[idx].path, "r");
+               retval = fscanf(fp, "%lld", &gfx_info[idx].val_ull);
+               if (retval != 1)
+                       err(1, "rc6");
+               fclose(fp);
+               return 0;
+       case GFX_MHz:
+       case GFX_ACTMHz:
+       case SAM_MHz:
+       case SAM_ACTMHz:
+               if (gfx_info[idx].fp == NULL) {
+                       gfx_info[idx].fp = fopen_or_die(gfx_info[idx].path, "r");
+               } else {
+                       rewind(gfx_info[idx].fp);
+                       fflush(gfx_info[idx].fp);
+               }
+               retval = fscanf(gfx_info[idx].fp, "%d", &gfx_info[idx].val);
+               if (retval != 1)
+                       err(1, "MHz");
+               return 0;
+       default:
+               return -EINVAL;
         }
-
-       retval = fscanf(fp, "%d", &gfx_act_mhz);
-       if (retval != 1)
-               err(1, "GFX ACT MHz");
-
-       return 0;
  }
  
  /*
@@ -4083,13 +4789,22 @@ int snapshot_proc_sysfs_files(void)
                         return 1;
  
         if (DO_BIC(BIC_GFX_rc6))
-               snapshot_gfx_rc6_ms();
+               snapshot_graphics(GFX_rc6);
  
         if (DO_BIC(BIC_GFXMHz))
-               snapshot_gfx_mhz();
+               snapshot_graphics(GFX_MHz);
  
         if (DO_BIC(BIC_GFXACTMHz))
-               snapshot_gfx_act_mhz();
+               snapshot_graphics(GFX_ACTMHz);
+
+       if (DO_BIC(BIC_SAM_mc6))
+               snapshot_graphics(SAM_mc6);
+
+       if (DO_BIC(BIC_SAMMHz))
+               snapshot_graphics(SAM_MHz);
+
+       if (DO_BIC(BIC_SAMACTMHz))
+               snapshot_graphics(SAM_ACTMHz);
  
         if (DO_BIC(BIC_CPU_LPI))
                 snapshot_cpu_lpi_us();
@@ -4173,6 +4888,8 @@ int get_msr_sum(int cpu, off_t offset, unsigned long long *msr)
         int ret, idx;
         unsigned long long msr_cur, msr_last;
  
+       assert(!no_msr);
+
         if (!per_cpu_msr_sum)
                 return 1;
  
@@ -4201,6 +4918,8 @@ static int update_msr_sum(struct thread_data *t, struct core_data *c, struct pkg
         UNUSED(c);
         UNUSED(p);
  
+       assert(!no_msr);
+
         for (i = IDX_PKG_ENERGY; i < IDX_COUNT; i++) {
                 unsigned long long msr_cur, msr_last;
                 off_t offset;
@@ -4280,7 +4999,8 @@ release_msr:
  
  /*
   * set_my_sched_priority(pri)
- * return previous
+ * return previous priority on success
+ * return value < -20 on failure
   */
  int set_my_sched_priority(int priority)
  {
@@ -4290,16 +5010,16 @@ int set_my_sched_priority(int priority)
         errno = 0;
         original_priority = getpriority(PRIO_PROCESS, 0);
         if (errno && (original_priority == -1))
-               err(errno, "getpriority");
+               return -21;
  
         retval = setpriority(PRIO_PROCESS, 0, priority);
         if (retval)
-               errx(retval, "capget(CAP_SYS_NICE) failed,try \"# setcap cap_sys_nice=ep %s\"", progname);
+               return -21;
  
         errno = 0;
         retval = getpriority(PRIO_PROCESS, 0);
         if (retval != priority)
-               err(retval, "getpriority(%d) != setpriority(%d)", retval, priority);
+               return -21;
  
         return original_priority;
  }
@@ -4314,6 +5034,9 @@ void turbostat_loop()
  
         /*
          * elevate own priority for interval mode
+        *
+        * ignore on error - we probably don't have permission to set it, but
+        * it's not a big deal
          */
         set_my_sched_priority(-20);
  
@@ -4399,10 +5122,13 @@ void check_dev_msr()
         struct stat sb;
         char pathname[32];
  
+       if (no_msr)
+               return;
+
         sprintf(pathname, "/dev/cpu/%d/msr", base_cpu);
         if (stat(pathname, &sb))
                 if (system("/sbin/modprobe msr > /dev/null 2>&1"))
-                       err(-5, "no /dev/cpu/0/msr, Try \"# modprobe msr\" ");
+                       no_msr = 1;
  }
  
  /*
@@ -4414,47 +5140,51 @@ int check_for_cap_sys_rawio(void)
  {
         cap_t caps;
         cap_flag_value_t cap_flag_value;
+       int ret = 0;
  
         caps = cap_get_proc();
         if (caps == NULL)
-               err(-6, "cap_get_proc\n");
+               return 1;
  
-       if (cap_get_flag(caps, CAP_SYS_RAWIO, CAP_EFFECTIVE, &cap_flag_value))
-               err(-6, "cap_get\n");
+       if (cap_get_flag(caps, CAP_SYS_RAWIO, CAP_EFFECTIVE, &cap_flag_value)) {
+               ret = 1;
+               goto free_and_exit;
+       }
  
         if (cap_flag_value != CAP_SET) {
-               warnx("capget(CAP_SYS_RAWIO) failed," " try \"# setcap cap_sys_rawio=ep %s\"", progname);
-               return 1;
+               ret = 1;
+               goto free_and_exit;
         }
  
+free_and_exit:
         if (cap_free(caps) == -1)
                 err(-6, "cap_free\n");
  
-       return 0;
+       return ret;
  }
  
-void check_permissions(void)
+void check_msr_permission(void)
  {
-       int do_exit = 0;
+       int failed = 0;
         char pathname[32];
  
+       if (no_msr)
+               return;
+
         /* check for CAP_SYS_RAWIO */
-       do_exit += check_for_cap_sys_rawio();
+       failed += check_for_cap_sys_rawio();
  
         /* test file permissions */
         sprintf(pathname, "/dev/cpu/%d/msr", base_cpu);
         if (euidaccess(pathname, R_OK)) {
-               do_exit++;
-               warn("/dev/cpu/0/msr open failed, try chown or chmod +r /dev/cpu/*/msr");
+               failed++;
         }
  
-       /* if all else fails, thell them to be root */
-       if (do_exit)
-               if (getuid() != 0)
-                       warnx("... or simply run as root");
-
-       if (do_exit)
-               exit(-6);
+       if (failed) {
+               warnx("Failed to access %s. Some of the counters may not be available\n"
+                     "\tRun as root to enable them or use %s to disable the access explicitly", pathname, "--no-msr");
+               no_msr = 1;
+       }
  }
  
  void probe_bclk(void)
@@ -4462,7 +5192,7 @@ void probe_bclk(void)
         unsigned long long msr;
         unsigned int base_ratio;
  
-       if (!platform->has_nhm_msrs)
+       if (!platform->has_nhm_msrs || no_msr)
                 return;
  
         if (platform->bclk_freq == BCLK_100MHZ)
@@ -4502,7 +5232,7 @@ static void dump_turbo_ratio_info(void)
         if (!has_turbo)
                 return;
  
-       if (!platform->has_nhm_msrs)
+       if (!platform->has_nhm_msrs || no_msr)
                 return;
  
         if (platform->trl_msrs & TRL_LIMIT2)
@@ -4567,20 +5297,15 @@ static void dump_sysfs_file(char *path)
  static void probe_intel_uncore_frequency(void)
  {
         int i, j;
-       char path[128];
+       char path[256];
  
         if (!genuine_intel)
                 return;
  
-       if (access("/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00", R_OK))
-               return;
-
-       /* Cluster level sysfs not supported yet. */
-       if (!access("/sys/devices/system/cpu/intel_uncore_frequency/uncore00", R_OK))
-               return;
+       if (access("/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz", R_OK))
+               goto probe_cluster;
  
-       if (!access("/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz", R_OK))
-               BIC_PRESENT(BIC_UNCORE_MHZ);
+       BIC_PRESENT(BIC_UNCORE_MHZ);
  
         if (quiet)
                 return;
@@ -4588,40 +5313,178 @@ static void probe_intel_uncore_frequency(void)
         for (i = 0; i < topo.num_packages; ++i) {
                 for (j = 0; j < topo.num_die; ++j) {
                         int k, l;
+                       char path_base[128];
  
-                       sprintf(path, "/sys/devices/system/cpu/intel_uncore_frequency/package_0%d_die_0%d/min_freq_khz",
-                               i, j);
+                       sprintf(path_base, "/sys/devices/system/cpu/intel_uncore_frequency/package_%02d_die_%02d", i,
+                               j);
+
+                       sprintf(path, "%s/min_freq_khz", path_base);
                         k = read_sysfs_int(path);
-                       sprintf(path, "/sys/devices/system/cpu/intel_uncore_frequency/package_0%d_die_0%d/max_freq_khz",
-                               i, j);
+                       sprintf(path, "%s/max_freq_khz", path_base);
                         l = read_sysfs_int(path);
-                       fprintf(outf, "Uncore Frequency pkg%d die%d: %d - %d MHz ", i, j, k / 1000, l / 1000);
+                       fprintf(outf, "Uncore Frequency package%d die%d: %d - %d MHz ", i, j, k / 1000, l / 1000);
  
-                       sprintf(path,
-                               "/sys/devices/system/cpu/intel_uncore_frequency/package_0%d_die_0%d/initial_min_freq_khz",
-                               i, j);
+                       sprintf(path, "%s/initial_min_freq_khz", path_base);
                         k = read_sysfs_int(path);
-                       sprintf(path,
-                               "/sys/devices/system/cpu/intel_uncore_frequency/package_0%d_die_0%d/initial_max_freq_khz",
-                               i, j);
+                       sprintf(path, "%s/initial_max_freq_khz", path_base);
                         l = read_sysfs_int(path);
-                       fprintf(outf, "(%d - %d MHz)\n", k / 1000, l / 1000);
+                       fprintf(outf, "(%d - %d MHz)", k / 1000, l / 1000);
+
+                       sprintf(path, "%s/current_freq_khz", path_base);
+                       k = read_sysfs_int(path);
+                       fprintf(outf, " %d MHz\n", k / 1000);
                 }
         }
+       return;
+
+probe_cluster:
+       if (access("/sys/devices/system/cpu/intel_uncore_frequency/uncore00/current_freq_khz", R_OK))
+               return;
+
+       if (quiet)
+               return;
+
+       for (i = 0;; ++i) {
+               int k, l;
+               char path_base[128];
+               int package_id, domain_id, cluster_id;
+
+               sprintf(path_base, "/sys/devices/system/cpu/intel_uncore_frequency/uncore%02d", i);
+
+               if (access(path_base, R_OK))
+                       break;
+
+               sprintf(path, "%s/package_id", path_base);
+               package_id = read_sysfs_int(path);
+
+               sprintf(path, "%s/domain_id", path_base);
+               domain_id = read_sysfs_int(path);
+
+               sprintf(path, "%s/fabric_cluster_id", path_base);
+               cluster_id = read_sysfs_int(path);
+
+               sprintf(path, "%s/min_freq_khz", path_base);
+               k = read_sysfs_int(path);
+               sprintf(path, "%s/max_freq_khz", path_base);
+               l = read_sysfs_int(path);
+               fprintf(outf, "Uncore Frequency package%d domain%d cluster%d: %d - %d MHz ", package_id, domain_id,
+                       cluster_id, k / 1000, l / 1000);
+
+               sprintf(path, "%s/initial_min_freq_khz", path_base);
+               k = read_sysfs_int(path);
+               sprintf(path, "%s/initial_max_freq_khz", path_base);
+               l = read_sysfs_int(path);
+               fprintf(outf, "(%d - %d MHz)", k / 1000, l / 1000);
+
+               sprintf(path, "%s/current_freq_khz", path_base);
+               k = read_sysfs_int(path);
+               fprintf(outf, " %d MHz\n", k / 1000);
+       }
  }
  
  static void probe_graphics(void)
  {
+       /* Xe graphics sysfs knobs */
+       if (!access("/sys/class/drm/card0/device/tile0/gt0/gtidle/idle_residency_ms", R_OK)) {
+               FILE *fp;
+               char buf[8];
+               bool gt0_is_gt;
+               int idx;
+
+               fp = fopen("/sys/class/drm/card0/device/tile0/gt0/gtidle/name", "r");
+               if (!fp)
+                       goto next;
+
+               if (!fread(buf, sizeof(char), 7, fp)) {
+                       fclose(fp);
+                       goto next;
+               }
+               fclose(fp);
+
+               if (!strncmp(buf, "gt0-rc", strlen("gt0-rc")))
+                       gt0_is_gt = true;
+               else if (!strncmp(buf, "gt0-mc", strlen("gt0-mc")))
+                       gt0_is_gt = false;
+               else
+                       goto next;
+
+               idx = gt0_is_gt ? GFX_rc6 : SAM_mc6;
+               gfx_info[idx].path = "/sys/class/drm/card0/device/tile0/gt0/gtidle/idle_residency_ms";
+
+               idx = gt0_is_gt ? GFX_MHz : SAM_MHz;
+               if (!access("/sys/class/drm/card0/device/tile0/gt0/freq0/cur_freq", R_OK))
+                       gfx_info[idx].path = "/sys/class/drm/card0/device/tile0/gt0/freq0/cur_freq";
+
+               idx = gt0_is_gt ? GFX_ACTMHz : SAM_ACTMHz;
+               if (!access("/sys/class/drm/card0/device/tile0/gt0/freq0/act_freq", R_OK))
+                       gfx_info[idx].path = "/sys/class/drm/card0/device/tile0/gt0/freq0/act_freq";
+
+               idx = gt0_is_gt ? SAM_mc6 : GFX_rc6;
+               if (!access("/sys/class/drm/card0/device/tile0/gt1/gtidle/idle_residency_ms", R_OK))
+                       gfx_info[idx].path = "/sys/class/drm/card0/device/tile0/gt1/gtidle/idle_residency_ms";
+
+               idx = gt0_is_gt ? SAM_MHz : GFX_MHz;
+               if (!access("/sys/class/drm/card0/device/tile0/gt1/freq0/cur_freq", R_OK))
+                       gfx_info[idx].path = "/sys/class/drm/card0/device/tile0/gt1/freq0/cur_freq";
+
+               idx = gt0_is_gt ? SAM_ACTMHz : GFX_ACTMHz;
+               if (!access("/sys/class/drm/card0/device/tile0/gt1/freq0/act_freq", R_OK))
+                       gfx_info[idx].path = "/sys/class/drm/card0/device/tile0/gt1/freq0/act_freq";
+
+               goto end;
+       }
+
+next:
+       /* New i915 graphics sysfs knobs */
+       if (!access("/sys/class/drm/card0/gt/gt0/rc6_residency_ms", R_OK)) {
+               gfx_info[GFX_rc6].path = "/sys/class/drm/card0/gt/gt0/rc6_residency_ms";
+
+               if (!access("/sys/class/drm/card0/gt/gt0/rps_cur_freq_mhz", R_OK))
+                       gfx_info[GFX_MHz].path = "/sys/class/drm/card0/gt/gt0/rps_cur_freq_mhz";
+
+               if (!access("/sys/class/drm/card0/gt/gt0/rps_act_freq_mhz", R_OK))
+                       gfx_info[GFX_ACTMHz].path = "/sys/class/drm/card0/gt/gt0/rps_act_freq_mhz";
+
+               if (!access("/sys/class/drm/card0/gt/gt1/rc6_residency_ms", R_OK))
+                       gfx_info[SAM_mc6].path = "/sys/class/drm/card0/gt/gt1/rc6_residency_ms";
+
+               if (!access("/sys/class/drm/card0/gt/gt1/rps_cur_freq_mhz", R_OK))
+                       gfx_info[SAM_MHz].path = "/sys/class/drm/card0/gt/gt1/rps_cur_freq_mhz";
+
+               if (!access("/sys/class/drm/card0/gt/gt1/rps_act_freq_mhz", R_OK))
+                       gfx_info[SAM_ACTMHz].path = "/sys/class/drm/card0/gt/gt1/rps_act_freq_mhz";
+
+               goto end;
+       }
+
+       /* Fall back to traditional i915 graphics sysfs knobs */
         if (!access("/sys/class/drm/card0/power/rc6_residency_ms", R_OK))
-               BIC_PRESENT(BIC_GFX_rc6);
+               gfx_info[GFX_rc6].path = "/sys/class/drm/card0/power/rc6_residency_ms";
+
+       if (!access("/sys/class/drm/card0/gt_cur_freq_mhz", R_OK))
+               gfx_info[GFX_MHz].path = "/sys/class/drm/card0/gt_cur_freq_mhz";
+       else if (!access("/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz", R_OK))
+               gfx_info[GFX_MHz].path = "/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz";
  
-       if (!access("/sys/class/drm/card0/gt_cur_freq_mhz", R_OK) ||
-           !access("/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz", R_OK))
-               BIC_PRESENT(BIC_GFXMHz);
  
-       if (!access("/sys/class/drm/card0/gt_act_freq_mhz", R_OK) ||
-           !access("/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz", R_OK))
+       if (!access("/sys/class/drm/card0/gt_act_freq_mhz", R_OK))
+               gfx_info[GFX_ACTMHz].path = "/sys/class/drm/card0/gt_act_freq_mhz";
+       else if (!access("/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz", R_OK))
+               gfx_info[GFX_ACTMHz].path = "/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz";
+
+end:
+       if (gfx_info[GFX_rc6].path)
+               BIC_PRESENT(BIC_GFX_rc6);
+       if (gfx_info[GFX_MHz].path)
+               BIC_PRESENT(BIC_GFXMHz);
+       if (gfx_info[GFX_ACTMHz].path)
                 BIC_PRESENT(BIC_GFXACTMHz);
+       if (gfx_info[SAM_mc6].path)
+               BIC_PRESENT(BIC_SAM_mc6);
+       if (gfx_info[SAM_MHz].path)
+               BIC_PRESENT(BIC_SAMMHz);
+       if (gfx_info[SAM_ACTMHz].path)
+               BIC_PRESENT(BIC_SAMACTMHz);
  }
  
  static void dump_sysfs_cstate_config(void)
@@ -4783,6 +5646,9 @@ int print_hwp(struct thread_data *t, struct core_data *c, struct pkg_data *p)
         UNUSED(c);
         UNUSED(p);
  
+       if (no_msr)
+               return 0;
+
         if (!has_hwp)
                 return 0;
  
@@ -4869,6 +5735,9 @@ int print_perf_limit(struct thread_data *t, struct core_data *c, struct pkg_data
         UNUSED(c);
         UNUSED(p);
  
+       if (no_msr)
+               return 0;
+
         cpu = t->cpu_id;
  
         /* per-package */
@@ -4983,31 +5852,18 @@ void rapl_probe_intel(void)
         unsigned long long msr;
         unsigned int time_unit;
         double tdp;
+       const unsigned long long bic_watt_bits = BIC_PkgWatt | BIC_CorWatt | BIC_RAMWatt | BIC_GFXWatt;
+       const unsigned long long bic_joules_bits = BIC_Pkg_J | BIC_Cor_J | BIC_RAM_J | BIC_GFX_J;
  
-       if (rapl_joules) {
-               if (platform->rapl_msrs & RAPL_PKG_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_Pkg_J);
-               if (platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_Cor_J);
-               if (platform->rapl_msrs & RAPL_DRAM_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_RAM_J);
-               if (platform->rapl_msrs & RAPL_GFX_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_GFX_J);
-       } else {
-               if (platform->rapl_msrs & RAPL_PKG_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_PkgWatt);
-               if (platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_CorWatt);
-               if (platform->rapl_msrs & RAPL_DRAM_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_RAMWatt);
-               if (platform->rapl_msrs & RAPL_GFX_ENERGY_STATUS)
-                       BIC_PRESENT(BIC_GFXWatt);
-       }
+       if (rapl_joules)
+               bic_enabled &= ~bic_watt_bits;
+       else
+               bic_enabled &= ~bic_joules_bits;
  
-       if (platform->rapl_msrs & RAPL_PKG_PERF_STATUS)
-               BIC_PRESENT(BIC_PKG__);
-       if (platform->rapl_msrs & RAPL_DRAM_PERF_STATUS)
-               BIC_PRESENT(BIC_RAM__);
+       if (!(platform->rapl_msrs & RAPL_PKG_PERF_STATUS))
+               bic_enabled &= ~BIC_PKG__;
+       if (!(platform->rapl_msrs & RAPL_DRAM_PERF_STATUS))
+               bic_enabled &= ~BIC_RAM__;
  
         /* units on package 0, verify later other packages match */
         if (get_msr(base_cpu, MSR_RAPL_POWER_UNIT, &msr))
@@ -5041,14 +5897,13 @@ void rapl_probe_amd(void)
  {
         unsigned long long msr;
         double tdp;
+       const unsigned long long bic_watt_bits = BIC_PkgWatt | BIC_CorWatt;
+       const unsigned long long bic_joules_bits = BIC_Pkg_J | BIC_Cor_J;
  
-       if (rapl_joules) {
-               BIC_PRESENT(BIC_Pkg_J);
-               BIC_PRESENT(BIC_Cor_J);
-       } else {
-               BIC_PRESENT(BIC_PkgWatt);
-               BIC_PRESENT(BIC_CorWatt);
-       }
+       if (rapl_joules)
+               bic_enabled &= ~bic_watt_bits;
+       else
+               bic_enabled &= ~bic_joules_bits;
  
         if (get_msr(base_cpu, MSR_RAPL_PWR_UNIT, &msr))
                 return;
@@ -5202,7 +6057,7 @@ int print_rapl(struct thread_data *t, struct core_data *c, struct pkg_data *p)
   */
  void probe_rapl(void)
  {
-       if (!platform->rapl_msrs)
+       if (!platform->rapl_msrs || no_msr)
                 return;
  
         if (genuine_intel)
@@ -5258,7 +6113,7 @@ int set_temperature_target(struct thread_data *t, struct core_data *c, struct pk
         }
  
         /* Temperature Target MSR is Nehalem and newer only */
-       if (!platform->has_nhm_msrs)
+       if (!platform->has_nhm_msrs || no_msr)
                 goto guess;
  
         if (get_msr(base_cpu, MSR_IA32_TEMPERATURE_TARGET, &msr))
@@ -5305,6 +6160,9 @@ int print_thermal(struct thread_data *t, struct core_data *c, struct pkg_data *p
         UNUSED(c);
         UNUSED(p);
  
+       if (no_msr)
+               return 0;
+
         if (!(do_dts || do_ptm))
                 return 0;
  
@@ -5402,6 +6260,9 @@ void decode_feature_control_msr(void)
  {
         unsigned long long msr;
  
+       if (no_msr)
+               return;
+
         if (!get_msr(base_cpu, MSR_IA32_FEAT_CTL, &msr))
                 fprintf(outf, "cpu%d: MSR_IA32_FEATURE_CONTROL: 0x%08llx (%sLocked %s)\n",
                         base_cpu, msr, msr & FEAT_CTL_LOCKED ? "" : "UN-", msr & (1 << 18) ? "SGX" : "");
@@ -5411,6 +6272,9 @@ void decode_misc_enable_msr(void)
  {
         unsigned long long msr;
  
+       if (no_msr)
+               return;
+
         if (!genuine_intel)
                 return;
  
@@ -5428,6 +6292,9 @@ void decode_misc_feature_control(void)
  {
         unsigned long long msr;
  
+       if (no_msr)
+               return;
+
         if (!platform->has_msr_misc_feature_control)
                 return;
  
@@ -5449,6 +6316,9 @@ void decode_misc_pwr_mgmt_msr(void)
  {
         unsigned long long msr;
  
+       if (no_msr)
+               return;
+
         if (!platform->has_msr_misc_pwr_mgmt)
                 return;
  
@@ -5468,6 +6338,9 @@ void decode_c6_demotion_policy_msr(void)
  {
         unsigned long long msr;
  
+       if (no_msr)
+               return;
+
         if (!platform->has_msr_c6_demotion_policy_config)
                 return;
  
@@ -5489,7 +6362,8 @@ void print_dev_latency(void)
  
         fd = open(path, O_RDONLY);
         if (fd < 0) {
-               warnx("capget(CAP_SYS_ADMIN) failed, try \"# setcap cap_sys_admin=ep %s\"", progname);
+               if (debug)
+                       warnx("Read %s failed", path);
                 return;
         }
  
@@ -5504,23 +6378,260 @@ void print_dev_latency(void)
         close(fd);
  }
  
+static int has_instr_count_access(void)
+{
+       int fd;
+       int has_access;
+
+       if (no_perf)
+               return 0;
+
+       fd = open_perf_counter(base_cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, -1, 0);
+       has_access = fd != -1;
+
+       if (fd != -1)
+               close(fd);
+
+       if (!has_access)
+               warnx("Failed to access %s. Some of the counters may not be available\n"
+                     "\tRun as root to enable them or use %s to disable the access explicitly",
+                     "instructions retired perf counter", "--no-perf");
+
+       return has_access;
+}
+
+bool is_aperf_access_required(void)
+{
+       return BIC_IS_ENABLED(BIC_Avg_MHz)
+           || BIC_IS_ENABLED(BIC_Busy)
+           || BIC_IS_ENABLED(BIC_Bzy_MHz)
+           || BIC_IS_ENABLED(BIC_IPC);
+}
+
+int add_rapl_perf_counter_(int cpu, struct rapl_counter_info_t *rci, const struct rapl_counter_arch_info *cai,
+                          double *scale_, enum rapl_unit *unit_)
+{
+       if (no_perf)
+               return -1;
+
+       const double scale = read_perf_rapl_scale(cai->perf_subsys, cai->perf_name);
+
+       if (scale == 0.0)
+               return -1;
+
+       const enum rapl_unit unit = read_perf_rapl_unit(cai->perf_subsys, cai->perf_name);
+
+       if (unit == RAPL_UNIT_INVALID)
+               return -1;
+
+       const unsigned int rapl_type = read_perf_type(cai->perf_subsys);
+       const unsigned int rapl_energy_pkg_config = read_rapl_config(cai->perf_subsys, cai->perf_name);
+
+       const int fd_counter =
+           open_perf_counter(cpu, rapl_type, rapl_energy_pkg_config, rci->fd_perf, PERF_FORMAT_GROUP);
+       if (fd_counter == -1)
+               return -1;
+
+       /* If it's the first counter opened, make it a group descriptor */
+       if (rci->fd_perf == -1)
+               rci->fd_perf = fd_counter;
+
+       *scale_ = scale;
+       *unit_ = unit;
+       return fd_counter;
+}
+
+int add_rapl_perf_counter(int cpu, struct rapl_counter_info_t *rci, const struct rapl_counter_arch_info *cai,
+                         double *scale, enum rapl_unit *unit)
+{
+       int ret = add_rapl_perf_counter_(cpu, rci, cai, scale, unit);
+
+       if (debug)
+               fprintf(stderr, "%s: %d (cpu: %d)\n", __func__, ret, cpu);
+
+       return ret;
+}
+
  /*
   * Linux-perf manages the HW instructions-retired counter
   * by enabling when requested, and hiding rollover
   */
  void linux_perf_init(void)
  {
-       if (!BIC_IS_ENABLED(BIC_IPC))
-               return;
-
         if (access("/proc/sys/kernel/perf_event_paranoid", F_OK))
                 return;
  
-       fd_instr_count_percpu = calloc(topo.max_cpu_num + 1, sizeof(int));
-       if (fd_instr_count_percpu == NULL)
-               err(-1, "calloc fd_instr_count_percpu");
+       if (BIC_IS_ENABLED(BIC_IPC) && has_aperf) {
+               fd_instr_count_percpu = calloc(topo.max_cpu_num + 1, sizeof(int));
+               if (fd_instr_count_percpu == NULL)
+                       err(-1, "calloc fd_instr_count_percpu");
+       }
+
+       const bool aperf_required = is_aperf_access_required();
+
+       if (aperf_required && has_aperf && amperf_source == AMPERF_SOURCE_PERF) {
+               fd_amperf_percpu = calloc(topo.max_cpu_num + 1, sizeof(*fd_amperf_percpu));
+               if (fd_amperf_percpu == NULL)
+                       err(-1, "calloc fd_amperf_percpu");
+       }
+}
+
+void rapl_perf_init(void)
+{
+       const int num_domains = platform->has_per_core_rapl ? topo.num_cores : topo.num_packages;
+       bool *domain_visited = calloc(num_domains, sizeof(bool));
+
+       rapl_counter_info_perdomain = calloc(num_domains, sizeof(*rapl_counter_info_perdomain));
+       if (rapl_counter_info_perdomain == NULL)
+               err(-1, "calloc rapl_counter_info_percpu");
+
+       /*
+        * Initialize rapl_counter_info_percpu
+        */
+       for (int domain_id = 0; domain_id < num_domains; ++domain_id) {
+               struct rapl_counter_info_t *rci = &rapl_counter_info_perdomain[domain_id];
+
+               rci->fd_perf = -1;
+               for (size_t i = 0; i < NUM_RAPL_COUNTERS; ++i) {
+                       rci->data[i] = 0;
+                       rci->source[i] = RAPL_SOURCE_NONE;
+               }
+       }
+
+       /*
+        * Open/probe the counters
+        * If can't get it via perf, fallback to MSR
+        */
+       for (size_t i = 0; i < ARRAY_SIZE(rapl_counter_arch_infos); ++i) {
+
+               const struct rapl_counter_arch_info *const cai = &rapl_counter_arch_infos[i];
+               bool has_counter = 0;
+               double scale;
+               enum rapl_unit unit;
+               int next_domain;
+
+               memset(domain_visited, 0, num_domains * sizeof(*domain_visited));
+
+               for (int cpu = 0; cpu < topo.max_cpu_num + 1; ++cpu) {
+
+                       if (cpu_is_not_allowed(cpu))
+                               continue;
+
+                       /* Skip already seen and handled RAPL domains */
+                       next_domain =
+                           platform->has_per_core_rapl ? cpus[cpu].physical_core_id : cpus[cpu].physical_package_id;
+
+                       if (domain_visited[next_domain])
+                               continue;
+
+                       domain_visited[next_domain] = 1;
+
+                       struct rapl_counter_info_t *rci = &rapl_counter_info_perdomain[next_domain];
+
+                       /* Check if the counter is enabled and accessible */
+                       if (BIC_IS_ENABLED(cai->bic) && (platform->rapl_msrs & cai->feature_mask)) {
+
+                               /* Use perf API for this counter */
+                               if (!no_perf && cai->perf_name
+                                   && add_rapl_perf_counter(cpu, rci, cai, &scale, &unit) != -1) {
+                                       rci->source[cai->rci_index] = RAPL_SOURCE_PERF;
+                                       rci->scale[cai->rci_index] = scale * cai->compat_scale;
+                                       rci->unit[cai->rci_index] = unit;
+                                       rci->flags[cai->rci_index] = cai->flags;
+
+                                       /* Use MSR for this counter */
+                               } else if (!no_msr && cai->msr && probe_msr(cpu, cai->msr) == 0) {
+                                       rci->source[cai->rci_index] = RAPL_SOURCE_MSR;
+                                       rci->msr[cai->rci_index] = cai->msr;
+                                       rci->msr_mask[cai->rci_index] = cai->msr_mask;
+                                       rci->msr_shift[cai->rci_index] = cai->msr_shift;
+                                       rci->unit[cai->rci_index] = RAPL_UNIT_JOULES;
+                                       rci->scale[cai->rci_index] = *cai->platform_rapl_msr_scale * cai->compat_scale;
+                                       rci->flags[cai->rci_index] = cai->flags;
+                               }
+                       }
+
+                       if (rci->source[cai->rci_index] != RAPL_SOURCE_NONE)
+                               has_counter = 1;
+               }
+
+               /* If any CPU has access to the counter, make it present */
+               if (has_counter)
+                       BIC_PRESENT(cai->bic);
+       }
+
+       free(domain_visited);
+}
+
+static int has_amperf_access_via_msr(void)
+{
+       if (no_msr)
+               return 0;
+
+       if (probe_msr(base_cpu, MSR_IA32_APERF))
+               return 0;
+
+       if (probe_msr(base_cpu, MSR_IA32_MPERF))
+               return 0;
+
+       return 1;
+}
+
+static int has_amperf_access_via_perf(void)
+{
+       struct amperf_group_fd fds;
+
+       /*
+        * Cache the last result, so we don't warn the user multiple times
+        *
+        * Negative means cached, no access
+        * Zero means not cached
+        * Positive means cached, has access
+        */
+       static int has_access_cached;
+
+       if (no_perf)
+               return 0;
+
+       if (has_access_cached != 0)
+               return has_access_cached > 0;
+
+       fds = open_amperf_fd(base_cpu);
+       has_access_cached = (fds.aperf != -1) && (fds.mperf != -1);
+
+       if (fds.aperf == -1)
+               warnx("Failed to access %s. Some of the counters may not be available\n"
+                     "\tRun as root to enable them or use %s to disable the access explicitly",
+                     "APERF perf counter", "--no-perf");
+       else
+               close(fds.aperf);
+
+       if (fds.mperf == -1)
+               warnx("Failed to access %s. Some of the counters may not be available\n"
+                     "\tRun as root to enable them or use %s to disable the access explicitly",
+                     "MPERF perf counter", "--no-perf");
+       else
+               close(fds.mperf);
+
+       if (has_access_cached == 0)
+               has_access_cached = -1;
+
+       return has_access_cached > 0;
+}
+
+/* Check if we can access APERF and MPERF */
+static int has_amperf_access(void)
+{
+       if (!is_aperf_access_required())
+               return 0;
+
+       if (!no_msr && has_amperf_access_via_msr())
+               return 1;
+
+       if (!no_perf && has_amperf_access_via_perf())
+               return 1;
  
-       BIC_PRESENT(BIC_IPC);
+       return 0;
  }
  
  void probe_cstates(void)
@@ -5563,7 +6674,7 @@ void probe_cstates(void)
         if (platform->has_msr_module_c6_res_ms)
                 BIC_PRESENT(BIC_Mod_c6);
  
-       if (platform->has_ext_cst_msrs) {
+       if (platform->has_ext_cst_msrs && !no_msr) {
                 BIC_PRESENT(BIC_Totl_c0);
                 BIC_PRESENT(BIC_Any_c0);
                 BIC_PRESENT(BIC_GFX_c0);
@@ -5623,6 +6734,7 @@ void process_cpuid()
         unsigned int eax, ebx, ecx, edx;
         unsigned int fms, family, model, stepping, ecx_flags, edx_flags;
         unsigned long long ucode_patch = 0;
+       bool ucode_patch_valid = false;
  
         eax = ebx = ecx = edx = 0;
  
@@ -5650,8 +6762,12 @@ void process_cpuid()
         ecx_flags = ecx;
         edx_flags = edx;
  
-       if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch))
-               warnx("get_msr(UCODE)");
+       if (!no_msr) {
+               if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch))
+                       warnx("get_msr(UCODE)");
+               else
+                       ucode_patch_valid = true;
+       }
  
         /*
          * check max extended function levels of CPUID.
@@ -5662,9 +6778,12 @@ void process_cpuid()
         __cpuid(0x80000000, max_extended_level, ebx, ecx, edx);
  
         if (!quiet) {
-               fprintf(outf, "CPUID(1): family:model:stepping 0x%x:%x:%x (%d:%d:%d) microcode 0x%x\n",
-                       family, model, stepping, family, model, stepping,
-                       (unsigned int)((ucode_patch >> 32) & 0xFFFFFFFF));
+               fprintf(outf, "CPUID(1): family:model:stepping 0x%x:%x:%x (%d:%d:%d)",
+                       family, model, stepping, family, model, stepping);
+               if (ucode_patch_valid)
+                       fprintf(outf, " microcode 0x%x", (unsigned int)((ucode_patch >> 32) & 0xFFFFFFFF));
+               fputc('\n', outf);
+
                 fprintf(outf, "CPUID(0x80000000): max_extended_levels: 0x%x\n", max_extended_level);
                 fprintf(outf, "CPUID(1): %s %s %s %s %s %s %s %s %s %s\n",
                         ecx_flags & (1 << 0) ? "SSE3" : "-",
@@ -5700,10 +6819,11 @@ void process_cpuid()
  
         __cpuid(0x6, eax, ebx, ecx, edx);
         has_aperf = ecx & (1 << 0);
-       if (has_aperf) {
+       if (has_aperf && has_amperf_access()) {
                 BIC_PRESENT(BIC_Avg_MHz);
                 BIC_PRESENT(BIC_Busy);
                 BIC_PRESENT(BIC_Bzy_MHz);
+               BIC_PRESENT(BIC_IPC);
         }
         do_dts = eax & (1 << 0);
         if (do_dts)
@@ -5786,6 +6906,15 @@ void process_cpuid()
                 base_mhz = max_mhz = bus_mhz = edx = 0;
  
                 __cpuid(0x16, base_mhz, max_mhz, bus_mhz, edx);
+
+               bclk = bus_mhz;
+
+               base_hz = base_mhz * 1000000;
+               has_base_hz = 1;
+
+               if (platform->enable_tsc_tweak)
+                       tsc_tweak = base_hz / tsc_hz;
+
                 if (!quiet)
                         fprintf(outf, "CPUID(0x16): base_mhz: %d max_mhz: %d bus_mhz: %d\n",
                                 base_mhz, max_mhz, bus_mhz);
@@ -5814,7 +6943,7 @@ void probe_pm_features(void)
  
         probe_thermal();
  
-       if (platform->has_nhm_msrs)
+       if (platform->has_nhm_msrs && !no_msr)
                 BIC_PRESENT(BIC_SMI);
  
         if (!quiet)
@@ -6142,6 +7271,7 @@ void topology_update(void)
         topo.allowed_packages = 0;
         for_all_cpus(update_topo, ODD_COUNTERS);
  }
+
  void setup_all_buffers(bool startup)
  {
         topology_probe(startup);
@@ -6169,21 +7299,129 @@ void set_base_cpu(void)
         err(-ENODEV, "No valid cpus found");
  }
  
+static void set_amperf_source(void)
+{
+       amperf_source = AMPERF_SOURCE_PERF;
+
+       const bool aperf_required = is_aperf_access_required();
+
+       if (no_perf || !aperf_required || !has_amperf_access_via_perf())
+               amperf_source = AMPERF_SOURCE_MSR;
+
+       if (quiet || !debug)
+               return;
+
+       fprintf(outf, "aperf/mperf source preference: %s\n", amperf_source == AMPERF_SOURCE_MSR ? "msr" : "perf");
+}
+
+bool has_added_counters(void)
+{
+       /*
+        * It only makes sense to call this after the command line is parsed,
+        * otherwise sys structure is not populated.
+        */
+
+       return sys.added_core_counters | sys.added_thread_counters | sys.added_package_counters;
+}
+
+bool is_msr_access_required(void)
+{
+       if (no_msr)
+               return false;
+
+       if (has_added_counters())
+               return true;
+
+       return BIC_IS_ENABLED(BIC_SMI)
+           || BIC_IS_ENABLED(BIC_CPU_c1)
+           || BIC_IS_ENABLED(BIC_CPU_c3)
+           || BIC_IS_ENABLED(BIC_CPU_c6)
+           || BIC_IS_ENABLED(BIC_CPU_c7)
+           || BIC_IS_ENABLED(BIC_Mod_c6)
+           || BIC_IS_ENABLED(BIC_CoreTmp)
+           || BIC_IS_ENABLED(BIC_Totl_c0)
+           || BIC_IS_ENABLED(BIC_Any_c0)
+           || BIC_IS_ENABLED(BIC_GFX_c0)
+           || BIC_IS_ENABLED(BIC_CPUGFX)
+           || BIC_IS_ENABLED(BIC_Pkgpc3)
+           || BIC_IS_ENABLED(BIC_Pkgpc6)
+           || BIC_IS_ENABLED(BIC_Pkgpc2)
+           || BIC_IS_ENABLED(BIC_Pkgpc7)
+           || BIC_IS_ENABLED(BIC_Pkgpc8)
+           || BIC_IS_ENABLED(BIC_Pkgpc9)
+           || BIC_IS_ENABLED(BIC_Pkgpc10)
+           /* TODO: Multiplex access with perf */
+           || BIC_IS_ENABLED(BIC_CorWatt)
+           || BIC_IS_ENABLED(BIC_Cor_J)
+           || BIC_IS_ENABLED(BIC_PkgWatt)
+           || BIC_IS_ENABLED(BIC_CorWatt)
+           || BIC_IS_ENABLED(BIC_GFXWatt)
+           || BIC_IS_ENABLED(BIC_RAMWatt)
+           || BIC_IS_ENABLED(BIC_Pkg_J)
+           || BIC_IS_ENABLED(BIC_Cor_J)
+           || BIC_IS_ENABLED(BIC_GFX_J)
+           || BIC_IS_ENABLED(BIC_RAM_J)
+           || BIC_IS_ENABLED(BIC_PKG__)
+           || BIC_IS_ENABLED(BIC_RAM__)
+           || BIC_IS_ENABLED(BIC_PkgTmp)
+           || (is_aperf_access_required() && !has_amperf_access_via_perf());
+}
+
+void check_msr_access(void)
+{
+       if (!is_msr_access_required())
+               no_msr = 1;
+
+       check_dev_msr();
+       check_msr_permission();
+
+       if (no_msr)
+               bic_disable_msr_access();
+}
+
+void check_perf_access(void)
+{
+       const bool intrcount_required = BIC_IS_ENABLED(BIC_IPC);
+
+       if (no_perf || !intrcount_required || !has_instr_count_access())
+               bic_enabled &= ~BIC_IPC;
+
+       const bool aperf_required = is_aperf_access_required();
+
+       if (!aperf_required || !has_amperf_access()) {
+               bic_enabled &= ~BIC_Avg_MHz;
+               bic_enabled &= ~BIC_Busy;
+               bic_enabled &= ~BIC_Bzy_MHz;
+               bic_enabled &= ~BIC_IPC;
+       }
+}
+
  void turbostat_init()
  {
         setup_all_buffers(true);
         set_base_cpu();
-       check_dev_msr();
-       check_permissions();
+       check_msr_access();
+       check_perf_access();
         process_cpuid();
         probe_pm_features();
+       set_amperf_source();
         linux_perf_init();
+       rapl_perf_init();
  
         for_all_cpus(get_cpu_type, ODD_COUNTERS);
         for_all_cpus(get_cpu_type, EVEN_COUNTERS);
  
         if (DO_BIC(BIC_IPC))
                 (void)get_instr_count_fd(base_cpu);
+
+       /*
+        * If TSC tweak is needed, but couldn't get it,
+        * disable more BICs, since it can't be reported accurately.
+        */
+       if (platform->enable_tsc_tweak && !has_base_hz) {
+               bic_enabled &= ~BIC_Busy;
+               bic_enabled &= ~BIC_Bzy_MHz;
+       }
  }
  
  int fork_it(char **argv)
@@ -6259,7 +7497,7 @@ int get_and_dump_counters(void)
  
  void print_version()
  {
-       fprintf(outf, "turbostat version 2023.11.07 - Len Brown <lenb@kernel.org>\n");
+       fprintf(outf, "turbostat version 2024.04.08 - Len Brown <lenb@kernel.org>\n");
  }
  
  #define COMMAND_LINE_SIZE 2048
@@ -6291,6 +7529,9 @@ int add_counter(unsigned int msr_num, char *path, char *name,
  {
         struct msr_counter *msrp;
  
+       if (no_msr && msr_num)
+               errx(1, "Requested MSR counter 0x%x, but in --no-msr mode", msr_num);
+
         msrp = calloc(1, sizeof(struct msr_counter));
         if (msrp == NULL) {
                 perror("calloc");
@@ -6595,6 +7836,8 @@ void cmdline(int argc, char **argv)
                 { "list", no_argument, 0, 'l' },
                 { "out", required_argument, 0, 'o' },
                 { "quiet", no_argument, 0, 'q' },
+               { "no-msr", no_argument, 0, 'M' },
+               { "no-perf", no_argument, 0, 'P' },
                 { "show", required_argument, 0, 's' },
                 { "Summary", no_argument, 0, 'S' },
                 { "TCC", required_argument, 0, 'T' },
@@ -6604,7 +7847,25 @@ void cmdline(int argc, char **argv)
  
         progname = argv[0];
  
-       while ((opt = getopt_long_only(argc, argv, "+C:c:Dde:hi:Jn:o:qST:v", long_options, &option_index)) != -1) {
+       /*
+        * Parse some options early, because they may make other options invalid,
+        * like adding the MSR counter with --add and at the same time using --no-msr.
+        */
+       while ((opt = getopt_long_only(argc, argv, "MP", long_options, &option_index)) != -1) {
+               switch (opt) {
+               case 'M':
+                       no_msr = 1;
+                       break;
+               case 'P':
+                       no_perf = 1;
+                       break;
+               default:
+                       break;
+               }
+       }
+       optind = 0;
+
+       while ((opt = getopt_long_only(argc, argv, "+C:c:Dde:hi:Jn:o:qMST:v", long_options, &option_index)) != -1) {
                 switch (opt) {
                 case 'a':
                         parse_add_command(optarg);
@@ -6662,6 +7923,10 @@ void cmdline(int argc, char **argv)
                 case 'q':
                         quiet = 1;
                         break;
+               case 'M':
+               case 'P':
+                       /* Parsed earlier */
+                       break;
                 case 'n':
                         num_iterations = strtod(optarg, NULL);
  
@@ -6704,6 +7969,22 @@ void cmdline(int argc, char **argv)
         }
  }
  
+void set_rlimit(void)
+{
+       struct rlimit limit;
+
+       if (getrlimit(RLIMIT_NOFILE, &limit) < 0)
+               err(1, "Failed to get rlimit");
+
+       if (limit.rlim_max < MAX_NOFILE)
+               limit.rlim_max = MAX_NOFILE;
+       if (limit.rlim_cur < MAX_NOFILE)
+               limit.rlim_cur = MAX_NOFILE;
+
+       if (setrlimit(RLIMIT_NOFILE, &limit) < 0)
+               err(1, "Failed to set rlimit");
+}
+
  int main(int argc, char **argv)
  {
         int fd, ret;
@@ -6729,9 +8010,13 @@ skip_cgroup_setting:
  
         probe_sysfs();
  
+       if (!getuid())
+               set_rlimit();
+
         turbostat_init();
  
-       msr_sum_record();
+       if (!no_msr)
+               msr_sum_record();
  
         /* dump counters and exit */
         if (dump_only)
diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h

index c02990bbd56f4cf1cb5ea878f8fa76c4b6057c8d..9007c420d52c5201c40284f4f91cd7687f9d7188 100644 (file)
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -3,7 +3,7 @@
  #include <stdbool.h>
  #include <sys/mman.h>
  #include <err.h>
-#include <string.h> /* ffsl() */
+#include <strings.h> /* ffsl() */
  #include <unistd.h> /* _SC_PAGESIZE */
  
  #define BIT_ULL(nr)                   (1ULL << (nr))
diff --git a/tools/testing/selftests/turbostat/defcolumns.py b/tools/testing/selftests/turbostat/defcolumns.py

new file mode 100755 (executable)

index 0000000..d9b0420
--- /dev/null
+++ b/tools/testing/selftests/turbostat/defcolumns.py
@@ -0,0 +1,60 @@
+#!/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+import subprocess
+from shutil import which
+
+turbostat = which('turbostat')
+if turbostat is None:
+       print('Could not find turbostat binary')
+       exit(1)
+
+timeout = which('timeout')
+if timeout is None:
+       print('Could not find timeout binary')
+       exit(1)
+
+proc_turbostat = subprocess.run([turbostat, '--list'], capture_output = True)
+if proc_turbostat.returncode != 0:
+       print(f'turbostat failed with {proc_turbostat.returncode}')
+       exit(1)
+
+#
+# By default --list reports also "usec" and "Time_Of_Day_Seconds" columns
+# which are only visible when running with --debug.
+#
+expected_columns_debug = proc_turbostat.stdout.replace(b',', b'\t').strip()
+expected_columns = expected_columns_debug.replace(b'usec\t', b'').replace(b'Time_Of_Day_Seconds\t', b'').replace(b'X2APIC\t', b'').replace(b'APIC\t', b'')
+
+#
+# Run turbostat with no options for 10 seconds and send SIGINT
+#
+timeout_argv = [timeout, '--preserve-status', '-s', 'SIGINT', '-k', '3', '1s']
+turbostat_argv = [turbostat, '-i', '0.250']
+
+print(f'Running turbostat with {turbostat_argv=}... ', end = '', flush = True)
+proc_turbostat = subprocess.run(timeout_argv + turbostat_argv, capture_output = True)
+if proc_turbostat.returncode != 0:
+       print(f'turbostat failed with {proc_turbostat.returncode}')
+       exit(1)
+actual_columns = proc_turbostat.stdout.split(b'\n')[0]
+if expected_columns != actual_columns:
+       print(f'turbostat column check failed\n{expected_columns=}\n{actual_columns=}')
+       exit(1)
+print('OK')
+
+#
+# Same, but with --debug
+#
+turbostat_argv.append('--debug')
+
+print(f'Running turbostat with {turbostat_argv=}... ', end = '', flush = True)
+proc_turbostat = subprocess.run(timeout_argv + turbostat_argv, capture_output = True)
+if proc_turbostat.returncode != 0:
+       print(f'turbostat failed with {proc_turbostat.returncode}')
+       exit(1)
+actual_columns = proc_turbostat.stdout.split(b'\n')[0]
+if expected_columns_debug != actual_columns:
+       print(f'turbostat column check failed\n{expected_columns_debug=}\n{actual_columns=}')
+       exit(1)
+print('OK')
author	David S. Miller <davem@davemloft.net>
	Fri, 12 Apr 2024 12:02:13 +0000 (13:02 +0100)
committer	David S. Miller <davem@davemloft.net>
	Fri, 12 Apr 2024 12:02:13 +0000 (13:02 +0100)