Peter Oruba <peter.oruba@amd.com>
Pratyush Anand <pratyush.anand@gmail.com> <pratyush.anand@st.com>
Praveen BP <praveenbp@ti.com>
+Punit Agrawal <punitagrawal@gmail.com> <punit.agrawal@arm.com>
Qais Yousef <qsyousef@gmail.com> <qais.yousef@imgtec.com>
Oleksij Rempel <linux@rempel-privat.de> <bug-track@fisher-privat.net>
Oleksij Rempel <linux@rempel-privat.de> <external.Oleksij.Rempel@de.bosch.com>
D: Soundblaster driver fixes, ISAPnP quirk
S: California, USA
+N: Jarkko Lavinen
+E: jarkko.lavinen@nokia.com
+D: OMAP MMC support
+
N: Jonathan Layes
D: ARPD support
S: North Little Rock, Arkansas 72115
S: USA
+N: Christopher Li
+E: sparse@chrisli.org
+D: Sparse maintainer 2009 - 2018
+
N: Stephan Linz
E: linz@mazet.de
E: Stephan.Linz@gmx.de
S: Victoria 3163
S: Australia
+N: Eric Miao
+E: eric.y.miao@gmail.com
+D: MMP support
+
N: Pauline Middelink
E: middelin@polyware.nl
D: General low-level bug fixes, /proc fixes, identd support
S: Bellevue, Washington 98007
S: USA
+N: Haojian Zhuang
+E: haojian.zhuang@gmail.com
+D: MMP support
+
N: Richard Zidlicky
E: rz@linux-m68k.org, rdzidlic@geocities.com
W: http://www.geocities.com/rdzidlic
0-| / \/ \/
+---0----1----2----3----4----5----6------------> time (s)
- 2. To make the LED go instantly from one brigntess value to another,
- we should use use zero-time lengths (the brightness must be same as
+ 2. To make the LED go instantly from one brightness value to another,
+ we should use zero-time lengths (the brightness must be same as
the previous tuple's). So the format should be:
"brightness_1 duration_1 brightness_1 0 brightness_2 duration_2
brightness_2 0 ...". For example:
-What: /sys/class/net/<iface>/tagging
+What: /sys/class/net/<iface>/dsa/tagging
Date: August 2018
KernelVersion: 4.20
Contact: netdev@vger.kernel.org
wbc_init_bio(@wbc, @bio)
Should be called for each bio carrying writeback data and
- associates the bio with the inode's owner cgroup and the
- corresponding request queue. This must be called after
- a queue (device) has been associated with the bio and
- before submission.
+ associates the bio with the inode's owner cgroup. Can be
+ called anytime between bio allocation and submission.
wbc_account_io(@wbc, @page, @bytes)
Should be called for each data segment being written out.
the writeback session is holding shared resources, e.g. a journal
entry, may lead to priority inversion. There is no one easy solution
for the problem. Filesystems can try to work around specific problem
-cases by skipping wbc_init_bio() or using bio_associate_create_blkg()
+cases by skipping wbc_init_bio() or using bio_associate_blkcg()
directly.
causing system reset or hang due to sending
INIT from AP to BSP.
- disable_counter_freezing [HW]
+ perf_v4_pmi= [X86,INTEL]
+ Format: <bool>
Disable Intel PMU counter freezing feature.
The feature only exists starting from
Arch Perfmon v4 (Skylake and newer).
earlyprintk=serial[,0x...[,baudrate]]
earlyprintk=ttySn[,baudrate]
earlyprintk=dbgp[debugController#]
- earlyprintk=pciserial,bus:device.function[,baudrate]
+ earlyprintk=pciserial[,force],bus:device.function[,baudrate]
earlyprintk=xdbc[xhciController#]
earlyprintk is useful when the kernel crashes before
The sclp output can only be used on s390.
+ The optional "force" to "pciserial" enables use of a
+ PCI device even when its classcode is not of the
+ UART class.
+
edac_report= [HW,EDAC] Control how to report EDAC event
Format: {"on" | "off" | "force"}
on: enable EDAC to report H/W event. May be overridden
before loading.
See Documentation/blockdev/ramdisk.txt.
+ psi= [KNL] Enable or disable pressure stall information
+ tracking.
+ Format: <bool>
+
psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
probe for; one of (bare|imps|exps|lifebook|any).
psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports
spectre_v2= [X86] Control mitigation of Spectre variant 2
(indirect branch speculation) vulnerability.
+ The default operation protects the kernel from
+ user space attacks.
- on - unconditionally enable
- off - unconditionally disable
+ on - unconditionally enable, implies
+ spectre_v2_user=on
+ off - unconditionally disable, implies
+ spectre_v2_user=off
auto - kernel detects whether your CPU model is
vulnerable
CONFIG_RETPOLINE configuration option, and the
compiler with which the kernel was built.
+ Selecting 'on' will also enable the mitigation
+ against user space to user space task attacks.
+
+ Selecting 'off' will disable both the kernel and
+ the user space protections.
+
Specific mitigations can also be selected manually:
retpoline - replace indirect branches
Not specifying this option is equivalent to
spectre_v2=auto.
+ spectre_v2_user=
+ [X86] Control mitigation of Spectre variant 2
+ (indirect branch speculation) vulnerability between
+ user space tasks
+
+ on - Unconditionally enable mitigations. Is
+ enforced by spectre_v2=on
+
+ off - Unconditionally disable mitigations. Is
+ enforced by spectre_v2=off
+
+ prctl - Indirect branch speculation is enabled,
+ but mitigation can be enabled via prctl
+ per thread. The mitigation control state
+ is inherited on fork.
+
+ prctl,ibpb
+ - Like "prctl" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different user
+ space processes.
+
+ seccomp
+ - Same as "prctl" above, but all seccomp
+ threads will enable the mitigation unless
+ they explicitly opt out.
+
+ seccomp,ibpb
+ - Like "seccomp" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different
+ user space processes.
+
+ auto - Kernel selects the mitigation depending on
+ the available CPU features and vulnerability.
+
+ Default mitigation:
+ If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
+
+ Not specifying this option is equivalent to
+ spectre_v2_user=auto.
+
spec_store_bypass_disable=
[HW] Control Speculative Store Bypass (SSB) Disable mitigation
(Speculative Store Bypass vulnerability)
prevent spurious wakeup);
n = USB_QUIRK_DELAY_CTRL_MSG (Device needs a
pause after every control message);
+ o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra
+ delay after resetting its port);
Example: quirks=0781:5580:bk,0a5c:5834:gij
usbhid.mousepoll=
a governor ``sysfs`` interface to it. Next, the governor is started by
invoking its ``->start()`` callback.
-That callback it expected to register per-CPU utilization update callbacks for
+That callback is expected to register per-CPU utilization update callbacks for
all of the online CPUs belonging to the given policy with the CPU scheduler.
The utilization update callbacks will be invoked by the CPU scheduler on
important events, like task enqueue and dequeue, on every iteration of the
The security list is not a disclosure channel. For that, see Coordination
below.
-Once a robust fix has been developed, our preference is to release the
-fix in a timely fashion, treating it no differently than any of the other
-thousands of changes and fixes the Linux kernel project releases every
-month.
-
-However, at the request of the reporter, we will postpone releasing the
-fix for up to 5 business days after the date of the report or after the
-embargo has lifted; whichever comes first. The only exception to that
-rule is if the bug is publicly known, in which case the preference is to
-release the fix as soon as it's available.
+Once a robust fix has been developed, the release process starts. Fixes
+for publicly known bugs are released immediately.
+
+Although our preference is to release fixes for publicly undisclosed bugs
+as soon as they become available, this may be postponed at the request of
+the reporter or an affected party for up to 7 calendar days from the start
+of the release process, with an exceptional extension to 14 calendar days
+if it is agreed that the criticality of the bug requires more time. The
+only valid reason for deferring the publication of a fix is to accommodate
+the logistics of QA and large scale rollouts which require release
+coordination.
Whilst embargoed information may be shared with trusted individuals in
order to develop a fix, such information will not be published alongside
| ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
| ARM | Cortex-A55 | #1024718 | ARM64_ERRATUM_1024718 |
| ARM | Cortex-A76 | #1188873 | ARM64_ERRATUM_1188873 |
+| ARM | Cortex-A76 | #1286807 | ARM64_ERRATUM_1286807 |
| ARM | MMU-500 | #841119,#826419 | N/A |
| | | | |
| Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 |
new entry and return the previous entry stored at that index. You can
use :c:func:`xa_erase` instead of calling :c:func:`xa_store` with a
``NULL`` entry. There is no difference between an entry that has never
-been stored to and one that has most recently had ``NULL`` stored to it.
+been stored to, one that has been erased and one that has most recently
+had ``NULL`` stored to it.
You can conditionally replace an entry at an index by using
:c:func:`xa_cmpxchg`. Like :c:func:`cmpxchg`, it will only succeed if
indices. Storing into one index may result in the entry retrieved by
some, but not all of the other indices changing.
+Sometimes you need to ensure that a subsequent call to :c:func:`xa_store`
+will not need to allocate memory. The :c:func:`xa_reserve` function
+will store a reserved entry at the indicated index. Users of the normal
+API will see this entry as containing ``NULL``. If you do not need to
+use the reserved entry, you can call :c:func:`xa_release` to remove the
+unused entry. If another user has stored to the entry in the meantime,
+:c:func:`xa_release` will do nothing; if instead you want the entry to
+become ``NULL``, you should use :c:func:`xa_erase`.
+
+If all entries in the array are ``NULL``, the :c:func:`xa_empty` function
+will return ``true``.
+
Finally, you can remove all entries from an XArray by calling
:c:func:`xa_destroy`. If the XArray entries are pointers, you may wish
to free the entries first. You can do this by iterating over all present
entries in the XArray using the :c:func:`xa_for_each` iterator.
-ID assignment
--------------
+Allocating XArrays
+------------------
+
+If you use :c:func:`DEFINE_XARRAY_ALLOC` to define the XArray, or
+initialise it by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`,
+the XArray changes to track whether entries are in use or not.
You can call :c:func:`xa_alloc` to store the entry at any unused index
in the XArray. If you need to modify the array from interrupt context,
you can use :c:func:`xa_alloc_bh` or :c:func:`xa_alloc_irq` to disable
-interrupts while allocating the ID. Unlike :c:func:`xa_store`, allocating
-a ``NULL`` pointer does not delete an entry. Instead it reserves an
-entry like :c:func:`xa_reserve` and you can release it using either
-:c:func:`xa_erase` or :c:func:`xa_release`. To use ID assignment, the
-XArray must be defined with :c:func:`DEFINE_XARRAY_ALLOC`, or initialised
-by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`,
+interrupts while allocating the ID.
+
+Using :c:func:`xa_store`, :c:func:`xa_cmpxchg` or :c:func:`xa_insert`
+will mark the entry as being allocated. Unlike a normal XArray, storing
+``NULL`` will mark the entry as being in use, like :c:func:`xa_reserve`.
+To free an entry, use :c:func:`xa_erase` (or :c:func:`xa_release` if
+you only want to free the entry if it's ``NULL``).
+
+You cannot use ``XA_MARK_0`` with an allocating XArray as this mark
+is used to track whether an entry is free or not. The other marks are
+available for your use.
Memory allocation
-----------------
Takes xa_lock internally:
* :c:func:`xa_store`
+ * :c:func:`xa_store_bh`
+ * :c:func:`xa_store_irq`
* :c:func:`xa_insert`
* :c:func:`xa_erase`
* :c:func:`xa_erase_bh`
* :c:func:`xa_alloc`
* :c:func:`xa_alloc_bh`
* :c:func:`xa_alloc_irq`
+ * :c:func:`xa_reserve`
+ * :c:func:`xa_reserve_bh`
+ * :c:func:`xa_reserve_irq`
* :c:func:`xa_destroy`
* :c:func:`xa_set_mark`
* :c:func:`xa_clear_mark`
* :c:func:`__xa_erase`
* :c:func:`__xa_cmpxchg`
* :c:func:`__xa_alloc`
+ * :c:func:`__xa_reserve`
* :c:func:`__xa_set_mark`
* :c:func:`__xa_clear_mark`
using :c:func:`xa_lock_irqsave` in both the interrupt handler and process
context, or :c:func:`xa_lock_irq` in process context and :c:func:`xa_lock`
in the interrupt handler. Some of the more common patterns have helper
-functions such as :c:func:`xa_erase_bh` and :c:func:`xa_erase_irq`.
+functions such as :c:func:`xa_store_bh`, :c:func:`xa_store_irq`,
+:c:func:`xa_erase_bh` and :c:func:`xa_erase_irq`.
Sometimes you need to protect access to the XArray with a mutex because
that lock sits above another mutex in the locking hierarchy. That does
- :c:func:`xa_is_zero`
- Zero entries appear as ``NULL`` through the Normal API, but occupy
an entry in the XArray which can be used to reserve the index for
- future use.
+ future use. This is used by allocating XArrays for allocated entries
+ which are ``NULL``.
Other internal entries may be added in the future. As far as possible, they
will be handled by :c:func:`xas_retry`.
This will give a fine grained information about all the CPU frequency
transitions. The cat output here is a two dimensional matrix, where an entry
<i,j> (row i, column j) represents the count of number of transitions from
-Freq_i to Freq_j. Freq_i is in descending order with increasing rows and
-Freq_j is in descending order with increasing columns. The output here also
-contains the actual freq values for each row and column for better readability.
+Freq_i to Freq_j. Freq_i rows and Freq_j columns follow the sorting order in
+which the driver has provided the frequency table initially to the cpufreq core
+and so can be sorted (ascending or descending) or unsorted. The output here
+also contains the actual freq values for each row and column for better
+readability.
If the transition table is bigger than PAGE_SIZE, reading this will
return an -EFBIG error.
void (*describe)(const struct key *key, struct seq_file *m);
void (*destroy)(void *payload);
+ int (*query)(const struct kernel_pkey_params *params,
+ struct kernel_pkey_query *info);
+ int (*eds_op)(struct kernel_pkey_params *params,
+ const void *in, void *out);
int (*verify_signature)(const struct key *key,
const struct public_key_signature *sig);
};
asymmetric key will look after freeing the fingerprint and releasing the
reference on the subtype module.
- (3) verify_signature().
+ (3) query().
- Optional. These are the entry points for the key usage operations.
- Currently there is only the one defined. If not set, the caller will be
- given -ENOTSUPP. The subtype may do anything it likes to implement an
- operation, including offloading to hardware.
+ Mandatory. This is a function for querying the capabilities of a key.
+
+ (4) eds_op().
+
+ Optional. This is the entry point for the encryption, decryption and
+ signature creation operations (which are distinguished by the operation ID
+ in the parameter struct). The subtype may do anything it likes to
+ implement an operation, including offloading to hardware.
+
+ (5) verify_signature().
+
+ Optional. This is the entry point for signature verification. The
+ subtype may do anything it likes to implement an operation, including
+ offloading to hardware.
==========================
- X.509 ASN.1 stream.
- Pointer to TPM key.
- Pointer to UEFI key.
+ - PKCS#8 private key [RFC 5208].
+ - PKCS#5 encrypted private key [RFC 2898].
During key instantiation each parser in the list is tried until one doesn't
return -EBADMSG.
===========================================
Example 1 (ARM 64-bit, 6-cpu system, two clusters):
-capacities-dmips-mhz are scaled w.r.t. 1024 (cpu@0 and cpu@1)
-supposing cluster0@max-freq=1100 and custer1@max-freq=850,
-final capacities are 1024 for cluster0 and 446 for cluster1
+The capacities-dmips-mhz or DMIPS/MHz values (scaled to 1024)
+are 1024 and 578 for cluster0 and cluster1. Further normalization
+is done by the operating system based on cluster0@max-freq=1100 and
+custer1@max-freq=850, final capacities are 1024 for cluster0 and
+446 for cluster1 (576*850/1100).
cpus {
#address-cells = <2>;
compatible = "renesas,r8a77470"
- RZ/G2M (R8A774A1)
compatible = "renesas,r8a774a1"
- - RZ/G2E (RA8774C0)
+ - RZ/G2E (R8A774C0)
compatible = "renesas,r8a774c0"
- R-Car M1A (R8A77781)
compatible = "renesas,r8a7778"
Configuration of common clocks, which affect multiple consumer devices can
be similarly specified in the clock provider node.
+
+==Protected clocks==
+
+Some platforms or firmwares may not fully expose all the clocks to the OS, such
+as in situations where those clks are used by drivers running in ARM secure
+execution levels. Such a configuration can be specified in device tree with the
+protected-clocks property in the form of a clock specifier list. This property should
+only be specified in the node that is providing the clocks being protected:
+
+ clock-controller@a000f000 {
+ compatible = "vendor,clk95;
+ reg = <0xa000f000 0x1000>
+ #clocks-cells = <1>;
+ ...
+ protected-clocks = <UART3_CLK>, <SPI5_CLK>;
+ };
+++ /dev/null
-Generic ARM big LITTLE cpufreq driver's DT glue
------------------------------------------------
-
-This is DT specific glue layer for generic cpufreq driver for big LITTLE
-systems.
-
-Both required and optional properties listed below must be defined
-under node /cpus/cpu@x. Where x is the first cpu inside a cluster.
-
-FIXME: Cpus should boot in the order specified in DT and all cpus for a cluster
-must be present contiguously. Generic DT driver will check only node 'x' for
-cpu:x.
-
-Required properties:
-- operating-points: Refer to Documentation/devicetree/bindings/opp/opp.txt
- for details
-
-Optional properties:
-- clock-latency: Specify the possible maximum transition latency for clock,
- in unit of nanoseconds.
-
-Examples:
-
-cpus {
- #address-cells = <1>;
- #size-cells = <0>;
-
- cpu@0 {
- compatible = "arm,cortex-a15";
- reg = <0>;
- next-level-cache = <&L2>;
- operating-points = <
- /* kHz uV */
- 792000 1100000
- 396000 950000
- 198000 850000
- >;
- clock-latency = <61036>; /* two CLK32 periods */
- };
-
- cpu@1 {
- compatible = "arm,cortex-a15";
- reg = <1>;
- next-level-cache = <&L2>;
- };
-
- cpu@100 {
- compatible = "arm,cortex-a7";
- reg = <100>;
- next-level-cache = <&L2>;
- operating-points = <
- /* kHz uV */
- 792000 950000
- 396000 750000
- 198000 450000
- >;
- clock-latency = <61036>; /* two CLK32 periods */
- };
-
- cpu@101 {
- compatible = "arm,cortex-a7";
- reg = <101>;
- next-level-cache = <&L2>;
- };
-};
--- /dev/null
+Innolux P120ZDG-BF1 12.02 inch eDP 2K display panel
+
+This binding is compatible with the simple-panel binding, which is specified
+in simple-panel.txt in this directory.
+
+Required properties:
+- compatible: should be "innolux,p120zdg-bf1"
+- power-supply: regulator to provide the supply voltage
+
+Optional properties:
+- enable-gpios: GPIO pin to enable or disable the panel
+- backlight: phandle of the backlight device attached to the panel
+- no-hpd: If HPD isn't hooked up; add this property.
+
+Example:
+ panel_edp: panel-edp {
+ compatible = "innolux,p120zdg-bf1";
+ enable-gpios = <&msmgpio 31 GPIO_ACTIVE_LOW>;
+ power-supply = <&pm8916_l2>;
+ backlight = <&backlight>;
+ no-hpd;
+ };
+++ /dev/null
-Innolux TV123WAM 12.3 inch eDP 2K display panel
-
-This binding is compatible with the simple-panel binding, which is specified
-in simple-panel.txt in this directory.
-
-Required properties:
-- compatible: should be "innolux,tv123wam"
-- power-supply: regulator to provide the supply voltage
-
-Optional properties:
-- enable-gpios: GPIO pin to enable or disable the panel
-- backlight: phandle of the backlight device attached to the panel
-
-Example:
- panel_edp: panel-edp {
- compatible = "innolux,tv123wam";
- enable-gpios = <&msmgpio 31 GPIO_ACTIVE_LOW>;
- power-supply = <&pm8916_l2>;
- backlight = <&backlight>;
- };
- ddc-i2c-bus: phandle of an I2C controller used for DDC EDID probing
- enable-gpios: GPIO pin to enable or disable the panel
- backlight: phandle of the backlight device attached to the panel
+- no-hpd: This panel is supposed to communicate that it's ready via HPD
+ (hot plug detect) signal, but the signal isn't hooked up so we should
+ hardcode the max delay from the panel spec when powering up the panel.
Example:
Required properties:
- compatible :
- "fsl,imx7ulp-lpi2c" for LPI2C compatible with the one integrated on i.MX7ULP soc
+ - "fsl,imx8qxp-lpi2c" for LPI2C compatible with the one integrated on i.MX8QXP soc
- reg : address and length of the lpi2c master registers
- interrupts : lpi2c interrupt
- clocks : lpi2c clock specifier
I2C for OMAP platforms
Required properties :
-- compatible : Must be "ti,omap2420-i2c", "ti,omap2430-i2c", "ti,omap3-i2c"
- or "ti,omap4-i2c"
+- compatible : Must be
+ "ti,omap2420-i2c" for OMAP2420 SoCs
+ "ti,omap2430-i2c" for OMAP2430 SoCs
+ "ti,omap3-i2c" for OMAP3 SoCs
+ "ti,omap4-i2c" for OMAP4+ SoCs
+ "ti,am654-i2c", "ti,omap4-i2c" for AM654 SoCs
- ti,hwmods : Must be "i2c<n>", n being the instance number (1-based)
- #address-cells = <1>;
- #size-cells = <0>;
a set of keys.
Required property:
-sysrq-reset-seq: array of Linux keycodes, one keycode per cell.
+keyset: array of Linux keycodes, one keycode per cell.
Optional property:
timeout-ms: duration keys must be pressed together in milliseconds before
+++ /dev/null
-device-tree bindings for rockchip VPU codec
-
-Rockchip (Video Processing Unit) present in various Rockchip platforms,
-such as RK3288 and RK3399.
-
-Required properties:
-- compatible: value should be one of the following
- "rockchip,rk3288-vpu";
- "rockchip,rk3399-vpu";
-- interrupts: encoding and decoding interrupt specifiers
-- interrupt-names: should be "vepu" and "vdpu"
-- clocks: phandle to VPU aclk, hclk clocks
-- clock-names: should be "aclk" and "hclk"
-- power-domains: phandle to power domain node
-- iommus: phandle to a iommu node
-
-Example:
-SoC-specific DT entry:
- vpu: video-codec@ff9a0000 {
- compatible = "rockchip,rk3288-vpu";
- reg = <0x0 0xff9a0000 0x0 0x800>;
- interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "vepu", "vdpu";
- clocks = <&cru ACLK_VCODEC>, <&cru HCLK_VCODEC>;
- clock-names = "aclk", "hclk";
- power-domains = <&power RK3288_PD_VIDEO>;
- iommus = <&vpu_mmu>;
- };
reg = <1>;
clocks = <&clk32m>;
interrupt-parent = <&gpio4>;
- interrupts = <13 IRQ_TYPE_EDGE_RISING>;
+ interrupts = <13 IRQ_TYPE_LEVEL_HIGH>;
vdd-supply = <®5v0>;
xceiver-supply = <®5v0>;
};
- compatible: "renesas,can-r8a7743" if CAN controller is a part of R8A7743 SoC.
"renesas,can-r8a7744" if CAN controller is a part of R8A7744 SoC.
"renesas,can-r8a7745" if CAN controller is a part of R8A7745 SoC.
+ "renesas,can-r8a774a1" if CAN controller is a part of R8A774A1 SoC.
"renesas,can-r8a7778" if CAN controller is a part of R8A7778 SoC.
"renesas,can-r8a7779" if CAN controller is a part of R8A7779 SoC.
"renesas,can-r8a7790" if CAN controller is a part of R8A7790 SoC.
"renesas,can-r8a7794" if CAN controller is a part of R8A7794 SoC.
"renesas,can-r8a7795" if CAN controller is a part of R8A7795 SoC.
"renesas,can-r8a7796" if CAN controller is a part of R8A7796 SoC.
+ "renesas,can-r8a77965" if CAN controller is a part of R8A77965 SoC.
"renesas,rcar-gen1-can" for a generic R-Car Gen1 compatible device.
"renesas,rcar-gen2-can" for a generic R-Car Gen2 or RZ/G1
compatible device.
- "renesas,rcar-gen3-can" for a generic R-Car Gen3 compatible device.
+ "renesas,rcar-gen3-can" for a generic R-Car Gen3 or RZ/G2
+ compatible device.
When compatible with the generic version, nodes must list the
SoC-specific version corresponding to the platform first
followed by the generic version.
- reg: physical base address and size of the R-Car CAN register map.
- interrupts: interrupt specifier for the sole interrupt.
-- clocks: phandles and clock specifiers for 3 CAN clock inputs.
-- clock-names: 3 clock input name strings: "clkp1", "clkp2", "can_clk".
+- clocks: phandles and clock specifiers for 2 CAN clock inputs for RZ/G2
+ devices.
+ phandles and clock specifiers for 3 CAN clock inputs for every other
+ SoC.
+- clock-names: 2 clock input name strings for RZ/G2: "clkp1", "can_clk".
+ 3 clock input name strings for every other SoC: "clkp1", "clkp2",
+ "can_clk".
- pinctrl-0: pin control group to be used for this controller.
- pinctrl-names: must be "default".
-Required properties for "renesas,can-r8a7795" and "renesas,can-r8a7796"
-compatible:
-In R8A7795 and R8A7796 SoCs, "clkp2" can be CANFD clock. This is a div6 clock
-and can be used by both CAN and CAN FD controller at the same time. It needs to
-be scaled to maximum frequency if any of these controllers use it. This is done
+Required properties for R8A7795, R8A7796 and R8A77965:
+For the denoted SoCs, "clkp2" can be CANFD clock. This is a div6 clock and can
+be used by both CAN and CAN FD controller at the same time. It needs to be
+scaled to maximum frequency if any of these controllers use it. This is done
using the below properties:
- assigned-clocks: phandle of clkp2(CANFD) clock.
Optional properties:
- renesas,can-clock-select: R-Car CAN Clock Source Select. Valid values are:
<0x0> (default) : Peripheral clock (clkp1)
- <0x1> : Peripheral clock (clkp2)
- <0x3> : Externally input clock
+ <0x1> : Peripheral clock (clkp2) (not supported by
+ RZ/G2 devices)
+ <0x3> : External input clock
Example
-------
Current Binding
---------------
-Switches are true Linux devices and can be probes by any means. Once
+Switches are true Linux devices and can be probed by any means. Once
probed, they register to the DSA framework, passing a node
pointer. This node is expected to fulfil the following binding, and
may contain additional properties as required by the device it is
"ref" for 19.2 MHz ref clk,
"com_aux" for phy common block aux clock,
"ref_aux" for phy reference aux clock,
+
+ For "qcom,ipq8074-qmp-pcie-phy": no clocks are listed.
For "qcom,msm8996-qmp-pcie-phy" must contain:
"aux", "cfg_ahb", "ref".
For "qcom,msm8996-qmp-usb3-phy" must contain:
"aux", "cfg_ahb", "ref".
- For "qcom,qmp-v3-usb3-phy" must contain:
+ For "qcom,sdm845-qmp-usb3-phy" must contain:
+ "aux", "cfg_ahb", "ref", "com_aux".
+ For "qcom,sdm845-qmp-usb3-uni-phy" must contain:
"aux", "cfg_ahb", "ref", "com_aux".
+ For "qcom,sdm845-qmp-ufs-phy" must contain:
+ "ref", "ref_aux".
- resets: a list of phandles and reset controller specifier pairs,
one for each entry in reset-names.
- reset-names: "phy" for reset of phy block,
"common" for phy common block reset,
- "cfg" for phy's ahb cfg block reset (Optional).
+ "cfg" for phy's ahb cfg block reset.
+
+ For "qcom,ipq8074-qmp-pcie-phy" must contain:
+ "phy", "common".
For "qcom,msm8996-qmp-pcie-phy" must contain:
- "phy", "common", "cfg".
+ "phy", "common", "cfg".
For "qcom,msm8996-qmp-usb3-phy" must contain
- "phy", "common".
- For "qcom,ipq8074-qmp-pcie-phy" must contain:
- "phy", "common".
+ "phy", "common".
+ For "qcom,sdm845-qmp-usb3-phy" must contain:
+ "phy", "common".
+ For "qcom,sdm845-qmp-usb3-uni-phy" must contain:
+ "phy", "common".
+ For "qcom,sdm845-qmp-ufs-phy": no resets are listed.
- vdda-phy-supply: Phandle to a regulator supply to PHY core block.
- vdda-pll-supply: Phandle to 1.8V regulator supply to PHY refclk pll block.
- #phy-cells: must be 0
+Required properties child node of pcie and usb3 qmp phys:
- clocks: a list of phandles and clock-specifier pairs,
one for each entry in clock-names.
- - clock-names: Must contain following for pcie and usb qmp phys:
+ - clock-names: Must contain following:
"pipe<lane-number>" for pipe clock specific to each lane.
- clock-output-names: Name of the PHY clock that will be the parent for
the above pipe clock.
(or)
"pcie20_phy1_pipe_clk"
+Required properties for child node of PHYs with lane reset, AKA:
+ "qcom,msm8996-qmp-pcie-phy"
- resets: a list of phandles and reset controller specifier pairs,
one for each entry in reset-names.
- - reset-names: Must contain following for pcie qmp phys:
+ - reset-names: Must contain following:
"lane<lane-number>" for reset specific to each lane.
Example:
for da850 - compatible = "ti,da850-ecap", "ti,am3352-ecap", "ti,am33xx-ecap";
for dra746 - compatible = "ti,dra746-ecap", "ti,am3352-ecap";
for 66ak2g - compatible = "ti,k2g-ecap", "ti,am3352-ecap";
+ for am654 - compatible = "ti,am654-ecap", "ti,am3352-ecap";
- #pwm-cells: should be 3. See pwm.txt in this directory for a description of
the cells format. The PWM channel index ranges from 0 to 4. The only third
cell flag supported by this binding is PWM_POLARITY_INVERTED.
Required Properties:
- compatible: should be "renesas,pwm-rcar" and one of the following.
- "renesas,pwm-r8a7743": for RZ/G1M
+ - "renesas,pwm-r8a7744": for RZ/G1N
- "renesas,pwm-r8a7745": for RZ/G1E
+ - "renesas,pwm-r8a774a1": for RZ/G2M
- "renesas,pwm-r8a7778": for R-Car M1A
- "renesas,pwm-r8a7779": for R-Car H1
- "renesas,pwm-r8a7790": for R-Car H2
- "renesas,pwm-r8a7795": for R-Car H3
- "renesas,pwm-r8a7796": for R-Car M3-W
- "renesas,pwm-r8a77965": for R-Car M3-N
+ - "renesas,pwm-r8a77970": for R-Car V3M
+ - "renesas,pwm-r8a77980": for R-Car V3H
- "renesas,pwm-r8a77990": for R-Car E3
- "renesas,pwm-r8a77995": for R-Car D3
- reg: base address and length of the registers block for the PWM.
Required Properties:
- - compatible: should be one of the following.
+ - compatible: must contain one or more of the following:
- "renesas,tpu-r8a73a4": for R8A73A4 (R-Mobile APE6) compatible PWM controller.
- "renesas,tpu-r8a7740": for R8A7740 (R-Mobile A1) compatible PWM controller.
- "renesas,tpu-r8a7743": for R8A7743 (RZ/G1M) compatible PWM controller.
+ - "renesas,tpu-r8a7744": for R8A7744 (RZ/G1N) compatible PWM controller.
- "renesas,tpu-r8a7745": for R8A7745 (RZ/G1E) compatible PWM controller.
- "renesas,tpu-r8a7790": for R8A7790 (R-Car H2) compatible PWM controller.
- - "renesas,tpu": for generic R-Car and RZ/G1 TPU PWM controller.
+ - "renesas,tpu-r8a77970": for R8A77970 (R-Car V3M) compatible PWM
+ controller.
+ - "renesas,tpu-r8a77980": for R8A77980 (R-Car V3H) compatible PWM
+ controller.
+ - "renesas,tpu": for the generic TPU PWM controller; this is a fallback for
+ the entries listed above.
- reg: Base address and length of each memory resource used by the PWM
controller hardware module.
Required properties:
- compatible: should be "socionext,uniphier-scssi"
- reg: address and length of the spi master registers
- - #address-cells: must be <1>, see spi-bus.txt
- - #size-cells: must be <0>, see spi-bus.txt
- - clocks: A phandle to the clock for the device.
- - resets: A phandle to the reset control for the device.
+ - interrupts: a single interrupt specifier
+ - pinctrl-names: should be "default"
+ - pinctrl-0: pin control state for the default mode
+ - clocks: a phandle to the clock for the device
+ - resets: a phandle to the reset control for the device
Example:
spi0: spi@54006000 {
compatible = "socionext,uniphier-scssi";
reg = <0x54006000 0x100>;
- #address-cells = <1>;
- #size-cells = <0>;
+ interrupts = <0 39 4>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_spi0>;
clocks = <&peri_clk 11>;
resets = <&peri_rst 11>;
};
--- /dev/null
+=================
+gx6605s SOC Timer
+=================
+
+The timer is used in gx6605s soc as system timer and the driver
+contain clk event and clk source.
+
+==============================
+timer node bindings definition
+==============================
+
+ Description: Describes gx6605s SOC timer
+
+ PROPERTIES
+
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: must be "csky,gx6605s-timer"
+ - reg
+ Usage: required
+ Value type: <u32 u32>
+ Definition: <phyaddr size> in soc from cpu view
+ - clocks
+ Usage: required
+ Value type: phandle + clock specifier cells
+ Definition: must be input clk node
+ - interrupt
+ Usage: required
+ Value type: <u32>
+ Definition: must be timer irq num defined by soc
+
+Examples:
+---------
+
+ timer0: timer@20a000 {
+ compatible = "csky,gx6605s-timer";
+ reg = <0x0020a000 0x400>;
+ clocks = <&dummy_apb_clk>;
+ interrupts = <10>;
+ interrupt-parent = <&intc>;
+ };
--- /dev/null
+============================
+C-SKY Multi-processors Timer
+============================
+
+C-SKY multi-processors timer is designed for C-SKY SMP system and the
+regs is accessed by cpu co-processor 4 registers with mtcr/mfcr.
+
+ - PTIM_CTLR "cr<0, 14>" Control reg to start reset timer.
+ - PTIM_TSR "cr<1, 14>" Interrupt cleanup status reg.
+ - PTIM_CCVR "cr<3, 14>" Current counter value reg.
+ - PTIM_LVR "cr<6, 14>" Window value reg to triger next event.
+
+==============================
+timer node bindings definition
+==============================
+
+ Description: Describes SMP timer
+
+ PROPERTIES
+
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: must be "csky,mptimer"
+ - clocks
+ Usage: required
+ Value type: <node>
+ Definition: must be input clk node
+ - interrupts
+ Usage: required
+ Value type: <u32>
+ Definition: must be timer irq num defined by soc
+
+Examples:
+---------
+
+ timer: timer {
+ compatible = "csky,mptimer";
+ clocks = <&dummy_apb_clk>;
+ interrupts = <16>;
+ interrupt-parent = <&intc>;
+ };
"trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible
for untrusted layers like from a pen drive.
+Note: redirect_dir={off|nofollow|follow(*)} conflicts with metacopy=on, and
+results in an error.
+
+(*) redirect_dir=follow only conflicts with metacopy=on if upperdir=... is
+given.
+
Sharing and copying layers
--------------------------
On success you get a new struct file sharing the mount/dentry with the
original, on failure - ERR_PTR().
--
+[mandatory]
+ ->clone_file_range() and ->dedupe_file_range have been replaced with
+ ->remap_file_range(). See Documentation/filesystems/vfs.txt for more
+ information.
+--
[recommended]
->lookup() instances doing an equivalent of
if (IS_ERR(inode))
--- /dev/null
+% UBIFS Authentication
+% sigma star gmbh
+% 2018
+
+# Introduction
+
+UBIFS utilizes the fscrypt framework to provide confidentiality for file
+contents and file names. This prevents attacks where an attacker is able to
+read contents of the filesystem on a single point in time. A classic example
+is a lost smartphone where the attacker is unable to read personal data stored
+on the device without the filesystem decryption key.
+
+At the current state, UBIFS encryption however does not prevent attacks where
+the attacker is able to modify the filesystem contents and the user uses the
+device afterwards. In such a scenario an attacker can modify filesystem
+contents arbitrarily without the user noticing. One example is to modify a
+binary to perform a malicious action when executed [DMC-CBC-ATTACK]. Since
+most of the filesystem metadata of UBIFS is stored in plain, this makes it
+fairly easy to swap files and replace their contents.
+
+Other full disk encryption systems like dm-crypt cover all filesystem metadata,
+which makes such kinds of attacks more complicated, but not impossible.
+Especially, if the attacker is given access to the device multiple points in
+time. For dm-crypt and other filesystems that build upon the Linux block IO
+layer, the dm-integrity or dm-verity subsystems [DM-INTEGRITY, DM-VERITY]
+can be used to get full data authentication at the block layer.
+These can also be combined with dm-crypt [CRYPTSETUP2].
+
+This document describes an approach to get file contents _and_ full metadata
+authentication for UBIFS. Since UBIFS uses fscrypt for file contents and file
+name encryption, the authentication system could be tied into fscrypt such that
+existing features like key derivation can be utilized. It should however also
+be possible to use UBIFS authentication without using encryption.
+
+
+## MTD, UBI & UBIFS
+
+On Linux, the MTD (Memory Technology Devices) subsystem provides a uniform
+interface to access raw flash devices. One of the more prominent subsystems that
+work on top of MTD is UBI (Unsorted Block Images). It provides volume management
+for flash devices and is thus somewhat similar to LVM for block devices. In
+addition, it deals with flash-specific wear-leveling and transparent I/O error
+handling. UBI offers logical erase blocks (LEBs) to the layers on top of it
+and maps them transparently to physical erase blocks (PEBs) on the flash.
+
+UBIFS is a filesystem for raw flash which operates on top of UBI. Thus, wear
+leveling and some flash specifics are left to UBI, while UBIFS focuses on
+scalability, performance and recoverability.
+
+
+
+ +------------+ +*******+ +-----------+ +-----+
+ | | * UBIFS * | UBI-BLOCK | | ... |
+ | JFFS/JFFS2 | +*******+ +-----------+ +-----+
+ | | +-----------------------------+ +-----------+ +-----+
+ | | | UBI | | MTD-BLOCK | | ... |
+ +------------+ +-----------------------------+ +-----------+ +-----+
+ +------------------------------------------------------------------+
+ | MEMORY TECHNOLOGY DEVICES (MTD) |
+ +------------------------------------------------------------------+
+ +-----------------------------+ +--------------------------+ +-----+
+ | NAND DRIVERS | | NOR DRIVERS | | ... |
+ +-----------------------------+ +--------------------------+ +-----+
+
+ Figure 1: Linux kernel subsystems for dealing with raw flash
+
+
+
+Internally, UBIFS maintains multiple data structures which are persisted on
+the flash:
+
+- *Index*: an on-flash B+ tree where the leaf nodes contain filesystem data
+- *Journal*: an additional data structure to collect FS changes before updating
+ the on-flash index and reduce flash wear.
+- *Tree Node Cache (TNC)*: an in-memory B+ tree that reflects the current FS
+ state to avoid frequent flash reads. It is basically the in-memory
+ representation of the index, but contains additional attributes.
+- *LEB property tree (LPT)*: an on-flash B+ tree for free space accounting per
+ UBI LEB.
+
+In the remainder of this section we will cover the on-flash UBIFS data
+structures in more detail. The TNC is of less importance here since it is never
+persisted onto the flash directly. More details on UBIFS can also be found in
+[UBIFS-WP].
+
+
+### UBIFS Index & Tree Node Cache
+
+Basic on-flash UBIFS entities are called *nodes*. UBIFS knows different types
+of nodes. Eg. data nodes (`struct ubifs_data_node`) which store chunks of file
+contents or inode nodes (`struct ubifs_ino_node`) which represent VFS inodes.
+Almost all types of nodes share a common header (`ubifs_ch`) containing basic
+information like node type, node length, a sequence number, etc. (see
+`fs/ubifs/ubifs-media.h`in kernel source). Exceptions are entries of the LPT
+and some less important node types like padding nodes which are used to pad
+unusable content at the end of LEBs.
+
+To avoid re-writing the whole B+ tree on every single change, it is implemented
+as *wandering tree*, where only the changed nodes are re-written and previous
+versions of them are obsoleted without erasing them right away. As a result,
+the index is not stored in a single place on the flash, but *wanders* around
+and there are obsolete parts on the flash as long as the LEB containing them is
+not reused by UBIFS. To find the most recent version of the index, UBIFS stores
+a special node called *master node* into UBI LEB 1 which always points to the
+most recent root node of the UBIFS index. For recoverability, the master node
+is additionally duplicated to LEB 2. Mounting UBIFS is thus a simple read of
+LEB 1 and 2 to get the current master node and from there get the location of
+the most recent on-flash index.
+
+The TNC is the in-memory representation of the on-flash index. It contains some
+additional runtime attributes per node which are not persisted. One of these is
+a dirty-flag which marks nodes that have to be persisted the next time the
+index is written onto the flash. The TNC acts as a write-back cache and all
+modifications of the on-flash index are done through the TNC. Like other caches,
+the TNC does not have to mirror the full index into memory, but reads parts of
+it from flash whenever needed. A *commit* is the UBIFS operation of updating the
+on-flash filesystem structures like the index. On every commit, the TNC nodes
+marked as dirty are written to the flash to update the persisted index.
+
+
+### Journal
+
+To avoid wearing out the flash, the index is only persisted (*commited*) when
+certain conditions are met (eg. `fsync(2)`). The journal is used to record
+any changes (in form of inode nodes, data nodes etc.) between commits
+of the index. During mount, the journal is read from the flash and replayed
+onto the TNC (which will be created on-demand from the on-flash index).
+
+UBIFS reserves a bunch of LEBs just for the journal called *log area*. The
+amount of log area LEBs is configured on filesystem creation (using
+`mkfs.ubifs`) and stored in the superblock node. The log area contains only
+two types of nodes: *reference nodes* and *commit start nodes*. A commit start
+node is written whenever an index commit is performed. Reference nodes are
+written on every journal update. Each reference node points to the position of
+other nodes (inode nodes, data nodes etc.) on the flash that are part of this
+journal entry. These nodes are called *buds* and describe the actual filesystem
+changes including their data.
+
+The log area is maintained as a ring. Whenever the journal is almost full,
+a commit is initiated. This also writes a commit start node so that during
+mount, UBIFS will seek for the most recent commit start node and just replay
+every reference node after that. Every reference node before the commit start
+node will be ignored as they are already part of the on-flash index.
+
+When writing a journal entry, UBIFS first ensures that enough space is
+available to write the reference node and buds part of this entry. Then, the
+reference node is written and afterwards the buds describing the file changes.
+On replay, UBIFS will record every reference node and inspect the location of
+the referenced LEBs to discover the buds. If these are corrupt or missing,
+UBIFS will attempt to recover them by re-reading the LEB. This is however only
+done for the last referenced LEB of the journal. Only this can become corrupt
+because of a power cut. If the recovery fails, UBIFS will not mount. An error
+for every other LEB will directly cause UBIFS to fail the mount operation.
+
+
+ | ---- LOG AREA ---- | ---------- MAIN AREA ------------ |
+
+ -----+------+-----+--------+---- ------+-----+-----+---------------
+ \ | | | | / / | | | \
+ / CS | REF | REF | | \ \ DENT | INO | INO | /
+ \ | | | | / / | | | \
+ ----+------+-----+--------+--- -------+-----+-----+----------------
+ | | ^ ^
+ | | | |
+ +------------------------+ |
+ | |
+ +-------------------------------+
+
+
+ Figure 2: UBIFS flash layout of log area with commit start nodes
+ (CS) and reference nodes (REF) pointing to main area
+ containing their buds
+
+
+### LEB Property Tree/Table
+
+The LEB property tree is used to store per-LEB information. This includes the
+LEB type and amount of free and *dirty* (old, obsolete content) space [1] on
+the LEB. The type is important, because UBIFS never mixes index nodes with data
+nodes on a single LEB and thus each LEB has a specific purpose. This again is
+useful for free space calculations. See [UBIFS-WP] for more details.
+
+The LEB property tree again is a B+ tree, but it is much smaller than the
+index. Due to its smaller size it is always written as one chunk on every
+commit. Thus, saving the LPT is an atomic operation.
+
+
+[1] Since LEBs can only be appended and never overwritten, there is a
+difference between free space ie. the remaining space left on the LEB to be
+written to without erasing it and previously written content that is obsolete
+but can't be overwritten without erasing the full LEB.
+
+
+# UBIFS Authentication
+
+This chapter introduces UBIFS authentication which enables UBIFS to verify
+the authenticity and integrity of metadata and file contents stored on flash.
+
+
+## Threat Model
+
+UBIFS authentication enables detection of offline data modification. While it
+does not prevent it, it enables (trusted) code to check the integrity and
+authenticity of on-flash file contents and filesystem metadata. This covers
+attacks where file contents are swapped.
+
+UBIFS authentication will not protect against rollback of full flash contents.
+Ie. an attacker can still dump the flash and restore it at a later time without
+detection. It will also not protect against partial rollback of individual
+index commits. That means that an attacker is able to partially undo changes.
+This is possible because UBIFS does not immediately overwrites obsolete
+versions of the index tree or the journal, but instead marks them as obsolete
+and garbage collection erases them at a later time. An attacker can use this by
+erasing parts of the current tree and restoring old versions that are still on
+the flash and have not yet been erased. This is possible, because every commit
+will always write a new version of the index root node and the master node
+without overwriting the previous version. This is further helped by the
+wear-leveling operations of UBI which copies contents from one physical
+eraseblock to another and does not atomically erase the first eraseblock.
+
+UBIFS authentication does not cover attacks where an attacker is able to
+execute code on the device after the authentication key was provided.
+Additional measures like secure boot and trusted boot have to be taken to
+ensure that only trusted code is executed on a device.
+
+
+## Authentication
+
+To be able to fully trust data read from flash, all UBIFS data structures
+stored on flash are authenticated. That is:
+
+- The index which includes file contents, file metadata like extended
+ attributes, file length etc.
+- The journal which also contains file contents and metadata by recording changes
+ to the filesystem
+- The LPT which stores UBI LEB metadata which UBIFS uses for free space accounting
+
+
+### Index Authentication
+
+Through UBIFS' concept of a wandering tree, it already takes care of only
+updating and persisting changed parts from leaf node up to the root node
+of the full B+ tree. This enables us to augment the index nodes of the tree
+with a hash over each node's child nodes. As a result, the index basically also
+a Merkle tree. Since the leaf nodes of the index contain the actual filesystem
+data, the hashes of their parent index nodes thus cover all the file contents
+and file metadata. When a file changes, the UBIFS index is updated accordingly
+from the leaf nodes up to the root node including the master node. This process
+can be hooked to recompute the hash only for each changed node at the same time.
+Whenever a file is read, UBIFS can verify the hashes from each leaf node up to
+the root node to ensure the node's integrity.
+
+To ensure the authenticity of the whole index, the UBIFS master node stores a
+keyed hash (HMAC) over its own contents and a hash of the root node of the index
+tree. As mentioned above, the master node is always written to the flash whenever
+the index is persisted (ie. on index commit).
+
+Using this approach only UBIFS index nodes and the master node are changed to
+include a hash. All other types of nodes will remain unchanged. This reduces
+the storage overhead which is precious for users of UBIFS (ie. embedded
+devices).
+
+
+ +---------------+
+ | Master Node |
+ | (hash) |
+ +---------------+
+ |
+ v
+ +-------------------+
+ | Index Node #1 |
+ | |
+ | branch0 branchn |
+ | (hash) (hash) |
+ +-------------------+
+ | ... | (fanout: 8)
+ | |
+ +-------+ +------+
+ | |
+ v v
+ +-------------------+ +-------------------+
+ | Index Node #2 | | Index Node #3 |
+ | | | |
+ | branch0 branchn | | branch0 branchn |
+ | (hash) (hash) | | (hash) (hash) |
+ +-------------------+ +-------------------+
+ | ... | ... |
+ v v v
+ +-----------+ +----------+ +-----------+
+ | Data Node | | INO Node | | DENT Node |
+ +-----------+ +----------+ +-----------+
+
+
+ Figure 3: Coverage areas of index node hash and master node HMAC
+
+
+
+The most important part for robustness and power-cut safety is to atomically
+persist the hash and file contents. Here the existing UBIFS logic for how
+changed nodes are persisted is already designed for this purpose such that
+UBIFS can safely recover if a power-cut occurs while persisting. Adding
+hashes to index nodes does not change this since each hash will be persisted
+atomically together with its respective node.
+
+
+### Journal Authentication
+
+The journal is authenticated too. Since the journal is continuously written
+it is necessary to also add authentication information frequently to the
+journal so that in case of a powercut not too much data can't be authenticated.
+This is done by creating a continuous hash beginning from the commit start node
+over the previous reference nodes, the current reference node, and the bud
+nodes. From time to time whenever it is suitable authentication nodes are added
+between the bud nodes. This new node type contains a HMAC over the current state
+of the hash chain. That way a journal can be authenticated up to the last
+authentication node. The tail of the journal which may not have a authentication
+node cannot be authenticated and is skipped during journal replay.
+
+We get this picture for journal authentication:
+
+ ,,,,,,,,
+ ,......,...........................................
+ ,. CS , hash1.----. hash2.----.
+ ,. | , . |hmac . |hmac
+ ,. v , . v . v
+ ,.REF#0,-> bud -> bud -> bud.-> auth -> bud -> bud.-> auth ...
+ ,..|...,...........................................
+ , | ,
+ , | ,,,,,,,,,,,,,,,
+ . | hash3,----.
+ , | , |hmac
+ , v , v
+ , REF#1 -> bud -> bud,-> auth ...
+ ,,,|,,,,,,,,,,,,,,,,,,
+ v
+ REF#2 -> ...
+ |
+ V
+ ...
+
+Since the hash also includes the reference nodes an attacker cannot reorder or
+skip any journal heads for replay. An attacker can only remove bud nodes or
+reference nodes from the end of the journal, effectively rewinding the
+filesystem at maximum back to the last commit.
+
+The location of the log area is stored in the master node. Since the master
+node is authenticated with a HMAC as described above, it is not possible to
+tamper with that without detection. The size of the log area is specified when
+the filesystem is created using `mkfs.ubifs` and stored in the superblock node.
+To avoid tampering with this and other values stored there, a HMAC is added to
+the superblock struct. The superblock node is stored in LEB 0 and is only
+modified on feature flag or similar changes, but never on file changes.
+
+
+### LPT Authentication
+
+The location of the LPT root node on the flash is stored in the UBIFS master
+node. Since the LPT is written and read atomically on every commit, there is
+no need to authenticate individual nodes of the tree. It suffices to
+protect the integrity of the full LPT by a simple hash stored in the master
+node. Since the master node itself is authenticated, the LPTs authenticity can
+be verified by verifying the authenticity of the master node and comparing the
+LTP hash stored there with the hash computed from the read on-flash LPT.
+
+
+## Key Management
+
+For simplicity, UBIFS authentication uses a single key to compute the HMACs
+of superblock, master, commit start and reference nodes. This key has to be
+available on creation of the filesystem (`mkfs.ubifs`) to authenticate the
+superblock node. Further, it has to be available on mount of the filesystem
+to verify authenticated nodes and generate new HMACs for changes.
+
+UBIFS authentication is intended to operate side-by-side with UBIFS encryption
+(fscrypt) to provide confidentiality and authenticity. Since UBIFS encryption
+has a different approach of encryption policies per directory, there can be
+multiple fscrypt master keys and there might be folders without encryption.
+UBIFS authentication on the other hand has an all-or-nothing approach in the
+sense that it either authenticates everything of the filesystem or nothing.
+Because of this and because UBIFS authentication should also be usable without
+encryption, it does not share the same master key with fscrypt, but manages
+a dedicated authentication key.
+
+The API for providing the authentication key has yet to be defined, but the
+key can eg. be provided by userspace through a keyring similar to the way it
+is currently done in fscrypt. It should however be noted that the current
+fscrypt approach has shown its flaws and the userspace API will eventually
+change [FSCRYPT-POLICY2].
+
+Nevertheless, it will be possible for a user to provide a single passphrase
+or key in userspace that covers UBIFS authentication and encryption. This can
+be solved by the corresponding userspace tools which derive a second key for
+authentication in addition to the derived fscrypt master key used for
+encryption.
+
+To be able to check if the proper key is available on mount, the UBIFS
+superblock node will additionally store a hash of the authentication key. This
+approach is similar to the approach proposed for fscrypt encryption policy v2
+[FSCRYPT-POLICY2].
+
+
+# Future Extensions
+
+In certain cases where a vendor wants to provide an authenticated filesystem
+image to customers, it should be possible to do so without sharing the secret
+UBIFS authentication key. Instead, in addition the each HMAC a digital
+signature could be stored where the vendor shares the public key alongside the
+filesystem image. In case this filesystem has to be modified afterwards,
+UBIFS can exchange all digital signatures with HMACs on first mount similar
+to the way the IMA/EVM subsystem deals with such situations. The HMAC key
+will then have to be provided beforehand in the normal way.
+
+
+# References
+
+[CRYPTSETUP2] http://www.saout.de/pipermail/dm-crypt/2017-November/005745.html
+
+[DMC-CBC-ATTACK] http://www.jakoblell.com/blog/2013/12/22/practical-malleability-attack-against-cbc-encrypted-luks-partitions/
+
+[DM-INTEGRITY] https://www.kernel.org/doc/Documentation/device-mapper/dm-integrity.txt
+
+[DM-VERITY] https://www.kernel.org/doc/Documentation/device-mapper/verity.txt
+
+[FSCRYPT-POLICY2] https://www.spinics.net/lists/linux-ext4/msg58710.html
+
+[UBIFS-WP] http://www.linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf
compr=none override default compressor and set it to "none"
compr=lzo override default compressor and set it to "lzo"
compr=zlib override default compressor and set it to "zlib"
+auth_key= specify the key used for authenticating the filesystem.
+ Passing this option makes authentication mandatory.
+ The passed key must be present in the kernel keyring
+ and must be of type 'logon'
+auth_hash_name= The hash algorithm used for authentication. Used for
+ both hashing and for creating HMACs. Typical values
+ include "sha256" or "sha512"
Quick usage instructions
unsigned (*mmap_capabilities)(struct file *);
#endif
ssize_t (*copy_file_range)(struct file *, loff_t, struct file *, loff_t, size_t, unsigned int);
- int (*clone_file_range)(struct file *, loff_t, struct file *, loff_t, u64);
- int (*dedupe_file_range)(struct file *, loff_t, struct file *, loff_t, u64);
+ loff_t (*remap_file_range)(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags);
int (*fadvise)(struct file *, loff_t, loff_t, int);
};
copy_file_range: called by the copy_file_range(2) system call.
- clone_file_range: called by the ioctl(2) system call for FICLONERANGE and
- FICLONE commands.
-
- dedupe_file_range: called by the ioctl(2) system call for FIDEDUPERANGE
- command.
+ remap_file_range: called by the ioctl(2) system call for FICLONERANGE and
+ FICLONE and FIDEDUPERANGE commands to remap file ranges. An
+ implementation should remap len bytes at pos_in of the source file into
+ the dest file at pos_out. Implementations must handle callers passing
+ in len == 0; this means "remap to the end of the source file". The
+ return value should the number of bytes remapped, or the usual
+ negative error code if errors occurred before any bytes were remapped.
+ The remap_flags parameter accepts REMAP_FILE_* flags. If
+ REMAP_FILE_DEDUP is set then the implementation must only remap if the
+ requested file ranges have identical contents. If REMAP_CAN_SHORTEN is
+ set, the caller is ok with the implementation shortening the request
+ length to satisfy alignment or EOF requirements (or any other reason).
fadvise: possibly called by the fadvise64() system call.
--- /dev/null
+Kernel driver i2c-nvidia-gpu
+
+Datasheet: not publicly available.
+
+Authors:
+ Ajay Gupta <ajayg@nvidia.com>
+
+Description
+-----------
+
+i2c-nvidia-gpu is a driver for I2C controller included in NVIDIA Turing
+and later GPUs and it is used to communicate with Type-C controller on GPUs.
+
+If your 'lspci -v' listing shows something like the following,
+
+01:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad9 (rev a1)
+
+then this driver should support the I2C controller of your GPU.
* REL_WHEEL, REL_HWHEEL:
- These codes are used for vertical and horizontal scroll wheels,
- respectively.
+ respectively. The value is the number of detents moved on the wheel, the
+ physical size of which varies by device. For high-resolution wheels
+ this may be an approximation based on the high-resolution scroll events,
+ see REL_WHEEL_HI_RES. These event codes are legacy codes and
+ REL_WHEEL_HI_RES and REL_HWHEEL_HI_RES should be preferred where
+ available.
+
+* REL_WHEEL_HI_RES, REL_HWHEEL_HI_RES:
+
+ - High-resolution scroll wheel data. The accumulated value 120 represents
+ movement by one detent. For devices that do not provide high-resolution
+ scrolling, the value is always a multiple of 120. For devices with
+ high-resolution scrolling, the value may be a fraction of 120.
+
+ If a vertical scroll wheel supports high-resolution scrolling, this code
+ will be emitted in addition to REL_WHEEL or REL_HWHEEL. The REL_WHEEL
+ and REL_HWHEEL may be an approximation based on the high-resolution
+ scroll events. There is no guarantee that the high-resolution data
+ is a multiple of 120 at the time of an emulated REL_WHEEL or REL_HWHEEL
+ event.
EV_ABS
------
The third parameter may be a text as in this example, but it may also
be an expanded variable or a macro.
- cc-fullversion
- cc-fullversion is useful when the exact version of gcc is needed.
- One typical use-case is when a specific GCC version is broken.
- cc-fullversion points out a more specific version than cc-version does.
-
- Example:
- #arch/powerpc/Makefile
- $(Q)if test "$(cc-fullversion)" = "040200" ; then \
- echo -n '*** GCC-4.2.0 cannot compile the 64-bit powerpc ' ; \
- false ; \
- fi
-
- In this example for a specific GCC version the build will error out
- explaining to the user why it stops.
-
cc-cross-prefix
cc-cross-prefix is used to check if there exists a $(CC) in path with
one of the listed prefixes. The first prefix where there exist a
-.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
+.. This file is dual-licensed: you can use it either under the terms
+.. of the GPL or the GFDL 1.1+ license, at your option. Note that this
+.. dual licensing only applies to this file, and not this project as a
+.. whole.
+..
+.. a) This file is free software; you can redistribute it and/or
+.. modify it under the terms of the GNU General Public License as
+.. published by the Free Software Foundation; either version 2 of
+.. the License, or (at your option) any later version.
+..
+.. This file is distributed in the hope that it will be useful,
+.. but WITHOUT ANY WARRANTY; without even the implied warranty of
+.. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.. GNU General Public License for more details.
+..
+.. Or, alternatively,
+..
+.. b) Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
.. _media_ioc_request_alloc:
-.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
+.. This file is dual-licensed: you can use it either under the terms
+.. of the GPL or the GFDL 1.1+ license, at your option. Note that this
+.. dual licensing only applies to this file, and not this project as a
+.. whole.
+..
+.. a) This file is free software; you can redistribute it and/or
+.. modify it under the terms of the GNU General Public License as
+.. published by the Free Software Foundation; either version 2 of
+.. the License, or (at your option) any later version.
+..
+.. This file is distributed in the hope that it will be useful,
+.. but WITHOUT ANY WARRANTY; without even the implied warranty of
+.. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.. GNU General Public License for more details.
+..
+.. Or, alternatively,
+..
+.. b) Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
.. _media_request_ioc_queue:
-.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
+.. This file is dual-licensed: you can use it either under the terms
+.. of the GPL or the GFDL 1.1+ license, at your option. Note that this
+.. dual licensing only applies to this file, and not this project as a
+.. whole.
+..
+.. a) This file is free software; you can redistribute it and/or
+.. modify it under the terms of the GNU General Public License as
+.. published by the Free Software Foundation; either version 2 of
+.. the License, or (at your option) any later version.
+..
+.. This file is distributed in the hope that it will be useful,
+.. but WITHOUT ANY WARRANTY; without even the implied warranty of
+.. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.. GNU General Public License for more details.
+..
+.. Or, alternatively,
+..
+.. b) Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
.. _media_request_ioc_reinit:
-.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
+.. This file is dual-licensed: you can use it either under the terms
+.. of the GPL or the GFDL 1.1+ license, at your option. Note that this
+.. dual licensing only applies to this file, and not this project as a
+.. whole.
+..
+.. a) This file is free software; you can redistribute it and/or
+.. modify it under the terms of the GNU General Public License as
+.. published by the Free Software Foundation; either version 2 of
+.. the License, or (at your option) any later version.
+..
+.. This file is distributed in the hope that it will be useful,
+.. but WITHOUT ANY WARRANTY; without even the implied warranty of
+.. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.. GNU General Public License for more details.
+..
+.. Or, alternatively,
+..
+.. b) Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
.. _media-request-api:
-.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
+.. This file is dual-licensed: you can use it either under the terms
+.. of the GPL or the GFDL 1.1+ license, at your option. Note that this
+.. dual licensing only applies to this file, and not this project as a
+.. whole.
+..
+.. a) This file is free software; you can redistribute it and/or
+.. modify it under the terms of the GNU General Public License as
+.. published by the Free Software Foundation; either version 2 of
+.. the License, or (at your option) any later version.
+..
+.. This file is distributed in the hope that it will be useful,
+.. but WITHOUT ANY WARRANTY; without even the implied warranty of
+.. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.. GNU General Public License for more details.
+..
+.. Or, alternatively,
+..
+.. b) Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
.. _request-func-close:
-.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
+.. This file is dual-licensed: you can use it either under the terms
+.. of the GPL or the GFDL 1.1+ license, at your option. Note that this
+.. dual licensing only applies to this file, and not this project as a
+.. whole.
+..
+.. a) This file is free software; you can redistribute it and/or
+.. modify it under the terms of the GNU General Public License as
+.. published by the Free Software Foundation; either version 2 of
+.. the License, or (at your option) any later version.
+..
+.. This file is distributed in the hope that it will be useful,
+.. but WITHOUT ANY WARRANTY; without even the implied warranty of
+.. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.. GNU General Public License for more details.
+..
+.. Or, alternatively,
+..
+.. b) Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
.. _request-func-ioctl:
-.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
+.. This file is dual-licensed: you can use it either under the terms
+.. of the GPL or the GFDL 1.1+ license, at your option. Note that this
+.. dual licensing only applies to this file, and not this project as a
+.. whole.
+..
+.. a) This file is free software; you can redistribute it and/or
+.. modify it under the terms of the GNU General Public License as
+.. published by the Free Software Foundation; either version 2 of
+.. the License, or (at your option) any later version.
+..
+.. This file is distributed in the hope that it will be useful,
+.. but WITHOUT ANY WARRANTY; without even the implied warranty of
+.. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.. GNU General Public License for more details.
+..
+.. Or, alternatively,
+..
+.. b) Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GPL-2.0 OR GFDL-1.1-or-later WITH no-invariant-sections
.. _request-func-poll:
the desired operation. Both drivers and applications must set the remainder of
the :c:type:`v4l2_format` structure to 0.
-.. _v4l2-meta-format:
+.. c:type:: v4l2_meta_format
.. tabularcolumns:: |p{1.4cm}|p{2.2cm}|p{13.9cm}|
- ``sdr``
- Definition of a data format, see :ref:`pixfmt`, used by SDR
capture and output devices.
+ * -
+ - struct :c:type:`v4l2_meta_format`
+ - ``meta``
+ - Definition of a metadata format, see :ref:`meta-formats`, used by
+ metadata capture devices.
* -
- __u8
- ``raw_data``\ [200]
The driver is enabled via the standard kernel configuration system,
using the make command::
- make oldconfig/silentoldconfig/menuconfig/etc.
+ make oldconfig/menuconfig/etc.
The driver is located in the menu structure at:
By default it's enabled with a non-zero value. 0 disables F-RTO.
+tcp_fwmark_accept - BOOLEAN
+ If set, incoming connections to listening sockets that do not have a
+ socket mark will set the mark of the accepting socket to the fwmark of
+ the incoming SYN packet. This will cause all packets on that connection
+ (starting from the first SYNACK) to be sent with that fwmark. The
+ listening socket's mark is unchanged. Listening sockets that already
+ have a fwmark set via setsockopt(SOL_SOCKET, SO_MARK, ...) are
+ unaffected.
+
+ Default: 0
+
tcp_invalid_ratelimit - INTEGER
Limit the maximal rate for sending duplicate acknowledgments
in response to incoming TCP packets that are for an existing
u32 rxrpc_kernel_check_life(struct socket *sock,
struct rxrpc_call *call);
+ void rxrpc_kernel_probe_life(struct socket *sock,
+ struct rxrpc_call *call);
- This returns a number that is updated when ACKs are received from the peer
- (notably including PING RESPONSE ACKs which we can elicit by sending PING
- ACKs to see if the call still exists on the server). The caller should
- compare the numbers of two calls to see if the call is still alive after
- waiting for a suitable interval.
+ The first function returns a number that is updated when ACKs are received
+ from the peer (notably including PING RESPONSE ACKs which we can elicit by
+ sending PING ACKs to see if the call still exists on the server). The
+ caller should compare the numbers of two calls to see if the call is still
+ alive after waiting for a suitable interval.
This allows the caller to work out if the server is still contactable and
if the call is still alive on the server whilst waiting for the server to
process a client operation.
- This function may transmit a PING ACK.
+ The second function causes a ping ACK to be transmitted to try to provoke
+ the peer into responding, which would then cause the value returned by the
+ first function to change. Note that this must be called in TASK_RUNNING
+ state.
(*) Get reply timestamp.
code-of-conduct-interpretation
development-process
submitting-patches
+ programming-language
coding-style
maintainer-pgp-guide
email-clients
--- /dev/null
+.. _programming_language:
+
+Programming Language
+====================
+
+The kernel is written in the C programming language [c-language]_.
+More precisely, the kernel is typically compiled with ``gcc`` [gcc]_
+under ``-std=gnu89`` [gcc-c-dialect-options]_: the GNU dialect of ISO C90
+(including some C99 features).
+
+This dialect contains many extensions to the language [gnu-extensions]_,
+and many of them are used within the kernel as a matter of course.
+
+There is some support for compiling the kernel with ``clang`` [clang]_
+and ``icc`` [icc]_ for several of the architectures, although at the time
+of writing it is not completed, requiring third-party patches.
+
+Attributes
+----------
+
+One of the common extensions used throughout the kernel are attributes
+[gcc-attribute-syntax]_. Attributes allow to introduce
+implementation-defined semantics to language entities (like variables,
+functions or types) without having to make significant syntactic changes
+to the language (e.g. adding a new keyword) [n2049]_.
+
+In some cases, attributes are optional (i.e. a compiler not supporting them
+should still produce proper code, even if it is slower or does not perform
+as many compile-time checks/diagnostics).
+
+The kernel defines pseudo-keywords (e.g. ``__pure``) instead of using
+directly the GNU attribute syntax (e.g. ``__attribute__((__pure__))``)
+in order to feature detect which ones can be used and/or to shorten the code.
+
+Please refer to ``include/linux/compiler_attributes.h`` for more information.
+
+.. [c-language] http://www.open-std.org/jtc1/sc22/wg14/www/standards
+.. [gcc] https://gcc.gnu.org
+.. [clang] https://clang.llvm.org
+.. [icc] https://software.intel.com/en-us/c-compilers
+.. [gcc-c-dialect-options] https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html
+.. [gnu-extensions] https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html
+.. [gcc-attribute-syntax] https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
+.. [n2049] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2049.pdf
+
and either the buffer length or the OtherInfo length exceeds the
allowed length.
+
* Restrict keyring linkage::
long keyctl(KEYCTL_RESTRICT_KEYRING, key_serial_t keyring,
applicable to the asymmetric key type.
+ * Query an asymmetric key::
+
+ long keyctl(KEYCTL_PKEY_QUERY,
+ key_serial_t key_id, unsigned long reserved,
+ struct keyctl_pkey_query *info);
+
+ Get information about an asymmetric key. The information is returned in
+ the keyctl_pkey_query struct::
+
+ __u32 supported_ops;
+ __u32 key_size;
+ __u16 max_data_size;
+ __u16 max_sig_size;
+ __u16 max_enc_size;
+ __u16 max_dec_size;
+ __u32 __spare[10];
+
+ ``supported_ops`` contains a bit mask of flags indicating which ops are
+ supported. This is constructed from a bitwise-OR of::
+
+ KEYCTL_SUPPORTS_{ENCRYPT,DECRYPT,SIGN,VERIFY}
+
+ ``key_size`` indicated the size of the key in bits.
+
+ ``max_*_size`` indicate the maximum sizes in bytes of a blob of data to be
+ signed, a signature blob, a blob to be encrypted and a blob to be
+ decrypted.
+
+ ``__spare[]`` must be set to 0. This is intended for future use to hand
+ over one or more passphrases needed unlock a key.
+
+ If successful, 0 is returned. If the key is not an asymmetric key,
+ EOPNOTSUPP is returned.
+
+
+ * Encrypt, decrypt, sign or verify a blob using an asymmetric key::
+
+ long keyctl(KEYCTL_PKEY_ENCRYPT,
+ const struct keyctl_pkey_params *params,
+ const char *info,
+ const void *in,
+ void *out);
+
+ long keyctl(KEYCTL_PKEY_DECRYPT,
+ const struct keyctl_pkey_params *params,
+ const char *info,
+ const void *in,
+ void *out);
+
+ long keyctl(KEYCTL_PKEY_SIGN,
+ const struct keyctl_pkey_params *params,
+ const char *info,
+ const void *in,
+ void *out);
+
+ long keyctl(KEYCTL_PKEY_VERIFY,
+ const struct keyctl_pkey_params *params,
+ const char *info,
+ const void *in,
+ const void *in2);
+
+ Use an asymmetric key to perform a public-key cryptographic operation a
+ blob of data. For encryption and verification, the asymmetric key may
+ only need the public parts to be available, but for decryption and signing
+ the private parts are required also.
+
+ The parameter block pointed to by params contains a number of integer
+ values::
+
+ __s32 key_id;
+ __u32 in_len;
+ __u32 out_len;
+ __u32 in2_len;
+
+ ``key_id`` is the ID of the asymmetric key to be used. ``in_len`` and
+ ``in2_len`` indicate the amount of data in the in and in2 buffers and
+ ``out_len`` indicates the size of the out buffer as appropriate for the
+ above operations.
+
+ For a given operation, the in and out buffers are used as follows::
+
+ Operation ID in,in_len out,out_len in2,in2_len
+ ======================= =============== =============== ===============
+ KEYCTL_PKEY_ENCRYPT Raw data Encrypted data -
+ KEYCTL_PKEY_DECRYPT Encrypted data Raw data -
+ KEYCTL_PKEY_SIGN Raw data Signature -
+ KEYCTL_PKEY_VERIFY Raw data - Signature
+
+ ``info`` is a string of key=value pairs that supply supplementary
+ information. These include:
+
+ ``enc=<encoding>`` The encoding of the encrypted/signature blob. This
+ can be "pkcs1" for RSASSA-PKCS1-v1.5 or
+ RSAES-PKCS1-v1.5; "pss" for "RSASSA-PSS"; "oaep" for
+ "RSAES-OAEP". If omitted or is "raw", the raw output
+ of the encryption function is specified.
+
+ ``hash=<algo>`` If the data buffer contains the output of a hash
+ function and the encoding includes some indication of
+ which hash function was used, the hash function can be
+ specified with this, eg. "hash=sha256".
+
+ The ``__spare[]`` space in the parameter block must be set to 0. This is
+ intended, amongst other things, to allow the passing of passphrases
+ required to unlock a key.
+
+ If successful, encrypt, decrypt and sign all return the amount of data
+ written into the output buffer. Verification returns 0 on success.
+
+
Kernel Services
===============
attempted key link operation. If there is no match, -EINVAL is returned.
+ * ``int (*asym_eds_op)(struct kernel_pkey_params *params,
+ const void *in, void *out);``
+ ``int (*asym_verify_signature)(struct kernel_pkey_params *params,
+ const void *in, const void *in2);``
+
+ These methods are optional. If provided the first allows a key to be
+ used to encrypt, decrypt or sign a blob of data, and the second allows a
+ key to verify a signature.
+
+ In all cases, the following information is provided in the params block::
+
+ struct kernel_pkey_params {
+ struct key *key;
+ const char *encoding;
+ const char *hash_algo;
+ char *info;
+ __u32 in_len;
+ union {
+ __u32 out_len;
+ __u32 in2_len;
+ };
+ enum kernel_pkey_operation op : 8;
+ };
+
+ This includes the key to be used; a string indicating the encoding to use
+ (for instance, "pkcs1" may be used with an RSA key to indicate
+ RSASSA-PKCS1-v1.5 or RSAES-PKCS1-v1.5 encoding or "raw" if no encoding);
+ the name of the hash algorithm used to generate the data for a signature
+ (if appropriate); the sizes of the input and output (or second input)
+ buffers; and the ID of the operation to be performed.
+
+ For a given operation ID, the input and output buffers are used as
+ follows::
+
+ Operation ID in,in_len out,out_len in2,in2_len
+ ======================= =============== =============== ===============
+ kernel_pkey_encrypt Raw data Encrypted data -
+ kernel_pkey_decrypt Encrypted data Raw data -
+ kernel_pkey_sign Raw data Signature -
+ kernel_pkey_verify Raw data - Signature
+
+ asym_eds_op() deals with encryption, decryption and signature creation as
+ specified by params->op. Note that params->op is also set for
+ asym_verify_signature().
+
+ Encrypting and signature creation both take raw data in the input buffer
+ and return the encrypted result in the output buffer. Padding may have
+ been added if an encoding was set. In the case of signature creation,
+ depending on the encoding, the padding created may need to indicate the
+ digest algorithm - the name of which should be supplied in hash_algo.
+
+ Decryption takes encrypted data in the input buffer and returns the raw
+ data in the output buffer. Padding will get checked and stripped off if
+ an encoding was set.
+
+ Verification takes raw data in the input buffer and the signature in the
+ second input buffer and checks that the one matches the other. Padding
+ will be validated. Depending on the encoding, the digest algorithm used
+ to generate the raw data may need to be indicated in hash_algo.
+
+ If successful, asym_eds_op() should return the number of bytes written
+ into the output buffer. asym_verify_signature() should return 0.
+
+ A variety of errors may be returned, including EOPNOTSUPP if the operation
+ is not supported; EKEYREJECTED if verification fails; ENOPKG if the
+ required crypto isn't available.
+
+
+ * ``int (*asym_query)(const struct kernel_pkey_params *params,
+ struct kernel_pkey_query *info);``
+
+ This method is optional. If provided it allows information about the
+ public or asymmetric key held in the key to be determined.
+
+ The parameter block is as for asym_eds_op() and co. but in_len and out_len
+ are unused. The encoding and hash_algo fields should be used to reduce
+ the returned buffer/data sizes as appropriate.
+
+ If successful, the following information is filled in::
+
+ struct kernel_pkey_query {
+ __u32 supported_ops;
+ __u32 key_size;
+ __u16 max_data_size;
+ __u16 max_sig_size;
+ __u16 max_enc_size;
+ __u16 max_dec_size;
+ };
+
+ The supported_ops field will contain a bitmask indicating what operations
+ are supported by the key, including encryption of a blob, decryption of a
+ blob, signing a blob and verifying the signature on a blob. The following
+ constants are defined for this::
+
+ KEYCTL_SUPPORTS_{ENCRYPT,DECRYPT,SIGN,VERIFY}
+
+ The key_size field is the size of the key in bits. max_data_size and
+ max_sig_size are the maximum raw data and signature sizes for creation and
+ verification of a signature; max_enc_size and max_dec_size are the maximum
+ raw data and signature sizes for encryption and decryption. The
+ max_*_size fields are measured in bytes.
+
+ If successful, 0 will be returned. If the key doesn't support this,
+ EOPNOTSUPP will be returned.
+
+
Request-Key Callback Service
============================
Memory poisoning
----------------
-When releasing memory, it is best to poison the contents (clear stack on
-syscall return, wipe heap memory on a free), to avoid reuse attacks that
-rely on the old contents of memory. This frustrates many uninitialized
-variable attacks, stack content exposures, heap content exposures, and
-use-after-free attacks.
+When releasing memory, it is best to poison the contents, to avoid reuse
+attacks that rely on the old contents of memory. E.g., clear stack on a
+syscall return (``CONFIG_GCC_PLUGIN_STACKLEAK``), wipe heap memory on a
+free. This frustrates many uninitialized variable attacks, stack content
+exposures, heap content exposures, and use-after-free attacks.
Destination tracking
--------------------
- shmmni
- softlockup_all_cpu_backtrace
- soft_watchdog
+- stack_erasing
- stop-a [ SPARC only ]
- sysrq ==> Documentation/admin-guide/sysrq.rst
- sysctl_writes_strict
==============================================================
+stack_erasing
+
+This parameter can be used to control kernel stack erasing at the end
+of syscalls for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
+
+That erasing reduces the information which kernel stack leak bugs
+can reveal and blocks some uninitialized stack variable attacks.
+The tradeoff is the performance impact: on a single CPU system kernel
+compilation sees a 1% slowdown, other systems and workloads may vary.
+
+ 0: kernel stack erasing is disabled, STACKLEAK_METRICS are not updated.
+
+ 1: kernel stack erasing is enabled (default), it is performed before
+ returning to the userspace at the end of syscalls.
+
+==============================================================
+
tainted:
Non-zero if the kernel has been tainted. Numeric values, which can be
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0);
+
+- PR_SPEC_INDIR_BRANCH: Indirect Branch Speculation in User Processes
+ (Mitigate Spectre V2 style attacks against user processes)
+
+ Invocations:
+ * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
to struct boot_params for loading bzImage and ramdisk
above 4G in 64bit.
-Protocol 2.13: (Kernel 3.14) Support 32- and 64-bit flags being set in
- xloadflags to support booting a 64-bit kernel from 32-bit
- EFI
-
-Protocol 2.14: (Kernel 4.20) Added acpi_rsdp_addr holding the physical
- address of the ACPI RSDP table.
- The bootloader updates version with:
- 0x8000 | min(kernel-version, bootloader-version)
- kernel-version being the protocol version supported by
- the kernel and bootloader-version the protocol version
- supported by the bootloader.
-
**** MEMORY LAYOUT
The traditional memory map for the kernel loader, used for Image or
0258/8 2.10+ pref_address Preferred loading address
0260/4 2.10+ init_size Linear memory required during initialization
0264/4 2.11+ handover_offset Offset of handover entry point
-0268/8 2.14+ acpi_rsdp_addr Physical address of RSDP table
(1) For backwards compatibility, if the setup_sects field contains 0, the
real value is 4.
Contains the magic number "HdrS" (0x53726448).
Field name: version
-Type: modify
+Type: read
Offset/size: 0x206/2
Protocol: 2.00+
e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
10.17.
- Up to protocol version 2.13 this information is only read by the
- bootloader. From protocol version 2.14 onwards the bootloader will
- write the used protocol version or-ed with 0x8000 to the field. The
- used protocol version will be the minimum of the supported protocol
- versions of the bootloader and the kernel.
-
Field name: realmode_swtch
Type: modify (optional)
Offset/size: 0x208/4
See EFI HANDOVER PROTOCOL below for more details.
-Field name: acpi_rsdp_addr
-Type: write
-Offset/size: 0x268/8
-Protocol: 2.14+
-
- This field can be set by the boot loader to tell the kernel the
- physical address of the ACPI RSDP table.
-
- A value of 0 indicates the kernel should fall back to the standard
- methods to locate the RSDP.
-
**** THE IMAGE CHECKSUM
____________________________________________________________|___________________________________________________________
| | | |
ffff800000000000 | -128 TB | ffff87ffffffffff | 8 TB | ... guard hole, also reserved for hypervisor
- ffff880000000000 | -120 TB | ffffc7ffffffffff | 64 TB | direct mapping of all physical memory (page_offset_base)
- ffffc80000000000 | -56 TB | ffffc8ffffffffff | 1 TB | ... unused hole
+ ffff880000000000 | -120 TB | ffff887fffffffff | 0.5 TB | LDT remap for PTI
+ ffff888000000000 | -119.5 TB | ffffc87fffffffff | 64 TB | direct mapping of all physical memory (page_offset_base)
+ ffffc88000000000 | -55.5 TB | ffffc8ffffffffff | 0.5 TB | ... unused hole
ffffc90000000000 | -55 TB | ffffe8ffffffffff | 32 TB | vmalloc/ioremap space (vmalloc_base)
ffffe90000000000 | -23 TB | ffffe9ffffffffff | 1 TB | ... unused hole
ffffea0000000000 | -22 TB | ffffeaffffffffff | 1 TB | virtual memory map (vmemmap_base)
ffffeb0000000000 | -21 TB | ffffebffffffffff | 1 TB | ... unused hole
ffffec0000000000 | -20 TB | fffffbffffffffff | 16 TB | KASAN shadow memory
- fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
- | | | | vaddr_end for KASLR
- fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
- fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | LDT remap for PTI
- ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
__________________|____________|__________________|_________|____________________________________________________________
|
- | Identical layout to the 47-bit one from here on:
+ | Identical layout to the 56-bit one from here on:
____________________________________________________________|____________________________________________________________
| | | |
+ fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
+ | | | | vaddr_end for KASLR
+ fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
+ fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
+ ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
__________________|____________|__________________|_________|___________________________________________________________
| | | |
0000800000000000 | +64 PB | ffff7fffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
- | | | | virtual memory addresses up to the -128 TB
+ | | | | virtual memory addresses up to the -64 PB
| | | | starting offset of kernel mappings.
__________________|____________|__________________|_________|___________________________________________________________
|
____________________________________________________________|___________________________________________________________
| | | |
ff00000000000000 | -64 PB | ff0fffffffffffff | 4 PB | ... guard hole, also reserved for hypervisor
- ff10000000000000 | -60 PB | ff8fffffffffffff | 32 PB | direct mapping of all physical memory (page_offset_base)
- ff90000000000000 | -28 PB | ff9fffffffffffff | 4 PB | LDT remap for PTI
+ ff10000000000000 | -60 PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI
+ ff11000000000000 | -59.75 PB | ff90ffffffffffff | 32 PB | direct mapping of all physical memory (page_offset_base)
+ ff91000000000000 | -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole
ffa0000000000000 | -24 PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
ffd2000000000000 | -11.5 PB | ffd3ffffffffffff | 0.5 PB | ... unused hole
ffd4000000000000 | -11 PB | ffd5ffffffffffff | 0.5 PB | virtual memory map (vmemmap_base)
ffd6000000000000 | -10.5 PB | ffdeffffffffffff | 2.25 PB | ... unused hole
ffdf000000000000 | -8.25 PB | fffffdffffffffff | ~8 PB | KASAN shadow memory
- fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
- | | | | vaddr_end for KASLR
- fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
- fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
- ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
__________________|____________|__________________|_________|____________________________________________________________
|
| Identical layout to the 47-bit one from here on:
____________________________________________________________|____________________________________________________________
| | | |
+ fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
+ | | | | vaddr_end for KASLR
+ fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
+ fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
+ ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
Be very careful vs. KASLR when changing anything here. The KASLR address
range must not overlap with anything except the KASAN shadow area, which is
correct as KASAN disables KASLR.
+
+For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB
+hole: ffffffffffff4111
0C8/004 ALL ext_cmd_line_ptr cmd_line_ptr high 32bits
140/080 ALL edid_info Video mode setup (struct edid_info)
1C0/020 ALL efi_info EFI 32 information (struct efi_info)
-1E0/004 ALL alk_mem_k Alternative mem check, in KB
+1E0/004 ALL alt_mem_k Alternative mem check, in KB
1E4/004 ALL scratch Scratch field for the kernel setup code
1E8/001 ALL e820_entries Number of entries in e820_table (below)
1E9/001 ALL eddbuf_entries Number of entries in eddbuf (below)
8169 10/100/1000 GIGABIT ETHERNET DRIVER
M: Realtek linux nic maintainers <nic_swsd@realtek.com>
+M: Heiner Kallweit <hkallweit1@gmail.com>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/net/ethernet/realtek/r8169.c
F: include/dt-bindings/reset/altr,rst-mgr-a10sr.h
ALTERA TRIPLE SPEED ETHERNET DRIVER
-M: Vince Bridgers <vbridger@opensource.altera.com>
+M: Thor Thayer <thor.thayer@linux.intel.com>
L: netdev@vger.kernel.org
L: nios2-dev@lists.rocketboards.org (moderated for non-subscribers)
S: Maintained
F: drivers/clocksource/timer-prima2.c
F: drivers/clocksource/timer-atlas7.c
N: [^a-z]sirf
+X: drivers/gnss
ARM/EBSA110 MACHINE SUPPORT
M: Russell King <linux@armlinux.org.uk>
M: Matthias Brugger <matthias.bgg@gmail.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: linux-mediatek@lists.infradead.org (moderated for non-subscribers)
+W: https://mtk.bcnfs.org/
+C: irc://chat.freenode.net/linux-mediatek
S: Maintained
F: arch/arm/boot/dts/mt6*
F: arch/arm/boot/dts/mt7*
F: arch/arm/boot/dts/mt8*
F: arch/arm/mach-mediatek/
F: arch/arm64/boot/dts/mediatek/
+F: drivers/soc/mediatek/
N: mtk
+N: mt[678]
K: mediatek
ARM/Mediatek USB3 PHY DRIVER
M: Andy Gross <andy.gross@linaro.org>
M: David Brown <david.brown@linaro.org>
L: linux-arm-msm@vger.kernel.org
-L: linux-soc@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/soc/qcom/
F: arch/arm/boot/dts/qcom-*.dts
ATHEROS ATH5K WIRELESS DRIVER
M: Jiri Slaby <jirislaby@gmail.com>
M: Nick Kossifidis <mickflemm@gmail.com>
-M: "Luis R. Rodriguez" <mcgrof@do-not-panic.com>
+M: Luis Chamberlain <mcgrof@kernel.org>
L: linux-wireless@vger.kernel.org
W: http://wireless.kernel.org/en/users/Drivers/ath5k
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
Q: https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
S: Supported
-F: arch/x86/net/bpf_jit*
+F: arch/*/net/*
F: Documentation/networking/filter.txt
F: Documentation/bpf/
F: include/linux/bpf*
F: tools/lib/bpf/
F: tools/testing/selftests/bpf/
+BPF JIT for ARM
+M: Shubham Bansal <illusionist.neo@gmail.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: arch/arm/net/
+
+BPF JIT for ARM64
+M: Daniel Borkmann <daniel@iogearbox.net>
+M: Alexei Starovoitov <ast@kernel.org>
+M: Zi Shen Lim <zlim.lnx@gmail.com>
+L: netdev@vger.kernel.org
+S: Supported
+F: arch/arm64/net/
+
+BPF JIT for MIPS (32-BIT AND 64-BIT)
+M: Paul Burton <paul.burton@mips.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: arch/mips/net/
+
+BPF JIT for NFP NICs
+M: Jakub Kicinski <jakub.kicinski@netronome.com>
+L: netdev@vger.kernel.org
+S: Supported
+F: drivers/net/ethernet/netronome/nfp/bpf/
+
+BPF JIT for POWERPC (32-BIT AND 64-BIT)
+M: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
+M: Sandipan Das <sandipan@linux.ibm.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: arch/powerpc/net/
+
+BPF JIT for S390
+M: Martin Schwidefsky <schwidefsky@de.ibm.com>
+M: Heiko Carstens <heiko.carstens@de.ibm.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: arch/s390/net/
+X: arch/s390/net/pnet.c
+
+BPF JIT for SPARC (32-BIT AND 64-BIT)
+M: David S. Miller <davem@davemloft.net>
+L: netdev@vger.kernel.org
+S: Maintained
+F: arch/sparc/net/
+
+BPF JIT for X86 32-BIT
+M: Wang YanQing <udknight@gmail.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: arch/x86/net/bpf_jit_comp32.c
+
+BPF JIT for X86 64-BIT
+M: Alexei Starovoitov <ast@kernel.org>
+M: Daniel Borkmann <daniel@iogearbox.net>
+L: netdev@vger.kernel.org
+S: Supported
+F: arch/x86/net/
+X: arch/x86/net/bpf_jit_comp32.c
+
BROADCOM B44 10/100 ETHERNET DRIVER
M: Michael Chan <michael.chan@broadcom.com>
L: netdev@vger.kernel.org
BROADCOM BCM47XX MIPS ARCHITECTURE
M: Hauke Mehrtens <hauke@hauke-m.de>
M: Rafał Miłecki <zajec5@gmail.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/mips/brcm/
F: arch/mips/bcm47xx/*
BROADCOM BCM5301X ARM ARCHITECTURE
M: Hauke Mehrtens <hauke@hauke-m.de>
M: Rafał Miłecki <zajec5@gmail.com>
-M: Jon Mason <jonmason@broadcom.com>
M: bcm-kernel-feedback-list@broadcom.com
L: linux-arm-kernel@lists.infradead.org
S: Maintained
BROADCOM BMIPS MIPS ARCHITECTURE
M: Kevin Cernekee <cernekee@gmail.com>
M: Florian Fainelli <f.fainelli@gmail.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
T: git git://github.com/broadcom/stblinux.git
S: Maintained
F: arch/mips/bmips/*
BROADCOM IPROC ARM ARCHITECTURE
M: Ray Jui <rjui@broadcom.com>
M: Scott Branden <sbranden@broadcom.com>
-M: Jon Mason <jonmason@broadcom.com>
M: bcm-kernel-feedback-list@broadcom.com
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
T: git git://github.com/broadcom/cygnus-linux.git
BROADCOM NVRAM DRIVER
M: Rafał Miłecki <zajec5@gmail.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: drivers/firmware/broadcom/*
F: sound/pci/oxygen/
C-SKY ARCHITECTURE
-M: Guo Ren <ren_guo@c-sky.com>
+M: Guo Ren <guoren@kernel.org>
T: git https://github.com/c-sky/csky-linux.git
S: Supported
F: arch/csky/
F: Documentation/devicetree/bindings/csky/
+F: drivers/irqchip/irq-csky-*
+F: Documentation/devicetree/bindings/interrupt-controller/csky,*
+F: drivers/clocksource/timer-gx6605s.c
+F: drivers/clocksource/timer-mp-csky.c
+F: Documentation/devicetree/bindings/timer/csky,*
K: csky
N: csky
F: include/net/caif/
F: net/caif/
+CAKE QDISC
+M: Toke Høiland-Jørgensen <toke@toke.dk>
+L: cake@lists.bufferbloat.net (moderated for non-subscribers)
+S: Maintained
+F: net/sched/sch_cake.c
+
CALGARY x86-64 IOMMU
M: Muli Ben-Yehuda <mulix@mulix.org>
M: Jon Mason <jdmason@kudzu.us>
S: Maintained
F: drivers/platform/x86/compal-laptop.c
+COMPILER ATTRIBUTES
+M: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
+S: Maintained
+F: include/linux/compiler_attributes.h
+
CONEXANT ACCESSRUNNER USB DRIVER
L: accessrunner-general@lists.sourceforge.net
W: http://accessrunner.sourceforge.net/
DECSTATION PLATFORM SUPPORT
M: "Maciej W. Rozycki" <macro@linux-mips.org>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
W: http://www.linux-mips.org/wiki/DECstation
S: Maintained
F: arch/mips/dec/
M: Ralf Baechle <ralf@linux-mips.org>
M: David Daney <david.daney@cavium.com>
L: linux-edac@vger.kernel.org
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Supported
F: drivers/edac/octeon_edac*
ETHERNET PHY LIBRARY
M: Andrew Lunn <andrew@lunn.ch>
M: Florian Fainelli <f.fainelli@gmail.com>
+M: Heiner Kallweit <hkallweit1@gmail.com>
L: netdev@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/sysfs-bus-mdio
F: tools/firewire/
FIRMWARE LOADER (request_firmware)
-M: Luis R. Rodriguez <mcgrof@kernel.org>
+M: Luis Chamberlain <mcgrof@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
F: Documentation/firmware_class/
S: Maintained
F: drivers/i2c/busses/i2c-cpm.c
+FREESCALE IMX LPI2C DRIVER
+M: Dong Aisheng <aisheng.dong@nxp.com>
+L: linux-i2c@vger.kernel.org
+L: linux-imx@nxp.com
+S: Maintained
+F: drivers/i2c/busses/i2c-imx-lpi2c.c
+F: Documentation/devicetree/bindings/i2c/i2c-imx-lpi2c.txt
+
FREESCALE IMX / MXC FEC DRIVER
M: Fugang Duan <fugang.duan@nxp.com>
L: netdev@vger.kernel.org
GNSS SUBSYSTEM
M: Johan Hovold <johan@kernel.org>
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/johan/gnss.git
S: Maintained
F: Documentation/ABI/testing/sysfs-class-gnss
F: Documentation/devicetree/bindings/gnss/
GPIO SUBSYSTEM
M: Linus Walleij <linus.walleij@linaro.org>
+M: Bartosz Golaszewski <bgolaszewski@baylibre.com>
L: linux-gpio@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git
S: Maintained
S: Maintained
F: drivers/i2c/i2c-core-acpi.c
+I2C CONTROLLER DRIVER FOR NVIDIA GPU
+M: Ajay Gupta <ajayg@nvidia.com>
+L: linux-i2c@vger.kernel.org
+S: Maintained
+F: Documentation/i2c/busses/i2c-nvidia-gpu
+F: drivers/i2c/busses/i2c-nvidia-gpu.c
+
I2C MUXES
M: Peter Rosin <peda@axentia.se>
L: linux-i2c@vger.kernel.org
F: Documentation/fb/intelfb.txt
F: drivers/video/fbdev/intelfb/
+INTEL GPIO DRIVERS
+M: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+L: linux-gpio@vger.kernel.org
+S: Maintained
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel.git
+F: drivers/gpio/gpio-ich.c
+F: drivers/gpio/gpio-intel-mid.c
+F: drivers/gpio/gpio-lynxpoint.c
+F: drivers/gpio/gpio-merrifield.c
+F: drivers/gpio/gpio-ml-ioh.c
+F: drivers/gpio/gpio-pch.c
+F: drivers/gpio/gpio-sch.c
+F: drivers/gpio/gpio-sodaville.c
+
INTEL GVT-g DRIVERS (Intel GPU Virtualization)
M: Zhenyu Wang <zhenyuw@linux.intel.com>
M: Zhi Wang <zhi.a.wang@intel.com>
S: Supported
F: drivers/gpu/drm/i915/gvt/
-INTEL PMIC GPIO DRIVER
-R: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
-S: Maintained
-F: drivers/gpio/gpio-*cove.c
-F: drivers/gpio/gpio-msic.c
-
INTEL HID EVENT DRIVER
M: Alex Hung <alex.hung@canonical.com>
L: platform-driver-x86@vger.kernel.org
S: Supported
F: drivers/platform/x86/intel_menlow.c
-INTEL MERRIFIELD GPIO DRIVER
-M: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
-L: linux-gpio@vger.kernel.org
-S: Maintained
-F: drivers/gpio/gpio-merrifield.c
-
INTEL MIC DRIVERS (mic)
M: Sudeep Dutt <sudeep.dutt@intel.com>
M: Ashutosh Dixit <ashutosh.dixit@intel.com>
F: arch/x86/include/asm/intel_pmc_ipc.h
F: arch/x86/include/asm/intel_punit_ipc.h
+INTEL PMIC GPIO DRIVERS
+M: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+S: Maintained
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel.git
+F: drivers/gpio/gpio-*cove.c
+F: drivers/gpio/gpio-msic.c
+
INTEL MULTIFUNCTION PMIC DEVICE DRIVERS
R: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
S: Maintained
IOC3 ETHERNET DRIVER
M: Ralf Baechle <ralf@linux-mips.org>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: drivers/net/ethernet/sgi/ioc3-eth.c
F: Documentation/dev-tools/kselftest*
KERNEL USERMODE HELPER
-M: "Luis R. Rodriguez" <mcgrof@kernel.org>
+M: Luis Chamberlain <mcgrof@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
F: kernel/umh.c
KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)
M: James Hogan <jhogan@kernel.org>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Supported
F: arch/mips/include/uapi/asm/kvm*
F: arch/mips/include/asm/kvm*
F: mm/kmemleak-test.c
KMOD KERNEL MODULE LOADER - USERMODE HELPER
-M: "Luis R. Rodriguez" <mcgrof@kernel.org>
+M: Luis Chamberlain <mcgrof@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
F: kernel/kmod.c
LANTIQ MIPS ARCHITECTURE
M: John Crispin <john@phrozen.org>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/lantiq
F: drivers/soc/lantiq
LIBATA PATA ARASAN COMPACT FLASH CONTROLLER
M: Viresh Kumar <vireshk@kernel.org>
L: linux-ide@vger.kernel.org
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
S: Maintained
F: include/linux/pata_arasan_cf_data.h
F: drivers/ata/pata_arasan_cf.c
LIBATA PATA FARADAY FTIDE010 AND GEMINI SATA BRIDGE DRIVERS
M: Linus Walleij <linus.walleij@linaro.org>
L: linux-ide@vger.kernel.org
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
S: Maintained
F: drivers/ata/pata_ftide010.c
F: drivers/ata/sata_gemini.c
LIBATA SATA PROMISE TX2/TX4 CONTROLLER DRIVER
M: Mikael Pettersson <mikpelinux@gmail.com>
L: linux-ide@vger.kernel.org
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
S: Maintained
F: drivers/ata/sata_promise.*
MARDUK (CREATOR CI40) DEVICE TREE SUPPORT
M: Rahul Bedarkar <rahulbedarkar89@gmail.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/boot/dts/img/pistachio_marduk.dts
MICROSEMI MIPS SOCS
M: Alexandre Belloni <alexandre.belloni@bootlin.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/generic/board-ocelot.c
F: arch/mips/configs/generic/board-ocelot.config
M: Ralf Baechle <ralf@linux-mips.org>
M: Paul Burton <paul.burton@mips.com>
M: James Hogan <jhogan@kernel.org>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
W: http://www.linux-mips.org/
T: git git://git.linux-mips.org/pub/scm/ralf/linux.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git
MIPS BOSTON DEVELOPMENT BOARD
M: Paul Burton <paul.burton@mips.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/clock/img,boston-clock.txt
F: arch/mips/boot/dts/img/boston.dts
MIPS GENERIC PLATFORM
M: Paul Burton <paul.burton@mips.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/power/mti,mips-cpc.txt
F: arch/mips/generic/
MIPS/LOONGSON1 ARCHITECTURE
M: Keguang Zhang <keguang.zhang@gmail.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/loongson32/
F: arch/mips/include/asm/mach-loongson32/
MIPS/LOONGSON2 ARCHITECTURE
M: Jiaxun Yang <jiaxun.yang@flygoat.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/loongson64/fuloong-2e/
F: arch/mips/loongson64/lemote-2f/
MIPS/LOONGSON3 ARCHITECTURE
M: Huacai Chen <chenhc@lemote.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/loongson64/
F: arch/mips/include/asm/mach-loongson64/
MIPS RINT INSTRUCTION EMULATION
M: Aleksandar Markovic <aleksandar.markovic@mips.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Supported
F: arch/mips/math-emu/sp_rint.c
F: arch/mips/math-emu/dp_rint.c
F: drivers/media/radio/radio-miropcm20*
MMP SUPPORT
-M: Eric Miao <eric.y.miao@gmail.com>
-M: Haojian Zhuang <haojian.zhuang@gmail.com>
+R: Lubomir Rintel <lkundrak@v3.sk>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
-T: git git://github.com/hzhuang1/linux.git
-T: git git://git.linaro.org/people/ycmiao/pxa-linux.git
-S: Maintained
+S: Odd Fixes
F: arch/arm/boot/dts/mmp*
F: arch/arm/mach-mmp/
S: Maintained
F: arch/arm/mach-omap2/omap_hwmod.*
+OMAP I2C DRIVER
+M: Vignesh R <vigneshr@ti.com>
+L: linux-omap@vger.kernel.org
+L: linux-i2c@vger.kernel.org
+S: Maintained
+F: Documentation/devicetree/bindings/i2c/i2c-omap.txt
+F: drivers/i2c/busses/i2c-omap.c
+
OMAP IMAGING SUBSYSTEM (OMAP3 ISP and OMAP4 ISS)
M: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
L: linux-media@vger.kernel.org
F: drivers/staging/media/omap4iss/
OMAP MMC SUPPORT
-M: Jarkko Lavinen <jarkko.lavinen@nokia.com>
+M: Aaro Koskinen <aaro.koskinen@iki.fi>
L: linux-omap@vger.kernel.org
-S: Maintained
+S: Odd Fixes
F: drivers/mmc/host/omap.c
OMAP POWER MANAGEMENT SUPPORT
ONION OMEGA2+ BOARD
M: Harvey Hunt <harveyhuntnexus@gmail.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/boot/dts/ralink/omega2p.dts
PIN CONTROLLER - INTEL
M: Mika Westerberg <mika.westerberg@linux.intel.com>
M: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/intel.git
S: Maintained
F: drivers/pinctrl/intel/
PISTACHIO SOC SUPPORT
M: James Hartley <james.hartley@sondrel.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Odd Fixes
F: arch/mips/pistachio/
F: arch/mips/include/asm/mach-pistachio/
F: include/linux/printk.h
PRISM54 WIRELESS DRIVER
-M: "Luis R. Rodriguez" <mcgrof@gmail.com>
+M: Luis Chamberlain <mcgrof@kernel.org>
L: linux-wireless@vger.kernel.org
W: http://wireless.kernel.org/en/users/Drivers/p54
S: Obsolete
F: fs/proc/
F: include/linux/proc_fs.h
F: tools/testing/selftests/proc/
+F: Documentation/filesystems/proc.txt
PROC SYSCTL
-M: "Luis R. Rodriguez" <mcgrof@kernel.org>
+M: Luis Chamberlain <mcgrof@kernel.org>
M: Kees Cook <keescook@chromium.org>
L: linux-kernel@vger.kernel.org
L: linux-fsdevel@vger.kernel.org
RALINK MIPS ARCHITECTURE
M: John Crispin <john@phrozen.org>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/ralink
RANCHU VIRTUAL BOARD FOR MIPS
M: Miodrag Dinic <miodrag.dinic@mips.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Supported
F: arch/mips/generic/board-ranchu.c
F: arch/mips/configs/generic/board-ranchu.config
F: include/linux/raid/
F: include/uapi/linux/raid/
+SOCIONEXT (SNI) AVE NETWORK DRIVER
+M: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: drivers/net/ethernet/socionext/sni_ave.c
+F: Documentation/devicetree/bindings/net/socionext,uniphier-ave4.txt
+
SOCIONEXT (SNI) NETSEC NETWORK DRIVER
M: Jassi Brar <jaswinder.singh@linaro.org>
L: netdev@vger.kernel.org
F: Documentation/devicetree/bindings/sound/
F: Documentation/sound/soc/
F: sound/soc/
+F: include/dt-bindings/sound/
F: include/sound/soc*
SOUNDWIRE SUBSYSTEM
F: drivers/tty/vcc.c
SPARSE CHECKER
-M: "Christopher Li" <sparse@chrisli.org>
+M: "Luc Van Oostenryck" <luc.vanoostenryck@gmail.com>
L: linux-sparse@vger.kernel.org
W: https://sparse.wiki.kernel.org/
T: git git://git.kernel.org/pub/scm/devel/sparse/sparse.git
-T: git git://git.kernel.org/pub/scm/devel/sparse/chrisl/sparse.git
S: Maintained
F: include/linux/compiler.h
STABLE BRANCH
M: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+M: Sasha Levin <sashal@kernel.org>
L: stable@vger.kernel.org
S: Supported
F: Documentation/process/stable-kernel-rules.rst
TURBOCHANNEL SUBSYSTEM
M: "Maciej W. Rozycki" <macro@linux-mips.org>
M: Ralf Baechle <ralf@linux-mips.org>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
Q: http://patchwork.linux-mips.org/project/linux-mips/list/
S: Maintained
F: drivers/tc/
F: net/vmw_vsock/virtio_transport.c
F: drivers/net/vsockmon.c
F: drivers/vhost/vsock.c
-F: drivers/vhost/vsock.h
F: tools/testing/vsock/
VIRTIO CONSOLE DRIVER
VOCORE VOCORE2 BOARD
M: Harvey Hunt <harveyhuntnexus@gmail.com>
-L: linux-mips@linux-mips.org
+L: linux-mips@vger.kernel.org
S: Maintained
F: arch/mips/boot/dts/ralink/vocore2.dts
# SPDX-License-Identifier: GPL-2.0
VERSION = 4
-PATCHLEVEL = 19
+PATCHLEVEL = 20
SUBLEVEL = 0
-EXTRAVERSION =
-NAME = "People's Front"
+EXTRAVERSION = -rc6
+NAME = Shy Crocodile
# *DOCUMENTATION*
# To see a list of typical targets execute "make help"
$(Q)$(CONFIG_SHELL) $(srctree)/scripts/mkmakefile $(srctree)
endif
-ifeq ($(cc-name),clang)
+ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
ifneq ($(CROSS_COMPILE),)
CLANG_TARGET := --target=$(notdir $(CROSS_COMPILE:%-=%))
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(LD)))
KBUILD_CFLAGS += $(stackp-flags-y)
-ifeq ($(cc-name),clang)
+ifdef CONFIG_CC_IS_CLANG
KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier)
KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
See Documentation/userspace-api/seccomp_filter.rst for details.
+config HAVE_ARCH_STACKLEAK
+ bool
+ help
+ An architecture should select this if it has the code which
+ fills the used part of the kernel stack with the STACKLEAK_POISON
+ value before returning from system calls.
+
config HAVE_STACKPROTECTOR
bool
help
})
#define user_termios_to_kernel_termios(k, u) \
- copy_from_user(k, u, sizeof(struct termios))
+ copy_from_user(k, u, sizeof(struct termios2))
#define kernel_termios_to_user_termios(u, k) \
+ copy_to_user(u, k, sizeof(struct termios2))
+
+#define user_termios_to_kernel_termios_1(k, u) \
+ copy_from_user(k, u, sizeof(struct termios))
+
+#define kernel_termios_to_user_termios_1(u, k) \
copy_to_user(u, k, sizeof(struct termios))
#endif /* _ALPHA_TERMIOS_H */
#define TCXONC _IO('t', 30)
#define TCFLSH _IO('t', 31)
+#define TCGETS2 _IOR('T', 42, struct termios2)
+#define TCSETS2 _IOW('T', 43, struct termios2)
+#define TCSETSW2 _IOW('T', 44, struct termios2)
+#define TCSETSF2 _IOW('T', 45, struct termios2)
+
#define TIOCSWINSZ _IOW('t', 103, struct winsize)
#define TIOCGWINSZ _IOR('t', 104, struct winsize)
#define TIOCSTART _IO('t', 110) /* start output, like ^Q */
speed_t c_ospeed; /* output speed */
};
+/* Alpha has identical termios and termios2 */
+
+struct termios2 {
+ tcflag_t c_iflag; /* input mode flags */
+ tcflag_t c_oflag; /* output mode flags */
+ tcflag_t c_cflag; /* control mode flags */
+ tcflag_t c_lflag; /* local mode flags */
+ cc_t c_cc[NCCS]; /* control characters */
+ cc_t c_line; /* line discipline (== c_cc[19]) */
+ speed_t c_ispeed; /* input speed */
+ speed_t c_ospeed; /* output speed */
+};
+
/* Alpha has matching termios and ktermios */
struct ktermios {
#define B3000000 00034
#define B3500000 00035
#define B4000000 00036
+#define BOTHER 00037
#define CSIZE 00001400
#define CS5 00000000
#define CMSPAR 010000000000 /* mark or space (stick) parity */
#define CRTSCTS 020000000000 /* flow control */
+#define CIBAUD 07600000
+#define IBSHIFT 16
+
/* c_lflag bits */
#define ISIG 0x00000080
#define ICANON 0x00000100
choice
prompt "ARC Instruction Set"
- default ISA_ARCOMPACT
+ default ISA_ARCV2
config ISA_ARCOMPACT
bool "ARCompact ISA"
config CPU_BIG_ENDIAN
bool "Enable Big Endian Mode"
- default n
help
Build kernel for Big Endian Mode of ARC CPU
config SMP
bool "Symmetric Multi-Processing"
- default n
select ARC_MCIP if ISA_ARCV2
help
This enables support for systems with more than one CPU.
config ARC_CACHE_VIPT_ALIASING
bool "Support VIPT Aliasing D$"
depends on ARC_HAS_DCACHE && ISA_ARCOMPACT
- default n
endif #ARC_CACHE
bool "Use ICCM"
help
Single Cycle RAMS to store Fast Path Code
- default n
config ARC_ICCM_SZ
int "ICCM Size in KB"
bool "Use DCCM"
help
Single Cycle RAMS to store Fast Path Data
- default n
config ARC_DCCM_SZ
int "DCCM Size in KB"
config ARC_COMPACT_IRQ_LEVELS
bool "Setup Timer IRQ as high Priority"
- default n
# if SMP, LV2 enabled ONLY if ARC implementation has LV2 re-entrancy
depends on !SMP
config ARC_FPU_SAVE_RESTORE
bool "Enable FPU state persistence across context switch"
- default n
help
Double Precision Floating Point unit had dedicated regs which
need to be saved/restored across context-switch.
config ARC_HAS_PAE40
bool "Support for the 40-bit Physical Address Extension"
- default n
depends on ISA_ARCV2
select HIGHMEM
select PHYS_ADDR_T_64BIT
config ARC_METAWARE_HLINK
bool "Support for Metaware debugger assisted Host access"
- default n
help
This options allows a Linux userland apps to directly access
host file system (open/creat/read/write etc) with help from
config ARC_DBG_TLB_PARANOIA
bool "Paranoia Checks in Low Level TLB Handlers"
- default n
endif
config ARC_UBOOT_SUPPORT
bool "Support uboot arg Handling"
- default n
help
ARC Linux by default checks for uboot provided args as pointers to
external cmdline or DTB. This however breaks in absence of uboot,
# published by the Free Software Foundation.
#
-KBUILD_DEFCONFIG := nsim_700_defconfig
+KBUILD_DEFCONFIG := nsim_hs_defconfig
cflags-y += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
cflags-$(CONFIG_ISA_ARCOMPACT) += -mA7
bus-width = <4>;
dma-coherent;
};
+
+ gpio: gpio@3000 {
+ compatible = "snps,dw-apb-gpio";
+ reg = <0x3000 0x20>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ gpio_port_a: gpio-controller@0 {
+ compatible = "snps,dw-apb-gpio-port";
+ gpio-controller;
+ #gpio-cells = <2>;
+ snps,nr-gpios = <24>;
+ reg = <0>;
+ };
+ };
};
memory@80000000 {
# CONFIG_VM_EVENT_COUNTERS is not set
# CONFIG_SLUB_DEBUG is not set
# CONFIG_COMPAT_BRK is not set
+CONFIG_ISA_ARCOMPACT=y
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
CONFIG_NTFS_FS=y
CONFIG_TMPFS=y
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_NTFS_FS=y
CONFIG_TMPFS=y
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_NTFS_FS=y
CONFIG_TMPFS=y
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_SERIAL_8250_DW=y
CONFIG_SERIAL_OF_PLATFORM=y
# CONFIG_HW_RANDOM is not set
+CONFIG_GPIOLIB=y
+CONFIG_GPIO_SYSFS=y
+CONFIG_GPIO_DWAPB=y
# CONFIG_HWMON is not set
CONFIG_DRM=y
# CONFIG_DRM_FBDEV_EMULATION is not set
CONFIG_VFAT_FS=y
CONFIG_TMPFS=y
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
+CONFIG_ISA_ARCOMPACT=y
CONFIG_KPROBES=y
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_TMPFS=y
# CONFIG_MISC_FILESYSTEMS is not set
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
CONFIG_ROOT_NFS=y
CONFIG_DEBUG_INFO=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_PERF_EVENTS=y
# CONFIG_SLUB_DEBUG is not set
# CONFIG_COMPAT_BRK is not set
+CONFIG_ISA_ARCOMPACT=y
CONFIG_KPROBES=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
CONFIG_PERF_EVENTS=y
# CONFIG_SLUB_DEBUG is not set
# CONFIG_COMPAT_BRK is not set
+CONFIG_ISA_ARCOMPACT=y
CONFIG_KPROBES=y
CONFIG_MODULES=y
# CONFIG_LBDAF is not set
CONFIG_TMPFS=y
# CONFIG_MISC_FILESYSTEMS is not set
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_TMPFS=y
# CONFIG_MISC_FILESYSTEMS is not set
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_TMPFS=y
# CONFIG_MISC_FILESYSTEMS is not set
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_FTRACE=y
# CONFIG_AIO is not set
CONFIG_EMBEDDED=y
# CONFIG_COMPAT_BRK is not set
+CONFIG_ISA_ARCOMPACT=y
CONFIG_SLAB=y
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_TMPFS=y
CONFIG_JFFS2_FS=y
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_TMPFS=y
CONFIG_JFFS2_FS=y
CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
/* IO coherency related Auxiliary registers */
#define ARC_REG_IO_COH_ENABLE 0x500
+#define ARC_IO_COH_ENABLE_BIT BIT(0)
#define ARC_REG_IO_COH_PARTIAL 0x501
+#define ARC_IO_COH_PARTIAL_BIT BIT(0)
#define ARC_REG_IO_COH_AP0_BASE 0x508
#define ARC_REG_IO_COH_AP0_SIZE 0x509
#include <linux/types.h>
#include <asm/byteorder.h>
#include <asm/page.h>
+#include <asm/unaligned.h>
#ifdef CONFIG_ISA_ARCV2
#include <asm/barrier.h>
return w;
}
+/*
+ * {read,write}s{b,w,l}() repeatedly access the same IO address in
+ * native endianness in 8-, 16-, 32-bit chunks {into,from} memory,
+ * @count times
+ */
+#define __raw_readsx(t,f) \
+static inline void __raw_reads##f(const volatile void __iomem *addr, \
+ void *ptr, unsigned int count) \
+{ \
+ bool is_aligned = ((unsigned long)ptr % ((t) / 8)) == 0; \
+ u##t *buf = ptr; \
+ \
+ if (!count) \
+ return; \
+ \
+ /* Some ARC CPU's don't support unaligned accesses */ \
+ if (is_aligned) { \
+ do { \
+ u##t x = __raw_read##f(addr); \
+ *buf++ = x; \
+ } while (--count); \
+ } else { \
+ do { \
+ u##t x = __raw_read##f(addr); \
+ put_unaligned(x, buf++); \
+ } while (--count); \
+ } \
+}
+
+#define __raw_readsb __raw_readsb
+__raw_readsx(8, b)
+#define __raw_readsw __raw_readsw
+__raw_readsx(16, w)
+#define __raw_readsl __raw_readsl
+__raw_readsx(32, l)
+
#define __raw_writeb __raw_writeb
static inline void __raw_writeb(u8 b, volatile void __iomem *addr)
{
}
+#define __raw_writesx(t,f) \
+static inline void __raw_writes##f(volatile void __iomem *addr, \
+ const void *ptr, unsigned int count) \
+{ \
+ bool is_aligned = ((unsigned long)ptr % ((t) / 8)) == 0; \
+ const u##t *buf = ptr; \
+ \
+ if (!count) \
+ return; \
+ \
+ /* Some ARC CPU's don't support unaligned accesses */ \
+ if (is_aligned) { \
+ do { \
+ __raw_write##f(*buf++, addr); \
+ } while (--count); \
+ } else { \
+ do { \
+ __raw_write##f(get_unaligned(buf++), addr); \
+ } while (--count); \
+ } \
+}
+
+#define __raw_writesb __raw_writesb
+__raw_writesx(8, b)
+#define __raw_writesw __raw_writesw
+__raw_writesx(16, w)
+#define __raw_writesl __raw_writesl
+__raw_writesx(32, l)
+
/*
* MMIO can also get buffered/optimized in micro-arch, so barriers needed
* Based on ARM model for the typical use case
#define readb(c) ({ u8 __v = readb_relaxed(c); __iormb(); __v; })
#define readw(c) ({ u16 __v = readw_relaxed(c); __iormb(); __v; })
#define readl(c) ({ u32 __v = readl_relaxed(c); __iormb(); __v; })
+#define readsb(p,d,l) ({ __raw_readsb(p,d,l); __iormb(); })
+#define readsw(p,d,l) ({ __raw_readsw(p,d,l); __iormb(); })
+#define readsl(p,d,l) ({ __raw_readsl(p,d,l); __iormb(); })
#define writeb(v,c) ({ __iowmb(); writeb_relaxed(v,c); })
#define writew(v,c) ({ __iowmb(); writew_relaxed(v,c); })
#define writel(v,c) ({ __iowmb(); writel_relaxed(v,c); })
+#define writesb(p,d,l) ({ __iowmb(); __raw_writesb(p,d,l); })
+#define writesw(p,d,l) ({ __iowmb(); __raw_writesw(p,d,l); })
+#define writesl(p,d,l) ({ __iowmb(); __raw_writesl(p,d,l); })
/*
* Relaxed API for drivers which can handle barrier ordering themselves
{
struct cpuinfo_arc *cpu = &cpuinfo_arc700[cpu_id];
struct bcr_identity *core = &cpu->core;
- int i, n = 0;
+ int i, n = 0, ua = 0;
FIX_PTR(cpu);
IS_AVAIL2(cpu->extn.rtc, "RTC [UP 64-bit] ", CONFIG_ARC_TIMERS_64BIT),
IS_AVAIL2(cpu->extn.gfrc, "GFRC [SMP 64-bit] ", CONFIG_ARC_TIMERS_64BIT));
- n += i = scnprintf(buf + n, len - n, "%s%s%s%s%s",
+#ifdef __ARC_UNALIGNED__
+ ua = 1;
+#endif
+ n += i = scnprintf(buf + n, len - n, "%s%s%s%s%s%s",
IS_AVAIL2(cpu->isa.atomic, "atomic ", CONFIG_ARC_HAS_LLSC),
IS_AVAIL2(cpu->isa.ldd, "ll64 ", CONFIG_ARC_HAS_LL64),
- IS_AVAIL1(cpu->isa.unalign, "unalign (not used)"));
+ IS_AVAIL1(cpu->isa.unalign, "unalign "), IS_USED_RUN(ua));
if (i)
n += scnprintf(buf + n, len - n, "\n\t\t: ");
{
unsigned int ioc_base, mem_sz;
+ /*
+ * If IOC was already enabled (due to bootloader) it technically needs to
+ * be reconfigured with aperture base,size corresponding to Linux memory map
+ * which will certainly be different than uboot's. But disabling and
+ * reenabling IOC when DMA might be potentially active is tricky business.
+ * To avoid random memory issues later, just panic here and ask user to
+ * upgrade bootloader to one which doesn't enable IOC
+ */
+ if (read_aux_reg(ARC_REG_IO_COH_ENABLE) & ARC_IO_COH_ENABLE_BIT)
+ panic("IOC already enabled, please upgrade bootloader!\n");
+
+ if (!ioc_enable)
+ return;
+
/*
* As for today we don't support both IOC and ZONE_HIGHMEM enabled
* simultaneously. This happens because as of today IOC aperture covers
panic("IOC Aperture start must be aligned to the size of the aperture");
write_aux_reg(ARC_REG_IO_COH_AP0_BASE, ioc_base >> 12);
- write_aux_reg(ARC_REG_IO_COH_PARTIAL, 1);
- write_aux_reg(ARC_REG_IO_COH_ENABLE, 1);
+ write_aux_reg(ARC_REG_IO_COH_PARTIAL, ARC_IO_COH_PARTIAL_BIT);
+ write_aux_reg(ARC_REG_IO_COH_ENABLE, ARC_IO_COH_ENABLE_BIT);
/* Re-enable L1 dcache */
__dc_enable();
if (is_isa_arcv2() && l2_line_sz && !slc_enable)
arc_slc_disable();
- if (is_isa_arcv2() && ioc_enable)
+ if (is_isa_arcv2() && ioc_exists)
arc_ioc_setup();
if (is_isa_arcv2() && l2_line_sz && slc_enable) {
struct vm_area_struct *vma = NULL;
struct task_struct *tsk = current;
struct mm_struct *mm = tsk->mm;
- int si_code;
+ int si_code = 0;
int ret;
vm_fault_t fault;
int write = regs->ecr_cause & ECR_C_PROTV_STORE; /* ST/EX */
vmmc-supply = <&vmmc_fixed>;
bus-width = <4>;
wp-gpios = <&gpio4 30 GPIO_ACTIVE_HIGH>; /* gpio_126 */
- cd-gpios = <&gpio4 31 GPIO_ACTIVE_HIGH>; /* gpio_127 */
+ cd-gpios = <&gpio4 31 GPIO_ACTIVE_LOW>; /* gpio_127 */
};
&mmc3 {
compatible = "ti,wl1271";
reg = <2>;
interrupt-parent = <&gpio6>;
- interrupts = <10 IRQ_TYPE_LEVEL_HIGH>; /* gpio_170 */
+ interrupts = <10 IRQ_TYPE_EDGE_RISING>; /* gpio_170 */
ref-clock-frequency = <26000000>;
tcxo-clock-frequency = <26000000>;
};
};
/* The voltage to the MMC card is hardwired at 3.3V */
- vmmc: fixedregulator@0 {
+ vmmc: regulator-vmmc {
compatible = "regulator-fixed";
regulator-name = "vmmc";
regulator-min-microvolt = <3300000>;
regulator-boot-on;
};
- veth: fixedregulator@0 {
+ veth: regulator-veth {
compatible = "regulator-fixed";
regulator-name = "veth";
regulator-min-microvolt = <3300000>;
};
/* The voltage to the MMC card is hardwired at 3.3V */
- vmmc: fixedregulator@0 {
+ vmmc: regulator-vmmc {
compatible = "regulator-fixed";
regulator-name = "vmmc";
regulator-min-microvolt = <3300000>;
regulator-boot-on;
};
- veth: fixedregulator@0 {
+ veth: regulator-veth {
compatible = "regulator-fixed";
regulator-name = "veth";
regulator-min-microvolt = <3300000>;
wifi_pwrseq: wifi-pwrseq {
compatible = "mmc-pwrseq-simple";
- reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
+ reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
};
};
wifi_pwrseq: wifi-pwrseq {
compatible = "mmc-pwrseq-simple";
- reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
+ reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
};
};
pinctrl-0 = <&pinctrl_i2c2>;
status = "okay";
- eeprom@50 {
- compatible = "atmel,24c04";
- pagesize = <16>;
- reg = <0x50>;
- };
-
hpa1: amp@60 {
compatible = "ti,tpa6130a2";
reg = <0x60>;
};
chosen {
- stdout-path = "&uart1:115200n8";
+ stdout-path = "serial0:115200n8";
};
memory@70000000 {
i2c1: i2c@21a0000 {
#address-cells = <1>;
#size-cells = <0>;
- compatible = "fs,imx6sll-i2c", "fsl,imx21-i2c";
+ compatible = "fsl,imx6sll-i2c", "fsl,imx21-i2c";
reg = <0x021a0000 0x4000>;
interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6SLL_CLK_I2C1>;
regulator-name = "enet_3v3";
regulator-min-microvolt = <3300000>;
regulator-max-microvolt = <3300000>;
- gpios = <&gpio2 6 GPIO_ACTIVE_LOW>;
+ gpio = <&gpio2 6 GPIO_ACTIVE_LOW>;
+ regulator-boot-on;
+ regulator-always-on;
};
reg_pcie_gpio: regulator-pcie-gpio {
phy-supply = <®_enet_3v3>;
phy-mode = "rgmii";
phy-handle = <ðphy1>;
+ phy-reset-gpios = <&gpio2 7 GPIO_ACTIVE_LOW>;
status = "okay";
mdio {
MX6SX_PAD_RGMII1_RD3__ENET1_RX_DATA_3 0x3081
MX6SX_PAD_RGMII1_RX_CTL__ENET1_RX_EN 0x3081
MX6SX_PAD_ENET2_RX_CLK__ENET2_REF_CLK_25M 0x91
+ /* phy reset */
+ MX6SX_PAD_ENET2_CRS__GPIO2_IO_7 0x10b0
>;
};
compatible = "regulator-fixed";
regulator-min-microvolt = <3300000>;
regulator-max-microvolt = <3300000>;
- clocks = <&clks IMX7D_CLKO2_ROOT_DIV>;
- clock-names = "slow";
regulator-name = "reg_wlan";
startup-delay-us = <70000>;
gpio = <&gpio4 21 GPIO_ACTIVE_HIGH>;
enable-active-high;
};
+
+ usdhc2_pwrseq: usdhc2_pwrseq {
+ compatible = "mmc-pwrseq-simple";
+ clocks = <&clks IMX7D_CLKO2_ROOT_DIV>;
+ clock-names = "ext_clock";
+ };
};
&adc1 {
bus-width = <4>;
non-removable;
vmmc-supply = <®_wlan>;
+ mmc-pwrseq = <&usdhc2_pwrseq>;
cap-power-off-card;
keep-power-in-suspend;
status = "okay";
regulator-min-microvolt = <1800000>;
regulator-max-microvolt = <1800000>;
};
+
+ usdhc2_pwrseq: usdhc2_pwrseq {
+ compatible = "mmc-pwrseq-simple";
+ clocks = <&clks IMX7D_CLKO2_ROOT_DIV>;
+ clock-names = "ext_clock";
+ };
+};
+
+&clks {
+ assigned-clocks = <&clks IMX7D_CLKO2_ROOT_SRC>,
+ <&clks IMX7D_CLKO2_ROOT_DIV>;
+ assigned-clock-parents = <&clks IMX7D_CKIL>;
+ assigned-clock-rates = <0>, <32768>;
};
&i2c4 {
&usdhc2 { /* Wifi SDIO */
pinctrl-names = "default";
- pinctrl-0 = <&pinctrl_usdhc2>;
+ pinctrl-0 = <&pinctrl_usdhc2 &pinctrl_wifi_clk>;
no-1-8-v;
non-removable;
keep-power-in-suspend;
wakeup-source;
vmmc-supply = <®_ap6212>;
+ mmc-pwrseq = <&usdhc2_pwrseq>;
status = "okay";
};
};
&iomuxc_lpsr {
+ pinctrl_wifi_clk: wificlkgrp {
+ fsl,pins = <
+ MX7D_PAD_LPSR_GPIO1_IO03__CCM_CLKO2 0x7d
+ >;
+ };
+
pinctrl_wdog: wdoggrp {
fsl,pins = <
MX7D_PAD_LPSR_GPIO1_IO00__WDOG1_WDOG_B 0x74
};
&mmc3 {
- interrupts-extended = <&intc 94 &omap3_pmx_core2 0x46>;
+ interrupts-extended = <&intc 94 &omap3_pmx_core 0x136>;
pinctrl-0 = <&mmc3_pins &wl127x_gpio>;
pinctrl-names = "default";
vmmc-supply = <&wl12xx_vmmc>;
* jumpering combinations for the long run.
*/
&mmc3 {
- interrupts-extended = <&intc 94 &omap3_pmx_core2 0x46>;
+ interrupts-extended = <&intc 94 &omap3_pmx_core 0x136>;
pinctrl-0 = <&mmc3_pins &mmc3_core2_pins>;
pinctrl-names = "default";
vmmc-supply = <&wl12xx_vmmc>;
#include "rk3288.dtsi"
/ {
- memory@0 {
+ /*
+ * The default coreboot on veyron devices ignores memory@0 nodes
+ * and would instead create another memory node.
+ */
+ memory {
device_type = "memory";
reg = <0x0 0x0 0x0 0x80000000>;
};
0x1 0x0 0x60000000 0x10000000
0x2 0x0 0x70000000 0x10000000
0x3 0x0 0x80000000 0x10000000>;
- clocks = <&mck>;
+ clocks = <&h32ck>;
status = "disabled";
nand_controller: nand-controller {
interrupts = <GIC_SPI 80 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&rcc HASH1>;
resets = <&rcc HASH1_R>;
- dmas = <&mdma1 31 0x10 0x1000A02 0x0 0x0 0x0>;
+ dmas = <&mdma1 31 0x10 0x1000A02 0x0 0x0>;
dma-names = "in";
dma-maxburst = <2>;
status = "disabled";
®_dldo3 {
regulator-always-on;
- regulator-min-microvolt = <2500000>;
- regulator-max-microvolt = <2500000>;
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
regulator-name = "vcc-pd";
};
compatible = "fsl,vf610m4";
chosen {
- bootargs = "console=ttyLP2,115200 clk_ignore_unused init=/linuxrc rw";
- stdout-path = "&uart2";
+ bootargs = "clk_ignore_unused init=/linuxrc rw";
+ stdout-path = "serial2:115200";
};
memory@8c000000 {
#include <linux/kernel.h>
extern unsigned int processor_id;
+struct proc_info_list *lookup_processor(u32 midr);
#ifdef CONFIG_CPU_CP15
#define read_cpuid(reg) \
#ifndef _ASM_PGTABLE_2LEVEL_H
#define _ASM_PGTABLE_2LEVEL_H
-#define __PAGETABLE_PMD_FOLDED
+#define __PAGETABLE_PMD_FOLDED 1
/*
* Hardware-wise, we have a two level page table structure, where the first
/*
* Don't change this structure - ASM code relies on it.
*/
-extern struct processor {
+struct processor {
/* MISC
* get data abort address/flags
*/
unsigned int suspend_size;
void (*do_suspend)(void *);
void (*do_resume)(void *);
-} processor;
+};
#ifndef MULTI_CPU
+static inline void init_proc_vtable(const struct processor *p)
+{
+}
+
extern void cpu_proc_init(void);
extern void cpu_proc_fin(void);
extern int cpu_do_idle(void);
extern void cpu_do_suspend(void *);
extern void cpu_do_resume(void *);
#else
-#define cpu_proc_init processor._proc_init
-#define cpu_proc_fin processor._proc_fin
-#define cpu_reset processor.reset
-#define cpu_do_idle processor._do_idle
-#define cpu_dcache_clean_area processor.dcache_clean_area
-#define cpu_set_pte_ext processor.set_pte_ext
-#define cpu_do_switch_mm processor.switch_mm
-/* These three are private to arch/arm/kernel/suspend.c */
-#define cpu_do_suspend processor.do_suspend
-#define cpu_do_resume processor.do_resume
+extern struct processor processor;
+#if defined(CONFIG_BIG_LITTLE) && defined(CONFIG_HARDEN_BRANCH_PREDICTOR)
+#include <linux/smp.h>
+/*
+ * This can't be a per-cpu variable because we need to access it before
+ * per-cpu has been initialised. We have a couple of functions that are
+ * called in a pre-emptible context, and so can't use smp_processor_id()
+ * there, hence PROC_TABLE(). We insist in init_proc_vtable() that the
+ * function pointers for these are identical across all CPUs.
+ */
+extern struct processor *cpu_vtable[];
+#define PROC_VTABLE(f) cpu_vtable[smp_processor_id()]->f
+#define PROC_TABLE(f) cpu_vtable[0]->f
+static inline void init_proc_vtable(const struct processor *p)
+{
+ unsigned int cpu = smp_processor_id();
+ *cpu_vtable[cpu] = *p;
+ WARN_ON_ONCE(cpu_vtable[cpu]->dcache_clean_area !=
+ cpu_vtable[0]->dcache_clean_area);
+ WARN_ON_ONCE(cpu_vtable[cpu]->set_pte_ext !=
+ cpu_vtable[0]->set_pte_ext);
+}
+#else
+#define PROC_VTABLE(f) processor.f
+#define PROC_TABLE(f) processor.f
+static inline void init_proc_vtable(const struct processor *p)
+{
+ processor = *p;
+}
+#endif
+
+#define cpu_proc_init PROC_VTABLE(_proc_init)
+#define cpu_check_bugs PROC_VTABLE(check_bugs)
+#define cpu_proc_fin PROC_VTABLE(_proc_fin)
+#define cpu_reset PROC_VTABLE(reset)
+#define cpu_do_idle PROC_VTABLE(_do_idle)
+#define cpu_dcache_clean_area PROC_TABLE(dcache_clean_area)
+#define cpu_set_pte_ext PROC_TABLE(set_pte_ext)
+#define cpu_do_switch_mm PROC_VTABLE(switch_mm)
+
+/* These two are private to arch/arm/kernel/suspend.c */
+#define cpu_do_suspend PROC_VTABLE(do_suspend)
+#define cpu_do_resume PROC_VTABLE(do_resume)
#endif
extern void cpu_resume(void);
void check_other_bugs(void)
{
#ifdef MULTI_CPU
- if (processor.check_bugs)
- processor.check_bugs();
+ if (cpu_check_bugs)
+ cpu_check_bugs();
#endif
}
unsigned long frame_pointer)
{
unsigned long return_hooker = (unsigned long) &return_to_handler;
- struct ftrace_graph_ent trace;
unsigned long old;
- int err;
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
return;
old = *parent;
*parent = return_hooker;
- trace.func = self_addr;
- trace.depth = current->curr_ret_stack + 1;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace)) {
+ if (function_graph_enter(old, self_addr, frame_pointer, NULL))
*parent = old;
- return;
- }
-
- err = ftrace_push_return_trace(old, self_addr, &trace.depth,
- frame_pointer, NULL);
- if (err == -EBUSY) {
- *parent = old;
- return;
- }
}
#ifdef CONFIG_DYNAMIC_FTRACE
#endif
.size __mmap_switched_data, . - __mmap_switched_data
+ __FINIT
+ .text
+
/*
* This provides a C-API version of __lookup_processor_type
*/
ldmfd sp!, {r4 - r6, r9, pc}
ENDPROC(lookup_processor_type)
- __FINIT
- .text
-
/*
* Read processor ID register (CP#15, CR0), and look up in the linker-built
* supported processor list. Note that we can't use the absolute addresses
#ifdef MULTI_CPU
struct processor processor __ro_after_init;
+#if defined(CONFIG_BIG_LITTLE) && defined(CONFIG_HARDEN_BRANCH_PREDICTOR)
+struct processor *cpu_vtable[NR_CPUS] = {
+ [0] = &processor,
+};
+#endif
#endif
#ifdef MULTI_TLB
struct cpu_tlb_fns cpu_tlb __ro_after_init;
}
#endif
-static void __init setup_processor(void)
+/*
+ * locate processor in the list of supported processor types. The linker
+ * builds this table for us from the entries in arch/arm/mm/proc-*.S
+ */
+struct proc_info_list *lookup_processor(u32 midr)
{
- struct proc_info_list *list;
+ struct proc_info_list *list = lookup_processor_type(midr);
- /*
- * locate processor in the list of supported processor
- * types. The linker builds this table for us from the
- * entries in arch/arm/mm/proc-*.S
- */
- list = lookup_processor_type(read_cpuid_id());
if (!list) {
- pr_err("CPU configuration botched (ID %08x), unable to continue.\n",
- read_cpuid_id());
- while (1);
+ pr_err("CPU%u: configuration botched (ID %08x), CPU halted\n",
+ smp_processor_id(), midr);
+ while (1)
+ /* can't use cpu_relax() here as it may require MMU setup */;
}
+ return list;
+}
+
+static void __init setup_processor(void)
+{
+ unsigned int midr = read_cpuid_id();
+ struct proc_info_list *list = lookup_processor(midr);
+
cpu_name = list->cpu_name;
__cpu_architecture = __get_cpu_architecture();
-#ifdef MULTI_CPU
- processor = *list->proc;
-#endif
+ init_proc_vtable(list->proc);
#ifdef MULTI_TLB
cpu_tlb = *list->tlb;
#endif
#endif
pr_info("CPU: %s [%08x] revision %d (ARMv%s), cr=%08lx\n",
- cpu_name, read_cpuid_id(), read_cpuid_id() & 15,
+ list->cpu_name, midr, midr & 15,
proc_arch[cpu_architecture()], get_cr());
snprintf(init_utsname()->machine, __NEW_UTS_LEN + 1, "%s%c",
#include <asm/mmu_context.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
+#include <asm/procinfo.h>
#include <asm/processor.h>
#include <asm/sections.h>
#include <asm/tlbflush.h>
#endif
}
+#if defined(CONFIG_BIG_LITTLE) && defined(CONFIG_HARDEN_BRANCH_PREDICTOR)
+static int secondary_biglittle_prepare(unsigned int cpu)
+{
+ if (!cpu_vtable[cpu])
+ cpu_vtable[cpu] = kzalloc(sizeof(*cpu_vtable[cpu]), GFP_KERNEL);
+
+ return cpu_vtable[cpu] ? 0 : -ENOMEM;
+}
+
+static void secondary_biglittle_init(void)
+{
+ init_proc_vtable(lookup_processor(read_cpuid_id())->proc);
+}
+#else
+static int secondary_biglittle_prepare(unsigned int cpu)
+{
+ return 0;
+}
+
+static void secondary_biglittle_init(void)
+{
+}
+#endif
+
int __cpu_up(unsigned int cpu, struct task_struct *idle)
{
int ret;
if (!smp_ops.smp_boot_secondary)
return -ENOSYS;
+ ret = secondary_biglittle_prepare(cpu);
+ if (ret)
+ return ret;
+
/*
* We need to tell the secondary core where to find
* its stack and the page tables.
struct mm_struct *mm = &init_mm;
unsigned int cpu;
+ secondary_biglittle_init();
+
/*
* The identity mapping is uncached (strongly ordered), so
* switch away from it before attempting any exclusive accesses.
};
static struct davinci_gpio_platform_data da830_gpio_platform_data = {
- .ngpio = 128,
+ .no_auto_base = true,
+ .base = 0,
+ .ngpio = 128,
};
int __init da830_register_gpio(void)
}
static struct davinci_gpio_platform_data da850_gpio_platform_data = {
- .ngpio = 144,
+ .no_auto_base = true,
+ .base = 0,
+ .ngpio = 144,
};
int __init da850_register_gpio(void)
},
{ /* interrupt */
.start = IRQ_DA8XX_GPIO0,
+ .end = IRQ_DA8XX_GPIO0,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO1,
+ .end = IRQ_DA8XX_GPIO1,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO2,
+ .end = IRQ_DA8XX_GPIO2,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO3,
+ .end = IRQ_DA8XX_GPIO3,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO4,
+ .end = IRQ_DA8XX_GPIO4,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO5,
+ .end = IRQ_DA8XX_GPIO5,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO6,
+ .end = IRQ_DA8XX_GPIO6,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO7,
+ .end = IRQ_DA8XX_GPIO7,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DA8XX_GPIO8,
.end = IRQ_DA8XX_GPIO8,
.flags = IORESOURCE_IRQ,
},
},
{ /* interrupt */
.start = IRQ_DM355_GPIOBNK0,
+ .end = IRQ_DM355_GPIOBNK0,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM355_GPIOBNK1,
+ .end = IRQ_DM355_GPIOBNK1,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM355_GPIOBNK2,
+ .end = IRQ_DM355_GPIOBNK2,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM355_GPIOBNK3,
+ .end = IRQ_DM355_GPIOBNK3,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM355_GPIOBNK4,
+ .end = IRQ_DM355_GPIOBNK4,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM355_GPIOBNK5,
+ .end = IRQ_DM355_GPIOBNK5,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM355_GPIOBNK6,
.end = IRQ_DM355_GPIOBNK6,
.flags = IORESOURCE_IRQ,
},
};
static struct davinci_gpio_platform_data dm355_gpio_platform_data = {
+ .no_auto_base = true,
+ .base = 0,
.ngpio = 104,
};
},
{ /* interrupt */
.start = IRQ_DM365_GPIO0,
+ .end = IRQ_DM365_GPIO0,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM365_GPIO1,
+ .end = IRQ_DM365_GPIO1,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM365_GPIO2,
+ .end = IRQ_DM365_GPIO2,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM365_GPIO3,
+ .end = IRQ_DM365_GPIO3,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM365_GPIO4,
+ .end = IRQ_DM365_GPIO4,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM365_GPIO5,
+ .end = IRQ_DM365_GPIO5,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM365_GPIO6,
+ .end = IRQ_DM365_GPIO6,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM365_GPIO7,
.end = IRQ_DM365_GPIO7,
.flags = IORESOURCE_IRQ,
},
};
static struct davinci_gpio_platform_data dm365_gpio_platform_data = {
+ .no_auto_base = true,
+ .base = 0,
.ngpio = 104,
.gpio_unbanked = 8,
};
},
{ /* interrupt */
.start = IRQ_GPIOBNK0,
+ .end = IRQ_GPIOBNK0,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_GPIOBNK1,
+ .end = IRQ_GPIOBNK1,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_GPIOBNK2,
+ .end = IRQ_GPIOBNK2,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_GPIOBNK3,
+ .end = IRQ_GPIOBNK3,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_GPIOBNK4,
.end = IRQ_GPIOBNK4,
.flags = IORESOURCE_IRQ,
},
};
static struct davinci_gpio_platform_data dm644_gpio_platform_data = {
+ .no_auto_base = true,
+ .base = 0,
.ngpio = 71,
};
},
{ /* interrupt */
.start = IRQ_DM646X_GPIOBNK0,
+ .end = IRQ_DM646X_GPIOBNK0,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM646X_GPIOBNK1,
+ .end = IRQ_DM646X_GPIOBNK1,
+ .flags = IORESOURCE_IRQ,
+ },
+ {
+ .start = IRQ_DM646X_GPIOBNK2,
.end = IRQ_DM646X_GPIOBNK2,
.flags = IORESOURCE_IRQ,
},
};
static struct davinci_gpio_platform_data dm646x_gpio_platform_data = {
+ .no_auto_base = true,
+ .base = 0,
.ngpio = 43,
};
* except for power up sw2iso which need to be
* larger than LDO ramp up time.
*/
- imx_gpc_set_arm_power_up_timing(2, 1);
+ imx_gpc_set_arm_power_up_timing(0xf, 1);
imx_gpc_set_arm_power_down_timing(1, 1);
return cpuidle_register(&imx6sx_cpuidle_driver, NULL);
#define cpu_is_pxa910() (0)
#endif
-#ifdef CONFIG_CPU_MMP2
+#if defined(CONFIG_CPU_MMP2) || defined(CONFIG_MACH_MMP2_DT)
static inline int cpu_is_mmp2(void)
{
- return (((read_cpuid_id() >> 8) & 0xff) == 0x58);
+ return (((read_cpuid_id() >> 8) & 0xff) == 0x58) &&
+ (((mmp_chip_id & 0xfff) == 0x410) ||
+ ((mmp_chip_id & 0xfff) == 0x610));
}
#else
#define cpu_is_mmp2() (0)
struct modem_private_data *priv = port->private_data;
int ret;
+ if (!priv)
+ return;
+
if (IS_ERR(priv->regulator))
return;
{
.membase = IOMEM(MODEM_VIRT),
.mapbase = MODEM_PHYS,
- .irq = -EINVAL, /* changed later */
+ .irq = IRQ_NOTCONNECTED, /* changed later */
.flags = UPF_BOOT_AUTOCONF,
.irqflags = IRQF_TRIGGER_RISING,
.iotype = UPIO_MEM,
/*
- * This function expects MODEM IRQ number already assigned to the port
- * and fails if it's not.
+ * This function expects MODEM IRQ number already assigned to the port.
* The MODEM device requires its RESET# pin kept high during probe.
* That requirement can be fulfilled in several ways:
* - with a descriptor of already functional modem_nreset regulator
if (!machine_is_ams_delta())
return -ENODEV;
- if (ams_delta_modem_ports[0].irq < 0)
- return ams_delta_modem_ports[0].irq;
-
omap_cfg_reg(M14_1510_GPIO2);
/* Initialize the modem_nreset regulator consumer before use */
return 0;
}
-#else
-static inline int omapdss_init_fbdev(void)
+
+static const char * const omapdss_compat_names[] __initconst = {
+ "ti,omap2-dss",
+ "ti,omap3-dss",
+ "ti,omap4-dss",
+ "ti,omap5-dss",
+ "ti,dra7-dss",
+};
+
+static struct device_node * __init omapdss_find_dss_of_node(void)
{
- return 0;
+ struct device_node *node;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(omapdss_compat_names); ++i) {
+ node = of_find_compatible_node(NULL, NULL,
+ omapdss_compat_names[i]);
+ if (node)
+ return node;
+ }
+
+ return NULL;
}
+
+static int __init omapdss_init_of(void)
+{
+ int r;
+ struct device_node *node;
+ struct platform_device *pdev;
+
+ /* only create dss helper devices if dss is enabled in the .dts */
+
+ node = omapdss_find_dss_of_node();
+ if (!node)
+ return 0;
+
+ if (!of_device_is_available(node))
+ return 0;
+
+ pdev = of_find_device_by_node(node);
+
+ if (!pdev) {
+ pr_err("Unable to find DSS platform device\n");
+ return -ENODEV;
+ }
+
+ r = of_platform_populate(node, NULL, NULL, &pdev->dev);
+ if (r) {
+ pr_err("Unable to populate DSS submodule devices\n");
+ return r;
+ }
+
+ return omapdss_init_fbdev();
+}
+omap_device_initcall(omapdss_init_of);
#endif /* CONFIG_FB_OMAP2 */
static void dispc_disable_outputs(void)
return r;
}
-
-static const char * const omapdss_compat_names[] __initconst = {
- "ti,omap2-dss",
- "ti,omap3-dss",
- "ti,omap4-dss",
- "ti,omap5-dss",
- "ti,dra7-dss",
-};
-
-static struct device_node * __init omapdss_find_dss_of_node(void)
-{
- struct device_node *node;
- int i;
-
- for (i = 0; i < ARRAY_SIZE(omapdss_compat_names); ++i) {
- node = of_find_compatible_node(NULL, NULL,
- omapdss_compat_names[i]);
- if (node)
- return node;
- }
-
- return NULL;
-}
-
-static int __init omapdss_init_of(void)
-{
- int r;
- struct device_node *node;
- struct platform_device *pdev;
-
- /* only create dss helper devices if dss is enabled in the .dts */
-
- node = omapdss_find_dss_of_node();
- if (!node)
- return 0;
-
- if (!of_device_is_available(node))
- return 0;
-
- pdev = of_find_device_by_node(node);
-
- if (!pdev) {
- pr_err("Unable to find DSS platform device\n");
- return -ENODEV;
- }
-
- r = of_platform_populate(node, NULL, NULL, &pdev->dev);
- if (r) {
- pr_err("Unable to populate DSS submodule devices\n");
- return r;
- }
-
- return omapdss_init_fbdev();
-}
-omap_device_initcall(omapdss_init_of);
* to occur, WAKEUPENABLE bits must be set in the pad mux registers, and
* omap44xx_prm_reconfigure_io_chain() must be called. No return value.
*/
-static void __init omap44xx_prm_enable_io_wakeup(void)
+static void omap44xx_prm_enable_io_wakeup(void)
{
s32 inst = omap4_prmst_get_prm_dev_inst();
ALT_UP(W(nop))
#endif
mcrne p15, 0, r0, c7, c14, 1 @ clean & invalidate D / U line
+ addne r0, r0, r2
tst r1, r3
bic r1, r1, r3
mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D / U line
-1:
- mcr p15, 0, r0, c7, c6, 1 @ invalidate D / U line
- add r0, r0, r2
cmp r0, r1
+1:
+ mcrlo p15, 0, r0, c7, c6, 1 @ invalidate D / U line
+ addlo r0, r0, r2
+ cmplo r0, r1
blo 1b
dsb st
ret lr
/*
* dcimvac: Invalidate data cache line by MVA to PoC
*/
-.macro dcimvac, rt, tmp
- v7m_cacheop \rt, \tmp, V7M_SCB_DCIMVAC
+.irp c,,eq,ne,cs,cc,mi,pl,vs,vc,hi,ls,ge,lt,gt,le,hs,lo
+.macro dcimvac\c, rt, tmp
+ v7m_cacheop \rt, \tmp, V7M_SCB_DCIMVAC, \c
.endm
+.endr
/*
* dccmvau: Clean data cache line by MVA to PoU
tst r0, r3
bic r0, r0, r3
dccimvacne r0, r3
+ addne r0, r0, r2
subne r3, r2, #1 @ restore r3, corrupted by v7m's dccimvac
tst r1, r3
bic r1, r1, r3
dccimvacne r1, r3
-1:
- dcimvac r0, r3
- add r0, r0, r2
cmp r0, r1
+1:
+ dcimvaclo r0, r3
+ addlo r0, r0, r2
+ cmplo r0, r1
blo 1b
dsb st
ret lr
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
{
- int ret;
+ int ret = -ENXIO;
unsigned long nr_vma_pages = vma_pages(vma);
unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
unsigned long pfn = dma_to_pfn(dev, dma_addr);
.endm
.macro define_processor_functions name:req, dabort:req, pabort:req, nommu=0, suspend=0, bugs=0
+/*
+ * If we are building for big.Little with branch predictor hardening,
+ * we need the processor function tables to remain available after boot.
+ */
+#if 1 // defined(CONFIG_BIG_LITTLE) && defined(CONFIG_HARDEN_BRANCH_PREDICTOR)
+ .section ".rodata"
+#endif
.type \name\()_processor_functions, #object
.align 2
ENTRY(\name\()_processor_functions)
.endif
.size \name\()_processor_functions, . - \name\()_processor_functions
+#if 1 // defined(CONFIG_BIG_LITTLE) && defined(CONFIG_HARDEN_BRANCH_PREDICTOR)
+ .previous
+#endif
.endm
.macro define_cache_functions name:req
case ARM_CPU_PART_CORTEX_A17:
case ARM_CPU_PART_CORTEX_A73:
case ARM_CPU_PART_CORTEX_A75:
- if (processor.switch_mm != cpu_v7_bpiall_switch_mm)
- goto bl_error;
per_cpu(harden_branch_predictor_fn, cpu) =
harden_branch_predictor_bpiall;
spectre_v2_method = "BPIALL";
case ARM_CPU_PART_CORTEX_A15:
case ARM_CPU_PART_BRAHMA_B15:
- if (processor.switch_mm != cpu_v7_iciallu_switch_mm)
- goto bl_error;
per_cpu(harden_branch_predictor_fn, cpu) =
harden_branch_predictor_iciallu;
spectre_v2_method = "ICIALLU";
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
if ((int)res.a0 != 0)
break;
- if (processor.switch_mm != cpu_v7_hvc_switch_mm && cpu)
- goto bl_error;
per_cpu(harden_branch_predictor_fn, cpu) =
call_hvc_arch_workaround_1;
- processor.switch_mm = cpu_v7_hvc_switch_mm;
+ cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
spectre_v2_method = "hypervisor";
break;
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
if ((int)res.a0 != 0)
break;
- if (processor.switch_mm != cpu_v7_smc_switch_mm && cpu)
- goto bl_error;
per_cpu(harden_branch_predictor_fn, cpu) =
call_smc_arch_workaround_1;
- processor.switch_mm = cpu_v7_smc_switch_mm;
+ cpu_do_switch_mm = cpu_v7_smc_switch_mm;
spectre_v2_method = "firmware";
break;
if (spectre_v2_method)
pr_info("CPU%u: Spectre v2: using %s workaround\n",
smp_processor_id(), spectre_v2_method);
- return;
-
-bl_error:
- pr_err("CPU%u: Spectre v2: incorrect context switching function, system vulnerable\n",
- cpu);
}
#else
static void cpu_v7_spectre_init(void)
hvc #0
ldmfd sp!, {r0 - r3}
b cpu_v7_switch_mm
-ENDPROC(cpu_v7_smc_switch_mm)
+ENDPROC(cpu_v7_hvc_switch_mm)
#endif
ENTRY(cpu_v7_iciallu_switch_mm)
mov r3, #0
unsigned int mpp_max, void __iomem *dev_bus)
{
unsigned int mpp_nr_regs = (1 + mpp_max/8);
- u32 mpp_ctrl[mpp_nr_regs];
+ u32 mpp_ctrl[8];
int i;
printk(KERN_DEBUG "initial MPP regs:");
+ if (mpp_nr_regs > ARRAY_SIZE(mpp_ctrl)) {
+ printk(KERN_ERR "orion_mpp_conf: invalid mpp_max\n");
+ return;
+ }
+
for (i = 0; i < mpp_nr_regs; i++) {
mpp_ctrl[i] = readl(mpp_ctrl_addr(i, dev_bus));
printk(" %08x", mpp_ctrl[i]);
}
/* Copy arch-dep-instance from template. */
- memcpy(code, &optprobe_template_entry,
+ memcpy(code, (unsigned char *)optprobe_template_entry,
TMPL_END_IDX * sizeof(kprobe_opcode_t));
/* Adjust buffer according to instruction. */
*/
ufp_exc->fpexc = hwstate->fpexc;
ufp_exc->fpinst = hwstate->fpinst;
- ufp_exc->fpinst2 = ufp_exc->fpinst2;
+ ufp_exc->fpinst2 = hwstate->fpinst2;
/* Ensure that VFP is disabled. */
vfp_flush_hwstate(thread);
If unsure, say Y.
+config ARM64_ERRATUM_1286807
+ bool "Cortex-A76: Modification of the translation table for a virtual address might lead to read-after-read ordering violation"
+ default y
+ select ARM64_WORKAROUND_REPEAT_TLBI
+ help
+ This option adds workaround for ARM Cortex-A76 erratum 1286807
+
+ On the affected Cortex-A76 cores (r0p0 to r3p0), if a virtual
+ address for a cacheable mapping of a location is being
+ accessed by a core while another core is remapping the virtual
+ address to a new physical page using the recommended
+ break-before-make sequence, then under very rare circumstances
+ TLBI+DSB completes before a read using the translation being
+ invalidated has been observed by other observers. The
+ workaround repeats the TLBI+DSB operation.
+
+ If unsure, say Y.
+
config CAVIUM_ERRATUM_22375
bool "Cavium erratum 22375, 24313"
default y
is unchanged. Work around the erratum by invalidating the walk cache
entries for the trampoline before entering the kernel proper.
+config ARM64_WORKAROUND_REPEAT_TLBI
+ bool
+ help
+ Enable the repeat TLBI workaround for Falkor erratum 1009 and
+ Cortex-A76 erratum 1286807.
+
config QCOM_FALKOR_ERRATUM_1009
bool "Falkor E1009: Prematurely complete a DSB after a TLBI"
default y
+ select ARM64_WORKAROUND_REPEAT_TLBI
help
On Falkor v1, the CPU may prematurely complete a DSB following a
TLBI xxIS invalidate maintenance operation. Repeat the TLBI operation
archclean:
$(Q)$(MAKE) $(clean)=$(boot)
+ifeq ($(KBUILD_EXTMOD),)
# We need to generate vdso-offsets.h before compiling certain files in kernel/.
# In order to do that, we should use the archprepare target, but we can't since
# asm-offsets.h is included in some files used to generate vdso-offsets.h, and
prepare: vdso_prepare
vdso_prepare: prepare0
$(Q)$(MAKE) $(build)=arch/arm64/kernel/vdso include/generated/vdso-offsets.h
+endif
define archhelp
echo '* Image.gz - Compressed kernel image (arch/$(ARCH)/boot/Image.gz)'
clock-names = "stmmaceth";
tx-fifo-depth = <16384>;
rx-fifo-depth = <16384>;
+ snps,multicast-filter-bins = <256>;
status = "disabled";
};
clock-names = "stmmaceth";
tx-fifo-depth = <16384>;
rx-fifo-depth = <16384>;
+ snps,multicast-filter-bins = <256>;
status = "disabled";
};
clock-names = "stmmaceth";
tx-fifo-depth = <16384>;
rx-fifo-depth = <16384>;
+ snps,multicast-filter-bins = <256>;
status = "disabled";
};
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0x000>;
enable-method = "psci";
- cpu-idle-states = <&CPU_SLEEP_0>;
};
cpu1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0x001>;
enable-method = "psci";
- cpu-idle-states = <&CPU_SLEEP_0>;
};
cpu2: cpu@100 {
device_type = "cpu";
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0x100>;
enable-method = "psci";
- cpu-idle-states = <&CPU_SLEEP_0>;
};
cpu3: cpu@101 {
device_type = "cpu";
compatible = "arm,cortex-a72", "arm,armv8";
reg = <0x101>;
enable-method = "psci";
- cpu-idle-states = <&CPU_SLEEP_0>;
};
};
};
method = "smc";
};
- cpus {
- #address-cells = <1>;
- #size-cells = <0>;
-
- idle_states {
- entry_method = "arm,pcsi";
-
- CPU_SLEEP_0: cpu-sleep-0 {
- compatible = "arm,idle-state";
- local-timer-stop;
- arm,psci-suspend-param = <0x0010000>;
- entry-latency-us = <80>;
- exit-latency-us = <160>;
- min-residency-us = <320>;
- };
-
- CLUSTER_SLEEP_0: cluster-sleep-0 {
- compatible = "arm,idle-state";
- local-timer-stop;
- arm,psci-suspend-param = <0x1010000>;
- entry-latency-us = <500>;
- exit-latency-us = <1000>;
- min-residency-us = <2500>;
- };
- };
- };
-
ap806 {
#address-cells = <2>;
#size-cells = <2>;
model = "Bananapi BPI-R64";
compatible = "bananapi,bpi-r64", "mediatek,mt7622";
+ aliases {
+ serial0 = &uart0;
+ };
+
chosen {
- bootargs = "earlycon=uart8250,mmio32,0x11002000 console=ttyS0,115200n1 swiotlb=512";
+ stdout-path = "serial0:115200n8";
+ bootargs = "earlycon=uart8250,mmio32,0x11002000 swiotlb=512";
};
cpus {
model = "MediaTek MT7622 RFB1 board";
compatible = "mediatek,mt7622-rfb1", "mediatek,mt7622";
+ aliases {
+ serial0 = &uart0;
+ };
+
chosen {
- bootargs = "earlycon=uart8250,mmio32,0x11002000 console=ttyS0,115200n1 swiotlb=512";
+ stdout-path = "serial0:115200n8";
+ bootargs = "earlycon=uart8250,mmio32,0x11002000 swiotlb=512";
};
cpus {
#reset-cells = <1>;
};
- timer: timer@10004000 {
- compatible = "mediatek,mt7622-timer",
- "mediatek,mt6577-timer";
- reg = <0 0x10004000 0 0x80>;
- interrupts = <GIC_SPI 152 IRQ_TYPE_LEVEL_LOW>;
- clocks = <&infracfg CLK_INFRA_APXGPT_PD>,
- <&topckgen CLK_TOP_RTC>;
- clock-names = "system-clk", "rtc-clk";
- };
-
scpsys: scpsys@10006000 {
compatible = "mediatek,mt7622-scpsys",
"syscon";
};
};
};
+
+&tlmm {
+ gpio-reserved-ranges = <0 4>, <81 4>;
+};
};
};
+&gcc {
+ protected-clocks = <GCC_QSPI_CORE_CLK>,
+ <GCC_QSPI_CORE_CLK_SRC>,
+ <GCC_QSPI_CNOC_PERIPH_AHB_CLK>;
+};
+
&i2c10 {
status = "okay";
clock-frequency = <400000>;
status = "okay";
};
+&tlmm {
+ gpio-reserved-ranges = <0 4>, <81 4>;
+};
+
&uart9 {
status = "okay";
};
clock-names = "fck", "brg_int", "scif_clk";
dmas = <&dmac1 0x35>, <&dmac1 0x34>,
<&dmac2 0x35>, <&dmac2 0x34>;
- dma-names = "tx", "rx";
+ dma-names = "tx", "rx", "tx", "rx";
power-domains = <&sysc R8A7795_PD_ALWAYS_ON>;
resets = <&cpg 518>;
status = "disabled";
aliases {
serial0 = &scif0;
- ethernet0 = &avb;
+ ethernet0 = &gether;
};
chosen {
};
};
-&avb {
- pinctrl-0 = <&avb_pins>;
- pinctrl-names = "default";
-
- phy-mode = "rgmii-id";
- phy-handle = <&phy0>;
- renesas,no-ether-link;
- status = "okay";
-
- phy0: ethernet-phy@0 {
- rxc-skew-ps = <1500>;
- reg = <0>;
- interrupt-parent = <&gpio1>;
- interrupts = <17 IRQ_TYPE_LEVEL_LOW>;
- };
-};
-
&canfd {
pinctrl-0 = <&canfd0_pins>;
pinctrl-names = "default";
clock-frequency = <32768>;
};
+&gether {
+ pinctrl-0 = <&gether_pins>;
+ pinctrl-names = "default";
+
+ phy-mode = "rgmii-id";
+ phy-handle = <&phy0>;
+ renesas,no-ether-link;
+ status = "okay";
+
+ phy0: ethernet-phy@0 {
+ rxc-skew-ps = <1500>;
+ reg = <0>;
+ interrupt-parent = <&gpio4>;
+ interrupts = <23 IRQ_TYPE_LEVEL_LOW>;
+ };
+};
+
&i2c0 {
pinctrl-0 = <&i2c0_pins>;
pinctrl-names = "default";
};
&pfc {
- avb_pins: avb {
- groups = "avb_mdio", "avb_rgmii";
- function = "avb";
- };
-
canfd0_pins: canfd0 {
groups = "canfd0_data_a";
function = "canfd0";
};
+ gether_pins: gether {
+ groups = "gether_mdio_a", "gether_rgmii",
+ "gether_txcrefclk", "gether_txcrefclk_mega";
+ function = "gether";
+ };
+
i2c0_pins: i2c0 {
groups = "i2c0";
function = "i2c0";
};
&pcie0 {
- ep-gpios = <&gpio4 RK_PC6 GPIO_ACTIVE_LOW>;
+ ep-gpios = <&gpio4 RK_PC6 GPIO_ACTIVE_HIGH>;
num-lanes = <4>;
pinctrl-names = "default";
pinctrl-0 = <&pcie_clkreqn_cpm>;
regulator-always-on;
vin-supply = <&vcc_sys>;
};
-
- vdd_log: vdd-log {
- compatible = "pwm-regulator";
- pwms = <&pwm2 0 25000 0>;
- regulator-name = "vdd_log";
- regulator-min-microvolt = <800000>;
- regulator-max-microvolt = <1400000>;
- regulator-always-on;
- regulator-boot-on;
- vin-supply = <&vcc_sys>;
- };
-
};
&cpu_l0 {
wkup_uart0: serial@42300000 {
compatible = "ti,am654-uart";
- reg = <0x00 0x42300000 0x00 0x100>;
+ reg = <0x42300000 0x100>;
reg-shift = <2>;
reg-io-width = <4>;
interrupts = <GIC_SPI 697 IRQ_TYPE_LEVEL_HIGH>;
CONFIG_SERIAL_MVEBU_UART=y
CONFIG_SERIAL_DEV_BUS=y
CONFIG_VIRTIO_CONSOLE=y
+CONFIG_IPMI_HANDLER=m
+CONFIG_IPMI_DEVICE_INTERFACE=m
+CONFIG_IPMI_SI=m
CONFIG_TCG_TPM=y
CONFIG_TCG_TIS_I2C_INFINEON=y
CONFIG_I2C_CHARDEV=y
{
return is_compat_task();
}
+
+#define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
+
+static inline bool arch_syscall_match_sym_name(const char *sym,
+ const char *name)
+{
+ /*
+ * Since all syscall functions have __arm64_ prefix, we must skip it.
+ * However, as we described above, we decided to ignore compat
+ * syscalls, so we don't care about __arm64_compat_ prefix here.
+ */
+ return !strcmp(sym + 8, name);
+}
#endif /* ifndef __ASSEMBLY__ */
#endif /* __ASM_FTRACE_H */
: [val] "Ir" (val)); \
break; \
default: \
+ ret = 0; \
BUILD_BUG(); \
} \
\
ret = READ_ONCE(*(u64 *)ptr);
break;
default:
+ ret = 0;
BUILD_BUG();
}
: [val] "r" (val));
break;
default:
+ ret = 0;
BUILD_BUG();
}
#define KERNEL_DS UL(-1)
#define USER_DS (TASK_SIZE_64 - 1)
+/*
+ * On arm64 systems, unaligned accesses by the CPU are cheap, and so there is
+ * no point in shifting all network buffers by 2 bytes just to make some IP
+ * header fields appear aligned in memory, potentially sacrificing some DMA
+ * performance on some platforms.
+ */
+#define NET_IP_ALIGN 0
+
#ifndef __ASSEMBLY__
#ifdef __KERNEL__
SCTLR_ELx_SA | SCTLR_ELx_I | SCTLR_ELx_WXN | \
SCTLR_ELx_DSSBS | ENDIAN_CLEAR_EL2 | SCTLR_EL2_RES0)
-#if (SCTLR_EL2_SET ^ SCTLR_EL2_CLEAR) != 0xffffffffffffffff
+#if (SCTLR_EL2_SET ^ SCTLR_EL2_CLEAR) != 0xffffffffffffffffUL
#error "Inconsistent SCTLR_EL2 set/clear bits"
#endif
SCTLR_EL1_UMA | SCTLR_ELx_WXN | ENDIAN_CLEAR_EL1 |\
SCTLR_ELx_DSSBS | SCTLR_EL1_NTWI | SCTLR_EL1_RES0)
-#if (SCTLR_EL1_SET ^ SCTLR_EL1_CLEAR) != 0xffffffffffffffff
+#if (SCTLR_EL1_SET ^ SCTLR_EL1_CLEAR) != 0xffffffffffffffffUL
#error "Inconsistent SCTLR_EL1 set/clear bits"
#endif
ALTERNATIVE("nop\n nop", \
"dsb ish\n tlbi " #op, \
ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_QCOM_FALKOR_ERRATUM_1009) \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
: : )
#define __TLBI_1(op, arg) asm ("tlbi " #op ", %0\n" \
ALTERNATIVE("nop\n nop", \
"dsb ish\n tlbi " #op ", %0", \
ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_QCOM_FALKOR_ERRATUM_1009) \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
: : "r" (arg))
#define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
#endif
+#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
+
+static const struct midr_range arm64_repeat_tlbi_cpus[] = {
+#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1009
+ MIDR_RANGE(MIDR_QCOM_FALKOR_V1, 0, 0, 0, 0),
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_1286807
+ MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 3, 0),
+#endif
+ {},
+};
+
+#endif
+
const struct arm64_cpu_capabilities arm64_errata[] = {
#if defined(CONFIG_ARM64_ERRATUM_826319) || \
defined(CONFIG_ARM64_ERRATUM_827319) || \
.matches = is_kryo_midr,
},
#endif
-#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1009
+#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
{
- .desc = "Qualcomm Technologies Falkor erratum 1009",
+ .desc = "Qualcomm erratum 1009, ARM erratum 1286807",
.capability = ARM64_WORKAROUND_REPEAT_TLBI,
- ERRATA_MIDR_REV(MIDR_QCOM_FALKOR_V1, 0, 0),
+ ERRATA_MIDR_RANGE_LIST(arm64_repeat_tlbi_cpus),
},
#endif
#ifdef CONFIG_ARM64_ERRATUM_858921
.cpu_enable = cpu_enable_hw_dbm,
},
#endif
-#ifdef CONFIG_ARM64_SSBD
{
.desc = "CRC32 instructions",
.capability = ARM64_HAS_CRC32,
.field_pos = ID_AA64ISAR0_CRC32_SHIFT,
.min_field_value = 1,
},
+#ifdef CONFIG_ARM64_SSBD
{
.desc = "Speculative Store Bypassing Safe (SSBS)",
.capability = ARM64_SSBS,
/**
* elfcorehdr_read - read from ELF core header
* @buf: buffer where the data is placed
- * @csize: number of bytes to read
+ * @count: number of bytes to read
* @ppos: address in the memory
*
* This function reads @count bytes from elf core header which exists
{
unsigned long return_hooker = (unsigned long)&return_to_handler;
unsigned long old;
- struct ftrace_graph_ent trace;
- int err;
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
return;
*/
old = *parent;
- trace.func = self_addr;
- trace.depth = current->curr_ret_stack + 1;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace))
- return;
-
- err = ftrace_push_return_trace(old, self_addr, &trace.depth,
- frame_pointer, NULL);
- if (err == -EBUSY)
- return;
- else
+ if (!function_graph_enter(old, self_addr, frame_pointer, NULL))
*parent = return_hooker;
}
}
memcpy((void *)dst, src_start, length);
- flush_icache_range(dst, dst + length);
+ __flush_icache_range(dst, dst + length);
pgdp = pgd_offset_raw(allocator(mask), dst_addr);
if (pgd_none(READ_ONCE(*pgdp))) {
#include <linux/slab.h>
#include <linux/stop_machine.h>
#include <linux/sched/debug.h>
+#include <linux/set_memory.h>
#include <linux/stringify.h>
+#include <linux/vmalloc.h>
#include <asm/traps.h>
#include <asm/ptrace.h>
#include <asm/cacheflush.h>
static void __kprobes
post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
+static int __kprobes patch_text(kprobe_opcode_t *addr, u32 opcode)
+{
+ void *addrs[1];
+ u32 insns[1];
+
+ addrs[0] = addr;
+ insns[0] = opcode;
+
+ return aarch64_insn_patch_text(addrs, insns, 1);
+}
+
static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
{
/* prepare insn slot */
- p->ainsn.api.insn[0] = cpu_to_le32(p->opcode);
+ patch_text(p->ainsn.api.insn, p->opcode);
flush_icache_range((uintptr_t) (p->ainsn.api.insn),
(uintptr_t) (p->ainsn.api.insn) +
return 0;
}
-static int __kprobes patch_text(kprobe_opcode_t *addr, u32 opcode)
+void *alloc_insn_page(void)
{
- void *addrs[1];
- u32 insns[1];
+ void *page;
- addrs[0] = (void *)addr;
- insns[0] = (u32)opcode;
+ page = vmalloc_exec(PAGE_SIZE);
+ if (page)
+ set_memory_ro((unsigned long)page, 1);
- return aarch64_insn_patch_text(addrs, insns, 1);
+ return page;
}
/* arm kprobe: install breakpoint in text */
{
current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0;
}
-
-#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
-void __used stackleak_check_alloca(unsigned long size)
-{
- unsigned long stack_left;
- unsigned long current_sp = current_stack_pointer;
- struct stack_info info;
-
- BUG_ON(!on_accessible_stack(current, current_sp, &info));
-
- stack_left = current_sp - info.low;
-
- /*
- * There's a good chance we're almost out of stack space if this
- * is true. Using panic() over BUG() is more likely to give
- * reliable debugging output.
- */
- if (size >= stack_left)
- panic("alloca() over the kernel stack boundary\n");
-}
-EXPORT_SYMBOL(stackleak_check_alloca);
-#endif
arm64_memblock_init();
paging_init();
+ efi_apply_persistent_mem_reservations();
acpi_table_upgrade();
__dma_unmap_area(phys_to_virt(paddr), size, dir);
}
+#ifdef CONFIG_IOMMU_DMA
static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
struct page *page, size_t size)
{
return ret;
}
+#endif /* CONFIG_IOMMU_DMA */
static int __init atomic_pool_init(void)
{
high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
dma_contiguous_reserve(arm64_dma_phys_limit);
-
- memblock_allow_resize();
}
void __init bootmem_init(void)
memblock_free(__pa_symbol(init_pg_dir),
__pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir));
+
+ memblock_allow_resize();
}
/*
* >0 - successfully JITed a 16-byte eBPF instruction.
* <0 - failed to JIT.
*/
-static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
+static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
+ bool extra_pass)
{
const u8 code = insn->code;
const u8 dst = bpf2a64[insn->dst_reg];
case BPF_JMP | BPF_CALL:
{
const u8 r0 = bpf2a64[BPF_REG_0];
- const u64 func = (u64)__bpf_call_base + imm;
+ bool func_addr_fixed;
+ u64 func_addr;
+ int ret;
- if (ctx->prog->is_func)
- emit_addr_mov_i64(tmp, func, ctx);
+ ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass,
+ &func_addr, &func_addr_fixed);
+ if (ret < 0)
+ return ret;
+ if (func_addr_fixed)
+ /* We can use optimized emission here. */
+ emit_a64_mov_i64(tmp, func_addr, ctx);
else
- emit_a64_mov_i64(tmp, func, ctx);
+ emit_addr_mov_i64(tmp, func_addr, ctx);
emit(A64_BLR(tmp), ctx);
emit(A64_MOV(1, r0, A64_R(0)), ctx);
break;
return 0;
}
-static int build_body(struct jit_ctx *ctx)
+static int build_body(struct jit_ctx *ctx, bool extra_pass)
{
const struct bpf_prog *prog = ctx->prog;
int i;
const struct bpf_insn *insn = &prog->insnsi[i];
int ret;
- ret = build_insn(insn, ctx);
+ ret = build_insn(insn, ctx, extra_pass);
if (ret > 0) {
i++;
if (ctx->image == NULL)
/* 1. Initial fake pass to compute ctx->idx. */
/* Fake pass to fill in ctx->offset. */
- if (build_body(&ctx)) {
+ if (build_body(&ctx, extra_pass)) {
prog = orig_prog;
goto out_off;
}
build_prologue(&ctx, was_classic);
- if (build_body(&ctx)) {
+ if (build_body(&ctx, extra_pass)) {
bpf_jit_binary_free(header);
prog = orig_prog;
goto out_off;
-menu "C-SKY Debug Options"
-config CSKY_BUILTIN_DTB
- string "Use kernel builtin dtb"
- help
- User could define the dtb instead of the one which is passed from
- bootloader.
- Sometimes for debug, we want to use a built-in dtb and then we needn't
- modify bootloader at all.
-endmenu
+# dummy file, do not delete
$(shell $(CC) $(KBUILD_CFLAGS) $(KCFLAGS) -print-libgcc-file-name)
boot := arch/csky/boot
-ifneq '$(CONFIG_CSKY_BUILTIN_DTB)' '""'
core-y += $(boot)/dts/
-endif
all: zImage
-
-dtbs: scripts
- $(Q)$(MAKE) $(build)=$(boot)/dts
-
-%.dtb %.dtb.S %.dtb.o: scripts
- $(Q)$(MAKE) $(build)=$(boot)/dts $(boot)/dts/$@
-
-zImage Image uImage: vmlinux dtbs
+zImage Image uImage: vmlinux
$(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
archclean:
$(Q)$(MAKE) $(clean)=$(boot)
- $(Q)$(MAKE) $(clean)=$(boot)/dts
- rm -rf arch/csky/include/generated
define archhelp
echo '* zImage - Compressed kernel image (arch/$(ARCH)/boot/zImage)'
dtstree := $(srctree)/$(src)
-ifneq '$(CONFIG_CSKY_BUILTIN_DTB)' '""'
-builtindtb-y := $(patsubst "%",%,$(CONFIG_CSKY_BUILTIN_DTB))
-dtb-y += $(builtindtb-y).dtb
-obj-y += $(builtindtb-y).dtb.o
-.SECONDARY: $(obj)/$(builtindtb-y).dtb.S
-else
dtb-y := $(patsubst $(dtstree)/%.dts,%.dtb, $(wildcard $(dtstree)/*.dts))
-endif
-
-always += $(dtb-y)
-clean-files += *.dtb *.dtb.S
static inline void tlbmiss_handler_setup_pgd(unsigned long pgd, bool kernel)
{
- pgd &= ~(1<<31);
+ pgd -= PAGE_OFFSET;
pgd += PHYS_OFFSET;
pgd |= 1;
setup_pgd(pgd, kernel);
static inline unsigned long tlb_get_pgd(void)
{
- return ((get_pgd()|(1<<31)) - PHYS_OFFSET) & ~1;
+ return ((get_pgd() - PHYS_OFFSET) & ~1) + PAGE_OFFSET;
}
#define cpu_context(cpu, mm) ((mm)->context.asid[cpu])
*/
extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
-#define node_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)])
+#define slit_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)])
+extern int __node_distance(int from, int to);
+#define node_distance(from,to) __node_distance(from, to)
extern int paddr_to_nid(unsigned long paddr);
if (!slit_table) {
for (i = 0; i < MAX_NUMNODES; i++)
for (j = 0; j < MAX_NUMNODES; j++)
- node_distance(i, j) = i == j ? LOCAL_DISTANCE :
- REMOTE_DISTANCE;
+ slit_distance(i, j) = i == j ?
+ LOCAL_DISTANCE : REMOTE_DISTANCE;
return;
}
if (!pxm_bit_test(j))
continue;
node_to = pxm_to_node(j);
- node_distance(node_from, node_to) =
+ slit_distance(node_from, node_to) =
slit_table->entry[i * slit_table->locality_count + j];
}
}
*/
u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
+int __node_distance(int from, int to)
+{
+ return slit_distance(from, to);
+}
+EXPORT_SYMBOL(__node_distance);
+
/* Identify which cnode a physical address resides on */
int
paddr_to_nid(unsigned long paddr)
*/
#ifdef CONFIG_SUN3
#define PTRS_PER_PTE 16
-#define __PAGETABLE_PMD_FOLDED
+#define __PAGETABLE_PMD_FOLDED 1
#define PTRS_PER_PMD 1
#define PTRS_PER_PGD 2048
#elif defined(CONFIG_COLDFIRE)
#define PTRS_PER_PTE 512
-#define __PAGETABLE_PMD_FOLDED
+#define __PAGETABLE_PMD_FOLDED 1
#define PTRS_PER_PMD 1
#define PTRS_PER_PGD 1024
#else
#include <asm-generic/4level-fixup.h>
-#define __PAGETABLE_PMD_FOLDED
+#define __PAGETABLE_PMD_FOLDED 1
#ifdef __KERNEL__
#ifndef __ASSEMBLY__
void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
{
unsigned long old;
- int faulted, err;
- struct ftrace_graph_ent trace;
+ int faulted;
unsigned long return_hooker = (unsigned long)
&return_to_handler;
return;
}
- err = ftrace_push_return_trace(old, self_addr, &trace.depth, 0, NULL);
- if (err == -EBUSY) {
+ if (function_graph_enter(old, self_addr, 0, NULL))
*parent = old;
- return;
- }
-
- trace.func = self_addr;
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace)) {
- current->curr_ret_stack--;
- *parent = old;
- }
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
# clang's output will be based upon the build machine. So for clang we simply
# unconditionally specify -EB or -EL as appropriate.
#
-ifeq ($(cc-name),clang)
+ifdef CONFIG_CC_IS_CLANG
cflags-$(CONFIG_CPU_BIG_ENDIAN) += -EB
cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -EL
else
void (*cvmx_override_ipd_port_setup) (int ipd_port);
/* Port count per interface */
-static int interface_port_count[5];
+static int interface_port_count[9];
/**
* Return the number of interfaces the chip has. Each interface
CONFIG_RTC_DRV_DS1307=y
CONFIG_STAGING=y
CONFIG_OCTEON_ETHERNET=y
+CONFIG_OCTEON_USB=y
# CONFIG_IOMMU_SUPPORT is not set
CONFIG_RAS=y
CONFIG_EXT4_FS=y
#ifdef CONFIG_64BIT
case 4: case 5: case 6: case 7:
#ifdef CONFIG_MIPS32_O32
- if (test_thread_flag(TIF_32BIT_REGS))
+ if (test_tsk_thread_flag(task, TIF_32BIT_REGS))
return get_user(*arg, (int *)usp + n);
else
#endif
unsigned long fp)
{
unsigned long old_parent_ra;
- struct ftrace_graph_ent trace;
unsigned long return_hooker = (unsigned long)
&return_to_handler;
int faulted, insns;
if (unlikely(faulted))
goto out;
- if (ftrace_push_return_trace(old_parent_ra, self_ra, &trace.depth, fp,
- NULL) == -EBUSY) {
- *parent_ra_addr = old_parent_ra;
- return;
- }
-
/*
* Get the recorded ip of the current mcount calling site in the
* __mcount_loc section, which will be used to filter the function
*/
insns = core_kernel_text(self_ra) ? 2 : MCOUNT_OFFSET_INSNS + 1;
- trace.func = self_ra - (MCOUNT_INSN_SIZE * insns);
+ self_ra -= (MCOUNT_INSN_SIZE * insns);
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace)) {
- current->curr_ret_stack--;
+ if (function_graph_enter(old_parent_ra, self_ra, fp, NULL))
*parent_ra_addr = old_parent_ra;
- }
return;
out:
ftrace_graph_stop();
/* call board setup routine */
plat_mem_setup();
+ memblock_set_bottom_up(true);
/*
* Make sure all kernel memory is in the maps. The "UP" and
unsigned long size = 0x200 + VECTORSPACING*64;
phys_addr_t ebase_pa;
- memblock_set_bottom_up(true);
ebase = (unsigned long)
memblock_alloc_from(size, 1 << fls(size), 0);
- memblock_set_bottom_up(false);
/*
* Try to ensure ebase resides in KSeg0 if possible.
if (board_ebase_setup)
board_ebase_setup();
per_cpu_trap_init(true);
+ memblock_set_bottom_up(false);
/*
* Copy the generic exception handlers to their final destination.
cpumask_clear(&__node_data[(node)]->cpumask);
}
}
+ max_low_pfn = PHYS_PFN(memblock_end_of_DRAM());
+
for (cpu = 0; cpu < loongson_sysconf.nr_cpus; cpu++) {
node = cpu / loongson_sysconf.cores_per_node;
if (node >= num_online_nodes())
void __init paging_init(void)
{
- unsigned node;
unsigned long zones_size[MAX_NR_ZONES] = {0, };
pagetable_init();
-
- for_each_online_node(node) {
- unsigned long start_pfn, end_pfn;
-
- get_pfn_range_for_nid(node, &start_pfn, &end_pfn);
-
- if (end_pfn > max_low_pfn)
- max_low_pfn = end_pfn;
- }
#ifdef CONFIG_ZONE_DMA32
zones_size[ZONE_DMA32] = MAX_DMA32_PFN;
#endif
void *ret;
ret = dma_direct_alloc_pages(dev, size, dma_handle, gfp, attrs);
- if (!ret && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
+ if (ret && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
dma_cache_wback_inv((unsigned long) ret, size);
ret = (void *)UNCAC_ADDR(ret);
}
};
static struct rt2880_pmx_func nd_sd_grp[] = {
FUNC("nand", MT7620_GPIO_MODE_NAND, 45, 15),
- FUNC("sd", MT7620_GPIO_MODE_SD, 45, 15)
+ FUNC("sd", MT7620_GPIO_MODE_SD, 47, 13)
};
static struct rt2880_pmx_group mt7620a_pinmux_data[] = {
mlreset();
szmem();
+ max_low_pfn = PHYS_PFN(memblock_end_of_DRAM());
for (node = 0; node < MAX_COMPACT_NODES; node++) {
if (node_online(node)) {
void __init paging_init(void)
{
unsigned long zones_size[MAX_NR_ZONES] = {0, };
- unsigned node;
pagetable_init();
-
- for_each_online_node(node) {
- unsigned long start_pfn, end_pfn;
-
- get_pfn_range_for_nid(node, &start_pfn, &end_pfn);
-
- if (end_pfn > max_low_pfn)
- max_low_pfn = end_pfn;
- }
zones_size[ZONE_NORMAL] = max_low_pfn;
free_area_init_nodes(zones_size);
}
$(filter -march=%,$(KBUILD_CFLAGS)) \
-D__VDSO__
-ifeq ($(cc-name),clang)
+ifdef CONFIG_CC_IS_CLANG
ccflags-vdso += $(filter --target=%,$(KBUILD_CFLAGS))
endif
#ifndef _ASMNDS32_PGTABLE_H
#define _ASMNDS32_PGTABLE_H
-#define __PAGETABLE_PMD_FOLDED
+#define __PAGETABLE_PMD_FOLDED 1
#include <asm-generic/4level-fixup.h>
#include <asm-generic/sizes.h>
unsigned long frame_pointer)
{
unsigned long return_hooker = (unsigned long)&return_to_handler;
- struct ftrace_graph_ent trace;
unsigned long old;
- int err;
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
return;
old = *parent;
- trace.func = self_addr;
- trace.depth = current->curr_ret_stack + 1;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace))
- return;
-
- err = ftrace_push_return_trace(old, self_addr, &trace.depth,
- frame_pointer, NULL);
-
- if (err == -EBUSY)
- return;
-
- *parent = return_hooker;
+ if (!function_graph_enter(old, self_addr, frame_pointer, NULL))
+ *parent = return_hooker;
}
noinline void ftrace_graph_caller(void)
KBUILD_CFLAGS_KERNEL += -mlong-calls
endif
+# Without this, "ld -r" results in .text sections that are too big (> 0x40000)
+# for branches to reach stubs. And multiple .text sections trigger a warning
+# when creating the sysfs module information section.
+ifndef CONFIG_64BIT
+KBUILD_CFLAGS_MODULE += -ffunction-sections
+endif
+
# select which processor to optimise for
cflags-$(CONFIG_PA7000) += -march=1.1 -mschedule=7100
cflags-$(CONFIG_PA7200) += -march=1.1 -mschedule=7200
#if CONFIG_PGTABLE_LEVELS == 3
#define BITS_PER_PMD (PAGE_SHIFT + PMD_ORDER - BITS_PER_PMD_ENTRY)
#else
-#define __PAGETABLE_PMD_FOLDED
+#define __PAGETABLE_PMD_FOLDED 1
#define BITS_PER_PMD 0
#endif
#define PTRS_PER_PMD (1UL << BITS_PER_PMD)
volatile unsigned int *a;
a = __ldcw_align(x);
- /* Release with ordered store. */
- __asm__ __volatile__("stw,ma %0,0(%1)" : : "r"(1), "r"(a) : "memory");
+ mb();
+ *a = 1;
}
static inline int arch_spin_trylock(arch_spinlock_t *x)
unsigned long self_addr)
{
unsigned long old;
- struct ftrace_graph_ent trace;
extern int parisc_return_to_handler;
if (unlikely(ftrace_graph_is_dead()))
old = *parent;
- trace.func = self_addr;
- trace.depth = current->curr_ret_stack + 1;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace))
- return;
-
- if (ftrace_push_return_trace(old, self_addr, &trace.depth,
- 0, NULL) == -EBUSY)
- return;
-
- /* activate parisc_return_to_handler() as return point */
- *parent = (unsigned long) &parisc_return_to_handler;
+ if (!function_graph_enter(old, self_addr, 0, NULL))
+ /* activate parisc_return_to_handler() as return point */
+ *parent = (unsigned long) &parisc_return_to_handler;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
sub,<> %r28, %r25, %r0
2: stw %r24, 0(%r26)
/* Free lock */
- stw,ma %r20, 0(%sr2,%r20)
+ sync
+ stw %r20, 0(%sr2,%r20)
#if ENABLE_LWS_DEBUG
/* Clear thread register indicator */
stw %r0, 4(%sr2,%r20)
3:
/* Error occurred on load or store */
/* Free lock */
- stw,ma %r20, 0(%sr2,%r20)
+ sync
+ stw %r20, 0(%sr2,%r20)
#if ENABLE_LWS_DEBUG
stw %r0, 4(%sr2,%r20)
#endif
cas2_end:
/* Free lock */
- stw,ma %r20, 0(%sr2,%r20)
+ sync
+ stw %r20, 0(%sr2,%r20)
/* Enable interrupts */
ssm PSW_SM_I, %r0
/* Return to userspace, set no error */
22:
/* Error occurred on load or store */
/* Free lock */
- stw,ma %r20, 0(%sr2,%r20)
+ sync
+ stw %r20, 0(%sr2,%r20)
ssm PSW_SM_I, %r0
ldo 1(%r0),%r28
b lws_exit
help
Freescale General-purpose Timers support
-# Yes MCA RS/6000s exist but Linux-PPC does not currently support any
-config MCA
- bool
-
# Platforms that what PCI turned unconditionally just do select PCI
# in their config node. Platforms that want to choose at config
# time should select PPC_PCI_CHOICE
bool "PCI support" if PPC_PCI_CHOICE
default y if !40x && !CPM2 && !PPC_8xx && !PPC_83xx \
&& !PPC_85xx && !PPC_86xx && !GAMECUBE_COMMON
- default PCI_QSPAN if PPC_8xx
select GENERIC_PCI_IOMAP
help
Find out whether your system includes a PCI bus. PCI is the name of
config PCI_SYSCALL
def_bool PCI
-config PCI_QSPAN
- bool "QSpan PCI"
- depends on PPC_8xx
- select PPC_I8259
- help
- Say Y here if you have a system based on a Motorola 8xx-series
- embedded processor with a QSPAN PCI interface, otherwise say N.
-
config PCI_8260
bool
depends on PCI && 8260
aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2
endif
-ifneq ($(cc-name),clang)
+ifndef CONFIG_CC_IS_CLANG
cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mno-strict-align
endif
# Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc <= 4.8
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828
-ifneq ($(cc-name),clang)
+ifndef CONFIG_CC_IS_CLANG
CC_FLAGS_FTRACE += $(call cc-ifversion, -lt, 0409, -mno-sched-epilog)
endif
endif
};
ethernet@f0000 {
- phy-handle = <&xg_cs4315_phy1>;
+ phy-handle = <&xg_cs4315_phy2>;
phy-connection-type = "xgmii";
};
ethernet@f2000 {
- phy-handle = <&xg_cs4315_phy2>;
+ phy-handle = <&xg_cs4315_phy1>;
phy-connection-type = "xgmii";
};
#address-cells = <1>;
#size-cells = <1>;
device_type = "soc";
- ranges = <0x0 0xff000000 0x4000>;
+ ranges = <0x0 0xff000000 0x28000>;
bus-frequency = <0>;
// Temporary -- will go away once kernel uses ranges for get_immrbase().
#size-cells = <0>;
};
};
+
+ crypto@20000 {
+ compatible = "fsl,sec1.2", "fsl,sec1.0";
+ reg = <0x20000 0x8000>;
+ interrupts = <1 1>;
+ interrupt-parent = <&PIC>;
+ fsl,num-channels = <1>;
+ fsl,channel-fifo-len = <24>;
+ fsl,exec-units-mask = <0x4c>;
+ fsl,descriptor-types-mask = <0x05000154>;
+ };
};
chosen {
int patch_instruction_site(s32 *addr, unsigned int instr);
int patch_branch_site(s32 *site, unsigned long target, int flags);
+static inline unsigned long patch_site_addr(s32 *site)
+{
+ return (unsigned long)site + *site;
+}
+
int instr_is_relative_branch(unsigned int instr);
int instr_is_relative_link_branch(unsigned int instr);
int instr_is_branch_to_addr(const unsigned int *instr, unsigned long addr);
* their hooks, a bitfield is reserved for use by the platform near the
* top of MMIO addresses (not PIO, those have to cope the hard way).
*
- * This bit field is 12 bits and is at the top of the IO virtual
- * addresses PCI_IO_INDIRECT_TOKEN_MASK.
+ * The highest address in the kernel virtual space are:
*
- * The kernel virtual space is thus:
+ * d0003fffffffffff # with Hash MMU
+ * c00fffffffffffff # with Radix MMU
*
- * 0xD000000000000000 : vmalloc
- * 0xD000080000000000 : PCI PHB IO space
- * 0xD000080080000000 : ioremap
- * 0xD0000fffffffffff : end of ioremap region
- *
- * Since the top 4 bits are reserved as the region ID, we use thus
- * the next 12 bits and keep 4 bits available for the future if the
- * virtual address space is ever to be extended.
+ * The top 4 bits are reserved as the region ID on hash, leaving us 8 bits
+ * that can be used for the field.
*
* The direct IO mapping operations will then mask off those bits
* before doing the actual access, though that only happen when
*/
#ifdef CONFIG_PPC_INDIRECT_MMIO
-#define PCI_IO_IND_TOKEN_MASK 0x0fff000000000000ul
-#define PCI_IO_IND_TOKEN_SHIFT 48
+#define PCI_IO_IND_TOKEN_SHIFT 52
+#define PCI_IO_IND_TOKEN_MASK (0xfful << PCI_IO_IND_TOKEN_SHIFT)
#define PCI_FIX_ADDR(addr) \
((PCI_IO_ADDR)(((unsigned long)(addr)) & ~PCI_IO_IND_TOKEN_MASK))
#define PCI_GET_ADDR_TOKEN(addr) \
* respectively NA for All or X for Supervisor and no access for User.
* Then we use the APG to say whether accesses are according to Page rules or
* "all Supervisor" rules (Access to all)
- * We also use the 2nd APG bit for _PAGE_ACCESSED when having SWAP:
- * When that bit is not set access is done iaw "all user"
- * which means no access iaw page rules.
- * Therefore, we define 4 APG groups. lsb is _PMD_USER, 2nd is _PAGE_ACCESSED
- * 0x => No access => 11 (all accesses performed as user iaw page definition)
- * 10 => No user => 01 (all accesses performed according to page definition)
- * 11 => User => 00 (all accesses performed as supervisor iaw page definition)
+ * Therefore, we define 2 APG groups. lsb is _PMD_USER
+ * 0 => No user => 01 (all accesses performed according to page definition)
+ * 1 => User => 00 (all accesses performed as supervisor iaw page definition)
* We define all 16 groups so that all other bits of APG can take any value
*/
-#ifdef CONFIG_SWAP
-#define MI_APG_INIT 0xf4f4f4f4
-#else
#define MI_APG_INIT 0x44444444
-#endif
/* The effective page number register. When read, contains the information
* about the last instruction TLB miss. When MI_RPN is written, bits in
* Supervisor and no access for user and NA for ALL.
* Then we use the APG to say whether accesses are according to Page rules or
* "all Supervisor" rules (Access to all)
- * We also use the 2nd APG bit for _PAGE_ACCESSED when having SWAP:
- * When that bit is not set access is done iaw "all user"
- * which means no access iaw page rules.
- * Therefore, we define 4 APG groups. lsb is _PMD_USER, 2nd is _PAGE_ACCESSED
- * 0x => No access => 11 (all accesses performed as user iaw page definition)
- * 10 => No user => 01 (all accesses performed according to page definition)
- * 11 => User => 00 (all accesses performed as supervisor iaw page definition)
+ * Therefore, we define 2 APG groups. lsb is _PMD_USER
+ * 0 => No user => 01 (all accesses performed according to page definition)
+ * 1 => User => 00 (all accesses performed as supervisor iaw page definition)
* We define all 16 groups so that all other bits of APG can take any value
*/
-#ifdef CONFIG_SWAP
-#define MD_APG_INIT 0xf4f4f4f4
-#else
#define MD_APG_INIT 0x44444444
-#endif
/* The effective page number register. When read, contains the information
* about the last instruction TLB miss. When MD_RPN is written, bits in
*/
#define SPRN_M_TW 799
-/* APGs */
-#define M_APG0 0x00000000
-#define M_APG1 0x00000020
-#define M_APG2 0x00000040
-#define M_APG3 0x00000060
-
#ifdef CONFIG_PPC_MM_SLICES
#include <asm/nohash/32/slice.h>
#define SLICE_ARRAY_SIZE (1 << (32 - SLICE_LOW_SHIFT - 1))
BUG();
}
+/* patch sites */
+extern s32 patch__itlbmiss_linmem_top;
+extern s32 patch__dtlbmiss_linmem_top, patch__dtlbmiss_immr_jmp;
+extern s32 patch__fixupdar_linmem_top;
+
+extern s32 patch__itlbmiss_exit_1, patch__itlbmiss_exit_2;
+extern s32 patch__dtlbmiss_exit_1, patch__dtlbmiss_exit_2, patch__dtlbmiss_exit_3;
+extern s32 patch__itlbmiss_perf, patch__dtlbmiss_perf;
+
#endif /* !__ASSEMBLY__ */
#if defined(CONFIG_PPC_4K_PAGES)
__PPC_RS(t) | __PPC_RA0(a) | __PPC_RB(b))
#define PPC_SLBFEE_DOT(t, b) stringify_in_c(.long PPC_INST_SLBFEE | \
__PPC_RT(t) | __PPC_RB(b))
+#define __PPC_SLBFEE_DOT(t, b) stringify_in_c(.long PPC_INST_SLBFEE | \
+ ___PPC_RT(t) | ___PPC_RB(b))
#define PPC_ICBT(c,a,b) stringify_in_c(.long PPC_INST_ICBT | \
__PPC_CT(c) | __PPC_RA0(a) | __PPC_RB(b))
/* PASemi instructions */
#ifdef CONFIG_PPC64
unsigned long ppr;
+ unsigned long __pad; /* Maintain 16 byte interrupt stack alignment */
#endif
};
#endif
#include <linux/spinlock.h>
#include <asm/page.h>
#include <linux/time.h>
+#include <linux/cpumask.h>
/*
* Definitions for talking to the RTAS on CHRP machines.
#include <asm/asm-offsets.h>
#include <asm/ptrace.h>
#include <asm/export.h>
+#include <asm/code-patching-asm.h>
#if CONFIG_TASK_SIZE <= 0x80000000 && CONFIG_PAGE_OFFSET >= 0x80000000
/* By simply checking Address >= 0x80000000, we know if its a kernel address */
cmpli cr0, r11, PAGE_OFFSET@h
#ifndef CONFIG_PIN_TLB_TEXT
/* It is assumed that kernel code fits into the first 8M page */
-_ENTRY(ITLBMiss_cmp)
- cmpli cr7, r11, (PAGE_OFFSET + 0x0800000)@h
+0: cmpli cr7, r11, (PAGE_OFFSET + 0x0800000)@h
+ patch_site 0b, patch__itlbmiss_linmem_top
#endif
#endif
#endif
#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
mtcr r12
#endif
-
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 31, _PAGE_ACCESSED >> 1
-#endif
/* Load the MI_TWC with the attributes for this "segment." */
mtspr SPRN_MI_TWC, r11 /* Set segment attributes */
+#ifdef CONFIG_SWAP
+ rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ and r11, r11, r10
+ rlwimi r10, r11, 0, _PAGE_PRESENT
+#endif
li r11, RPN_PATTERN | 0x200
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20 and 23 must be clear.
mtspr SPRN_MI_RPN, r10 /* Update TLB entry */
/* Restore registers */
-_ENTRY(itlb_miss_exit_1)
- mfspr r10, SPRN_SPRG_SCRATCH0
+0: mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
mfspr r12, SPRN_SPRG_SCRATCH2
#endif
rfi
+ patch_site 0b, patch__itlbmiss_exit_1
+
#ifdef CONFIG_PERF_EVENTS
-_ENTRY(itlb_miss_perf)
- lis r10, (itlb_miss_counter - PAGE_OFFSET)@ha
+ patch_site 0f, patch__itlbmiss_perf
+0: lis r10, (itlb_miss_counter - PAGE_OFFSET)@ha
lwz r11, (itlb_miss_counter - PAGE_OFFSET)@l(r10)
addi r11, r11, 1
stw r11, (itlb_miss_counter - PAGE_OFFSET)@l(r10)
#ifndef CONFIG_PIN_TLB_IMMR
cmpli cr0, r11, VIRT_IMMR_BASE@h
#endif
-_ENTRY(DTLBMiss_cmp)
- cmpli cr7, r11, (PAGE_OFFSET + 0x1800000)@h
+0: cmpli cr7, r11, (PAGE_OFFSET + 0x1800000)@h
+ patch_site 0b, patch__dtlbmiss_linmem_top
#ifndef CONFIG_PIN_TLB_IMMR
-_ENTRY(DTLBMiss_jmp)
- beq- DTLBMissIMMR
+0: beq- DTLBMissIMMR
+ patch_site 0b, patch__dtlbmiss_immr_jmp
#endif
blt cr7, DTLBMissLinear
lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
* above.
*/
rlwimi r11, r10, 0, _PAGE_GUARDED
-#ifdef CONFIG_SWAP
- /* _PAGE_ACCESSED has to be set. We use second APG bit for that, 0
- * on that bit will represent a Non Access group
- */
- rlwinm r11, r10, 31, _PAGE_ACCESSED >> 1
-#endif
mtspr SPRN_MD_TWC, r11
+ /* Both _PAGE_ACCESSED and _PAGE_PRESENT has to be set.
+ * We also need to know if the insn is a load/store, so:
+ * Clear _PAGE_PRESENT and load that which will
+ * trap into DTLB Error with store bit set accordinly.
+ */
+ /* PRESENT=0x1, ACCESSED=0x20
+ * r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
+ * r10 = (r10 & ~PRESENT) | r11;
+ */
+#ifdef CONFIG_SWAP
+ rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ and r11, r11, r10
+ rlwimi r10, r11, 0, _PAGE_PRESENT
+#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 24, 25, 26, and 27 must be
* set. All other Linux PTE bits control the behavior
/* Restore registers */
mtspr SPRN_DAR, r11 /* Tag DAR */
-_ENTRY(dtlb_miss_exit_1)
- mfspr r10, SPRN_SPRG_SCRATCH0
+
+0: mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
mfspr r12, SPRN_SPRG_SCRATCH2
rfi
+ patch_site 0b, patch__dtlbmiss_exit_1
+
#ifdef CONFIG_PERF_EVENTS
-_ENTRY(dtlb_miss_perf)
- lis r10, (dtlb_miss_counter - PAGE_OFFSET)@ha
+ patch_site 0f, patch__dtlbmiss_perf
+0: lis r10, (dtlb_miss_counter - PAGE_OFFSET)@ha
lwz r11, (dtlb_miss_counter - PAGE_OFFSET)@l(r10)
addi r11, r11, 1
stw r11, (dtlb_miss_counter - PAGE_OFFSET)@l(r10)
*/
DTLBMissIMMR:
mtcr r12
- /* Set 512k byte guarded page and mark it valid and accessed */
- li r10, MD_PS512K | MD_GUARDED | MD_SVALID | M_APG2
+ /* Set 512k byte guarded page and mark it valid */
+ li r10, MD_PS512K | MD_GUARDED | MD_SVALID
mtspr SPRN_MD_TWC, r10
mfspr r10, SPRN_IMMR /* Get current IMMR */
rlwinm r10, r10, 0, 0xfff80000 /* Get 512 kbytes boundary */
li r11, RPN_PATTERN
mtspr SPRN_DAR, r11 /* Tag DAR */
-_ENTRY(dtlb_miss_exit_2)
- mfspr r10, SPRN_SPRG_SCRATCH0
+
+0: mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
mfspr r12, SPRN_SPRG_SCRATCH2
rfi
+ patch_site 0b, patch__dtlbmiss_exit_2
DTLBMissLinear:
mtcr r12
- /* Set 8M byte page and mark it valid and accessed */
- li r11, MD_PS8MEG | MD_SVALID | M_APG2
+ /* Set 8M byte page and mark it valid */
+ li r11, MD_PS8MEG | MD_SVALID
mtspr SPRN_MD_TWC, r11
rlwinm r10, r10, 0, 0x0f800000 /* 8xx supports max 256Mb RAM */
ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \
li r11, RPN_PATTERN
mtspr SPRN_DAR, r11 /* Tag DAR */
-_ENTRY(dtlb_miss_exit_3)
- mfspr r10, SPRN_SPRG_SCRATCH0
+
+0: mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
mfspr r12, SPRN_SPRG_SCRATCH2
rfi
+ patch_site 0b, patch__dtlbmiss_exit_3
#ifndef CONFIG_PIN_TLB_TEXT
ITLBMissLinear:
mtcr r12
- /* Set 8M byte page and mark it valid,accessed */
- li r11, MI_PS8MEG | MI_SVALID | M_APG2
+ /* Set 8M byte page and mark it valid */
+ li r11, MI_PS8MEG | MI_SVALID
mtspr SPRN_MI_TWC, r11
rlwinm r10, r10, 0, 0x0f800000 /* 8xx supports max 256Mb RAM */
ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_SH | _PAGE_DIRTY | \
_PAGE_PRESENT
mtspr SPRN_MI_RPN, r10 /* Update TLB entry */
-_ENTRY(itlb_miss_exit_2)
- mfspr r10, SPRN_SPRG_SCRATCH0
+0: mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
mfspr r12, SPRN_SPRG_SCRATCH2
rfi
+ patch_site 0b, patch__itlbmiss_exit_2
#endif
/* This is the procedure to calculate the data EA for buggy dcbx,dcbi instructions
mfspr r11, SPRN_M_TW /* Get level 1 table */
blt+ 3f
rlwinm r11, r10, 16, 0xfff8
-_ENTRY(FixupDAR_cmp)
- cmpli cr7, r11, (PAGE_OFFSET + 0x1800000)@h
+
+0: cmpli cr7, r11, (PAGE_OFFSET + 0x1800000)@h
+ patch_site 0b, patch__fixupdar_linmem_top
+
/* create physical page address from effective address */
tophys(r11, r10)
blt- cr7, 201f
ori r8, r8, MI_EVALID /* Mark it valid */
mtspr SPRN_MI_EPN, r8
li r8, MI_PS8MEG /* Set 8M byte page */
- ori r8, r8, MI_SVALID | M_APG2 /* Make it valid, APG 2 */
+ ori r8, r8, MI_SVALID /* Make it valid */
mtspr SPRN_MI_TWC, r8
li r8, MI_BOOTINIT /* Create RPN for address 0 */
mtspr SPRN_MI_RPN, r8 /* Store TLB entry */
ori r8, r8, MD_EVALID /* Mark it valid */
mtspr SPRN_MD_EPN, r8
li r8, MD_PS512K | MD_GUARDED /* Set 512k byte page */
- ori r8, r8, MD_SVALID | M_APG2 /* Make it valid and accessed */
+ ori r8, r8, MD_SVALID /* Make it valid */
mtspr SPRN_MD_TWC, r8
mr r8, r9 /* Create paddr for TLB */
ori r8, r8, MI_BOOTINIT|0x2 /* Inhibit cache -- Cort */
if (tsk->thread.regs) {
preempt_disable();
BUG_ON(tsk != current);
- save_all(tsk);
-
#ifdef CONFIG_SPE
if (tsk->thread.regs->msr & MSR_SPE)
tsk->thread.spefscr = mfspr(SPRN_SPEFSCR);
#endif
+ save_all(tsk);
preempt_enable();
}
{
unsigned long pa;
+ BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16);
+
pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit,
early_cpu_to_node(cpu), MEMBLOCK_NONE);
if (!pa) {
*/
unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip)
{
- struct ftrace_graph_ent trace;
unsigned long return_hooker;
if (unlikely(ftrace_graph_is_dead()))
return_hooker = ppc_function_entry(return_to_handler);
- trace.func = ip;
- trace.depth = current->curr_ret_stack + 1;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace))
- goto out;
-
- if (ftrace_push_return_trace(parent, ip, &trace.depth, 0,
- NULL) == -EBUSY)
- goto out;
-
- parent = return_hooker;
+ if (!function_graph_enter(parent, ip, 0, NULL))
+ parent = return_hooker;
out:
return parent;
}
ret = kvmhv_enter_nested_guest(vcpu);
if (ret == H_INTERRUPT) {
kvmppc_set_gpr(vcpu, 3, 0);
+ vcpu->arch.hcall_needed = 0;
return -EINTR;
}
break;
kvmppc_core_prepare_to_enter(vcpu);
return;
}
- dec_nsec = (vcpu->arch.dec_expires - now) * NSEC_PER_SEC
- / tb_ticks_per_sec;
+ dec_nsec = tb_to_ns(vcpu->arch.dec_expires - now);
hrtimer_start(&vcpu->arch.dec_timer, dec_nsec, HRTIMER_MODE_REL);
vcpu->arch.timer_running = 1;
}
dec_time = vcpu->arch.dec;
/*
- * Guest timebase ticks at the same frequency as host decrementer.
- * So use the host decrementer calculations for decrementer emulation.
+ * Guest timebase ticks at the same frequency as host timebase.
+ * So use the host timebase calculations for decrementer emulation.
*/
- dec_time = dec_time << decrementer_clockevent.shift;
- do_div(dec_time, decrementer_clockevent.mult);
+ dec_time = tb_to_ns(dec_time);
dec_nsec = do_div(dec_time, NSEC_PER_SEC);
hrtimer_start(&vcpu->arch.dec_timer,
ktime_set(dec_time, dec_nsec), HRTIMER_MODE_REL);
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvm
-#define TRACE_INCLUDE_PATH .
-#define TRACE_INCLUDE_FILE trace
/*
* Tracepoint for guest mode entry.
#endif /* _TRACE_KVM_H */
/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace
+
#include <trace/define_trace.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvm_booke
-#define TRACE_INCLUDE_PATH .
-#define TRACE_INCLUDE_FILE trace_booke
#define kvm_trace_symbol_exit \
{0, "CRITICAL"}, \
#endif
/* This part must be outside protection */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_booke
+
#include <trace/define_trace.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvm_hv
-#define TRACE_INCLUDE_PATH .
-#define TRACE_INCLUDE_FILE trace_hv
#define kvm_trace_symbol_hcall \
{H_REMOVE, "H_REMOVE"}, \
#endif /* _TRACE_KVM_HV_H */
/* This part must be outside protection */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_hv
+
#include <trace/define_trace.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvm_pr
-#define TRACE_INCLUDE_PATH .
-#define TRACE_INCLUDE_FILE trace_pr
TRACE_EVENT(kvm_book3s_reenter,
TP_PROTO(int r, struct kvm_vcpu *vcpu),
#endif /* _TRACE_KVM_H */
/* This part must be outside protection */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_pr
+
#include <trace/define_trace.h>
*/
#include <linux/memblock.h>
+#include <linux/mmu_context.h>
#include <asm/fixmap.h>
#include <asm/code-patching.h>
for (; i < 32 && mem >= LARGE_PAGE_SIZE_8M; i++) {
mtspr(SPRN_MD_CTR, ctr | (i << 8));
mtspr(SPRN_MD_EPN, (unsigned long)__va(addr) | MD_EVALID);
- mtspr(SPRN_MD_TWC, MD_PS8MEG | MD_SVALID | M_APG2);
+ mtspr(SPRN_MD_TWC, MD_PS8MEG | MD_SVALID);
mtspr(SPRN_MD_RPN, addr | flags | _PAGE_PRESENT);
addr += LARGE_PAGE_SIZE_8M;
mem -= LARGE_PAGE_SIZE_8M;
map_kernel_page(v + offset, p + offset, PAGE_KERNEL_NCG);
}
-/* Address of instructions to patch */
-#ifndef CONFIG_PIN_TLB_IMMR
-extern unsigned int DTLBMiss_jmp;
-#endif
-extern unsigned int DTLBMiss_cmp, FixupDAR_cmp;
-#ifndef CONFIG_PIN_TLB_TEXT
-extern unsigned int ITLBMiss_cmp;
-#endif
-
-static void __init mmu_patch_cmp_limit(unsigned int *addr, unsigned long mapped)
+static void __init mmu_patch_cmp_limit(s32 *site, unsigned long mapped)
{
- unsigned int instr = *addr;
+ unsigned int instr = *(unsigned int *)patch_site_addr(site);
instr &= 0xffff0000;
instr |= (unsigned long)__va(mapped) >> 16;
- patch_instruction(addr, instr);
+ patch_instruction_site(site, instr);
}
unsigned long __init mmu_mapin_ram(unsigned long top)
mapped = 0;
mmu_mapin_immr();
#ifndef CONFIG_PIN_TLB_IMMR
- patch_instruction(&DTLBMiss_jmp, PPC_INST_NOP);
+ patch_instruction_site(&patch__dtlbmiss_immr_jmp, PPC_INST_NOP);
#endif
#ifndef CONFIG_PIN_TLB_TEXT
- mmu_patch_cmp_limit(&ITLBMiss_cmp, 0);
+ mmu_patch_cmp_limit(&patch__itlbmiss_linmem_top, 0);
#endif
} else {
mapped = top & ~(LARGE_PAGE_SIZE_8M - 1);
}
- mmu_patch_cmp_limit(&DTLBMiss_cmp, mapped);
- mmu_patch_cmp_limit(&FixupDAR_cmp, mapped);
+ mmu_patch_cmp_limit(&patch__dtlbmiss_linmem_top, mapped);
+ mmu_patch_cmp_limit(&patch__fixupdar_linmem_top, mapped);
/* If the size of RAM is not an exact power of two, we may not
* have covered RAM in its entirety with 8 MiB
switch (rc) {
case H_FUNCTION:
- printk(KERN_INFO
+ printk_once(KERN_INFO
"VPHN is not supported. Disabling polling...\n");
stop_topology_update();
break;
#include <asm/mmu.h>
#include <asm/mmu_context.h>
#include <asm/paca.h>
+#include <asm/ppc-opcode.h>
#include <asm/cputable.h>
#include <asm/cacheflush.h>
#include <asm/smp.h>
return __mk_vsid_data(get_kernel_vsid(ea, ssize), ssize, flags);
}
-static void assert_slb_exists(unsigned long ea)
+static void assert_slb_presence(bool present, unsigned long ea)
{
#ifdef CONFIG_DEBUG_VM
unsigned long tmp;
WARN_ON_ONCE(mfmsr() & MSR_EE);
- asm volatile("slbfee. %0, %1" : "=r"(tmp) : "r"(ea) : "cr0");
- WARN_ON(tmp == 0);
-#endif
-}
-
-static void assert_slb_notexists(unsigned long ea)
-{
-#ifdef CONFIG_DEBUG_VM
- unsigned long tmp;
+ if (!cpu_has_feature(CPU_FTR_ARCH_206))
+ return;
- WARN_ON_ONCE(mfmsr() & MSR_EE);
+ asm volatile(__PPC_SLBFEE_DOT(%0, %1) : "=r"(tmp) : "r"(ea) : "cr0");
- asm volatile("slbfee. %0, %1" : "=r"(tmp) : "r"(ea) : "cr0");
- WARN_ON(tmp != 0);
+ WARN_ON(present == (tmp == 0));
#endif
}
*/
slb_shadow_update(ea, ssize, flags, index);
- assert_slb_notexists(ea);
+ assert_slb_presence(false, ea);
asm volatile("slbmte %0,%1" :
: "r" (mk_vsid_data(ea, ssize, flags)),
"r" (mk_esid_data(ea, ssize, index))
"r" (be64_to_cpu(p->save_area[index].esid)));
}
- assert_slb_exists(local_paca->kstack);
+ assert_slb_presence(true, local_paca->kstack);
}
/*
:: "r" (be64_to_cpu(p->save_area[KSTACK_INDEX].vsid)),
"r" (be64_to_cpu(p->save_area[KSTACK_INDEX].esid))
: "memory");
- assert_slb_exists(get_paca()->kstack);
+ assert_slb_presence(true, get_paca()->kstack);
get_paca()->slb_cache_ptr = 0;
ea = (unsigned long)
get_paca()->slb_cache[i] << SID_SHIFT;
/*
- * Could assert_slb_exists here, but hypervisor
- * or machine check could have come in and
- * removed the entry at this point.
+ * Could assert_slb_presence(true) here, but
+ * hypervisor or machine check could have come
+ * in and removed the entry at this point.
*/
slbie_data = ea;
* User preloads should add isync afterwards in case the kernel
* accesses user memory before it returns to userspace with rfid.
*/
- assert_slb_notexists(ea);
+ assert_slb_presence(false, ea);
asm volatile("slbmte %0, %1" : : "r" (vsid_data), "r" (esid_data));
barrier();
return -EFAULT;
if (ea < H_VMALLOC_END)
- flags = get_paca()->vmalloc_sllp;
+ flags = local_paca->vmalloc_sllp;
else
flags = SLB_VSID_KERNEL | mmu_psize_defs[mmu_io_psize].sllp;
} else {
PPC_BLR();
}
-static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, u64 func)
+static void bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx,
+ u64 func)
+{
+#ifdef PPC64_ELF_ABI_v1
+ /* func points to the function descriptor */
+ PPC_LI64(b2p[TMP_REG_2], func);
+ /* Load actual entry point from function descriptor */
+ PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_2], 0);
+ /* ... and move it to LR */
+ PPC_MTLR(b2p[TMP_REG_1]);
+ /*
+ * Load TOC from function descriptor at offset 8.
+ * We can clobber r2 since we get called through a
+ * function pointer (so caller will save/restore r2)
+ * and since we don't use a TOC ourself.
+ */
+ PPC_BPF_LL(2, b2p[TMP_REG_2], 8);
+#else
+ /* We can clobber r12 */
+ PPC_FUNC_ADDR(12, func);
+ PPC_MTLR(12);
+#endif
+ PPC_BLRL();
+}
+
+static void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx,
+ u64 func)
{
unsigned int i, ctx_idx = ctx->idx;
{
const struct bpf_insn *insn = fp->insnsi;
int flen = fp->len;
- int i;
+ int i, ret;
/* Start of epilogue code - will only be valid 2nd pass onwards */
u32 exit_addr = addrs[flen];
u32 src_reg = b2p[insn[i].src_reg];
s16 off = insn[i].off;
s32 imm = insn[i].imm;
+ bool func_addr_fixed;
+ u64 func_addr;
u64 imm64;
- u8 *func;
u32 true_cond;
u32 tmp_idx;
case BPF_JMP | BPF_CALL:
ctx->seen |= SEEN_FUNC;
- /* bpf function call */
- if (insn[i].src_reg == BPF_PSEUDO_CALL)
- if (!extra_pass)
- func = NULL;
- else if (fp->aux->func && off < fp->aux->func_cnt)
- /* use the subprog id from the off
- * field to lookup the callee address
- */
- func = (u8 *) fp->aux->func[off]->bpf_func;
- else
- return -EINVAL;
- /* kernel helper call */
- else
- func = (u8 *) __bpf_call_base + imm;
-
- bpf_jit_emit_func_call(image, ctx, (u64)func);
+ ret = bpf_jit_get_func_addr(fp, &insn[i], extra_pass,
+ &func_addr, &func_addr_fixed);
+ if (ret < 0)
+ return ret;
+ if (func_addr_fixed)
+ bpf_jit_emit_func_call_hlp(image, ctx, func_addr);
+ else
+ bpf_jit_emit_func_call_rel(image, ctx, func_addr);
/* move return value from r3 to BPF_REG_0 */
PPC_MR(b2p[BPF_REG_0], 3);
break;
return 0;
}
+/* Fix the branch target addresses for subprog calls */
+static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
+ struct codegen_context *ctx, u32 *addrs)
+{
+ const struct bpf_insn *insn = fp->insnsi;
+ bool func_addr_fixed;
+ u64 func_addr;
+ u32 tmp_idx;
+ int i, ret;
+
+ for (i = 0; i < fp->len; i++) {
+ /*
+ * During the extra pass, only the branch target addresses for
+ * the subprog calls need to be fixed. All other instructions
+ * can left untouched.
+ *
+ * The JITed image length does not change because we already
+ * ensure that the JITed instruction sequence for these calls
+ * are of fixed length by padding them with NOPs.
+ */
+ if (insn[i].code == (BPF_JMP | BPF_CALL) &&
+ insn[i].src_reg == BPF_PSEUDO_CALL) {
+ ret = bpf_jit_get_func_addr(fp, &insn[i], true,
+ &func_addr,
+ &func_addr_fixed);
+ if (ret < 0)
+ return ret;
+
+ /*
+ * Save ctx->idx as this would currently point to the
+ * end of the JITed image and set it to the offset of
+ * the instruction sequence corresponding to the
+ * subprog call temporarily.
+ */
+ tmp_idx = ctx->idx;
+ ctx->idx = addrs[i] / 4;
+ bpf_jit_emit_func_call_rel(image, ctx, func_addr);
+
+ /*
+ * Restore ctx->idx here. This is safe as the length
+ * of the JITed sequence remains unchanged.
+ */
+ ctx->idx = tmp_idx;
+ }
+ }
+
+ return 0;
+}
+
struct powerpc64_jit_data {
struct bpf_binary_header *header;
u32 *addrs;
skip_init_ctx:
code_base = (u32 *)(image + FUNCTION_DESCR_SIZE);
+ if (extra_pass) {
+ /*
+ * Do not touch the prologue and epilogue as they will remain
+ * unchanged. Only fix the branch target address for subprog
+ * calls in the body.
+ *
+ * This does not change the offsets and lengths of the subprog
+ * call instruction sequences and hence, the size of the JITed
+ * image as well.
+ */
+ bpf_jit_fixup_subprog_calls(fp, code_base, &cgctx, addrs);
+
+ /* There is no need to perform the usual passes. */
+ goto skip_codegen_passes;
+ }
+
/* Code generation passes 1-2 */
for (pass = 1; pass < 3; pass++) {
/* Now build the prologue, body code & epilogue for real. */
proglen - (cgctx.idx * 4), cgctx.seen);
}
+skip_codegen_passes:
if (bpf_jit_enable > 1)
/*
* Note that we output the base address of the code_base
extern unsigned long itlb_miss_counter, dtlb_miss_counter;
extern atomic_t instruction_counter;
-extern unsigned int itlb_miss_perf, dtlb_miss_perf;
-extern unsigned int itlb_miss_exit_1, itlb_miss_exit_2;
-extern unsigned int dtlb_miss_exit_1, dtlb_miss_exit_2, dtlb_miss_exit_3;
static atomic_t insn_ctr_ref;
static atomic_t itlb_miss_ref;
break;
case PERF_8xx_ID_ITLB_LOAD_MISS:
if (atomic_inc_return(&itlb_miss_ref) == 1) {
- unsigned long target = (unsigned long)&itlb_miss_perf;
+ unsigned long target = patch_site_addr(&patch__itlbmiss_perf);
- patch_branch(&itlb_miss_exit_1, target, 0);
+ patch_branch_site(&patch__itlbmiss_exit_1, target, 0);
#ifndef CONFIG_PIN_TLB_TEXT
- patch_branch(&itlb_miss_exit_2, target, 0);
+ patch_branch_site(&patch__itlbmiss_exit_2, target, 0);
#endif
}
val = itlb_miss_counter;
break;
case PERF_8xx_ID_DTLB_LOAD_MISS:
if (atomic_inc_return(&dtlb_miss_ref) == 1) {
- unsigned long target = (unsigned long)&dtlb_miss_perf;
+ unsigned long target = patch_site_addr(&patch__dtlbmiss_perf);
- patch_branch(&dtlb_miss_exit_1, target, 0);
- patch_branch(&dtlb_miss_exit_2, target, 0);
- patch_branch(&dtlb_miss_exit_3, target, 0);
+ patch_branch_site(&patch__dtlbmiss_exit_1, target, 0);
+ patch_branch_site(&patch__dtlbmiss_exit_2, target, 0);
+ patch_branch_site(&patch__dtlbmiss_exit_3, target, 0);
}
val = dtlb_miss_counter;
break;
break;
case PERF_8xx_ID_ITLB_LOAD_MISS:
if (atomic_dec_return(&itlb_miss_ref) == 0) {
- patch_instruction(&itlb_miss_exit_1, insn);
+ patch_instruction_site(&patch__itlbmiss_exit_1, insn);
#ifndef CONFIG_PIN_TLB_TEXT
- patch_instruction(&itlb_miss_exit_2, insn);
+ patch_instruction_site(&patch__itlbmiss_exit_2, insn);
#endif
}
break;
case PERF_8xx_ID_DTLB_LOAD_MISS:
if (atomic_dec_return(&dtlb_miss_ref) == 0) {
- patch_instruction(&dtlb_miss_exit_1, insn);
- patch_instruction(&dtlb_miss_exit_2, insn);
- patch_instruction(&dtlb_miss_exit_3, insn);
+ patch_instruction_site(&patch__dtlbmiss_exit_1, insn);
+ patch_instruction_site(&patch__dtlbmiss_exit_2, insn);
+ patch_instruction_site(&patch__dtlbmiss_exit_3, insn);
}
break;
}
select 405EX
select PPC40x_SIMPLE
select PPC4xx_PCI_EXPRESS
+ select PCI
select PCI_MSI
select PPC4xx_MSI
help
depends on 44x
select PPC44x_SIMPLE
select APM821xx
+ select PCI
select PCI_MSI
select PPC4xx_MSI
select PPC4xx_PCI_EXPRESS
select SWIOTLB
select 476FPE
select PPC4xx_PCI_EXPRESS
+ select PCI
select PCI_MSI
select PPC4xx_HSTA_MSI
select I2C
}
EXPORT_SYMBOL(pnv_pci_get_npu_dev);
-#define NPU_DMA_OP_UNSUPPORTED() \
- dev_err_once(dev, "%s operation unsupported for NVLink devices\n", \
- __func__)
-
-static void *dma_npu_alloc(struct device *dev, size_t size,
- dma_addr_t *dma_handle, gfp_t flag,
- unsigned long attrs)
-{
- NPU_DMA_OP_UNSUPPORTED();
- return NULL;
-}
-
-static void dma_npu_free(struct device *dev, size_t size,
- void *vaddr, dma_addr_t dma_handle,
- unsigned long attrs)
-{
- NPU_DMA_OP_UNSUPPORTED();
-}
-
-static dma_addr_t dma_npu_map_page(struct device *dev, struct page *page,
- unsigned long offset, size_t size,
- enum dma_data_direction direction,
- unsigned long attrs)
-{
- NPU_DMA_OP_UNSUPPORTED();
- return 0;
-}
-
-static int dma_npu_map_sg(struct device *dev, struct scatterlist *sglist,
- int nelems, enum dma_data_direction direction,
- unsigned long attrs)
-{
- NPU_DMA_OP_UNSUPPORTED();
- return 0;
-}
-
-static int dma_npu_dma_supported(struct device *dev, u64 mask)
-{
- NPU_DMA_OP_UNSUPPORTED();
- return 0;
-}
-
-static u64 dma_npu_get_required_mask(struct device *dev)
-{
- NPU_DMA_OP_UNSUPPORTED();
- return 0;
-}
-
-static const struct dma_map_ops dma_npu_ops = {
- .map_page = dma_npu_map_page,
- .map_sg = dma_npu_map_sg,
- .alloc = dma_npu_alloc,
- .free = dma_npu_free,
- .dma_supported = dma_npu_dma_supported,
- .get_required_mask = dma_npu_get_required_mask,
-};
-
/*
* Returns the PE assoicated with the PCI device of the given
* NPU. Returns the linked pci device if pci_dev != NULL.
rc = pnv_npu_set_window(npe, 0, gpe->table_group.tables[0]);
/*
- * We don't initialise npu_pe->tce32_table as we always use
- * dma_npu_ops which are nops.
+ * NVLink devices use the same TCE table configuration as
+ * their parent device so drivers shouldn't be doing DMA
+ * operations directly on these devices.
*/
- set_dma_ops(&npe->pdev->dev, &dma_npu_ops);
+ set_dma_ops(&npe->pdev->dev, NULL);
}
/*
#include <linux/seq_file.h>
#include <linux/slab.h>
#include <linux/uaccess.h>
+#include <linux/hugetlb.h>
#include <asm/lppaca.h>
#include <asm/hvcall.h>
#include <asm/firmware.h>
#include <asm/vio.h>
#include <asm/mmu.h>
#include <asm/machdep.h>
+#include <asm/drmem.h>
#include "pseries.h"
seq_printf(m, "power_mode_data=%016lx\n", retbuf[0]);
}
+static void maxmem_data(struct seq_file *m)
+{
+ unsigned long maxmem = 0;
+
+ maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
+ maxmem += hugetlb_total_pages() * PAGE_SIZE;
+
+ seq_printf(m, "MaxMem=%ld\n", maxmem);
+}
+
static int pseries_lparcfg_data(struct seq_file *m, void *v)
{
int partition_potential_processors;
seq_printf(m, "slb_size=%d\n", mmu_slb_size);
#endif
parse_em_data(m);
+ maxmem_data(m);
return 0;
}
ORIG_CFLAGS := $(KBUILD_CFLAGS)
KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
+ifdef CONFIG_CC_IS_CLANG
+# clang stores addresses on the stack causing the frame size to blow
+# out. See https://github.com/ClangBuiltLinux/linux/issues/252
+KBUILD_CFLAGS += -Wframe-larger-than=4096
+endif
+
ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
obj-y += xmon.o nonstdio.o spr_access.o
# arch specific predefines for sparse
CHECKFLAGS += -D__riscv -D__riscv_xlen=$(BITS)
+# Default target when executing plain make
+boot := arch/riscv/boot
+KBUILD_IMAGE := $(boot)/Image.gz
+
head-y := arch/riscv/kernel/head.o
core-y += arch/riscv/kernel/ arch/riscv/mm/
libs-y += arch/riscv/lib/
-all: vmlinux
+PHONY += vdso_install
+vdso_install:
+ $(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso $@
+
+all: Image.gz
+
+Image: vmlinux
+ $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
+
+Image.%: Image
+ $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
+
+zinstall install:
+ $(Q)$(MAKE) $(build)=$(boot) $@
--- /dev/null
+Image
+Image.gz
--- /dev/null
+#
+# arch/riscv/boot/Makefile
+#
+# This file is included by the global makefile so that you can add your own
+# architecture-specific flags and dependencies.
+#
+# This file is subject to the terms and conditions of the GNU General Public
+# License. See the file "COPYING" in the main directory of this archive
+# for more details.
+#
+# Copyright (C) 2018, Anup Patel.
+# Author: Anup Patel <anup@brainfault.org>
+#
+# Based on the ia64 and arm64 boot/Makefile.
+#
+
+OBJCOPYFLAGS_Image :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
+
+targets := Image
+
+$(obj)/Image: vmlinux FORCE
+ $(call if_changed,objcopy)
+
+$(obj)/Image.gz: $(obj)/Image FORCE
+ $(call if_changed,gzip)
+
+install:
+ $(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
+ $(obj)/Image System.map "$(INSTALL_PATH)"
+
+zinstall:
+ $(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
+ $(obj)/Image.gz System.map "$(INSTALL_PATH)"
--- /dev/null
+#!/bin/sh
+#
+# arch/riscv/boot/install.sh
+#
+# This file is subject to the terms and conditions of the GNU General Public
+# License. See the file "COPYING" in the main directory of this archive
+# for more details.
+#
+# Copyright (C) 1995 by Linus Torvalds
+#
+# Adapted from code in arch/i386/boot/Makefile by H. Peter Anvin
+# Adapted from code in arch/i386/boot/install.sh by Russell King
+#
+# "make install" script for the RISC-V Linux port
+#
+# Arguments:
+# $1 - kernel version
+# $2 - kernel image file
+# $3 - kernel map file
+# $4 - default install path (blank if root directory)
+#
+
+verify () {
+ if [ ! -f "$1" ]; then
+ echo "" 1>&2
+ echo " *** Missing file: $1" 1>&2
+ echo ' *** You need to run "make" before "make install".' 1>&2
+ echo "" 1>&2
+ exit 1
+ fi
+}
+
+# Make sure the files actually exist
+verify "$2"
+verify "$3"
+
+# User may have a custom install script
+if [ -x ~/bin/${INSTALLKERNEL} ]; then exec ~/bin/${INSTALLKERNEL} "$@"; fi
+if [ -x /sbin/${INSTALLKERNEL} ]; then exec /sbin/${INSTALLKERNEL} "$@"; fi
+
+if [ "$(basename $2)" = "Image.gz" ]; then
+# Compressed install
+ echo "Installing compressed kernel"
+ base=vmlinuz
+else
+# Normal install
+ echo "Installing normal kernel"
+ base=vmlinux
+fi
+
+if [ -f $4/$base-$1 ]; then
+ mv $4/$base-$1 $4/$base-$1.old
+fi
+cat $2 > $4/$base-$1
+
+# Install system map file
+if [ -f $4/System.map-$1 ]; then
+ mv $4/System.map-$1 $4/System.map-$1.old
+fi
+cp $3 $4/System.map-$1
-CONFIG_SMP=y
-CONFIG_PCI=y
-CONFIG_PCIE_XILINX=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_IKCONFIG=y
CONFIG_CGROUP_BPF=y
CONFIG_NAMESPACES=y
CONFIG_USER_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_EXPERT=y
-CONFIG_CHECKPOINT_RESTORE=y
CONFIG_BPF_SYSCALL=y
+CONFIG_SMP=y
+CONFIG_PCI=y
+CONFIG_PCIE_XILINX=y
+CONFIG_MODULES=y
+CONFIG_MODULE_UNLOAD=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_USB_STORAGE=y
CONFIG_USB_UAS=y
CONFIG_VIRTIO_MMIO=y
+CONFIG_SIFIVE_PLIC=y
CONFIG_RAS=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_ROOT_NFS=y
-# CONFIG_RCU_TRACE is not set
CONFIG_CRYPTO_USER_API_HASH=y
-CONFIG_MODULES=y
-CONFIG_MODULE_UNLOAD=y
-CONFIG_SIFIVE_PLIC=y
+CONFIG_PRINTK_TIME=y
+# CONFIG_RCU_TRACE is not set
#define MODULE_ARCH_VERMAGIC "riscv"
+struct module;
u64 module_emit_got_entry(struct module *mod, u64 val);
u64 module_emit_plt_entry(struct module *mod, u64 val);
unsigned long sstatus;
unsigned long sbadaddr;
unsigned long scause;
- /* a0 value before the syscall */
- unsigned long orig_a0;
+ /* a0 value before the syscall */
+ unsigned long orig_a0;
};
#ifdef CONFIG_64BIT
static inline unsigned long
raw_copy_from_user(void *to, const void __user *from, unsigned long n)
{
- return __asm_copy_to_user(to, from, n);
+ return __asm_copy_from_user(to, from, n);
}
static inline unsigned long
raw_copy_to_user(void __user *to, const void *from, unsigned long n)
{
- return __asm_copy_from_user(to, from, n);
+ return __asm_copy_to_user(to, from, n);
}
extern long strncpy_from_user(char *dest, const char __user *src, long count);
/*
* There is explicitly no include guard here because this file is expected to
- * be included multiple times. See uapi/asm/syscalls.h for more info.
+ * be included multiple times.
*/
-#define __ARCH_WANT_NEW_STAT
#define __ARCH_WANT_SYS_CLONE
+
#include <uapi/asm/unistd.h>
-#include <uapi/asm/syscalls.h>
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Copyright (C) 2017-2018 SiFive
- */
-
-/*
- * There is explicitly no include guard here because this file is expected to
- * be included multiple times in order to define the syscall macros via
- * __SYSCALL.
- */
-
-/*
- * Allows the instruction cache to be flushed from userspace. Despite RISC-V
- * having a direct 'fence.i' instruction available to userspace (which we
- * can't trap!), that's not actually viable when running on Linux because the
- * kernel might schedule a process on another hart. There is no way for
- * userspace to handle this without invoking the kernel (as it doesn't know the
- * thread->hart mappings), so we've defined a RISC-V specific system call to
- * flush the instruction cache.
- *
- * __NR_riscv_flush_icache is defined to flush the instruction cache over an
- * address range, with the flush applying to either all threads or just the
- * caller. We don't currently do anything with the address range, that's just
- * in there for forwards compatibility.
- */
-#ifndef __NR_riscv_flush_icache
-#define __NR_riscv_flush_icache (__NR_arch_specific_syscall + 15)
-#endif
-__SYSCALL(__NR_riscv_flush_icache, sys_riscv_flush_icache)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (C) 2018 David Abdurachmanov <david.abdurachmanov@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef __LP64__
+#define __ARCH_WANT_NEW_STAT
+#endif /* __LP64__ */
+
+#include <asm-generic/unistd.h>
+
+/*
+ * Allows the instruction cache to be flushed from userspace. Despite RISC-V
+ * having a direct 'fence.i' instruction available to userspace (which we
+ * can't trap!), that's not actually viable when running on Linux because the
+ * kernel might schedule a process on another hart. There is no way for
+ * userspace to handle this without invoking the kernel (as it doesn't know the
+ * thread->hart mappings), so we've defined a RISC-V specific system call to
+ * flush the instruction cache.
+ *
+ * __NR_riscv_flush_icache is defined to flush the instruction cache over an
+ * address range, with the flush applying to either all threads or just the
+ * caller. We don't currently do anything with the address range, that's just
+ * in there for forwards compatibility.
+ */
+#ifndef __NR_riscv_flush_icache
+#define __NR_riscv_flush_icache (__NR_arch_specific_syscall + 15)
+#endif
+__SYSCALL(__NR_riscv_flush_icache, sys_riscv_flush_icache)
static void print_isa(struct seq_file *f, const char *orig_isa)
{
- static const char *ext = "mafdc";
+ static const char *ext = "mafdcsu";
const char *isa = orig_isa;
const char *e;
/*
* Check the rest of the ISA string for valid extensions, printing those
* we find. RISC-V ISA strings define an order, so we only print the
- * extension bits when they're in order.
+ * extension bits when they're in order. Hide the supervisor (S)
+ * extension from userspace as it's not accessible from there.
*/
for (e = ext; *e != '\0'; ++e) {
if (isa[0] == e[0]) {
- seq_write(f, isa, 1);
+ if (isa[0] != 's')
+ seq_write(f, isa, 1);
+
isa++;
}
}
{
unsigned long return_hooker = (unsigned long)&return_to_handler;
unsigned long old;
- struct ftrace_graph_ent trace;
int err;
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
*/
old = *parent;
- trace.func = self_addr;
- trace.depth = current->curr_ret_stack + 1;
-
- if (!ftrace_graph_entry(&trace))
- return;
-
- err = ftrace_push_return_trace(old, self_addr, &trace.depth,
- frame_pointer, parent);
- if (err == -EBUSY)
- return;
- *parent = return_hooker;
+ if (function_graph_enter(old, self_addr, frame_pointer, parent))
+ *parent = return_hooker;
}
#ifdef CONFIG_DYNAMIC_FTRACE
amoadd.w a3, a2, (a3)
bnez a3, .Lsecondary_start
+ /* Clear BSS for flat non-ELF images */
+ la a3, __bss_start
+ la a4, __bss_stop
+ ble a4, a3, clear_bss_done
+clear_bss:
+ REG_S zero, (a3)
+ add a3, a3, RISCV_SZPTR
+ blt a3, a4, clear_bss
+clear_bss_done:
+
/* Save hart ID and DTB physical address */
mv s0, a0
mv s1, a1
{
if (v != (u32)v) {
pr_err("%s: value %016llx out of range for 32-bit field\n",
- me->name, v);
+ me->name, (long long)v);
return -EINVAL;
}
*location = v;
if (offset != (s32)offset) {
pr_err(
"%s: target %016llx can not be addressed by the 32-bit offset from PC = %p\n",
- me->name, v, location);
+ me->name, (long long)v, location);
return -EINVAL;
}
if (IS_ENABLED(CMODEL_MEDLOW)) {
pr_err(
"%s: target %016llx can not be addressed by the 32-bit offset from PC = %p\n",
- me->name, v, location);
+ me->name, (long long)v, location);
return -EINVAL;
}
} else {
pr_err(
"%s: can not generate the GOT entry for symbol = %016llx from PC = %p\n",
- me->name, v, location);
+ me->name, (long long)v, location);
return -EINVAL;
}
} else {
pr_err(
"%s: target %016llx can not be addressed by the 32-bit offset from PC = %p\n",
- me->name, v, location);
+ me->name, (long long)v, location);
return -EINVAL;
}
}
if (offset != fill_v) {
pr_err(
"%s: target %016llx can not be addressed by the 32-bit offset from PC = %p\n",
- me->name, v, location);
+ me->name, (long long)v, location);
return -EINVAL;
}
*(.sbss*)
}
- BSS_SECTION(0, 0, 0)
+ BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0)
EXCEPTION_TABLE(0x10)
NOTES
lib-y += memset.o
lib-y += uaccess.o
-lib-(CONFIG_64BIT) += tishift.o
+lib-$(CONFIG_64BIT) += tishift.o
lib-$(CONFIG_32BIT) += udivdi3.o
KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO),-g)
KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO_DWARF4), $(call cc-option, -gdwarf-4,))
UTS_MACHINE := s390x
-STACK_SIZE := $(if $(CONFIG_KASAN),32768,16384)
+STACK_SIZE := $(if $(CONFIG_KASAN),65536,16384)
CHECKFLAGS += -D__s390__ -D__s390x__
export LD_BFD
OBJECTS := $(addprefix $(obj)/,$(obj-y))
LDFLAGS_vmlinux := --oformat $(LD_BFD) -e startup -T
-$(obj)/vmlinux: $(obj)/vmlinux.lds $(objtree)/arch/s390/boot/startup.a $(OBJECTS)
+$(obj)/vmlinux: $(obj)/vmlinux.lds $(objtree)/arch/s390/boot/startup.a $(OBJECTS) FORCE
$(call if_changed,ld)
-OBJCOPYFLAGS_info.bin := -O binary --only-section=.vmlinux.info
+OBJCOPYFLAGS_info.bin := -O binary --only-section=.vmlinux.info --set-section-flags .vmlinux.info=load
$(obj)/info.bin: vmlinux FORCE
$(call if_changed,objcopy)
suffix-$(CONFIG_KERNEL_LZO) := .lzo
suffix-$(CONFIG_KERNEL_XZ) := .xz
-$(obj)/vmlinux.bin.gz: $(vmlinux.bin.all-y)
+$(obj)/vmlinux.bin.gz: $(vmlinux.bin.all-y) FORCE
$(call if_changed,gzip)
-$(obj)/vmlinux.bin.bz2: $(vmlinux.bin.all-y)
+$(obj)/vmlinux.bin.bz2: $(vmlinux.bin.all-y) FORCE
$(call if_changed,bzip2)
-$(obj)/vmlinux.bin.lz4: $(vmlinux.bin.all-y)
+$(obj)/vmlinux.bin.lz4: $(vmlinux.bin.all-y) FORCE
$(call if_changed,lz4)
-$(obj)/vmlinux.bin.lzma: $(vmlinux.bin.all-y)
+$(obj)/vmlinux.bin.lzma: $(vmlinux.bin.all-y) FORCE
$(call if_changed,lzma)
-$(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y)
+$(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE
$(call if_changed,lzo)
-$(obj)/vmlinux.bin.xz: $(vmlinux.bin.all-y)
+$(obj)/vmlinux.bin.xz: $(vmlinux.bin.all-y) FORCE
$(call if_changed,xzkern)
OBJCOPYFLAGS_piggy.o := -I binary -O elf64-s390 -B s390:64-bit --rename-section .data=.vmlinux.bin.compressed
CONFIG_PREEMPT=y
CONFIG_HZ_100=y
CONFIG_KEXEC_FILE=y
+CONFIG_EXPOLINE=y
+CONFIG_EXPOLINE_AUTO=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_KSM=y
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_S390=y
CONFIG_CHSC_SCH=y
+CONFIG_VFIO_AP=m
CONFIG_CRASH_DUMP=y
CONFIG_BINFMT_MISC=m
CONFIG_HIBERNATION=y
+CONFIG_PM_DEBUG=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NF_CT_NETLINK_TIMEOUT=m
CONFIG_NF_TABLES=m
-CONFIG_NFT_EXTHDR=m
-CONFIG_NFT_META=m
CONFIG_NFT_CT=m
CONFIG_NFT_COUNTER=m
CONFIG_NFT_LOG=m
CONFIG_NET_ACT_CSUM=m
CONFIG_DNS_RESOLVER=y
CONFIG_OPENVSWITCH=m
+CONFIG_VSOCKETS=m
+CONFIG_VIRTIO_VSOCKETS=m
CONFIG_NETLINK_DIAG=m
CONFIG_CGROUP_NET_PRIO=y
CONFIG_BPF_JIT=y
CONFIG_PPPOL2TP=m
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
+CONFIG_ISM=m
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
CONFIG_MLX5_INFINIBAND=m
CONFIG_VFIO=m
CONFIG_VFIO_PCI=m
+CONFIG_VFIO_MDEV=m
+CONFIG_VFIO_MDEV_DEVICE=m
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_INPUT=y
+CONFIG_S390_AP_IOMMU=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=300
CONFIG_NOTIFIER_ERROR_INJECTION=m
-CONFIG_PM_NOTIFIER_ERROR_INJECT=m
CONFIG_NETDEV_NOTIFIER_ERROR_INJECT=m
CONFIG_FAULT_INJECTION=y
CONFIG_FAILSLAB=y
CONFIG_KVM=m
CONFIG_KVM_S390_UCONTROL=y
CONFIG_VHOST_NET=m
+CONFIG_VHOST_VSOCK=m
CONFIG_NUMA=y
CONFIG_HZ_100=y
CONFIG_KEXEC_FILE=y
+CONFIG_EXPOLINE=y
+CONFIG_EXPOLINE_AUTO=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_KSM=y
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_S390=y
CONFIG_CHSC_SCH=y
+CONFIG_VFIO_AP=m
CONFIG_CRASH_DUMP=y
CONFIG_BINFMT_MISC=m
CONFIG_HIBERNATION=y
+CONFIG_PM_DEBUG=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NF_CT_NETLINK_TIMEOUT=m
CONFIG_NF_TABLES=m
-CONFIG_NFT_EXTHDR=m
-CONFIG_NFT_META=m
CONFIG_NFT_CT=m
CONFIG_NFT_COUNTER=m
CONFIG_NFT_LOG=m
CONFIG_NET_ACT_CSUM=m
CONFIG_DNS_RESOLVER=y
CONFIG_OPENVSWITCH=m
+CONFIG_VSOCKETS=m
+CONFIG_VIRTIO_VSOCKETS=m
CONFIG_NETLINK_DIAG=m
CONFIG_CGROUP_NET_PRIO=y
CONFIG_BPF_JIT=y
CONFIG_PPPOL2TP=m
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
+CONFIG_ISM=m
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
CONFIG_MLX5_INFINIBAND=m
CONFIG_VFIO=m
CONFIG_VFIO_PCI=m
+CONFIG_VFIO_MDEV=m
+CONFIG_VFIO_MDEV_DEVICE=m
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_INPUT=y
+CONFIG_S390_AP_IOMMU=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
CONFIG_KVM=m
CONFIG_KVM_S390_UCONTROL=y
CONFIG_VHOST_NET=m
+CONFIG_VHOST_VSOCK=m
CONFIG_CGROUP_PERF=y
CONFIG_NAMESPACES=y
CONFIG_USER_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_EXPERT=y
# CONFIG_SYSFS_SYSCALL is not set
-CONFIG_CHECKPOINT_RESTORE=y
CONFIG_BPF_SYSCALL=y
CONFIG_USERFAULTFD=y
# CONFIG_COMPAT_BRK is not set
CONFIG_PROFILING=y
+CONFIG_LIVEPATCH=y
+CONFIG_NR_CPUS=256
+CONFIG_NUMA=y
+CONFIG_HZ_100=y
+CONFIG_KEXEC_FILE=y
+CONFIG_CRASH_DUMP=y
+CONFIG_HIBERNATION=y
+CONFIG_PM_DEBUG=y
+CONFIG_CMM=m
CONFIG_OPROFILE=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_PARTITION_ADVANCED=y
CONFIG_IBM_PARTITION=y
CONFIG_DEFAULT_DEADLINE=y
-CONFIG_LIVEPATCH=y
-CONFIG_NR_CPUS=256
-CONFIG_NUMA=y
-CONFIG_HZ_100=y
-CONFIG_KEXEC_FILE=y
+CONFIG_BINFMT_MISC=m
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_KSM=y
CONFIG_ZSMALLOC=m
CONFIG_ZSMALLOC_STAT=y
CONFIG_IDLE_PAGE_TRACKING=y
-CONFIG_CRASH_DUMP=y
-CONFIG_BINFMT_MISC=m
-CONFIG_HIBERNATION=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_BLK_DEV_RAM=y
CONFIG_VIRTIO_BLK=y
CONFIG_SCSI=y
+# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
CONFIG_BLK_DEV_SR=y
CONFIG_TUN=m
CONFIG_VIRTIO_NET=y
# CONFIG_NET_VENDOR_ALACRITECH is not set
+# CONFIG_NET_VENDOR_AURORA is not set
# CONFIG_NET_VENDOR_CORTINA is not set
# CONFIG_NET_VENDOR_SOLARFLARE is not set
# CONFIG_NET_VENDOR_SOCIONEXT is not set
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
# CONFIG_NETWORK_FILESYSTEMS is not set
-CONFIG_DEBUG_INFO=y
-CONFIG_DEBUG_INFO_DWARF4=y
-CONFIG_GDB_SCRIPTS=y
-CONFIG_UNUSED_SYMBOLS=y
-CONFIG_DEBUG_SECTION_MISMATCH=y
-CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y
-CONFIG_MAGIC_SYSRQ=y
-CONFIG_DEBUG_PAGEALLOC=y
-CONFIG_DETECT_HUNG_TASK=y
-CONFIG_PANIC_ON_OOPS=y
-CONFIG_PROVE_LOCKING=y
-CONFIG_LOCK_STAT=y
-CONFIG_DEBUG_LOCKDEP=y
-CONFIG_DEBUG_ATOMIC_SLEEP=y
-CONFIG_DEBUG_LIST=y
-CONFIG_DEBUG_SG=y
-CONFIG_DEBUG_NOTIFIERS=y
-CONFIG_RCU_CPU_STALL_TIMEOUT=60
-CONFIG_LATENCYTOP=y
-CONFIG_SCHED_TRACER=y
-CONFIG_FTRACE_SYSCALLS=y
-CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
-CONFIG_STACK_TRACER=y
-CONFIG_BLK_DEV_IO_TRACE=y
-CONFIG_FUNCTION_PROFILER=y
-# CONFIG_RUNTIME_TESTING_MENU is not set
-CONFIG_S390_PTDUMP=y
CONFIG_CRYPTO_CRYPTD=m
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_CFB=m
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_LRW=m
+CONFIG_CRYPTO_OFB=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_USER_API_SKCIPHER=m
CONFIG_CRYPTO_USER_API_RNG=m
CONFIG_ZCRYPT=m
-CONFIG_ZCRYPT_MULTIDEVNODES=y
CONFIG_PKEY=m
CONFIG_CRYPTO_PAES_S390=m
CONFIG_CRYPTO_SHA1_S390=m
# CONFIG_XZ_DEC_ARM is not set
# CONFIG_XZ_DEC_ARMTHUMB is not set
# CONFIG_XZ_DEC_SPARC is not set
-CONFIG_CMM=m
+CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF4=y
+CONFIG_GDB_SCRIPTS=y
+CONFIG_UNUSED_SYMBOLS=y
+CONFIG_DEBUG_SECTION_MISMATCH=y
+CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_PAGEALLOC=y
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PROVE_LOCKING=y
+CONFIG_LOCK_STAT=y
+CONFIG_DEBUG_LOCKDEP=y
+CONFIG_DEBUG_ATOMIC_SLEEP=y
+CONFIG_DEBUG_LIST=y
+CONFIG_DEBUG_SG=y
+CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_RCU_CPU_STALL_TIMEOUT=60
+CONFIG_LATENCYTOP=y
+CONFIG_SCHED_TRACER=y
+CONFIG_FTRACE_SYSCALLS=y
+CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
+CONFIG_STACK_TRACER=y
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_FUNCTION_PROFILER=y
+# CONFIG_RUNTIME_TESTING_MENU is not set
+CONFIG_S390_PTDUMP=y
mm->context.asce_limit = STACK_TOP_MAX;
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
_ASCE_USER_BITS | _ASCE_TYPE_REGION3;
- /* pgd_alloc() did not account this pud */
- mm_inc_nr_puds(mm);
break;
case -PAGE_SIZE:
/* forked 5-level task, set new asce with new_mm->pgd */
/* forked 2-level compat task, set new asce with new mm->pgd */
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
_ASCE_USER_BITS | _ASCE_TYPE_SEGMENT;
- /* pgd_alloc() did not account this pmd */
- mm_inc_nr_pmds(mm);
- mm_inc_nr_puds(mm);
}
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
static inline unsigned long pgd_entry_type(struct mm_struct *mm)
{
- if (mm->context.asce_limit <= _REGION3_SIZE)
+ if (mm_pmd_folded(mm))
return _SEGMENT_ENTRY_EMPTY;
- if (mm->context.asce_limit <= _REGION2_SIZE)
+ if (mm_pud_folded(mm))
return _REGION3_ENTRY_EMPTY;
- if (mm->context.asce_limit <= _REGION1_SIZE)
+ if (mm_p4d_folded(mm))
return _REGION2_ENTRY_EMPTY;
return _REGION1_ENTRY_EMPTY;
}
_REGION_ENTRY_PROTECT | \
_REGION_ENTRY_NOEXEC)
+static inline bool mm_p4d_folded(struct mm_struct *mm)
+{
+ return mm->context.asce_limit <= _REGION1_SIZE;
+}
+#define mm_p4d_folded(mm) mm_p4d_folded(mm)
+
+static inline bool mm_pud_folded(struct mm_struct *mm)
+{
+ return mm->context.asce_limit <= _REGION2_SIZE;
+}
+#define mm_pud_folded(mm) mm_pud_folded(mm)
+
+static inline bool mm_pmd_folded(struct mm_struct *mm)
+{
+ return mm->context.asce_limit <= _REGION3_SIZE;
+}
+#define mm_pmd_folded(mm) mm_pmd_folded(mm)
+
static inline int mm_has_pgste(struct mm_struct *mm)
{
#ifdef CONFIG_PGSTE
return sp;
}
-static __no_sanitize_address_or_inline unsigned short stap(void)
+static __no_kasan_or_inline unsigned short stap(void)
{
unsigned short cpu_address;
* Set PSW mask to specified value, while leaving the
* PSW addr pointing to the next instruction.
*/
-static __no_sanitize_address_or_inline void __load_psw_mask(unsigned long mask)
+static __no_kasan_or_inline void __load_psw_mask(unsigned long mask)
{
unsigned long addr;
psw_t psw;
* General size of kernel stacks
*/
#ifdef CONFIG_KASAN
-#define THREAD_SIZE_ORDER 3
+#define THREAD_SIZE_ORDER 4
#else
#define THREAD_SIZE_ORDER 2
#endif
static inline void pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
unsigned long address)
{
- if (tlb->mm->context.asce_limit <= _REGION3_SIZE)
+ if (mm_pmd_folded(tlb->mm))
return;
pgtable_pmd_page_dtor(virt_to_page(pmd));
tlb_remove_table(tlb, pmd);
static inline void p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long address)
{
- if (tlb->mm->context.asce_limit <= _REGION1_SIZE)
+ if (mm_p4d_folded(tlb->mm))
return;
tlb_remove_table(tlb, p4d);
}
static inline void pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
unsigned long address)
{
- if (tlb->mm->context.asce_limit <= _REGION2_SIZE)
+ if (mm_pud_folded(tlb->mm))
return;
tlb_remove_table(tlb, pud);
}
stmg %r6,%r15,__SF_GPRS(%r15) # store gprs of prev task
lghi %r4,__TASK_stack
lghi %r1,__TASK_thread
- lg %r5,0(%r4,%r3) # start of kernel stack of next
+ llill %r5,STACK_INIT
stg %r15,__THREAD_ksp(%r1,%r2) # store kernel stack of prev
- lgr %r15,%r5
- aghi %r15,STACK_INIT # end of kernel stack of next
+ lg %r15,0(%r4,%r3) # start of kernel stack of next
+ agr %r15,%r5 # end of kernel stack of next
stg %r3,__LC_CURRENT # store task struct of next
stg %r15,__LC_KERNEL_STACK # store end of kernel stack
lg %r15,__THREAD_ksp(%r1,%r3) # load kernel stack of next
*/
unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip)
{
- struct ftrace_graph_ent trace;
-
if (unlikely(ftrace_graph_is_dead()))
goto out;
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
goto out;
ip -= MCOUNT_INSN_SIZE;
- trace.func = ip;
- trace.depth = current->curr_ret_stack + 1;
- /* Only trace if the calling function expects to. */
- if (!ftrace_graph_entry(&trace))
- goto out;
- if (ftrace_push_return_trace(parent, ip, &trace.depth, 0,
- NULL) == -EBUSY)
- goto out;
- parent = (unsigned long) return_to_handler;
+ if (!function_graph_enter(parent, ip, 0, NULL))
+ parent = (unsigned long) return_to_handler;
out:
return parent;
}
break;
case PERF_TYPE_HARDWARE:
+ if (is_sampling_event(event)) /* No sampling support */
+ return -ENOENT;
ev = attr->config;
/* Count user space (problem-state) only */
if (!attr->exclude_user && attr->exclude_kernel) {
return -ENOENT;
if (ev > PERF_CPUM_CF_MAX_CTR)
- return -EINVAL;
+ return -ENOENT;
/* Obtain the counter set to which the specified counter belongs */
set = get_counter_set(ev);
CPUMF_EVENT_ATTR(SF, SF_CYCLES_BASIC, PERF_EVENT_CPUM_SF);
CPUMF_EVENT_ATTR(SF, SF_CYCLES_BASIC_DIAG, PERF_EVENT_CPUM_SF_DIAG);
-static struct attribute *cpumsf_pmu_events_attr[] = {
- CPUMF_EVENT_PTR(SF, SF_CYCLES_BASIC),
- NULL,
- NULL,
+/* Attribute list for CPU_SF.
+ *
+ * The availablitiy depends on the CPU_MF sampling facility authorization
+ * for basic + diagnositic samples. This is determined at initialization
+ * time by the sampling facility device driver.
+ * If the authorization for basic samples is turned off, it should be
+ * also turned off for diagnostic sampling.
+ *
+ * During initialization of the device driver, check the authorization
+ * level for diagnostic sampling and installs the attribute
+ * file for diagnostic sampling if necessary.
+ *
+ * For now install a placeholder to reference all possible attributes:
+ * SF_CYCLES_BASIC and SF_CYCLES_BASIC_DIAG.
+ * Add another entry for the final NULL pointer.
+ */
+enum {
+ SF_CYCLES_BASIC_ATTR_IDX = 0,
+ SF_CYCLES_BASIC_DIAG_ATTR_IDX,
+ SF_CYCLES_ATTR_MAX
+};
+
+static struct attribute *cpumsf_pmu_events_attr[SF_CYCLES_ATTR_MAX + 1] = {
+ [SF_CYCLES_BASIC_ATTR_IDX] = CPUMF_EVENT_PTR(SF, SF_CYCLES_BASIC)
};
PMU_FORMAT_ATTR(event, "config:0-63");
if (si.ad) {
sfb_set_limits(CPUM_SF_MIN_SDB, CPUM_SF_MAX_SDB);
- cpumsf_pmu_events_attr[1] =
+ /* Sampling of diagnostic data authorized,
+ * install event into attribute list of PMU device.
+ */
+ cpumsf_pmu_events_attr[SF_CYCLES_BASIC_DIAG_ATTR_IDX] =
CPUMF_EVENT_PTR(SF, SF_CYCLES_BASIC_DIAG);
}
$(obj)/vdso32_wrapper.o : $(obj)/vdso32.so
# link rule for the .so file, .lds has to be first
-$(obj)/vdso32.so.dbg: $(src)/vdso32.lds $(obj-vdso32)
+$(obj)/vdso32.so.dbg: $(src)/vdso32.lds $(obj-vdso32) FORCE
$(call if_changed,vdso32ld)
# strip rule for the .so file
$(call if_changed,objcopy)
# assembly rules for the .S files
-$(obj-vdso32): %.o: %.S
+$(obj-vdso32): %.o: %.S FORCE
$(call if_changed_dep,vdso32as)
# actual build commands
quiet_cmd_vdso32ld = VDSO32L $@
- cmd_vdso32ld = $(CC) $(c_flags) -Wl,-T $^ -o $@
+ cmd_vdso32ld = $(CC) $(c_flags) -Wl,-T $(filter %.lds %.o,$^) -o $@
quiet_cmd_vdso32as = VDSO32A $@
cmd_vdso32as = $(CC) $(a_flags) -c -o $@ $<
$(obj)/vdso64_wrapper.o : $(obj)/vdso64.so
# link rule for the .so file, .lds has to be first
-$(obj)/vdso64.so.dbg: $(src)/vdso64.lds $(obj-vdso64)
+$(obj)/vdso64.so.dbg: $(src)/vdso64.lds $(obj-vdso64) FORCE
$(call if_changed,vdso64ld)
# strip rule for the .so file
$(call if_changed,objcopy)
# assembly rules for the .S files
-$(obj-vdso64): %.o: %.S
+$(obj-vdso64): %.o: %.S FORCE
$(call if_changed_dep,vdso64as)
# actual build commands
quiet_cmd_vdso64ld = VDSO64L $@
- cmd_vdso64ld = $(CC) $(c_flags) -Wl,-T $^ -o $@
+ cmd_vdso64ld = $(CC) $(c_flags) -Wl,-T $(filter %.lds %.o,$^) -o $@
quiet_cmd_vdso64as = VDSO64A $@
cmd_vdso64as = $(CC) $(a_flags) -c -o $@ $<
* uncompressed image info used by the decompressor
* it should match struct vmlinux_info
*/
- .vmlinux.info 0 : {
+ .vmlinux.info 0 (INFO) : {
QUAD(_stext) /* default_lma */
QUAD(startup_continue) /* entry */
QUAD(__bss_start - _stext) /* image_size */
QUAD(__bss_stop - __bss_start) /* bss_size */
QUAD(__boot_data_start) /* bootdata_off */
QUAD(__boot_data_end - __boot_data_start) /* bootdata_size */
- }
+ } :NONE
/* Debugging sections. */
STABS_DEBUG
mm->context.asce_limit = _REGION1_SIZE;
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
_ASCE_USER_BITS | _ASCE_TYPE_REGION2;
+ mm_inc_nr_puds(mm);
} else {
crst_table_init(table, _REGION1_ENTRY_EMPTY);
pgd_populate(mm, (pgd_t *) table, (p4d_t *) pgd);
}
pgd = mm->pgd;
+ mm_dec_nr_pmds(mm);
mm->pgd = (pgd_t *) (pgd_val(*pgd) & _REGION_ENTRY_ORIGIN);
mm->context.asce_limit = _REGION3_SIZE;
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
{
return mode->distance ? mode->distance(a, b) : 0;
}
+EXPORT_SYMBOL(__node_distance);
int numa_debug_enabled;
void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
{
unsigned long old;
- int faulted, err;
- struct ftrace_graph_ent trace;
+ int faulted;
unsigned long return_hooker = (unsigned long)&return_to_handler;
if (unlikely(ftrace_graph_is_dead()))
return;
}
- err = ftrace_push_return_trace(old, self_addr, &trace.depth, 0, NULL);
- if (err == -EBUSY) {
+ if (function_graph_enter(old, self_addr, 0, NULL))
__raw_writel(old, parent);
- return;
- }
-
- trace.func = self_addr;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace)) {
- current->curr_ret_stack--;
- __raw_writel(old, parent);
- }
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
unsigned long frame_pointer)
{
unsigned long return_hooker = (unsigned long) &return_to_handler;
- struct ftrace_graph_ent trace;
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
return parent + 8UL;
- trace.func = self_addr;
- trace.depth = current->curr_ret_stack + 1;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace))
- return parent + 8UL;
-
- if (ftrace_push_return_trace(parent, self_addr, &trace.depth,
- frame_pointer, NULL) == -EBUSY)
+ if (function_graph_enter(parent, self_addr, frame_pointer, NULL))
return parent + 8UL;
return return_hooker;
/* Allocate and initialize the free area map. */
sz = num_tsb_entries / 8;
sz = (sz + 7UL) & ~7UL;
- iommu->tbl.map = kmalloc_node(sz, GFP_KERNEL, numa_node);
+ iommu->tbl.map = kzalloc_node(sz, GFP_KERNEL, numa_node);
if (!iommu->tbl.map)
return -ENOMEM;
- memset(iommu->tbl.map, 0, sz);
iommu_tbl_pool_init(&iommu->tbl, num_tsb_entries, IO_PAGE_SHIFT,
(tlb_type != hypervisor ? iommu_flushall : NULL),
{
u64 saved_fault_address = current_thread_info()->fault_address;
u8 saved_fault_code = get_thread_fault_code();
- mm_segment_t old_fs;
perf_callchain_store(entry, regs->tpc);
if (!current->mm)
return;
- old_fs = get_fs();
- set_fs(USER_DS);
-
flushw_user();
pagefault_disable();
pagefault_enable();
- set_fs(old_fs);
set_thread_fault_code(saved_fault_code);
current_thread_info()->fault_address = saved_fault_address;
}
regs->tpc -= 4;
regs->tnpc -= 4;
pt_regs_clear_syscall(regs);
+ /* fall through */
case ERESTART_RESTARTBLOCK:
regs->u_regs[UREG_G1] = __NR_restart_syscall;
regs->tpc -= 4;
regs->pc -= 4;
regs->npc -= 4;
pt_regs_clear_syscall(regs);
+ /* fall through */
case ERESTART_RESTARTBLOCK:
regs->u_regs[UREG_G1] = __NR_restart_syscall;
regs->pc -= 4;
regs->tpc -= 4;
regs->tnpc -= 4;
pt_regs_clear_syscall(regs);
+ /* fall through */
case ERESTART_RESTARTBLOCK:
regs->u_regs[UREG_G1] = __NR_restart_syscall;
regs->tpc -= 4;
.word sys_recvfrom, sys_setreuid16, sys_setregid16, sys_rename, compat_sys_truncate
/*130*/ .word compat_sys_ftruncate, sys_flock, compat_sys_lstat64, sys_sendto, sys_shutdown
.word sys_socketpair, sys_mkdir, sys_rmdir, compat_sys_utimes, compat_sys_stat64
-/*140*/ .word sys_sendfile64, sys_nis_syscall, compat_sys_futex, sys_gettid, compat_sys_getrlimit
+/*140*/ .word sys_sendfile64, sys_getpeername, compat_sys_futex, sys_gettid, compat_sys_getrlimit
.word compat_sys_setrlimit, sys_pivot_root, sys_prctl, sys_pciconfig_read, sys_pciconfig_write
-/*150*/ .word sys_nis_syscall, sys_inotify_init, sys_inotify_add_watch, sys_poll, sys_getdents64
+/*150*/ .word sys_getsockname, sys_inotify_init, sys_inotify_add_watch, sys_poll, sys_getdents64
.word compat_sys_fcntl64, sys_inotify_rm_watch, compat_sys_statfs, compat_sys_fstatfs, sys_oldumount
/*160*/ .word compat_sys_sched_setaffinity, compat_sys_sched_getaffinity, sys_getdomainname, sys_setdomainname, sys_nis_syscall
.word sys_quotactl, sys_set_tid_address, compat_sys_mount, compat_sys_ustat, sys_setxattr
}
/* Just skip the save instruction and the ctx register move. */
-#define BPF_TAILCALL_PROLOGUE_SKIP 16
+#define BPF_TAILCALL_PROLOGUE_SKIP 32
#define BPF_TAILCALL_CNT_SP_OFF (STACK_BIAS + 128)
static void build_prologue(struct jit_ctx *ctx)
const u8 vfp = bpf2sparc[BPF_REG_FP];
emit(ADD | IMMED | RS1(FP) | S13(STACK_BIAS) | RD(vfp), ctx);
+ } else {
+ emit_nop(ctx);
}
emit_reg_move(I0, O0, ctx);
+ emit_reg_move(I1, O1, ctx);
+ emit_reg_move(I2, O2, ctx);
+ emit_reg_move(I3, O3, ctx);
+ emit_reg_move(I4, O4, ctx);
/* If you add anything here, adjust BPF_TAILCALL_PROLOGUE_SKIP above. */
}
const u8 tmp2 = bpf2sparc[TMP_REG_2];
u32 opcode = 0, rs2;
+ if (insn->dst_reg == BPF_REG_FP)
+ ctx->saw_frame_pointer = true;
+
ctx->tmp_2_used = true;
emit_loadimm(imm, tmp2, ctx);
const u8 tmp = bpf2sparc[TMP_REG_1];
u32 opcode = 0, rs2;
+ if (insn->dst_reg == BPF_REG_FP)
+ ctx->saw_frame_pointer = true;
+
switch (BPF_SIZE(code)) {
case BPF_W:
opcode = ST32;
const u8 tmp2 = bpf2sparc[TMP_REG_2];
const u8 tmp3 = bpf2sparc[TMP_REG_3];
+ if (insn->dst_reg == BPF_REG_FP)
+ ctx->saw_frame_pointer = true;
+
ctx->tmp_1_used = true;
ctx->tmp_2_used = true;
ctx->tmp_3_used = true;
const u8 tmp2 = bpf2sparc[TMP_REG_2];
const u8 tmp3 = bpf2sparc[TMP_REG_3];
+ if (insn->dst_reg == BPF_REG_FP)
+ ctx->saw_frame_pointer = true;
+
ctx->tmp_1_used = true;
ctx->tmp_2_used = true;
ctx->tmp_3_used = true;
struct bpf_prog *tmp, *orig_prog = prog;
struct sparc64_jit_data *jit_data;
struct bpf_binary_header *header;
+ u32 prev_image_size, image_size;
bool tmp_blinded = false;
bool extra_pass = false;
struct jit_ctx ctx;
- u32 image_size;
u8 *image_ptr;
- int pass;
+ int pass, i;
if (!prog->jit_requested)
return orig_prog;
header = jit_data->header;
extra_pass = true;
image_size = sizeof(u32) * ctx.idx;
+ prev_image_size = image_size;
+ pass = 1;
goto skip_init_ctx;
}
memset(&ctx, 0, sizeof(ctx));
ctx.prog = prog;
- ctx.offset = kcalloc(prog->len, sizeof(unsigned int), GFP_KERNEL);
+ ctx.offset = kmalloc_array(prog->len, sizeof(unsigned int), GFP_KERNEL);
if (ctx.offset == NULL) {
prog = orig_prog;
goto out_off;
}
- /* Fake pass to detect features used, and get an accurate assessment
- * of what the final image size will be.
+ /* Longest sequence emitted is for bswap32, 12 instructions. Pre-cook
+ * the offset array so that we converge faster.
*/
- if (build_body(&ctx)) {
- prog = orig_prog;
- goto out_off;
- }
- build_prologue(&ctx);
- build_epilogue(&ctx);
-
- /* Now we know the actual image size. */
- image_size = sizeof(u32) * ctx.idx;
- header = bpf_jit_binary_alloc(image_size, &image_ptr,
- sizeof(u32), jit_fill_hole);
- if (header == NULL) {
- prog = orig_prog;
- goto out_off;
- }
+ for (i = 0; i < prog->len; i++)
+ ctx.offset[i] = i * (12 * 4);
- ctx.image = (u32 *)image_ptr;
-skip_init_ctx:
- for (pass = 1; pass < 3; pass++) {
+ prev_image_size = ~0U;
+ for (pass = 1; pass < 40; pass++) {
ctx.idx = 0;
build_prologue(&ctx);
-
if (build_body(&ctx)) {
- bpf_jit_binary_free(header);
prog = orig_prog;
goto out_off;
}
-
build_epilogue(&ctx);
if (bpf_jit_enable > 1)
- pr_info("Pass %d: shrink = %d, seen = [%c%c%c%c%c%c]\n", pass,
- image_size - (ctx.idx * 4),
+ pr_info("Pass %d: size = %u, seen = [%c%c%c%c%c%c]\n", pass,
+ ctx.idx * 4,
ctx.tmp_1_used ? '1' : ' ',
ctx.tmp_2_used ? '2' : ' ',
ctx.tmp_3_used ? '3' : ' ',
ctx.saw_frame_pointer ? 'F' : ' ',
ctx.saw_call ? 'C' : ' ',
ctx.saw_tail_call ? 'T' : ' ');
+
+ if (ctx.idx * 4 == prev_image_size)
+ break;
+ prev_image_size = ctx.idx * 4;
+ cond_resched();
+ }
+
+ /* Now we know the actual image size. */
+ image_size = sizeof(u32) * ctx.idx;
+ header = bpf_jit_binary_alloc(image_size, &image_ptr,
+ sizeof(u32), jit_fill_hole);
+ if (header == NULL) {
+ prog = orig_prog;
+ goto out_off;
+ }
+
+ ctx.image = (u32 *)image_ptr;
+skip_init_ctx:
+ ctx.idx = 0;
+
+ build_prologue(&ctx);
+
+ if (build_body(&ctx)) {
+ bpf_jit_binary_free(header);
+ prog = orig_prog;
+ goto out_off;
+ }
+
+ build_epilogue(&ctx);
+
+ if (ctx.idx * 4 != prev_image_size) {
+ pr_err("bpf_jit: Failed to converge, prev_size=%u size=%d\n",
+ prev_image_size, ctx.idx * 4);
+ bpf_jit_binary_free(header);
+ prog = orig_prog;
+ goto out_off;
}
if (bpf_jit_enable > 1)
io_req->fds[0] = dev->cow.fd;
else
io_req->fds[0] = dev->fd;
+ io_req->error = 0;
if (req_op(req) == REQ_OP_FLUSH) {
io_req->op = UBD_FLUSH;
io_req->cow_offset = -1;
io_req->offset = off;
io_req->length = bvec->bv_len;
- io_req->error = 0;
io_req->sector_mask = 0;
-
io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
io_req->offsets[0] = 0;
io_req->offsets[1] = dev->cow.data_offset;
static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
const struct blk_mq_queue_data *bd)
{
+ struct ubd *ubd_dev = hctx->queue->queuedata;
struct request *req = bd->rq;
int ret = 0;
blk_mq_start_request(req);
+ spin_lock_irq(&ubd_dev->lock);
+
if (req_op(req) == REQ_OP_FLUSH) {
ret = ubd_queue_one_vec(hctx, req, 0, NULL);
} else {
}
}
out:
- if (ret < 0) {
+ spin_unlock_irq(&ubd_dev->lock);
+
+ if (ret < 0)
blk_mq_requeue_request(req, true);
- }
+
return BLK_STS_OK;
}
select HAVE_ARCH_PREL32_RELOCATIONS
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_THREAD_STRUCT_WHITELIST
+ select HAVE_ARCH_STACKLEAK
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
branches. Requires a compiler with -mindirect-branch=thunk-extern
support for full protection. The kernel may run slower.
- Without compiler support, at least indirect branches in assembler
- code are eliminated. Since this includes the syscall entry path,
- it is not entirely pointless.
-
config INTEL_RDT
bool "Intel Resource Director Technology support"
depends on X86 && CPU_SUP_INTEL
bool "ScaleMP vSMP"
select HYPERVISOR_GUEST
select PARAVIRT
- select PARAVIRT_XXL
depends on X86_64 && PCI
depends on X86_EXTENDED_PLATFORM
depends on SMP
to the kernel image.
config SCHED_SMT
- bool "SMT (Hyperthreading) scheduler support"
- depends on SMP
- ---help---
- SMT scheduler support improves the CPU scheduler's decision making
- when dealing with Intel Pentium 4 chips with HyperThreading at a
- cost of slightly increased overhead in some places. If unsure say
- N here.
+ def_bool y if SMP
config SCHED_MC
def_bool y
KBUILD_LDFLAGS += $(call ld-option, -z max-page-size=0x200000)
endif
-# Speed up the build
-KBUILD_CFLAGS += -pipe
# Workaround for a gcc prelease that unfortunately was shipped in a suse release
KBUILD_CFLAGS += -Wno-sign-compare
#
# Avoid indirect branches in kernel to deal with Spectre
ifdef CONFIG_RETPOLINE
-ifneq ($(RETPOLINE_CFLAGS),)
- KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE
-endif
+ KBUILD_CFLAGS += $(RETPOLINE_CFLAGS)
endif
archscripts: scripts_basic
archmacros:
$(Q)$(MAKE) $(build)=arch/x86/kernel arch/x86/kernel/macros.s
-ASM_MACRO_FLAGS = -Wa,arch/x86/kernel/macros.s -Wa,-
+ASM_MACRO_FLAGS = -Wa,arch/x86/kernel/macros.s
export ASM_MACRO_FLAGS
KBUILD_CFLAGS += $(ASM_MACRO_FLAGS)
@echo Compiler lacks asm-goto support.
@exit 1
endif
+ifdef CONFIG_RETPOLINE
+ifeq ($(RETPOLINE_CFLAGS),)
+ @echo "You are building kernel with non-retpoline compiler." >&2
+ @echo "Please update your compiler." >&2
+ @false
+endif
+endif
archclean:
$(Q)rm -rf $(objtree)/arch/i386
+
/* -----------------------------------------------------------------------
*
* Copyright 2011 Intel Corporation; author Matt Fleming
return status;
}
+static efi_status_t allocate_e820(struct boot_params *params,
+ struct setup_data **e820ext,
+ u32 *e820ext_size)
+{
+ unsigned long map_size, desc_size, buff_size;
+ struct efi_boot_memmap boot_map;
+ efi_memory_desc_t *map;
+ efi_status_t status;
+ __u32 nr_desc;
+
+ boot_map.map = ↦
+ boot_map.map_size = &map_size;
+ boot_map.desc_size = &desc_size;
+ boot_map.desc_ver = NULL;
+ boot_map.key_ptr = NULL;
+ boot_map.buff_size = &buff_size;
+
+ status = efi_get_memory_map(sys_table, &boot_map);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ nr_desc = buff_size / desc_size;
+
+ if (nr_desc > ARRAY_SIZE(params->e820_table)) {
+ u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table);
+
+ status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size);
+ if (status != EFI_SUCCESS)
+ return status;
+ }
+
+ return EFI_SUCCESS;
+}
+
struct exit_boot_struct {
struct boot_params *boot_params;
struct efi_info *efi;
- struct setup_data *e820ext;
- __u32 e820ext_size;
};
static efi_status_t exit_boot_func(efi_system_table_t *sys_table_arg,
struct efi_boot_memmap *map,
void *priv)
{
- static bool first = true;
const char *signature;
__u32 nr_desc;
efi_status_t status;
struct exit_boot_struct *p = priv;
- if (first) {
- nr_desc = *map->buff_size / *map->desc_size;
- if (nr_desc > ARRAY_SIZE(p->boot_params->e820_table)) {
- u32 nr_e820ext = nr_desc -
- ARRAY_SIZE(p->boot_params->e820_table);
-
- status = alloc_e820ext(nr_e820ext, &p->e820ext,
- &p->e820ext_size);
- if (status != EFI_SUCCESS)
- return status;
- }
- first = false;
- }
-
signature = efi_is_64bit() ? EFI64_LOADER_SIGNATURE
: EFI32_LOADER_SIGNATURE;
memcpy(&p->efi->efi_loader_signature, signature, sizeof(__u32));
{
unsigned long map_sz, key, desc_size, buff_size;
efi_memory_desc_t *mem_map;
- struct setup_data *e820ext;
- __u32 e820ext_size;
+ struct setup_data *e820ext = NULL;
+ __u32 e820ext_size = 0;
efi_status_t status;
__u32 desc_version;
struct efi_boot_memmap map;
map.buff_size = &buff_size;
priv.boot_params = boot_params;
priv.efi = &boot_params->efi_info;
- priv.e820ext = NULL;
- priv.e820ext_size = 0;
+
+ status = allocate_e820(boot_params, &e820ext, &e820ext_size);
+ if (status != EFI_SUCCESS)
+ return status;
/* Might as well exit boot services now */
status = efi_exit_boot_services(sys_table, handle, &map, &priv,
if (status != EFI_SUCCESS)
return status;
- e820ext = priv.e820ext;
- e820ext_size = priv.e820ext_size;
-
/* Historic? */
boot_params->alt_mem_k = 32 * 1024;
{
int err;
- memset(&cpu.flags, 0, sizeof cpu.flags);
+ memset(&cpu.flags, 0, sizeof(cpu.flags));
cpu.level = 3;
if (has_eflag(X86_EFLAGS_AC))
int pos = 0;
int port = 0;
- if (cmdline_find_option("earlyprintk", arg, sizeof arg) > 0) {
+ if (cmdline_find_option("earlyprintk", arg, sizeof(arg)) > 0) {
char *e;
if (!strncmp(arg, "serial", 6)) {
* console=uart8250,io,0x3f8,115200n8
* need to make sure it is last one console !
*/
- if (cmdline_find_option("console", optstr, sizeof optstr) <= 0)
+ if (cmdline_find_option("console", optstr, sizeof(optstr)) <= 0)
return;
options = optstr;
{
struct biosregs ireg, oreg;
- memset(ei, 0, sizeof *ei);
+ memset(ei, 0, sizeof(*ei));
/* Check Extensions Present */
struct edd_info ei, *edp;
u32 *mbrptr;
- if (cmdline_find_option("edd", eddarg, sizeof eddarg) > 0) {
+ if (cmdline_find_option("edd", eddarg, sizeof(eddarg)) > 0) {
if (!strcmp(eddarg, "skipmbr") || !strcmp(eddarg, "skip")) {
do_edd = 1;
do_mbr = 0;
*/
if (!get_edd_info(devno, &ei)
&& boot_params.eddbuf_entries < EDDMAXNR) {
- memcpy(edp, &ei, sizeof ei);
+ memcpy(edp, &ei, sizeof(ei));
edp++;
boot_params.eddbuf_entries++;
}
# Part 2 of the header, from the old setup.S
.ascii "HdrS" # header signature
- .word 0x020e # header version number (>= 0x0105)
+ .word 0x020d # header version number (>= 0x0105)
# or else old loadlin-1.5 will fail)
.globl realmode_swtch
realmode_swtch: .word 0, 0 # default_switch, SETUPSEG
init_size: .long INIT_SIZE # kernel initialization size
handover_offset: .long 0 # Filled in by build.c
-acpi_rsdp_addr: .quad 0 # 64-bit physical pointer to the
- # ACPI RSDP table, added with
- # version 2.14
-
# End of setup header #####################################################
.section ".entrytext", "ax"
const struct old_cmdline * const oldcmd =
(const struct old_cmdline *)OLD_CL_ADDRESS;
- BUILD_BUG_ON(sizeof boot_params != 4096);
- memcpy(&boot_params.hdr, &hdr, sizeof hdr);
+ BUILD_BUG_ON(sizeof(boot_params) != 4096);
+ memcpy(&boot_params.hdr, &hdr, sizeof(hdr));
if (!boot_params.hdr.cmd_line_ptr &&
oldcmd->cl_magic == OLD_CL_MAGIC) {
initregs(&ireg);
ireg.ax = 0xe820;
- ireg.cx = sizeof buf;
+ ireg.cx = sizeof(buf);
ireg.edx = SMAP;
ireg.di = (size_t)&buf;
void initregs(struct biosregs *reg)
{
- memset(reg, 0, sizeof *reg);
+ memset(reg, 0, sizeof(*reg));
reg->eflags |= X86_EFLAGS_CF;
reg->ds = ds();
reg->es = ds();
if (mode & ~0x1ff)
continue;
- memset(&vminfo, 0, sizeof vminfo); /* Just in case... */
+ memset(&vminfo, 0, sizeof(vminfo)); /* Just in case... */
ireg.ax = 0x4f01;
ireg.cx = mode;
int is_graphic;
u16 vesa_mode = mode->mode - VIDEO_FIRST_VESA;
- memset(&vminfo, 0, sizeof vminfo); /* Just in case... */
+ memset(&vminfo, 0, sizeof(vminfo)); /* Just in case... */
initregs(&ireg);
ireg.ax = 0x4f01;
struct biosregs ireg, oreg;
/* Apparently used as a nonsense token... */
- memset(&boot_params.edid_info, 0x13, sizeof boot_params.edid_info);
+ memset(&boot_params.edid_info, 0x13, sizeof(boot_params.edid_info));
if (vginfo.version < 0x0200)
return; /* EDID requires VBE 2.0+ */
} else if ((key >= '0' && key <= '9') ||
(key >= 'A' && key <= 'Z') ||
(key >= 'a' && key <= 'z')) {
- if (len < sizeof entry_buf) {
+ if (len < sizeof(entry_buf)) {
entry_buf[len++] = key;
putchar(key);
}
#endif
+.macro STACKLEAK_ERASE_NOCLOBBER
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+ PUSH_AND_CLEAR_REGS
+ call stackleak_erase
+ POP_REGS
+#endif
+.endm
+
#endif /* CONFIG_X86_64 */
+.macro STACKLEAK_ERASE
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+ call stackleak_erase
+#endif
+.endm
+
/*
* This does 'call enter_from_user_mode' unless we can avoid it based on
* kernel config or using the static jump infrastructure.
#include <asm/frame.h>
#include <asm/nospec-branch.h>
+#include "calling.h"
+
.section .entry.text, "ax"
/*
/* When we fork, we trace the syscall return in the child, too. */
movl %esp, %eax
call syscall_return_slowpath
+ STACKLEAK_ERASE
jmp restore_all
/* kernel thread */
ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
"jmp .Lsyscall_32_done", X86_FEATURE_XENPV
+ STACKLEAK_ERASE
+
/* Opportunistic SYSEXIT */
TRACE_IRQS_ON /* User mode traces as IRQs on. */
call do_int80_syscall_32
.Lsyscall_32_done:
+ STACKLEAK_ERASE
+
restore_all:
TRACE_IRQS_IRET
SWITCH_TO_ENTRY_STACK
* We are on the trampoline stack. All regs except RDI are live.
* We can do future final exit work right here.
*/
+ STACKLEAK_ERASE_NOCLOBBER
+
SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
popq %rdi
ret
END(interrupt_entry)
+_ASM_NOKPROBE(interrupt_entry)
/* Interrupt entry/exit. */
* We are on the trampoline stack. All regs except RDI are live.
* We can do future final exit work right here.
*/
+ STACKLEAK_ERASE_NOCLOBBER
SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
jmp native_irq_return_iret
#endif
END(common_interrupt)
+_ASM_NOKPROBE(common_interrupt)
/*
* APIC interrupts.
call \do_sym /* rdi points to pt_regs */
jmp ret_from_intr
END(\sym)
+_ASM_NOKPROBE(\sym)
.endm
/* Make sure APIC interrupt handlers end up in the irqentry section: */
jmp error_exit
.endif
+_ASM_NOKPROBE(\sym)
END(\sym)
.endm
/* Opportunistic SYSRET */
sysret32_from_system_call:
+ /*
+ * We are not going to return to userspace from the trampoline
+ * stack. So let's erase the thread stack right now.
+ */
+ STACKLEAK_ERASE
TRACE_IRQS_ON /* User mode traces as IRQs on. */
movq RBX(%rsp), %rbx /* pt_regs->rbx */
movq RBP(%rsp), %rbp /* pt_regs->rbp */
CPPFLAGS_vdso.lds += -P -C
VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \
- -z max-page-size=4096 -z common-page-size=4096
+ -z max-page-size=4096
$(obj)/vdso64.so.dbg: $(obj)/vdso.lds $(vobjs) FORCE
$(call if_changed,vdso)
CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds)
VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \
- -z max-page-size=4096 -z common-page-size=4096
+ -z max-page-size=4096
# x32-rebranded versions
vobjx32s-y := $(vobjs-y:.o=-x32.o)
if (config == -1LL)
return -EINVAL;
- /*
- * Branch tracing:
- */
- if (attr->config == PERF_COUNT_HW_BRANCH_INSTRUCTIONS &&
- !attr->freq && hwc->sample_period == 1) {
- /* BTS is not supported by this architecture. */
- if (!x86_pmu.bts_active)
- return -EOPNOTSUPP;
-
- /* BTS is currently only allowed for user-mode. */
- if (!attr->exclude_kernel)
- return -EOPNOTSUPP;
-
- /* disallow bts if conflicting events are present */
- if (x86_add_exclusive(x86_lbr_exclusive_lbr))
- return -EBUSY;
-
- event->destroy = hw_perf_lbr_event_destroy;
- }
-
hwc->config |= config;
return 0;
return handled;
}
-static bool disable_counter_freezing;
+static bool disable_counter_freezing = true;
static int __init intel_perf_counter_freezing_setup(char *s)
{
- disable_counter_freezing = true;
- pr_info("Intel PMU Counter freezing feature disabled\n");
+ bool res;
+
+ if (kstrtobool(s, &res))
+ return -EINVAL;
+
+ disable_counter_freezing = !res;
return 1;
}
-__setup("disable_counter_freezing", intel_perf_counter_freezing_setup);
+__setup("perf_v4_pmi=", intel_perf_counter_freezing_setup);
/*
* Simplified handler for Arch Perfmon v4:
static struct event_constraint *
intel_bts_constraints(struct perf_event *event)
{
- struct hw_perf_event *hwc = &event->hw;
- unsigned int hw_event, bts_event;
-
- if (event->attr.freq)
- return NULL;
-
- hw_event = hwc->config & INTEL_ARCH_EVENT_MASK;
- bts_event = x86_pmu.event_map(PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
-
- if (unlikely(hw_event == bts_event && hwc->sample_period == 1))
+ if (unlikely(intel_pmu_has_bts(event)))
return &bts_constraint;
return NULL;
return flags;
}
+static int intel_pmu_bts_config(struct perf_event *event)
+{
+ struct perf_event_attr *attr = &event->attr;
+
+ if (unlikely(intel_pmu_has_bts(event))) {
+ /* BTS is not supported by this architecture. */
+ if (!x86_pmu.bts_active)
+ return -EOPNOTSUPP;
+
+ /* BTS is currently only allowed for user-mode. */
+ if (!attr->exclude_kernel)
+ return -EOPNOTSUPP;
+
+ /* BTS is not allowed for precise events. */
+ if (attr->precise_ip)
+ return -EOPNOTSUPP;
+
+ /* disallow bts if conflicting events are present */
+ if (x86_add_exclusive(x86_lbr_exclusive_lbr))
+ return -EBUSY;
+
+ event->destroy = hw_perf_lbr_event_destroy;
+ }
+
+ return 0;
+}
+
+static int core_pmu_hw_config(struct perf_event *event)
+{
+ int ret = x86_pmu_hw_config(event);
+
+ if (ret)
+ return ret;
+
+ return intel_pmu_bts_config(event);
+}
+
static int intel_pmu_hw_config(struct perf_event *event)
{
int ret = x86_pmu_hw_config(event);
+ if (ret)
+ return ret;
+
+ ret = intel_pmu_bts_config(event);
if (ret)
return ret;
/*
* BTS is set up earlier in this path, so don't account twice
*/
- if (!intel_pmu_has_bts(event)) {
+ if (!unlikely(intel_pmu_has_bts(event))) {
/* disallow lbr if conflicting events are present */
if (x86_add_exclusive(x86_lbr_exclusive_lbr))
return -EBUSY;
.enable_all = core_pmu_enable_all,
.enable = core_pmu_enable_event,
.disable = x86_pmu_disable_event,
- .hw_config = x86_pmu_hw_config,
+ .hw_config = core_pmu_hw_config,
.schedule_events = x86_schedule_events,
.eventsel = MSR_ARCH_PERFMON_EVENTSEL0,
.perfctr = MSR_ARCH_PERFMON_PERFCTR0,
}
}
- snprintf(pmu_name_str, sizeof pmu_name_str, "%s", name);
+ snprintf(pmu_name_str, sizeof(pmu_name_str), "%s", name);
if (version >= 2 && extra_attr) {
x86_pmu.format_attrs = merge_attr(intel_arch3_formats_attr,
struct intel_uncore_extra_reg shared_regs[0];
};
-#define UNCORE_BOX_FLAG_INITIATED 0
-#define UNCORE_BOX_FLAG_CTL_OFFS8 1 /* event config registers are 8-byte apart */
+/* CFL uncore 8th cbox MSRs */
+#define CFL_UNC_CBO_7_PERFEVTSEL0 0xf70
+#define CFL_UNC_CBO_7_PER_CTR0 0xf76
+
+#define UNCORE_BOX_FLAG_INITIATED 0
+/* event config registers are 8-byte apart */
+#define UNCORE_BOX_FLAG_CTL_OFFS8 1
+/* CFL 8th CBOX has different MSR space */
+#define UNCORE_BOX_FLAG_CFL8_CBOX_MSR_OFFS 2
struct uncore_event_desc {
struct kobj_attribute attr;
static inline
unsigned uncore_msr_event_ctl(struct intel_uncore_box *box, int idx)
{
- return box->pmu->type->event_ctl +
- (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx) +
- uncore_msr_box_offset(box);
+ if (test_bit(UNCORE_BOX_FLAG_CFL8_CBOX_MSR_OFFS, &box->flags)) {
+ return CFL_UNC_CBO_7_PERFEVTSEL0 +
+ (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx);
+ } else {
+ return box->pmu->type->event_ctl +
+ (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx) +
+ uncore_msr_box_offset(box);
+ }
}
static inline
unsigned uncore_msr_perf_ctr(struct intel_uncore_box *box, int idx)
{
- return box->pmu->type->perf_ctr +
- (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx) +
- uncore_msr_box_offset(box);
+ if (test_bit(UNCORE_BOX_FLAG_CFL8_CBOX_MSR_OFFS, &box->flags)) {
+ return CFL_UNC_CBO_7_PER_CTR0 +
+ (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx);
+ } else {
+ return box->pmu->type->perf_ctr +
+ (box->pmu->type->pair_ctr_ctl ? 2 * idx : idx) +
+ uncore_msr_box_offset(box);
+ }
}
static inline
#define PCI_DEVICE_ID_INTEL_SKL_HQ_IMC 0x1910
#define PCI_DEVICE_ID_INTEL_SKL_SD_IMC 0x190f
#define PCI_DEVICE_ID_INTEL_SKL_SQ_IMC 0x191f
+#define PCI_DEVICE_ID_INTEL_KBL_Y_IMC 0x590c
+#define PCI_DEVICE_ID_INTEL_KBL_U_IMC 0x5904
+#define PCI_DEVICE_ID_INTEL_KBL_UQ_IMC 0x5914
+#define PCI_DEVICE_ID_INTEL_KBL_SD_IMC 0x590f
+#define PCI_DEVICE_ID_INTEL_KBL_SQ_IMC 0x591f
+#define PCI_DEVICE_ID_INTEL_CFL_2U_IMC 0x3ecc
+#define PCI_DEVICE_ID_INTEL_CFL_4U_IMC 0x3ed0
+#define PCI_DEVICE_ID_INTEL_CFL_4H_IMC 0x3e10
+#define PCI_DEVICE_ID_INTEL_CFL_6H_IMC 0x3ec4
+#define PCI_DEVICE_ID_INTEL_CFL_2S_D_IMC 0x3e0f
+#define PCI_DEVICE_ID_INTEL_CFL_4S_D_IMC 0x3e1f
+#define PCI_DEVICE_ID_INTEL_CFL_6S_D_IMC 0x3ec2
+#define PCI_DEVICE_ID_INTEL_CFL_8S_D_IMC 0x3e30
+#define PCI_DEVICE_ID_INTEL_CFL_4S_W_IMC 0x3e18
+#define PCI_DEVICE_ID_INTEL_CFL_6S_W_IMC 0x3ec6
+#define PCI_DEVICE_ID_INTEL_CFL_8S_W_IMC 0x3e31
+#define PCI_DEVICE_ID_INTEL_CFL_4S_S_IMC 0x3e33
+#define PCI_DEVICE_ID_INTEL_CFL_6S_S_IMC 0x3eca
+#define PCI_DEVICE_ID_INTEL_CFL_8S_S_IMC 0x3e32
/* SNB event control */
#define SNB_UNC_CTL_EV_SEL_MASK 0x000000ff
wrmsrl(SKL_UNC_PERF_GLOBAL_CTL,
SNB_UNC_GLOBAL_CTL_EN | SKL_UNC_GLOBAL_CTL_CORE_ALL);
}
+
+ /* The 8th CBOX has different MSR space */
+ if (box->pmu->pmu_idx == 7)
+ __set_bit(UNCORE_BOX_FLAG_CFL8_CBOX_MSR_OFFS, &box->flags);
}
static void skl_uncore_msr_enable_box(struct intel_uncore_box *box)
static struct intel_uncore_type skl_uncore_cbox = {
.name = "cbox",
.num_counters = 4,
- .num_boxes = 5,
+ .num_boxes = 8,
.perf_ctr_bits = 44,
.fixed_ctr_bits = 48,
.perf_ctr = SNB_UNC_CBO_0_PER_CTR0,
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_SKL_SQ_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
-
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_Y_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_U_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_UQ_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_SD_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_SQ_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_2U_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_4U_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_4H_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_6H_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_2S_D_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_4S_D_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_6S_D_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_8S_D_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_4S_W_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_6S_W_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_8S_W_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_4S_S_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_6S_S_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_8S_S_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
{ /* end: all zeroes */ },
};
IMC_DEV(SKL_HQ_IMC, &skl_uncore_pci_driver), /* 6th Gen Core H Quad Core */
IMC_DEV(SKL_SD_IMC, &skl_uncore_pci_driver), /* 6th Gen Core S Dual Core */
IMC_DEV(SKL_SQ_IMC, &skl_uncore_pci_driver), /* 6th Gen Core S Quad Core */
+ IMC_DEV(KBL_Y_IMC, &skl_uncore_pci_driver), /* 7th Gen Core Y */
+ IMC_DEV(KBL_U_IMC, &skl_uncore_pci_driver), /* 7th Gen Core U */
+ IMC_DEV(KBL_UQ_IMC, &skl_uncore_pci_driver), /* 7th Gen Core U Quad Core */
+ IMC_DEV(KBL_SD_IMC, &skl_uncore_pci_driver), /* 7th Gen Core S Dual Core */
+ IMC_DEV(KBL_SQ_IMC, &skl_uncore_pci_driver), /* 7th Gen Core S Quad Core */
+ IMC_DEV(CFL_2U_IMC, &skl_uncore_pci_driver), /* 8th Gen Core U 2 Cores */
+ IMC_DEV(CFL_4U_IMC, &skl_uncore_pci_driver), /* 8th Gen Core U 4 Cores */
+ IMC_DEV(CFL_4H_IMC, &skl_uncore_pci_driver), /* 8th Gen Core H 4 Cores */
+ IMC_DEV(CFL_6H_IMC, &skl_uncore_pci_driver), /* 8th Gen Core H 6 Cores */
+ IMC_DEV(CFL_2S_D_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 2 Cores Desktop */
+ IMC_DEV(CFL_4S_D_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 4 Cores Desktop */
+ IMC_DEV(CFL_6S_D_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 6 Cores Desktop */
+ IMC_DEV(CFL_8S_D_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 8 Cores Desktop */
+ IMC_DEV(CFL_4S_W_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 4 Cores Work Station */
+ IMC_DEV(CFL_6S_W_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 6 Cores Work Station */
+ IMC_DEV(CFL_8S_W_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 8 Cores Work Station */
+ IMC_DEV(CFL_4S_S_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 4 Cores Server */
+ IMC_DEV(CFL_6S_S_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 6 Cores Server */
+ IMC_DEV(CFL_8S_S_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 8 Cores Server */
{ /* end marker */ }
};
static inline bool intel_pmu_has_bts(struct perf_event *event)
{
- if (event->attr.config == PERF_COUNT_HW_BRANCH_INSTRUCTIONS &&
- !event->attr.freq && event->hw.sample_period == 1)
- return true;
+ struct hw_perf_event *hwc = &event->hw;
+ unsigned int hw_event, bts_event;
+
+ if (event->attr.freq)
+ return false;
+
+ hw_event = hwc->config & INTEL_ARCH_EVENT_MASK;
+ bts_event = x86_pmu.event_map(PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
- return false;
+ return hw_event == bts_event && hwc->sample_period == 1;
}
int intel_pmu_save_and_restart(struct perf_event *event);
*/
if (boot_params->sentinel) {
/* fields in boot_params are left uninitialized, clear them */
+ boot_params->acpi_rsdp_addr = 0;
memset(&boot_params->ext_ramdisk_image, 0,
(char *)&boot_params->efi_info -
(char *)&boot_params->ext_ramdisk_image);
return false;
}
-static inline bool in_compat_syscall(void)
+static inline bool in_32bit_syscall(void)
{
return in_ia32_syscall() || in_x32_syscall();
}
+
+#ifdef CONFIG_COMPAT
+static inline bool in_compat_syscall(void)
+{
+ return in_32bit_syscall();
+}
#define in_compat_syscall in_compat_syscall /* override the generic impl */
+#endif
struct compat_siginfo;
int __copy_siginfo_to_user32(struct compat_siginfo __user *to,
#define X86_FEATURE_LA57 (16*32+16) /* 5-level page tables */
#define X86_FEATURE_RDPID (16*32+22) /* RDPID instruction */
#define X86_FEATURE_CLDEMOTE (16*32+25) /* CLDEMOTE instruction */
+#define X86_FEATURE_MOVDIRI (16*32+27) /* MOVDIRI instruction */
+#define X86_FEATURE_MOVDIR64B (16*32+28) /* MOVDIR64B instruction */
/* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
#define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery support */
"3: movl $-2,%[err]\n\t" \
"jmp 2b\n\t" \
".popsection\n\t" \
- _ASM_EXTABLE_UA(1b, 3b) \
+ _ASM_EXTABLE(1b, 3b) \
: [err] "=r" (err) \
: "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \
: "memory")
#define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS 1
static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
{
- if (in_compat_syscall())
- return true;
- return false;
+ return in_32bit_syscall();
}
#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_IA32_EMULATION */
#endif /* !COMPILE_OFFSETS */
bool (*has_wbinvd_exit)(void);
u64 (*read_l1_tsc_offset)(struct kvm_vcpu *vcpu);
- void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
+ /* Returns actual tsc_offset set in active VMCS */
+ u64 (*write_l1_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2);
int mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
+bool mce_is_correctable(struct mce *m);
+int mce_usable_address(struct mce *m);
DECLARE_PER_CPU(unsigned, mce_exception_count);
DECLARE_PER_CPU(unsigned, mce_poll_count);
: "cc");
}
#endif
- return hv_status;
+ return hv_status;
}
/*
#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */
#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */
-#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */
+#define SPEC_CTRL_STIBP_SHIFT 1 /* Single Thread Indirect Branch Predictor (STIBP) bit */
+#define SPEC_CTRL_STIBP (1 << SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
#define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
-#define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
+#define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
#define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */
#ifndef _ASM_X86_NOSPEC_BRANCH_H_
#define _ASM_X86_NOSPEC_BRANCH_H_
+#include <linux/static_key.h>
+
#include <asm/alternative.h>
#include <asm/alternative-asm.h>
#include <asm/cpufeatures.h>
_ASM_PTR " 999b\n\t" \
".popsection\n\t"
-#if defined(CONFIG_X86_64) && defined(RETPOLINE)
+#ifdef CONFIG_RETPOLINE
+#ifdef CONFIG_X86_64
/*
- * Since the inline asm uses the %V modifier which is only in newer GCC,
- * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE.
+ * Inline asm uses the %V modifier which is only in newer GCC
+ * which is ensured when CONFIG_RETPOLINE is defined.
*/
# define CALL_NOSPEC \
ANNOTATE_NOSPEC_ALTERNATIVE \
X86_FEATURE_RETPOLINE_AMD)
# define THUNK_TARGET(addr) [thunk_target] "r" (addr)
-#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE)
+#else /* CONFIG_X86_32 */
/*
* For i386 we use the original ret-equivalent retpoline, because
* otherwise we'll run out of registers. We don't care about CET
X86_FEATURE_RETPOLINE_AMD)
# define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
+#endif
#else /* No retpoline for C / inline asm */
# define CALL_NOSPEC "call *%[thunk_target]\n"
# define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
/* The Spectre V2 mitigation variants */
enum spectre_v2_mitigation {
SPECTRE_V2_NONE,
- SPECTRE_V2_RETPOLINE_MINIMAL,
- SPECTRE_V2_RETPOLINE_MINIMAL_AMD,
SPECTRE_V2_RETPOLINE_GENERIC,
SPECTRE_V2_RETPOLINE_AMD,
SPECTRE_V2_IBRS_ENHANCED,
};
+/* The indirect branch speculation control variants */
+enum spectre_v2_user_mitigation {
+ SPECTRE_V2_USER_NONE,
+ SPECTRE_V2_USER_STRICT,
+ SPECTRE_V2_USER_PRCTL,
+ SPECTRE_V2_USER_SECCOMP,
+};
+
/* The Speculative Store Bypass disable variants */
enum ssb_mitigation {
SPEC_STORE_BYPASS_NONE,
preempt_enable(); \
} while (0)
+DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
+DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
+
#endif /* __ASSEMBLY__ */
/*
/*
* Set __PAGE_OFFSET to the most negative possible address +
- * PGDIR_SIZE*16 (pgd slot 272). The gap is to allow a space for a
- * hypervisor to fit. Choosing 16 slots here is arbitrary, but it's
- * what Xen requires.
+ * PGDIR_SIZE*17 (pgd slot 273).
+ *
+ * The gap is to allow a space for LDT remap for PTI (1 pgd slot) and space for
+ * a hypervisor (16 slots). Choosing 16 slots for a hypervisor is arbitrary,
+ * but it's what Xen requires.
*/
-#define __PAGE_OFFSET_BASE_L5 _AC(0xff10000000000000, UL)
-#define __PAGE_OFFSET_BASE_L4 _AC(0xffff880000000000, UL)
+#define __PAGE_OFFSET_BASE_L5 _AC(0xff11000000000000, UL)
+#define __PAGE_OFFSET_BASE_L4 _AC(0xffff888000000000, UL)
#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
#define __PAGE_OFFSET page_offset_base
__visible extern const char start_##ops##_##name[], end_##ops##_##name[]; \
asm(NATIVE_LABEL("start_", ops, name) code NATIVE_LABEL("end_", ops, name))
-unsigned paravirt_patch_ident_32(void *insnbuf, unsigned len);
unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len);
unsigned paravirt_patch_default(u8 type, void *insnbuf,
unsigned long addr, unsigned len);
void paravirt_flush_lazy_mmu(void);
void _paravirt_nop(void);
-u32 _paravirt_ident_32(u32);
u64 _paravirt_ident_64(u64);
#define paravirt_nop ((void *)_paravirt_nop)
*/
#define MAXMEM (1UL << MAX_PHYSMEM_BITS)
-#define LDT_PGD_ENTRY_L4 -3UL
-#define LDT_PGD_ENTRY_L5 -112UL
-#define LDT_PGD_ENTRY (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 : LDT_PGD_ENTRY_L4)
+#define LDT_PGD_ENTRY -240UL
#define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
#define LDT_END_ADDR (LDT_BASE_ADDR + PGDIR_SIZE)
#define queued_fetch_set_pending_acquire queued_fetch_set_pending_acquire
static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock)
{
- u32 val = 0;
-
- if (GEN_BINARY_RMWcc(LOCK_PREFIX "btsl", lock->val.counter, c,
- "I", _Q_PENDING_OFFSET))
- val |= _Q_PENDING_VAL;
+ u32 val;
+ /*
+ * We can't use GEN_BINARY_RMWcc() inside an if() stmt because asm goto
+ * and CONFIG_PROFILE_ALL_BRANCHES=y results in a label inside a
+ * statement expression, which GCC doesn't like.
+ */
+ val = GEN_BINARY_RMWcc(LOCK_PREFIX "btsl", lock->val.counter, c,
+ "I", _Q_PENDING_OFFSET) * _Q_PENDING_VAL;
val |= atomic_read(&lock->val) & ~_Q_PENDING_MASK;
return val;
return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT);
}
+static inline u64 stibp_tif_to_spec_ctrl(u64 tifn)
+{
+ BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT);
+ return (tifn & _TIF_SPEC_IB) >> (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT);
+}
+
static inline unsigned long ssbd_spec_ctrl_to_tif(u64 spec_ctrl)
{
BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT);
return (spec_ctrl & SPEC_CTRL_SSBD) << (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT);
}
+static inline unsigned long stibp_spec_ctrl_to_tif(u64 spec_ctrl)
+{
+ BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT);
+ return (spec_ctrl & SPEC_CTRL_STIBP) << (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT);
+}
+
static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn)
{
return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL;
static inline void speculative_store_bypass_ht_init(void) { }
#endif
-extern void speculative_store_bypass_update(unsigned long tif);
-
-static inline void speculative_store_bypass_update_current(void)
-{
- speculative_store_bypass_update(current_thread_info()->flags);
-}
+extern void speculation_ctrl_update(unsigned long tif);
+extern void speculation_ctrl_update_current(void);
#endif
__visible struct task_struct *__switch_to(struct task_struct *prev,
struct task_struct *next);
-struct tss_struct;
-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
- struct tss_struct *tss);
/* This runs runs on the previous thread's stack. */
static inline void prepare_switch_to(struct task_struct *next)
#define TIF_SIGPENDING 2 /* signal pending */
#define TIF_NEED_RESCHED 3 /* rescheduling necessary */
#define TIF_SINGLESTEP 4 /* reenable singlestep on user return*/
-#define TIF_SSBD 5 /* Reduced data speculation */
+#define TIF_SSBD 5 /* Speculative store bypass disable */
#define TIF_SYSCALL_EMU 6 /* syscall emulation active */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SECCOMP 8 /* secure computing */
+#define TIF_SPEC_IB 9 /* Indirect branch speculation mitigation */
+#define TIF_SPEC_FORCE_UPDATE 10 /* Force speculation MSR update in context switch */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
#define TIF_PATCH_PENDING 13 /* pending live patching update */
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
+#define _TIF_SPEC_IB (1 << TIF_SPEC_IB)
+#define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
_TIF_FSCHECK)
/* flags to check in __switch_to() */
-#define _TIF_WORK_CTXSW \
- (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_SSBD)
+#define _TIF_WORK_CTXSW_BASE \
+ (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP| \
+ _TIF_SSBD | _TIF_SPEC_FORCE_UPDATE)
+
+/*
+ * Avoid calls to __switch_to_xtra() on UP as STIBP is not evaluated.
+ */
+#ifdef CONFIG_SMP
+# define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE | _TIF_SPEC_IB)
+#else
+# define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE)
+#endif
#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
#define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
#define LOADED_MM_SWITCHING ((struct mm_struct *)1)
+ /* Last user mm for optimizing IBPB */
+ union {
+ struct mm_struct *last_user_mm;
+ unsigned long last_user_mm_ibpb;
+ };
+
u16 loaded_mm_asid;
u16 next_asid;
- /* last user mm's ctx id */
- u64 last_ctx_id;
/*
* We can be in one of several states:
*/
static inline void __flush_tlb_all(void)
{
+ /*
+ * This is to catch users with enabled preemption and the PGE feature
+ * and don't trigger the warning in __native_flush_tlb().
+ */
+ VM_WARN_ON_ONCE(preemptible());
+
if (boot_cpu_has(X86_FEATURE_PGE)) {
__flush_tlb_global();
} else {
extern void x86_init_uint_noop(unsigned int unused);
extern bool x86_pnpbios_disabled(void);
-void x86_verify_bootdata_version(void);
-
#endif
#include <linux/mm.h>
#include <linux/device.h>
-#include <linux/uaccess.h>
+#include <asm/extable.h>
#include <asm/page.h>
#include <asm/pgtable.h>
*/
static inline int xen_safe_write_ulong(unsigned long *addr, unsigned long val)
{
- return __put_user(val, (unsigned long __user *)addr);
+ int ret = 0;
+
+ asm volatile("1: mov %[val], %[ptr]\n"
+ "2:\n"
+ ".section .fixup, \"ax\"\n"
+ "3: sub $1, %[ret]\n"
+ " jmp 2b\n"
+ ".previous\n"
+ _ASM_EXTABLE(1b, 3b)
+ : [ret] "+r" (ret), [ptr] "=m" (*addr)
+ : [val] "r" (val));
+
+ return ret;
}
-static inline int xen_safe_read_ulong(unsigned long *addr, unsigned long *val)
+static inline int xen_safe_read_ulong(const unsigned long *addr,
+ unsigned long *val)
{
- return __get_user(*val, (unsigned long __user *)addr);
+ int ret = 0;
+ unsigned long rval = ~0ul;
+
+ asm volatile("1: mov %[ptr], %[rval]\n"
+ "2:\n"
+ ".section .fixup, \"ax\"\n"
+ "3: sub $1, %[ret]\n"
+ " jmp 2b\n"
+ ".previous\n"
+ _ASM_EXTABLE(1b, 3b)
+ : [ret] "+r" (ret), [rval] "+r" (rval)
+ : [ptr] "m" (*addr));
+ *val = rval;
+
+ return ret;
}
#ifdef CONFIG_XEN_PV
#define RAMDISK_PROMPT_FLAG 0x8000
#define RAMDISK_LOAD_FLAG 0x4000
-/* version flags */
-#define VERSION_WRITTEN 0x8000
-
/* loadflags */
#define LOADED_HIGH (1<<0)
#define KASLR_FLAG (1<<1)
__u64 pref_address;
__u32 init_size;
__u32 handover_offset;
- __u64 acpi_rsdp_addr;
} __attribute__((packed));
struct sys_desc_table {
__u8 _pad2[4]; /* 0x054 */
__u64 tboot_addr; /* 0x058 */
struct ist_info ist_info; /* 0x060 */
- __u8 _pad3[16]; /* 0x070 */
+ __u64 acpi_rsdp_addr; /* 0x070 */
+ __u8 _pad3[8]; /* 0x078 */
__u8 hd0_info[16]; /* obsolete! */ /* 0x080 */
__u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */
u64 x86_default_get_root_pointer(void)
{
- return boot_params.hdr.acpi_rsdp_addr;
+ return boot_params.acpi_rsdp_addr;
}
#include <linux/module.h>
#include <linux/nospec.h>
#include <linux/prctl.h>
+#include <linux/sched/smt.h>
#include <asm/spec-ctrl.h>
#include <asm/cmdline.h>
u64 __ro_after_init x86_amd_ls_cfg_base;
u64 __ro_after_init x86_amd_ls_cfg_ssbd_mask;
+/* Control conditional STIPB in switch_to() */
+DEFINE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+/* Control conditional IBPB in switch_mm() */
+DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
+/* Control unconditional IBPB in switch_mm() */
+DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
+
void __init check_bugs(void)
{
identify_boot_cpu();
#endif
}
-/* The kernel command line selection */
-enum spectre_v2_mitigation_cmd {
- SPECTRE_V2_CMD_NONE,
- SPECTRE_V2_CMD_AUTO,
- SPECTRE_V2_CMD_FORCE,
- SPECTRE_V2_CMD_RETPOLINE,
- SPECTRE_V2_CMD_RETPOLINE_GENERIC,
- SPECTRE_V2_CMD_RETPOLINE_AMD,
-};
-
-static const char *spectre_v2_strings[] = {
- [SPECTRE_V2_NONE] = "Vulnerable",
- [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline",
- [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline",
- [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline",
- [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline",
- [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS",
-};
-
-#undef pr_fmt
-#define pr_fmt(fmt) "Spectre V2 : " fmt
-
-static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
- SPECTRE_V2_NONE;
-
void
x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
{
static_cpu_has(X86_FEATURE_AMD_SSBD))
hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
+ /* Conditional STIBP enabled? */
+ if (static_branch_unlikely(&switch_to_cond_stibp))
+ hostval |= stibp_tif_to_spec_ctrl(ti->flags);
+
if (hostval != guestval) {
msrval = setguest ? guestval : hostval;
wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
tif = setguest ? ssbd_spec_ctrl_to_tif(guestval) :
ssbd_spec_ctrl_to_tif(hostval);
- speculative_store_bypass_update(tif);
+ speculation_ctrl_update(tif);
}
}
EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl);
wrmsrl(MSR_AMD64_LS_CFG, msrval);
}
+#undef pr_fmt
+#define pr_fmt(fmt) "Spectre V2 : " fmt
+
+static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
+ SPECTRE_V2_NONE;
+
+static enum spectre_v2_user_mitigation spectre_v2_user __ro_after_init =
+ SPECTRE_V2_USER_NONE;
+
#ifdef RETPOLINE
static bool spectre_v2_bad_module;
static inline const char *spectre_v2_module_string(void) { return ""; }
#endif
-static void __init spec2_print_if_insecure(const char *reason)
+static inline bool match_option(const char *arg, int arglen, const char *opt)
{
- if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
- pr_info("%s selected on command line.\n", reason);
+ int len = strlen(opt);
+
+ return len == arglen && !strncmp(arg, opt, len);
}
-static void __init spec2_print_if_secure(const char *reason)
+/* The kernel command line selection for spectre v2 */
+enum spectre_v2_mitigation_cmd {
+ SPECTRE_V2_CMD_NONE,
+ SPECTRE_V2_CMD_AUTO,
+ SPECTRE_V2_CMD_FORCE,
+ SPECTRE_V2_CMD_RETPOLINE,
+ SPECTRE_V2_CMD_RETPOLINE_GENERIC,
+ SPECTRE_V2_CMD_RETPOLINE_AMD,
+};
+
+enum spectre_v2_user_cmd {
+ SPECTRE_V2_USER_CMD_NONE,
+ SPECTRE_V2_USER_CMD_AUTO,
+ SPECTRE_V2_USER_CMD_FORCE,
+ SPECTRE_V2_USER_CMD_PRCTL,
+ SPECTRE_V2_USER_CMD_PRCTL_IBPB,
+ SPECTRE_V2_USER_CMD_SECCOMP,
+ SPECTRE_V2_USER_CMD_SECCOMP_IBPB,
+};
+
+static const char * const spectre_v2_user_strings[] = {
+ [SPECTRE_V2_USER_NONE] = "User space: Vulnerable",
+ [SPECTRE_V2_USER_STRICT] = "User space: Mitigation: STIBP protection",
+ [SPECTRE_V2_USER_PRCTL] = "User space: Mitigation: STIBP via prctl",
+ [SPECTRE_V2_USER_SECCOMP] = "User space: Mitigation: STIBP via seccomp and prctl",
+};
+
+static const struct {
+ const char *option;
+ enum spectre_v2_user_cmd cmd;
+ bool secure;
+} v2_user_options[] __initdata = {
+ { "auto", SPECTRE_V2_USER_CMD_AUTO, false },
+ { "off", SPECTRE_V2_USER_CMD_NONE, false },
+ { "on", SPECTRE_V2_USER_CMD_FORCE, true },
+ { "prctl", SPECTRE_V2_USER_CMD_PRCTL, false },
+ { "prctl,ibpb", SPECTRE_V2_USER_CMD_PRCTL_IBPB, false },
+ { "seccomp", SPECTRE_V2_USER_CMD_SECCOMP, false },
+ { "seccomp,ibpb", SPECTRE_V2_USER_CMD_SECCOMP_IBPB, false },
+};
+
+static void __init spec_v2_user_print_cond(const char *reason, bool secure)
{
- if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
- pr_info("%s selected on command line.\n", reason);
+ if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure)
+ pr_info("spectre_v2_user=%s forced on command line.\n", reason);
}
-static inline bool retp_compiler(void)
+static enum spectre_v2_user_cmd __init
+spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd)
{
- return __is_defined(RETPOLINE);
+ char arg[20];
+ int ret, i;
+
+ switch (v2_cmd) {
+ case SPECTRE_V2_CMD_NONE:
+ return SPECTRE_V2_USER_CMD_NONE;
+ case SPECTRE_V2_CMD_FORCE:
+ return SPECTRE_V2_USER_CMD_FORCE;
+ default:
+ break;
+ }
+
+ ret = cmdline_find_option(boot_command_line, "spectre_v2_user",
+ arg, sizeof(arg));
+ if (ret < 0)
+ return SPECTRE_V2_USER_CMD_AUTO;
+
+ for (i = 0; i < ARRAY_SIZE(v2_user_options); i++) {
+ if (match_option(arg, ret, v2_user_options[i].option)) {
+ spec_v2_user_print_cond(v2_user_options[i].option,
+ v2_user_options[i].secure);
+ return v2_user_options[i].cmd;
+ }
+ }
+
+ pr_err("Unknown user space protection option (%s). Switching to AUTO select\n", arg);
+ return SPECTRE_V2_USER_CMD_AUTO;
}
-static inline bool match_option(const char *arg, int arglen, const char *opt)
+static void __init
+spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
{
- int len = strlen(opt);
+ enum spectre_v2_user_mitigation mode = SPECTRE_V2_USER_NONE;
+ bool smt_possible = IS_ENABLED(CONFIG_SMP);
+ enum spectre_v2_user_cmd cmd;
- return len == arglen && !strncmp(arg, opt, len);
+ if (!boot_cpu_has(X86_FEATURE_IBPB) && !boot_cpu_has(X86_FEATURE_STIBP))
+ return;
+
+ if (cpu_smt_control == CPU_SMT_FORCE_DISABLED ||
+ cpu_smt_control == CPU_SMT_NOT_SUPPORTED)
+ smt_possible = false;
+
+ cmd = spectre_v2_parse_user_cmdline(v2_cmd);
+ switch (cmd) {
+ case SPECTRE_V2_USER_CMD_NONE:
+ goto set_mode;
+ case SPECTRE_V2_USER_CMD_FORCE:
+ mode = SPECTRE_V2_USER_STRICT;
+ break;
+ case SPECTRE_V2_USER_CMD_PRCTL:
+ case SPECTRE_V2_USER_CMD_PRCTL_IBPB:
+ mode = SPECTRE_V2_USER_PRCTL;
+ break;
+ case SPECTRE_V2_USER_CMD_AUTO:
+ case SPECTRE_V2_USER_CMD_SECCOMP:
+ case SPECTRE_V2_USER_CMD_SECCOMP_IBPB:
+ if (IS_ENABLED(CONFIG_SECCOMP))
+ mode = SPECTRE_V2_USER_SECCOMP;
+ else
+ mode = SPECTRE_V2_USER_PRCTL;
+ break;
+ }
+
+ /* Initialize Indirect Branch Prediction Barrier */
+ if (boot_cpu_has(X86_FEATURE_IBPB)) {
+ setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
+
+ switch (cmd) {
+ case SPECTRE_V2_USER_CMD_FORCE:
+ case SPECTRE_V2_USER_CMD_PRCTL_IBPB:
+ case SPECTRE_V2_USER_CMD_SECCOMP_IBPB:
+ static_branch_enable(&switch_mm_always_ibpb);
+ break;
+ case SPECTRE_V2_USER_CMD_PRCTL:
+ case SPECTRE_V2_USER_CMD_AUTO:
+ case SPECTRE_V2_USER_CMD_SECCOMP:
+ static_branch_enable(&switch_mm_cond_ibpb);
+ break;
+ default:
+ break;
+ }
+
+ pr_info("mitigation: Enabling %s Indirect Branch Prediction Barrier\n",
+ static_key_enabled(&switch_mm_always_ibpb) ?
+ "always-on" : "conditional");
+ }
+
+ /* If enhanced IBRS is enabled no STIPB required */
+ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+ return;
+
+ /*
+ * If SMT is not possible or STIBP is not available clear the STIPB
+ * mode.
+ */
+ if (!smt_possible || !boot_cpu_has(X86_FEATURE_STIBP))
+ mode = SPECTRE_V2_USER_NONE;
+set_mode:
+ spectre_v2_user = mode;
+ /* Only print the STIBP mode when SMT possible */
+ if (smt_possible)
+ pr_info("%s\n", spectre_v2_user_strings[mode]);
}
+static const char * const spectre_v2_strings[] = {
+ [SPECTRE_V2_NONE] = "Vulnerable",
+ [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline",
+ [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline",
+ [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS",
+};
+
static const struct {
const char *option;
enum spectre_v2_mitigation_cmd cmd;
bool secure;
-} mitigation_options[] = {
- { "off", SPECTRE_V2_CMD_NONE, false },
- { "on", SPECTRE_V2_CMD_FORCE, true },
- { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
- { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false },
- { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
- { "auto", SPECTRE_V2_CMD_AUTO, false },
+} mitigation_options[] __initdata = {
+ { "off", SPECTRE_V2_CMD_NONE, false },
+ { "on", SPECTRE_V2_CMD_FORCE, true },
+ { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
+ { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false },
+ { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
+ { "auto", SPECTRE_V2_CMD_AUTO, false },
};
+static void __init spec_v2_print_cond(const char *reason, bool secure)
+{
+ if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure)
+ pr_info("%s selected on command line.\n", reason);
+}
+
static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
{
+ enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;
char arg[20];
int ret, i;
- enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;
if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
return SPECTRE_V2_CMD_NONE;
- else {
- ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
- if (ret < 0)
- return SPECTRE_V2_CMD_AUTO;
- for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
- if (!match_option(arg, ret, mitigation_options[i].option))
- continue;
- cmd = mitigation_options[i].cmd;
- break;
- }
+ ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
+ if (ret < 0)
+ return SPECTRE_V2_CMD_AUTO;
- if (i >= ARRAY_SIZE(mitigation_options)) {
- pr_err("unknown option (%s). Switching to AUTO select\n", arg);
- return SPECTRE_V2_CMD_AUTO;
- }
+ for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
+ if (!match_option(arg, ret, mitigation_options[i].option))
+ continue;
+ cmd = mitigation_options[i].cmd;
+ break;
+ }
+
+ if (i >= ARRAY_SIZE(mitigation_options)) {
+ pr_err("unknown option (%s). Switching to AUTO select\n", arg);
+ return SPECTRE_V2_CMD_AUTO;
}
if ((cmd == SPECTRE_V2_CMD_RETPOLINE ||
return SPECTRE_V2_CMD_AUTO;
}
- if (mitigation_options[i].secure)
- spec2_print_if_secure(mitigation_options[i].option);
- else
- spec2_print_if_insecure(mitigation_options[i].option);
-
+ spec_v2_print_cond(mitigation_options[i].option,
+ mitigation_options[i].secure);
return cmd;
}
-static bool stibp_needed(void)
-{
- if (spectre_v2_enabled == SPECTRE_V2_NONE)
- return false;
-
- if (!boot_cpu_has(X86_FEATURE_STIBP))
- return false;
-
- return true;
-}
-
-static void update_stibp_msr(void *info)
-{
- wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
-}
-
-void arch_smt_update(void)
-{
- u64 mask;
-
- if (!stibp_needed())
- return;
-
- mutex_lock(&spec_ctrl_mutex);
- mask = x86_spec_ctrl_base;
- if (cpu_smt_control == CPU_SMT_ENABLED)
- mask |= SPEC_CTRL_STIBP;
- else
- mask &= ~SPEC_CTRL_STIBP;
-
- if (mask != x86_spec_ctrl_base) {
- pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
- cpu_smt_control == CPU_SMT_ENABLED ?
- "Enabling" : "Disabling");
- x86_spec_ctrl_base = mask;
- on_each_cpu(update_stibp_msr, NULL, 1);
- }
- mutex_unlock(&spec_ctrl_mutex);
-}
-
static void __init spectre_v2_select_mitigation(void)
{
enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n");
goto retpoline_generic;
}
- mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD :
- SPECTRE_V2_RETPOLINE_MINIMAL_AMD;
+ mode = SPECTRE_V2_RETPOLINE_AMD;
setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD);
setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
} else {
retpoline_generic:
- mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC :
- SPECTRE_V2_RETPOLINE_MINIMAL;
+ mode = SPECTRE_V2_RETPOLINE_GENERIC;
setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
}
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
- /* Initialize Indirect Branch Prediction Barrier if supported */
- if (boot_cpu_has(X86_FEATURE_IBPB)) {
- setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
- pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
- }
-
/*
* Retpoline means the kernel is safe because it has no indirect
* branches. Enhanced IBRS protects firmware too, so, enable restricted
pr_info("Enabling Restricted Speculation for firmware calls\n");
}
+ /* Set up IBPB and STIBP depending on the general spectre V2 command */
+ spectre_v2_user_select_mitigation(cmd);
+
/* Enable STIBP if appropriate */
arch_smt_update();
}
+static void update_stibp_msr(void * __unused)
+{
+ wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+}
+
+/* Update x86_spec_ctrl_base in case SMT state changed. */
+static void update_stibp_strict(void)
+{
+ u64 mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP;
+
+ if (sched_smt_active())
+ mask |= SPEC_CTRL_STIBP;
+
+ if (mask == x86_spec_ctrl_base)
+ return;
+
+ pr_info("Update user space SMT mitigation: STIBP %s\n",
+ mask & SPEC_CTRL_STIBP ? "always-on" : "off");
+ x86_spec_ctrl_base = mask;
+ on_each_cpu(update_stibp_msr, NULL, 1);
+}
+
+/* Update the static key controlling the evaluation of TIF_SPEC_IB */
+static void update_indir_branch_cond(void)
+{
+ if (sched_smt_active())
+ static_branch_enable(&switch_to_cond_stibp);
+ else
+ static_branch_disable(&switch_to_cond_stibp);
+}
+
+void arch_smt_update(void)
+{
+ /* Enhanced IBRS implies STIBP. No update required. */
+ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+ return;
+
+ mutex_lock(&spec_ctrl_mutex);
+
+ switch (spectre_v2_user) {
+ case SPECTRE_V2_USER_NONE:
+ break;
+ case SPECTRE_V2_USER_STRICT:
+ update_stibp_strict();
+ break;
+ case SPECTRE_V2_USER_PRCTL:
+ case SPECTRE_V2_USER_SECCOMP:
+ update_indir_branch_cond();
+ break;
+ }
+
+ mutex_unlock(&spec_ctrl_mutex);
+}
+
#undef pr_fmt
#define pr_fmt(fmt) "Speculative Store Bypass: " fmt
SPEC_STORE_BYPASS_CMD_SECCOMP,
};
-static const char *ssb_strings[] = {
+static const char * const ssb_strings[] = {
[SPEC_STORE_BYPASS_NONE] = "Vulnerable",
[SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled",
[SPEC_STORE_BYPASS_PRCTL] = "Mitigation: Speculative Store Bypass disabled via prctl",
static const struct {
const char *option;
enum ssb_mitigation_cmd cmd;
-} ssb_mitigation_options[] = {
+} ssb_mitigation_options[] __initdata = {
{ "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */
{ "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */
{ "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */
#undef pr_fmt
#define pr_fmt(fmt) "Speculation prctl: " fmt
-static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
+static void task_update_spec_tif(struct task_struct *tsk)
{
- bool update;
+ /* Force the update of the real TIF bits */
+ set_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE);
+ /*
+ * Immediately update the speculation control MSRs for the current
+ * task, but for a non-current task delay setting the CPU
+ * mitigation until it is scheduled next.
+ *
+ * This can only happen for SECCOMP mitigation. For PRCTL it's
+ * always the current task.
+ */
+ if (tsk == current)
+ speculation_ctrl_update_current();
+}
+
+static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
+{
if (ssb_mode != SPEC_STORE_BYPASS_PRCTL &&
ssb_mode != SPEC_STORE_BYPASS_SECCOMP)
return -ENXIO;
if (task_spec_ssb_force_disable(task))
return -EPERM;
task_clear_spec_ssb_disable(task);
- update = test_and_clear_tsk_thread_flag(task, TIF_SSBD);
+ task_update_spec_tif(task);
break;
case PR_SPEC_DISABLE:
task_set_spec_ssb_disable(task);
- update = !test_and_set_tsk_thread_flag(task, TIF_SSBD);
+ task_update_spec_tif(task);
break;
case PR_SPEC_FORCE_DISABLE:
task_set_spec_ssb_disable(task);
task_set_spec_ssb_force_disable(task);
- update = !test_and_set_tsk_thread_flag(task, TIF_SSBD);
+ task_update_spec_tif(task);
break;
default:
return -ERANGE;
}
+ return 0;
+}
- /*
- * If being set on non-current task, delay setting the CPU
- * mitigation until it is next scheduled.
- */
- if (task == current && update)
- speculative_store_bypass_update_current();
-
+static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
+{
+ switch (ctrl) {
+ case PR_SPEC_ENABLE:
+ if (spectre_v2_user == SPECTRE_V2_USER_NONE)
+ return 0;
+ /*
+ * Indirect branch speculation is always disabled in strict
+ * mode.
+ */
+ if (spectre_v2_user == SPECTRE_V2_USER_STRICT)
+ return -EPERM;
+ task_clear_spec_ib_disable(task);
+ task_update_spec_tif(task);
+ break;
+ case PR_SPEC_DISABLE:
+ case PR_SPEC_FORCE_DISABLE:
+ /*
+ * Indirect branch speculation is always allowed when
+ * mitigation is force disabled.
+ */
+ if (spectre_v2_user == SPECTRE_V2_USER_NONE)
+ return -EPERM;
+ if (spectre_v2_user == SPECTRE_V2_USER_STRICT)
+ return 0;
+ task_set_spec_ib_disable(task);
+ if (ctrl == PR_SPEC_FORCE_DISABLE)
+ task_set_spec_ib_force_disable(task);
+ task_update_spec_tif(task);
+ break;
+ default:
+ return -ERANGE;
+ }
return 0;
}
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssb_prctl_set(task, ctrl);
+ case PR_SPEC_INDIRECT_BRANCH:
+ return ib_prctl_set(task, ctrl);
default:
return -ENODEV;
}
{
if (ssb_mode == SPEC_STORE_BYPASS_SECCOMP)
ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE);
+ if (spectre_v2_user == SPECTRE_V2_USER_SECCOMP)
+ ib_prctl_set(task, PR_SPEC_FORCE_DISABLE);
}
#endif
}
}
+static int ib_prctl_get(struct task_struct *task)
+{
+ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+ return PR_SPEC_NOT_AFFECTED;
+
+ switch (spectre_v2_user) {
+ case SPECTRE_V2_USER_NONE:
+ return PR_SPEC_ENABLE;
+ case SPECTRE_V2_USER_PRCTL:
+ case SPECTRE_V2_USER_SECCOMP:
+ if (task_spec_ib_force_disable(task))
+ return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
+ if (task_spec_ib_disable(task))
+ return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
+ return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
+ case SPECTRE_V2_USER_STRICT:
+ return PR_SPEC_DISABLE;
+ default:
+ return PR_SPEC_NOT_AFFECTED;
+ }
+}
+
int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
{
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssb_prctl_get(task);
+ case PR_SPEC_INDIRECT_BRANCH:
+ return ib_prctl_get(task);
default:
return -ENODEV;
}
#define L1TF_DEFAULT_MSG "Mitigation: PTE Inversion"
#if IS_ENABLED(CONFIG_KVM_INTEL)
-static const char *l1tf_vmx_states[] = {
+static const char * const l1tf_vmx_states[] = {
[VMENTER_L1D_FLUSH_AUTO] = "auto",
[VMENTER_L1D_FLUSH_NEVER] = "vulnerable",
[VMENTER_L1D_FLUSH_COND] = "conditional cache flushes",
if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_EPT_DISABLED ||
(l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER &&
- cpu_smt_control == CPU_SMT_ENABLED))
+ sched_smt_active())) {
return sprintf(buf, "%s; VMX: %s\n", L1TF_DEFAULT_MSG,
l1tf_vmx_states[l1tf_vmx_mitigation]);
+ }
return sprintf(buf, "%s; VMX: %s, SMT %s\n", L1TF_DEFAULT_MSG,
l1tf_vmx_states[l1tf_vmx_mitigation],
- cpu_smt_control == CPU_SMT_ENABLED ? "vulnerable" : "disabled");
+ sched_smt_active() ? "vulnerable" : "disabled");
}
#else
static ssize_t l1tf_show_state(char *buf)
}
#endif
+static char *stibp_state(void)
+{
+ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+ return "";
+
+ switch (spectre_v2_user) {
+ case SPECTRE_V2_USER_NONE:
+ return ", STIBP: disabled";
+ case SPECTRE_V2_USER_STRICT:
+ return ", STIBP: forced";
+ case SPECTRE_V2_USER_PRCTL:
+ case SPECTRE_V2_USER_SECCOMP:
+ if (static_key_enabled(&switch_to_cond_stibp))
+ return ", STIBP: conditional";
+ }
+ return "";
+}
+
+static char *ibpb_state(void)
+{
+ if (boot_cpu_has(X86_FEATURE_IBPB)) {
+ if (static_key_enabled(&switch_mm_always_ibpb))
+ return ", IBPB: always-on";
+ if (static_key_enabled(&switch_mm_cond_ibpb))
+ return ", IBPB: conditional";
+ return ", IBPB: disabled";
+ }
+ return "";
+}
+
static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
char *buf, unsigned int bug)
{
- int ret;
-
if (!boot_cpu_has_bug(bug))
return sprintf(buf, "Not affected\n");
return sprintf(buf, "Mitigation: __user pointer sanitization\n");
case X86_BUG_SPECTRE_V2:
- ret = sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
- boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
+ return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+ ibpb_state(),
boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
- (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
+ stibp_state(),
boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
spectre_v2_module_string());
- return ret;
case X86_BUG_SPEC_STORE_BYPASS:
return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
#endif
c->x86_cache_alignment = c->x86_clflush_size;
- memset(&c->x86_capability, 0, sizeof c->x86_capability);
+ memset(&c->x86_capability, 0, sizeof(c->x86_capability));
c->extended_cpuid_level = 0;
if (!have_cpuid_p())
c->x86_virt_bits = 32;
#endif
c->x86_cache_alignment = c->x86_clflush_size;
- memset(&c->x86_capability, 0, sizeof c->x86_capability);
+ memset(&c->x86_capability, 0, sizeof(c->x86_capability));
generic_identify(c);
* be somewhat complicated (e.g. segment offset would require an instruction
* parser). So only support physical addresses up to page granuality for now.
*/
-static int mce_usable_address(struct mce *m)
+int mce_usable_address(struct mce *m)
{
if (!(m->status & MCI_STATUS_ADDRV))
return 0;
return 1;
}
+EXPORT_SYMBOL_GPL(mce_usable_address);
bool mce_is_memory_error(struct mce *m)
{
}
EXPORT_SYMBOL_GPL(mce_is_memory_error);
-static bool mce_is_correctable(struct mce *m)
+bool mce_is_correctable(struct mce *m)
{
if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
return false;
return true;
}
+EXPORT_SYMBOL_GPL(mce_is_correctable);
static bool cec_add_mce(struct mce *m)
{
if (dev)
return 0;
- dev = kzalloc(sizeof *dev, GFP_KERNEL);
+ dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
return -ENOMEM;
dev->id = cpu;
/* Threshold LVT offset is at MSR0xC0000410[15:12] */
#define SMCA_THR_LVT_OFF 0xF000
-static bool thresholding_en;
+static bool thresholding_irq_en;
static const char * const th_names[] = {
"load_store",
set_offset:
offset = setup_APIC_mce_threshold(offset, new);
-
- if ((offset == new) && (mce_threshold_vector != amd_threshold_interrupt))
- mce_threshold_vector = amd_threshold_interrupt;
+ if (offset == new)
+ thresholding_irq_en = true;
done:
mce_threshold_block_init(&b, offset);
{
unsigned int bank;
- if (!thresholding_en)
- return 0;
-
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
struct threshold_bank **bp;
int err = 0;
- if (!thresholding_en)
- return 0;
-
bp = per_cpu(threshold_banks, cpu);
if (bp)
return 0;
{
unsigned lcpu = 0;
- if (mce_threshold_vector == amd_threshold_interrupt)
- thresholding_en = true;
-
/* to hit CPUs online before the notifier is up */
for_each_online_cpu(lcpu) {
int err = mce_threshold_create_device(lcpu);
return err;
}
+ if (thresholding_irq_en)
+ mce_threshold_vector = amd_threshold_interrupt;
+
return 0;
}
/*
}
static DEVICE_ATTR_WO(reload);
-static DEVICE_ATTR(version, 0400, version_show, NULL);
-static DEVICE_ATTR(processor_flags, 0400, pf_show, NULL);
+static DEVICE_ATTR(version, 0444, version_show, NULL);
+static DEVICE_ATTR(processor_flags, 0444, pf_show, NULL);
static struct attribute *mc_default_attrs[] = {
&dev_attr_version.attr,
#include <linux/interrupt.h>
#include <linux/irq.h>
#include <linux/kexec.h>
+#include <linux/i8253.h>
#include <asm/processor.h>
#include <asm/hypervisor.h>
#include <asm/hyperv-tlfs.h>
if (efi_enabled(EFI_BOOT))
x86_platform.get_nmi_reason = hv_get_nmi_reason;
+ /*
+ * Hyper-V VMs have a PIT emulation quirk such that zeroing the
+ * counter register during PIT shutdown restarts the PIT. So it
+ * continues to interrupt @18.2 HZ. Setting i8253_clear_counter
+ * to false tells pit_shutdown() not to zero the counter so that
+ * the PIT really is shutdown. Generation 2 VMs don't have a PIT,
+ * and setting this value has no effect.
+ */
+ i8253_clear_counter_on_shutdown = false;
+
#if IS_ENABLED(CONFIG_HYPERV)
/*
* Setup the hook to get control post apic initialization.
local_irq_restore(flags);
/* Use the atomic bitops to update the global mask */
- for (count = 0; count < sizeof mask * 8; ++count) {
+ for (count = 0; count < sizeof(mask) * 8; ++count) {
if (mask & 0x01)
set_bit(count, &smp_changes_mask);
mask >>= 1;
case MTRRIOC_SET_PAGE_ENTRY:
case MTRRIOC_DEL_PAGE_ENTRY:
case MTRRIOC_KILL_PAGE_ENTRY:
- if (copy_from_user(&sentry, arg, sizeof sentry))
+ if (copy_from_user(&sentry, arg, sizeof(sentry)))
return -EFAULT;
break;
case MTRRIOC_GET_ENTRY:
case MTRRIOC_GET_PAGE_ENTRY:
- if (copy_from_user(&gentry, arg, sizeof gentry))
+ if (copy_from_user(&gentry, arg, sizeof(gentry)))
return -EFAULT;
break;
#ifdef CONFIG_COMPAT
switch (cmd) {
case MTRRIOC_GET_ENTRY:
case MTRRIOC_GET_PAGE_ENTRY:
- if (copy_to_user(arg, &gentry, sizeof gentry))
+ if (copy_to_user(arg, &gentry, sizeof(gentry)))
err = -EFAULT;
break;
#ifdef CONFIG_COMPAT
}
early_param("no-vmw-sched-clock", setup_vmw_sched_clock);
-static unsigned long long vmware_sched_clock(void)
+static unsigned long long notrace vmware_sched_clock(void)
{
unsigned long long ns;
* early_pci_serial_init()
*
* This function is invoked when the early_printk param starts with "pciserial"
- * The rest of the param should be ",B:D.F,baud" where B, D & F describe the
- * location of a PCI device that must be a UART device.
+ * The rest of the param should be "[force],B:D.F,baud", where B, D & F describe
+ * the location of a PCI device that must be a UART device. "force" is optional
+ * and overrides the use of an UART device with a wrong PCI class code.
*/
static __init void early_pci_serial_init(char *s)
{
u32 classcode, bar0;
u16 cmdreg;
char *e;
+ int force = 0;
-
- /*
- * First, part the param to get the BDF values
- */
if (*s == ',')
++s;
if (*s == 0)
return;
+ /* Force the use of an UART device with wrong class code */
+ if (!strncmp(s, "force,", 6)) {
+ force = 1;
+ s += 6;
+ }
+
+ /*
+ * Part the param to get the BDF values
+ */
bus = (u8)simple_strtoul(s, &e, 16);
s = e;
if (*s != ':')
s++;
/*
- * Second, find the device from the BDF
+ * Find the device from the BDF
*/
cmdreg = read_pci_config(bus, slot, func, PCI_COMMAND);
classcode = read_pci_config(bus, slot, func, PCI_CLASS_REVISION);
*/
if (((classcode >> 16 != PCI_CLASS_COMMUNICATION_MODEM) &&
(classcode >> 16 != PCI_CLASS_COMMUNICATION_SERIAL)) ||
- (((classcode >> 8) & 0xff) != 0x02)) /* 16550 I/F at BAR0 */
- return;
+ (((classcode >> 8) & 0xff) != 0x02)) /* 16550 I/F at BAR0 */ {
+ if (!force)
+ return;
+ }
/*
* Determine if it is IO or memory mapped
}
/*
- * Lastly, initialize the hardware
+ * Initialize the hardware
*/
if (*s) {
if (strcmp(s, "nocfg") == 0)
sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
}
+ local_bh_disable();
fpu->initialized = 1;
- preempt_disable();
fpu__restore(fpu);
- preempt_enable();
+ local_bh_enable();
return err;
} else {
{
unsigned long old;
int faulted;
- struct ftrace_graph_ent trace;
unsigned long return_hooker = (unsigned long)
&return_to_handler;
return;
}
- trace.func = self_addr;
- trace.depth = current->curr_ret_stack + 1;
-
- /* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace)) {
+ if (function_graph_enter(old, self_addr, frame_pointer, parent))
*parent = old;
- return;
- }
-
- if (ftrace_push_return_trace(old, self_addr, &trace.depth,
- frame_pointer, parent) == -EBUSY) {
- *parent = old;
- return;
- }
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
cr4_init_shadow();
sanitize_boot_params(&boot_params);
- x86_verify_bootdata_version();
x86_early_init_platform_quirks();
*/
sme_map_bootdata(real_mode_data);
- memcpy(&boot_params, real_mode_data, sizeof boot_params);
+ memcpy(&boot_params, real_mode_data, sizeof(boot_params));
sanitize_boot_params(&boot_params);
cmd_line_ptr = get_cmd_line_ptr();
if (cmd_line_ptr) {
if (!boot_params.hdr.version)
copy_bootdata(__va(real_mode_data));
- x86_verify_bootdata_version();
-
x86_early_init_platform_quirks();
switch (boot_params.hdr.hardware_subarch) {
int len = 0, ret;
while (len < RELATIVEJUMP_SIZE) {
- ret = __copy_instruction(dest + len, src + len, real, &insn);
+ ret = __copy_instruction(dest + len, src + len, real + len, &insn);
if (!ret || !can_boost(&insn, src + len))
return -EINVAL;
len += ret;
/*
* If PTI is enabled, this maps the LDT into the kernelmode and
* usermode tables for the given mm.
- *
- * There is no corresponding unmap function. Even if the LDT is freed, we
- * leave the PTEs around until the slot is reused or the mm is destroyed.
- * This is harmless: the LDT is always in ordinary memory, and no one will
- * access the freed slot.
- *
- * If we wanted to unmap freed LDTs, we'd also need to do a flush to make
- * it useful, and the flush would slow down modify_ldt().
*/
static int
map_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt, int slot)
unsigned long va;
bool is_vmalloc;
spinlock_t *ptl;
- pgd_t *pgd;
- int i;
+ int i, nr_pages;
if (!static_cpu_has(X86_FEATURE_PTI))
return 0;
/* Check if the current mappings are sane */
sanity_check_ldt_mapping(mm);
- /*
- * Did we already have the top level entry allocated? We can't
- * use pgd_none() for this because it doens't do anything on
- * 4-level page table kernels.
- */
- pgd = pgd_offset(mm, LDT_BASE_ADDR);
-
is_vmalloc = is_vmalloc_addr(ldt->entries);
- for (i = 0; i * PAGE_SIZE < ldt->nr_entries * LDT_ENTRY_SIZE; i++) {
+ nr_pages = DIV_ROUND_UP(ldt->nr_entries * LDT_ENTRY_SIZE, PAGE_SIZE);
+
+ for (i = 0; i < nr_pages; i++) {
unsigned long offset = i << PAGE_SHIFT;
const void *src = (char *)ldt->entries + offset;
unsigned long pfn;
/* Propagate LDT mapping to the user page-table */
map_ldt_struct_to_user(mm);
- va = (unsigned long)ldt_slot_va(slot);
- flush_tlb_mm_range(mm, va, va + LDT_SLOT_STRIDE, PAGE_SHIFT, false);
-
ldt->slot = slot;
return 0;
}
+static void unmap_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt)
+{
+ unsigned long va;
+ int i, nr_pages;
+
+ if (!ldt)
+ return;
+
+ /* LDT map/unmap is only required for PTI */
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return;
+
+ nr_pages = DIV_ROUND_UP(ldt->nr_entries * LDT_ENTRY_SIZE, PAGE_SIZE);
+
+ for (i = 0; i < nr_pages; i++) {
+ unsigned long offset = i << PAGE_SHIFT;
+ spinlock_t *ptl;
+ pte_t *ptep;
+
+ va = (unsigned long)ldt_slot_va(ldt->slot) + offset;
+ ptep = get_locked_pte(mm, va, &ptl);
+ pte_clear(mm, va, ptep);
+ pte_unmap_unlock(ptep, ptl);
+ }
+
+ va = (unsigned long)ldt_slot_va(ldt->slot);
+ flush_tlb_mm_range(mm, va, va + nr_pages * PAGE_SIZE, PAGE_SHIFT, false);
+}
+
#else /* !CONFIG_PAGE_TABLE_ISOLATION */
static int
{
return 0;
}
+
+static void unmap_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt)
+{
+}
#endif /* CONFIG_PAGE_TABLE_ISOLATION */
static void free_ldt_pgtables(struct mm_struct *mm)
}
install_ldt(mm, new_ldt);
+ unmap_ldt_struct(mm, old_ldt);
free_ldt_struct(old_ldt);
error = 0;
err = -EBADF;
break;
}
- if (copy_from_user(®s, uregs, sizeof regs)) {
+ if (copy_from_user(®s, uregs, sizeof(regs))) {
err = -EFAULT;
break;
}
err = rdmsr_safe_regs_on_cpu(cpu, regs);
if (err)
break;
- if (copy_to_user(uregs, ®s, sizeof regs))
+ if (copy_to_user(uregs, ®s, sizeof(regs)))
err = -EFAULT;
break;
err = -EBADF;
break;
}
- if (copy_from_user(®s, uregs, sizeof regs)) {
+ if (copy_from_user(®s, uregs, sizeof(regs))) {
err = -EFAULT;
break;
}
err = wrmsr_safe_regs_on_cpu(cpu, regs);
if (err)
break;
- if (copy_to_user(uregs, ®s, sizeof regs))
+ if (copy_to_user(uregs, ®s, sizeof(regs)))
err = -EFAULT;
break;
".type _paravirt_nop, @function\n\t"
".popsection");
-/* identity function, which can be inlined */
-u32 notrace _paravirt_ident_32(u32 x)
-{
- return x;
-}
-
-u64 notrace _paravirt_ident_64(u64 x)
-{
- return x;
-}
-
void __init default_banner(void)
{
printk(KERN_INFO "Booting paravirtualized kernel on %s\n",
}
#ifdef CONFIG_PARAVIRT_XXL
+/* identity function, which can be inlined */
+u64 notrace _paravirt_ident_64(u64 x)
+{
+ return x;
+}
+
static unsigned paravirt_patch_jmp(void *insnbuf, const void *target,
unsigned long addr, unsigned len)
{
else if (opfunc == _paravirt_nop)
ret = 0;
+#ifdef CONFIG_PARAVIRT_XXL
/* identity functions just return their single argument */
- else if (opfunc == _paravirt_ident_32)
- ret = paravirt_patch_ident_32(insnbuf, len);
else if (opfunc == _paravirt_ident_64)
ret = paravirt_patch_ident_64(insnbuf, len);
-#ifdef CONFIG_PARAVIRT_XXL
else if (type == PARAVIRT_PATCH(cpu.iret) ||
type == PARAVIRT_PATCH(cpu.usergs_sysret64))
/* If operation requires a jmp, then jmp */
#endif
};
-#if defined(CONFIG_X86_32) && !defined(CONFIG_X86_PAE)
-/* 32-bit pagetable entries */
-#define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_32)
-#else
/* 64-bit pagetable entries */
#define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_64)
-#endif
struct paravirt_patch_template pv_ops = {
/* Init ops. */
NOKPROBE_SYMBOL(native_load_idt);
#endif
-EXPORT_SYMBOL_GPL(pv_ops);
+EXPORT_SYMBOL(pv_ops);
EXPORT_SYMBOL_GPL(pv_info);
DEF_NATIVE(mmu, read_cr2, "mov %cr2, %eax");
DEF_NATIVE(mmu, write_cr3, "mov %eax, %cr3");
DEF_NATIVE(mmu, read_cr3, "mov %cr3, %eax");
-#endif
-
-#if defined(CONFIG_PARAVIRT_SPINLOCKS)
-DEF_NATIVE(lock, queued_spin_unlock, "movb $0, (%eax)");
-DEF_NATIVE(lock, vcpu_is_preempted, "xor %eax, %eax");
-#endif
-
-unsigned paravirt_patch_ident_32(void *insnbuf, unsigned len)
-{
- /* arg in %eax, return in %eax */
- return 0;
-}
unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len)
{
/* arg in %edx:%eax, return in %edx:%eax */
return 0;
}
+#endif
+
+#if defined(CONFIG_PARAVIRT_SPINLOCKS)
+DEF_NATIVE(lock, queued_spin_unlock, "movb $0, (%eax)");
+DEF_NATIVE(lock, vcpu_is_preempted, "xor %eax, %eax");
+#endif
extern bool pv_is_native_spin_unlock(void);
extern bool pv_is_native_vcpu_is_preempted(void);
DEF_NATIVE(cpu, usergs_sysret64, "swapgs; sysretq");
DEF_NATIVE(cpu, swapgs, "swapgs");
-#endif
-
-DEF_NATIVE(, mov32, "mov %edi, %eax");
DEF_NATIVE(, mov64, "mov %rdi, %rax");
-#if defined(CONFIG_PARAVIRT_SPINLOCKS)
-DEF_NATIVE(lock, queued_spin_unlock, "movb $0, (%rdi)");
-DEF_NATIVE(lock, vcpu_is_preempted, "xor %eax, %eax");
-#endif
-
-unsigned paravirt_patch_ident_32(void *insnbuf, unsigned len)
-{
- return paravirt_patch_insns(insnbuf, len,
- start__mov32, end__mov32);
-}
-
unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len)
{
return paravirt_patch_insns(insnbuf, len,
start__mov64, end__mov64);
}
+#endif
+
+#if defined(CONFIG_PARAVIRT_SPINLOCKS)
+DEF_NATIVE(lock, queued_spin_unlock, "movb $0, (%rdi)");
+DEF_NATIVE(lock, vcpu_is_preempted, "xor %eax, %eax");
+#endif
extern bool pv_is_native_spin_unlock(void);
extern bool pv_is_native_vcpu_is_preempted(void);
#include <asm/prctl.h>
#include <asm/spec-ctrl.h>
+#include "process.h"
+
/*
* per-CPU TSS segments. Threads are completely 'soft' on Linux,
* no more per-task TSS's. The TSS size is kept cacheline-aligned
enable_cpuid();
}
-static inline void switch_to_bitmap(struct tss_struct *tss,
- struct thread_struct *prev,
+static inline void switch_to_bitmap(struct thread_struct *prev,
struct thread_struct *next,
unsigned long tifp, unsigned long tifn)
{
+ struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw);
+
if (tifn & _TIF_IO_BITMAP) {
/*
* Copy the relevant range of the IO bitmap.
wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn));
}
-static __always_inline void intel_set_ssb_state(unsigned long tifn)
+/*
+ * Update the MSRs managing speculation control, during context switch.
+ *
+ * tifp: Previous task's thread flags
+ * tifn: Next task's thread flags
+ */
+static __always_inline void __speculation_ctrl_update(unsigned long tifp,
+ unsigned long tifn)
{
- u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn);
+ unsigned long tif_diff = tifp ^ tifn;
+ u64 msr = x86_spec_ctrl_base;
+ bool updmsr = false;
+
+ /*
+ * If TIF_SSBD is different, select the proper mitigation
+ * method. Note that if SSBD mitigation is disabled or permanentely
+ * enabled this branch can't be taken because nothing can set
+ * TIF_SSBD.
+ */
+ if (tif_diff & _TIF_SSBD) {
+ if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) {
+ amd_set_ssb_virt_state(tifn);
+ } else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) {
+ amd_set_core_ssb_state(tifn);
+ } else if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+ static_cpu_has(X86_FEATURE_AMD_SSBD)) {
+ msr |= ssbd_tif_to_spec_ctrl(tifn);
+ updmsr = true;
+ }
+ }
+
+ /*
+ * Only evaluate TIF_SPEC_IB if conditional STIBP is enabled,
+ * otherwise avoid the MSR write.
+ */
+ if (IS_ENABLED(CONFIG_SMP) &&
+ static_branch_unlikely(&switch_to_cond_stibp)) {
+ updmsr |= !!(tif_diff & _TIF_SPEC_IB);
+ msr |= stibp_tif_to_spec_ctrl(tifn);
+ }
- wrmsrl(MSR_IA32_SPEC_CTRL, msr);
+ if (updmsr)
+ wrmsrl(MSR_IA32_SPEC_CTRL, msr);
}
-static __always_inline void __speculative_store_bypass_update(unsigned long tifn)
+static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk)
{
- if (static_cpu_has(X86_FEATURE_VIRT_SSBD))
- amd_set_ssb_virt_state(tifn);
- else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD))
- amd_set_core_ssb_state(tifn);
- else
- intel_set_ssb_state(tifn);
+ if (test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE)) {
+ if (task_spec_ssb_disable(tsk))
+ set_tsk_thread_flag(tsk, TIF_SSBD);
+ else
+ clear_tsk_thread_flag(tsk, TIF_SSBD);
+
+ if (task_spec_ib_disable(tsk))
+ set_tsk_thread_flag(tsk, TIF_SPEC_IB);
+ else
+ clear_tsk_thread_flag(tsk, TIF_SPEC_IB);
+ }
+ /* Return the updated threadinfo flags*/
+ return task_thread_info(tsk)->flags;
}
-void speculative_store_bypass_update(unsigned long tif)
+void speculation_ctrl_update(unsigned long tif)
{
+ /* Forced update. Make sure all relevant TIF flags are different */
preempt_disable();
- __speculative_store_bypass_update(tif);
+ __speculation_ctrl_update(~tif, tif);
preempt_enable();
}
-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
- struct tss_struct *tss)
+/* Called from seccomp/prctl update */
+void speculation_ctrl_update_current(void)
+{
+ preempt_disable();
+ speculation_ctrl_update(speculation_ctrl_update_tif(current));
+ preempt_enable();
+}
+
+void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p)
{
struct thread_struct *prev, *next;
unsigned long tifp, tifn;
tifn = READ_ONCE(task_thread_info(next_p)->flags);
tifp = READ_ONCE(task_thread_info(prev_p)->flags);
- switch_to_bitmap(tss, prev, next, tifp, tifn);
+ switch_to_bitmap(prev, next, tifp, tifn);
propagate_user_return_notify(prev_p, next_p);
if ((tifp ^ tifn) & _TIF_NOCPUID)
set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
- if ((tifp ^ tifn) & _TIF_SSBD)
- __speculative_store_bypass_update(tifn);
+ if (likely(!((tifp | tifn) & _TIF_SPEC_FORCE_UPDATE))) {
+ __speculation_ctrl_update(tifp, tifn);
+ } else {
+ speculation_ctrl_update_tif(prev_p);
+ tifn = speculation_ctrl_update_tif(next_p);
+
+ /* Enforce MSR update to ensure consistent state */
+ __speculation_ctrl_update(~tifn, tifn);
+ }
}
/*
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+//
+// Code shared between 32 and 64 bit
+
+#include <asm/spec-ctrl.h>
+
+void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p);
+
+/*
+ * This needs to be inline to optimize for the common case where no extra
+ * work needs to be done.
+ */
+static inline void switch_to_extra(struct task_struct *prev,
+ struct task_struct *next)
+{
+ unsigned long next_tif = task_thread_info(next)->flags;
+ unsigned long prev_tif = task_thread_info(prev)->flags;
+
+ if (IS_ENABLED(CONFIG_SMP)) {
+ /*
+ * Avoid __switch_to_xtra() invocation when conditional
+ * STIPB is disabled and the only different bit is
+ * TIF_SPEC_IB. For CONFIG_SMP=n TIF_SPEC_IB is not
+ * in the TIF_WORK_CTXSW masks.
+ */
+ if (!static_branch_likely(&switch_to_cond_stibp)) {
+ prev_tif &= ~_TIF_SPEC_IB;
+ next_tif &= ~_TIF_SPEC_IB;
+ }
+ }
+
+ /*
+ * __switch_to_xtra() handles debug registers, i/o bitmaps,
+ * speculation mitigations etc.
+ */
+ if (unlikely(next_tif & _TIF_WORK_CTXSW_NEXT ||
+ prev_tif & _TIF_WORK_CTXSW_PREV))
+ __switch_to_xtra(prev, next);
+}
#include <asm/intel_rdt_sched.h>
#include <asm/proto.h>
+#include "process.h"
+
void __show_regs(struct pt_regs *regs, enum show_regs_mode mode)
{
unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L;
struct fpu *prev_fpu = &prev->fpu;
struct fpu *next_fpu = &next->fpu;
int cpu = smp_processor_id();
- struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
set_iopl_mask(next->iopl);
- /*
- * Now maybe handle debug registers and/or IO bitmaps
- */
- if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||
- task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))
- __switch_to_xtra(prev_p, next_p, tss);
+ switch_to_extra(prev_p, next_p);
/*
* Leave lazy mode, flushing any hypercalls made here.
#include <asm/unistd_32_ia32.h>
#endif
+#include "process.h"
+
/* Prints also some state that isn't saved in the pt_regs */
void __show_regs(struct pt_regs *regs, enum show_regs_mode mode)
{
struct fpu *prev_fpu = &prev->fpu;
struct fpu *next_fpu = &next->fpu;
int cpu = smp_processor_id();
- struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) &&
this_cpu_read(irq_count) != -1);
/* Reload sp0. */
update_task_stack(next_p);
- /*
- * Now maybe reload the debug registers and handle I/O bitmaps
- */
- if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
- task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
- __switch_to_xtra(prev_p, next_p, tss);
+ switch_to_extra(prev_p, next_p);
#ifdef CONFIG_XEN_PV
/*
current->mm->context.ia32_compat = TIF_X32;
current->personality &= ~READ_IMPLIES_EXEC;
/*
- * in_compat_syscall() uses the presence of the x32 syscall bit
+ * in_32bit_syscall() uses the presence of the x32 syscall bit
* flag to determine compat status. The x86 mmap() code relies on
* the syscall bitness so set x32 syscall bit right here to make
- * in_compat_syscall() work during exec().
+ * in_32bit_syscall() work during exec().
*
* Pretend to come from a x32 execve.
*/
unwind_init();
}
-/*
- * From boot protocol 2.14 onwards we expect the bootloader to set the
- * version to "0x8000 | <used version>". In case we find a version >= 2.14
- * without the 0x8000 we assume the boot loader supports 2.13 only and
- * reset the version accordingly. The 0x8000 flag is removed in any case.
- */
-void __init x86_verify_bootdata_version(void)
-{
- if (boot_params.hdr.version & VERSION_WRITTEN)
- boot_params.hdr.version &= ~VERSION_WRITTEN;
- else if (boot_params.hdr.version >= 0x020e)
- boot_params.hdr.version = 0x020d;
-
- if (boot_params.hdr.version < 0x020e)
- boot_params.hdr.acpi_rsdp_addr = 0;
-}
-
#ifdef CONFIG_X86_32
static struct resource video_ram_resource = {
static void find_start_end(unsigned long addr, unsigned long flags,
unsigned long *begin, unsigned long *end)
{
- if (!in_compat_syscall() && (flags & MAP_32BIT)) {
+ if (!in_32bit_syscall() && (flags & MAP_32BIT)) {
/* This is usually used needed to map code in small
model, so it needs to be in the first 31bit. Limit
it to that. This means we need to move the
}
*begin = get_mmap_base(1);
- if (in_compat_syscall())
+ if (in_32bit_syscall())
*end = task_size_32bit();
else
*end = task_size_64bit(addr > DEFAULT_MAP_WINDOW);
return addr;
/* for MAP_32BIT mappings we force the legacy mmap base */
- if (!in_compat_syscall() && (flags & MAP_32BIT))
+ if (!in_32bit_syscall() && (flags & MAP_32BIT))
goto bottomup;
/* requesting a specific address */
* If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area
* in the full address space.
*
- * !in_compat_syscall() check to avoid high addresses for x32.
+ * !in_32bit_syscall() check to avoid high addresses for x32
+ * (and make it no op on native i386).
*/
- if (addr > DEFAULT_MAP_WINDOW && !in_compat_syscall())
+ if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall())
info.high_limit += TASK_SIZE_MAX - DEFAULT_MAP_WINDOW;
info.align_mask = 0;
die(message, regs, 0);
/* Be absolutely certain we don't return. */
- panic(message);
+ panic("%s", message);
}
#endif
#define TOPOLOGY_REGISTER_OFFSET 0x10
-#if defined CONFIG_PCI && defined CONFIG_PARAVIRT_XXL
-/*
- * Interrupt control on vSMPowered systems:
- * ~AC is a shadow of IF. If IF is 'on' AC should be 'off'
- * and vice versa.
- */
-
-asmlinkage __visible unsigned long vsmp_save_fl(void)
-{
- unsigned long flags = native_save_fl();
-
- if (!(flags & X86_EFLAGS_IF) || (flags & X86_EFLAGS_AC))
- flags &= ~X86_EFLAGS_IF;
- return flags;
-}
-PV_CALLEE_SAVE_REGS_THUNK(vsmp_save_fl);
-
-__visible void vsmp_restore_fl(unsigned long flags)
-{
- if (flags & X86_EFLAGS_IF)
- flags &= ~X86_EFLAGS_AC;
- else
- flags |= X86_EFLAGS_AC;
- native_restore_fl(flags);
-}
-PV_CALLEE_SAVE_REGS_THUNK(vsmp_restore_fl);
-
-asmlinkage __visible void vsmp_irq_disable(void)
-{
- unsigned long flags = native_save_fl();
-
- native_restore_fl((flags & ~X86_EFLAGS_IF) | X86_EFLAGS_AC);
-}
-PV_CALLEE_SAVE_REGS_THUNK(vsmp_irq_disable);
-
-asmlinkage __visible void vsmp_irq_enable(void)
-{
- unsigned long flags = native_save_fl();
-
- native_restore_fl((flags | X86_EFLAGS_IF) & (~X86_EFLAGS_AC));
-}
-PV_CALLEE_SAVE_REGS_THUNK(vsmp_irq_enable);
-
-static unsigned __init vsmp_patch(u8 type, void *ibuf,
- unsigned long addr, unsigned len)
-{
- switch (type) {
- case PARAVIRT_PATCH(irq.irq_enable):
- case PARAVIRT_PATCH(irq.irq_disable):
- case PARAVIRT_PATCH(irq.save_fl):
- case PARAVIRT_PATCH(irq.restore_fl):
- return paravirt_patch_default(type, ibuf, addr, len);
- default:
- return native_patch(type, ibuf, addr, len);
- }
-
-}
-
-static void __init set_vsmp_pv_ops(void)
+#ifdef CONFIG_PCI
+static void __init set_vsmp_ctl(void)
{
void __iomem *address;
unsigned int cap, ctl, cfg;
}
#endif
- if (cap & ctl & (1 << 4)) {
- /* Setup irq ops and turn on vSMP IRQ fastpath handling */
- pv_ops.irq.irq_disable = PV_CALLEE_SAVE(vsmp_irq_disable);
- pv_ops.irq.irq_enable = PV_CALLEE_SAVE(vsmp_irq_enable);
- pv_ops.irq.save_fl = PV_CALLEE_SAVE(vsmp_save_fl);
- pv_ops.irq.restore_fl = PV_CALLEE_SAVE(vsmp_restore_fl);
- pv_ops.init.patch = vsmp_patch;
- ctl &= ~(1 << 4);
- }
writel(ctl, address + 4);
ctl = readl(address + 4);
pr_info("vSMP CTL: control set to:0x%08x\n", ctl);
early_iounmap(address, 8);
}
-#else
-static void __init set_vsmp_pv_ops(void)
-{
-}
-#endif
-
-#ifdef CONFIG_PCI
static int is_vsmp = -1;
static void __init detect_vsmp_box(void)
{
return 0;
}
+static void __init set_vsmp_ctl(void)
+{
+}
#endif
static void __init vsmp_cap_cpus(void)
{
-#if !defined(CONFIG_X86_VSMP) && defined(CONFIG_SMP)
+#if !defined(CONFIG_X86_VSMP) && defined(CONFIG_SMP) && defined(CONFIG_PCI)
void __iomem *address;
unsigned int cfg, topology, node_shift, maxcpus;
vsmp_cap_cpus();
- set_vsmp_pv_ops();
+ set_vsmp_ctl();
return;
}
return emulate_gp(ctxt, index << 3 | 0x2);
addr = dt.address + index * 8;
- return linear_read_system(ctxt, addr, desc, sizeof *desc);
+ return linear_read_system(ctxt, addr, desc, sizeof(*desc));
}
static void get_descriptor_table_ptr(struct x86_emulate_ctxt *ctxt,
struct desc_struct desc;
u16 sel;
- memset (dt, 0, sizeof *dt);
+ memset(dt, 0, sizeof(*dt));
if (!ops->get_segment(ctxt, &sel, &desc, &base3,
VCPU_SREG_LDTR))
return;
if (rc != X86EMUL_CONTINUE)
return rc;
- return linear_write_system(ctxt, addr, desc, sizeof *desc);
+ return linear_write_system(ctxt, addr, desc, sizeof(*desc));
}
static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
u16 dummy;
u32 base3 = 0;
- memset(&seg_desc, 0, sizeof seg_desc);
+ memset(&seg_desc, 0, sizeof(seg_desc));
if (ctxt->mode == X86EMUL_MODE_REAL) {
/* set real mode segment descriptor (keep limit etc. for
int ret;
u32 new_tss_base = get_desc_base(new_desc);
- ret = linear_read_system(ctxt, old_tss_base, &tss_seg, sizeof tss_seg);
+ ret = linear_read_system(ctxt, old_tss_base, &tss_seg, sizeof(tss_seg));
if (ret != X86EMUL_CONTINUE)
return ret;
save_state_to_tss16(ctxt, &tss_seg);
- ret = linear_write_system(ctxt, old_tss_base, &tss_seg, sizeof tss_seg);
+ ret = linear_write_system(ctxt, old_tss_base, &tss_seg, sizeof(tss_seg));
if (ret != X86EMUL_CONTINUE)
return ret;
- ret = linear_read_system(ctxt, new_tss_base, &tss_seg, sizeof tss_seg);
+ ret = linear_read_system(ctxt, new_tss_base, &tss_seg, sizeof(tss_seg));
if (ret != X86EMUL_CONTINUE)
return ret;
ret = linear_write_system(ctxt, new_tss_base,
&tss_seg.prev_task_link,
- sizeof tss_seg.prev_task_link);
+ sizeof(tss_seg.prev_task_link));
if (ret != X86EMUL_CONTINUE)
return ret;
}
u32 eip_offset = offsetof(struct tss_segment_32, eip);
u32 ldt_sel_offset = offsetof(struct tss_segment_32, ldt_selector);
- ret = linear_read_system(ctxt, old_tss_base, &tss_seg, sizeof tss_seg);
+ ret = linear_read_system(ctxt, old_tss_base, &tss_seg, sizeof(tss_seg));
if (ret != X86EMUL_CONTINUE)
return ret;
if (ret != X86EMUL_CONTINUE)
return ret;
- ret = linear_read_system(ctxt, new_tss_base, &tss_seg, sizeof tss_seg);
+ ret = linear_read_system(ctxt, new_tss_base, &tss_seg, sizeof(tss_seg));
if (ret != X86EMUL_CONTINUE)
return ret;
ret = linear_write_system(ctxt, new_tss_base,
&tss_seg.prev_task_link,
- sizeof tss_seg.prev_task_link);
+ sizeof(tss_seg.prev_task_link));
if (ret != X86EMUL_CONTINUE)
return ret;
}
#define PRIo64 "o"
/* #define apic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg) */
-#define apic_debug(fmt, arg...)
+#define apic_debug(fmt, arg...) do {} while (0)
/* 14 is the version for Xeon and Pentium 8.4.8*/
#define APIC_VERSION (0x14UL | ((KVM_APIC_LVT_NUM - 1) << 16))
rcu_read_lock();
map = rcu_dereference(kvm->arch.apic_map);
+ if (unlikely(!map)) {
+ count = -EOPNOTSUPP;
+ goto out;
+ }
+
if (min > map->max_apic_id)
goto out;
/* Bits above cluster_size are masked in the caller. */
r = kvm_apic_state_fixup(vcpu, s, true);
if (r)
return r;
- memcpy(vcpu->arch.apic->regs, s->regs, sizeof *s);
+ memcpy(vcpu->arch.apic->regs, s->regs, sizeof(*s));
recalculate_apic_map(vcpu->kvm);
kvm_apic_set_version(vcpu);
}
static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa,
- const u8 *new, int *bytes)
+ int *bytes)
{
- u64 gentry;
+ u64 gentry = 0;
int r;
/*
/* Handle a 32-bit guest writing two halves of a 64-bit gpte */
*gpa &= ~(gpa_t)7;
*bytes = 8;
- r = kvm_vcpu_read_guest(vcpu, *gpa, &gentry, 8);
- if (r)
- gentry = 0;
- new = (const u8 *)&gentry;
}
- switch (*bytes) {
- case 4:
- gentry = *(const u32 *)new;
- break;
- case 8:
- gentry = *(const u64 *)new;
- break;
- default:
- gentry = 0;
- break;
+ if (*bytes == 4 || *bytes == 8) {
+ r = kvm_vcpu_read_guest_atomic(vcpu, *gpa, &gentry, *bytes);
+ if (r)
+ gentry = 0;
}
return gentry;
pgprintk("%s: gpa %llx bytes %d\n", __func__, gpa, bytes);
- gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, new, &bytes);
-
/*
* No need to care whether allocation memory is successful
* or not since pte prefetch is skiped if it does not have
mmu_topup_memory_caches(vcpu);
spin_lock(&vcpu->kvm->mmu_lock);
+
+ gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, &bytes);
+
++vcpu->kvm->stat.mmu_pte_write;
kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
return vcpu->arch.tsc_offset;
}
-static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
+static u64 svm_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
{
struct vcpu_svm *svm = to_svm(vcpu);
u64 g_tsc_offset = 0;
svm->vmcb->control.tsc_offset = offset + g_tsc_offset;
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
+ return svm->vmcb->control.tsc_offset;
}
static void avic_init_vmcb(struct vcpu_svm *svm)
static int avic_init_access_page(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
- int ret;
+ int ret = 0;
+ mutex_lock(&kvm->slots_lock);
if (kvm->arch.apic_access_page_done)
- return 0;
+ goto out;
- ret = x86_set_memory_region(kvm,
- APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
- APIC_DEFAULT_PHYS_BASE,
- PAGE_SIZE);
+ ret = __x86_set_memory_region(kvm,
+ APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
+ APIC_DEFAULT_PHYS_BASE,
+ PAGE_SIZE);
if (ret)
- return ret;
+ goto out;
kvm->arch.apic_access_page_done = true;
- return 0;
+out:
+ mutex_unlock(&kvm->slots_lock);
+ return ret;
}
static int avic_init_backing_page(struct kvm_vcpu *vcpu)
return ERR_PTR(err);
}
+static void svm_clear_current_vmcb(struct vmcb *vmcb)
+{
+ int i;
+
+ for_each_online_cpu(i)
+ cmpxchg(&per_cpu(svm_data, i)->current_vmcb, vmcb, NULL);
+}
+
static void svm_free_vcpu(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ /*
+ * The vmcb page can be recycled, causing a false negative in
+ * svm_vcpu_load(). So, ensure that no logical CPU has this
+ * vmcb page recorded as its current vmcb.
+ */
+ svm_clear_current_vmcb(svm->vmcb);
+
__free_page(pfn_to_page(__sme_clr(svm->vmcb_pa) >> PAGE_SHIFT));
__free_pages(virt_to_page(svm->msrpm), MSRPM_ALLOC_ORDER);
__free_page(virt_to_page(svm->nested.hsave));
__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, svm);
- /*
- * The vmcb page can be recycled, causing a false negative in
- * svm_vcpu_load(). So do a full IBPB now.
- */
- indirect_branch_prediction_barrier();
}
static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
.has_wbinvd_exit = svm_has_wbinvd_exit,
.read_l1_tsc_offset = svm_read_l1_tsc_offset,
- .write_tsc_offset = svm_write_tsc_offset,
+ .write_l1_tsc_offset = svm_write_l1_tsc_offset,
.set_tdp_cr3 = set_tdp_cr3,
* refer SDM volume 3b section 21.6.13 & 22.1.3.
*/
static unsigned int ple_gap = KVM_DEFAULT_PLE_GAP;
+module_param(ple_gap, uint, 0444);
static unsigned int ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
module_param(ple_window, uint, 0444);
struct shared_msr_entry *guest_msrs;
int nmsrs;
int save_nmsrs;
+ bool guest_msrs_dirty;
unsigned long host_idt_base;
#ifdef CONFIG_X86_64
u64 msr_host_kernel_gs_base;
static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
u16 error_code);
static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
-static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
u32 msr, int type);
static DEFINE_PER_CPU(struct vmcs *, vmxarea);
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
- /* We don't support disabling the feature for simplicity. */
- if (vmx->nested.enlightened_vmcs_enabled)
- return 0;
-
- vmx->nested.enlightened_vmcs_enabled = true;
-
/*
* vmcs_version represents the range of supported Enlightened VMCS
* versions: lower 8 bits is the minimal version, higher 8 bits is the
if (vmcs_version)
*vmcs_version = (KVM_EVMCS_VERSION << 8) | 1;
+ /* We don't support disabling the feature for simplicity. */
+ if (vmx->nested.enlightened_vmcs_enabled)
+ return 0;
+
+ vmx->nested.enlightened_vmcs_enabled = true;
+
vmx->nested.msrs.pinbased_ctls_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
vmx->nested.msrs.entry_ctls_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
vmx->nested.msrs.exit_ctls_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
vmx->req_immediate_exit = false;
+ /*
+ * Note that guest MSRs to be saved/restored can also be changed
+ * when guest state is loaded. This happens when guest transitions
+ * to/from long-mode by setting MSR_EFER.LMA.
+ */
+ if (!vmx->loaded_cpu_state || vmx->guest_msrs_dirty) {
+ vmx->guest_msrs_dirty = false;
+ for (i = 0; i < vmx->save_nmsrs; ++i)
+ kvm_set_shared_msr(vmx->guest_msrs[i].index,
+ vmx->guest_msrs[i].data,
+ vmx->guest_msrs[i].mask);
+
+ }
+
if (vmx->loaded_cpu_state)
return;
vmcs_writel(HOST_GS_BASE, gs_base);
host_state->gs_base = gs_base;
}
-
- for (i = 0; i < vmx->save_nmsrs; ++i)
- kvm_set_shared_msr(vmx->guest_msrs[i].index,
- vmx->guest_msrs[i].data,
- vmx->guest_msrs[i].mask);
}
static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx)
move_msr_up(vmx, index, save_nmsrs++);
vmx->save_nmsrs = save_nmsrs;
+ vmx->guest_msrs_dirty = true;
if (cpu_has_vmx_msr_bitmap())
vmx_update_msr_bitmap(&vmx->vcpu);
return vcpu->arch.tsc_offset;
}
-/*
- * writes 'offset' into guest's timestamp counter offset register
- */
-static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
+static u64 vmx_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
{
+ u64 active_offset = offset;
if (is_guest_mode(vcpu)) {
/*
* We're here if L1 chose not to trap WRMSR to TSC. According
* set for L2 remains unchanged, and still needs to be added
* to the newly set TSC to get L2's TSC.
*/
- struct vmcs12 *vmcs12;
- /* recalculate vmcs02.TSC_OFFSET: */
- vmcs12 = get_vmcs12(vcpu);
- vmcs_write64(TSC_OFFSET, offset +
- (nested_cpu_has(vmcs12, CPU_BASED_USE_TSC_OFFSETING) ?
- vmcs12->tsc_offset : 0));
+ struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+ if (nested_cpu_has(vmcs12, CPU_BASED_USE_TSC_OFFSETING))
+ active_offset += vmcs12->tsc_offset;
} else {
trace_kvm_write_tsc_offset(vcpu->vcpu_id,
vmcs_read64(TSC_OFFSET), offset);
- vmcs_write64(TSC_OFFSET, offset);
}
+
+ vmcs_write64(TSC_OFFSET, active_offset);
+ return active_offset;
}
/*
spin_unlock(&vmx_vpid_lock);
}
-static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
u32 msr, int type)
{
int f = sizeof(unsigned long);
}
}
-static void __always_inline vmx_enable_intercept_for_msr(unsigned long *msr_bitmap,
+static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitmap,
u32 msr, int type)
{
int f = sizeof(unsigned long);
}
}
-static void __always_inline vmx_set_intercept_for_msr(unsigned long *msr_bitmap,
+static __always_inline void vmx_set_intercept_for_msr(unsigned long *msr_bitmap,
u32 msr, int type, bool value)
{
if (value)
struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
struct hv_enlightened_vmcs *evmcs = vmx->nested.hv_evmcs;
- vmcs12->hdr.revision_id = evmcs->revision_id;
-
/* HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE */
vmcs12->tpr_threshold = evmcs->tpr_threshold;
vmcs12->guest_rip = evmcs->guest_rip;
vmx->nested.hv_evmcs = kmap(vmx->nested.hv_evmcs_page);
- if (vmx->nested.hv_evmcs->revision_id != VMCS12_REVISION) {
+ /*
+ * Currently, KVM only supports eVMCS version 1
+ * (== KVM_EVMCS_VERSION) and thus we expect guest to set this
+ * value to first u32 field of eVMCS which should specify eVMCS
+ * VersionNumber.
+ *
+ * Guest should be aware of supported eVMCS versions by host by
+ * examining CPUID.0x4000000A.EAX[0:15]. Host userspace VMM is
+ * expected to set this CPUID leaf according to the value
+ * returned in vmcs_version from nested_enable_evmcs().
+ *
+ * However, it turns out that Microsoft Hyper-V fails to comply
+ * to their own invented interface: When Hyper-V use eVMCS, it
+ * just sets first u32 field of eVMCS to revision_id specified
+ * in MSR_IA32_VMX_BASIC. Instead of used eVMCS version number
+ * which is one of the supported versions specified in
+ * CPUID.0x4000000A.EAX[0:15].
+ *
+ * To overcome Hyper-V bug, we accept here either a supported
+ * eVMCS version or VMCS12 revision_id as valid values for first
+ * u32 field of eVMCS.
+ */
+ if ((vmx->nested.hv_evmcs->revision_id != KVM_EVMCS_VERSION) &&
+ (vmx->nested.hv_evmcs->revision_id != VMCS12_REVISION)) {
nested_release_evmcs(vcpu);
return 0;
}
* present in struct hv_enlightened_vmcs, ...). Make sure there
* are no leftovers.
*/
- if (from_launch)
- memset(vmx->nested.cached_vmcs12, 0,
- sizeof(*vmx->nested.cached_vmcs12));
+ if (from_launch) {
+ struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+ memset(vmcs12, 0, sizeof(*vmcs12));
+ vmcs12->hdr.revision_id = VMCS12_REVISION;
+ }
}
return 1;
.has_wbinvd_exit = cpu_has_vmx_wbinvd_exit,
.read_l1_tsc_offset = vmx_read_l1_tsc_offset,
- .write_tsc_offset = vmx_write_tsc_offset,
+ .write_l1_tsc_offset = vmx_write_l1_tsc_offset,
.set_tdp_cr3 = vmx_set_cr3,
static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
{
- kvm_x86_ops->write_tsc_offset(vcpu, offset);
- vcpu->arch.tsc_offset = offset;
+ vcpu->arch.tsc_offset = kvm_x86_ops->write_l1_tsc_offset(vcpu, offset);
}
static inline bool kvm_check_tsc_unstable(void)
static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
s64 adjustment)
{
- kvm_vcpu_write_tsc_offset(vcpu, vcpu->arch.tsc_offset + adjustment);
+ u64 tsc_offset = kvm_x86_ops->read_l1_tsc_offset(vcpu);
+ kvm_vcpu_write_tsc_offset(vcpu, tsc_offset + adjustment);
}
static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 adjustment)
unsigned size;
r = -EFAULT;
- if (copy_from_user(&msrs, user_msrs, sizeof msrs))
+ if (copy_from_user(&msrs, user_msrs, sizeof(msrs)))
goto out;
r = -E2BIG;
unsigned n;
r = -EFAULT;
- if (copy_from_user(&msr_list, user_msr_list, sizeof msr_list))
+ if (copy_from_user(&msr_list, user_msr_list, sizeof(msr_list)))
goto out;
n = msr_list.nmsrs;
msr_list.nmsrs = num_msrs_to_save + num_emulated_msrs;
- if (copy_to_user(user_msr_list, &msr_list, sizeof msr_list))
+ if (copy_to_user(user_msr_list, &msr_list, sizeof(msr_list)))
goto out;
r = -E2BIG;
if (n < msr_list.nmsrs)
struct kvm_cpuid2 cpuid;
r = -EFAULT;
- if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
r = kvm_dev_ioctl_get_cpuid(&cpuid, cpuid_arg->entries,
goto out;
r = -EFAULT;
- if (copy_to_user(cpuid_arg, &cpuid, sizeof cpuid))
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
goto out;
r = 0;
break;
struct kvm_interrupt irq;
r = -EFAULT;
- if (copy_from_user(&irq, argp, sizeof irq))
+ if (copy_from_user(&irq, argp, sizeof(irq)))
goto out;
r = kvm_vcpu_ioctl_interrupt(vcpu, &irq);
break;
struct kvm_cpuid cpuid;
r = -EFAULT;
- if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
r = kvm_vcpu_ioctl_set_cpuid(vcpu, &cpuid, cpuid_arg->entries);
break;
struct kvm_cpuid2 cpuid;
r = -EFAULT;
- if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
r = kvm_vcpu_ioctl_set_cpuid2(vcpu, &cpuid,
cpuid_arg->entries);
struct kvm_cpuid2 cpuid;
r = -EFAULT;
- if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
r = kvm_vcpu_ioctl_get_cpuid2(vcpu, &cpuid,
cpuid_arg->entries);
if (r)
goto out;
r = -EFAULT;
- if (copy_to_user(cpuid_arg, &cpuid, sizeof cpuid))
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
goto out;
r = 0;
break;
struct kvm_tpr_access_ctl tac;
r = -EFAULT;
- if (copy_from_user(&tac, argp, sizeof tac))
+ if (copy_from_user(&tac, argp, sizeof(tac)))
goto out;
r = vcpu_ioctl_tpr_access_reporting(vcpu, &tac);
if (r)
goto out;
r = -EFAULT;
- if (copy_to_user(argp, &tac, sizeof tac))
+ if (copy_to_user(argp, &tac, sizeof(tac)))
goto out;
r = 0;
break;
if (!lapic_in_kernel(vcpu))
goto out;
r = -EFAULT;
- if (copy_from_user(&va, argp, sizeof va))
+ if (copy_from_user(&va, argp, sizeof(va)))
goto out;
idx = srcu_read_lock(&vcpu->kvm->srcu);
r = kvm_lapic_set_vapic_addr(vcpu, va.vapic_addr);
u64 mcg_cap;
r = -EFAULT;
- if (copy_from_user(&mcg_cap, argp, sizeof mcg_cap))
+ if (copy_from_user(&mcg_cap, argp, sizeof(mcg_cap)))
goto out;
r = kvm_vcpu_ioctl_x86_setup_mce(vcpu, mcg_cap);
break;
struct kvm_x86_mce mce;
r = -EFAULT;
- if (copy_from_user(&mce, argp, sizeof mce))
+ if (copy_from_user(&mce, argp, sizeof(mce)))
goto out;
r = kvm_vcpu_ioctl_x86_set_mce(vcpu, &mce);
break;
if (kvm->created_vcpus)
goto set_identity_unlock;
r = -EFAULT;
- if (copy_from_user(&ident_addr, argp, sizeof ident_addr))
+ if (copy_from_user(&ident_addr, argp, sizeof(ident_addr)))
goto set_identity_unlock;
r = kvm_vm_ioctl_set_identity_map_addr(kvm, ident_addr);
set_identity_unlock:
if (r)
goto get_irqchip_out;
r = -EFAULT;
- if (copy_to_user(argp, chip, sizeof *chip))
+ if (copy_to_user(argp, chip, sizeof(*chip)))
goto get_irqchip_out;
r = 0;
get_irqchip_out:
}
case KVM_SET_PIT: {
r = -EFAULT;
- if (copy_from_user(&u.ps, argp, sizeof u.ps))
+ if (copy_from_user(&u.ps, argp, sizeof(u.ps)))
goto out;
r = -ENXIO;
if (!kvm->arch.vpit)
clock_pairing.nsec = ts.tv_nsec;
clock_pairing.tsc = kvm_read_l1_tsc(vcpu, cycle);
clock_pairing.flags = 0;
+ memset(&clock_pairing.pad, 0, sizeof(clock_pairing.pad));
ret = 0;
if (kvm_write_guest(vcpu->kvm, paddr, &clock_pairing,
else {
if (vcpu->arch.apicv_active)
kvm_x86_ops->sync_pir_to_irr(vcpu);
- kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
+ if (ioapic_in_kernel(vcpu->kvm))
+ kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
}
if (is_guest_mode(vcpu))
sregs->efer = vcpu->arch.efer;
sregs->apic_base = kvm_get_apic_base(vcpu);
- memset(sregs->interrupt_bitmap, 0, sizeof sregs->interrupt_bitmap);
+ memset(sregs->interrupt_bitmap, 0, sizeof(sregs->interrupt_bitmap));
if (vcpu->arch.interrupt.injected && !vcpu->arch.interrupt.soft)
set_bit(vcpu->arch.interrupt.nr,
fpu->last_opcode = fxsave->fop;
fpu->last_ip = fxsave->rip;
fpu->last_dp = fxsave->rdp;
- memcpy(fpu->xmm, fxsave->xmm_space, sizeof fxsave->xmm_space);
+ memcpy(fpu->xmm, fxsave->xmm_space, sizeof(fxsave->xmm_space));
vcpu_put(vcpu);
return 0;
fxsave->fop = fpu->last_opcode;
fxsave->rip = fpu->last_ip;
fxsave->rdp = fpu->last_dp;
- memcpy(fxsave->xmm_space, fpu->xmm, sizeof fxsave->xmm_space);
+ memcpy(fxsave->xmm_space, fpu->xmm, sizeof(fxsave->xmm_space));
vcpu_put(vcpu);
return 0;
* If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area
* in the full address space.
*/
- info.high_limit = in_compat_syscall() ?
+ info.high_limit = in_32bit_syscall() ?
task_size_32bit() : task_size_64bit(addr > DEFAULT_MAP_WINDOW);
info.align_mask = PAGE_MASK & ~huge_page_mask(h);
* If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area
* in the full address space.
*/
- if (addr > DEFAULT_MAP_WINDOW && !in_compat_syscall())
+ if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall())
info.high_limit += TASK_SIZE_MAX - DEFAULT_MAP_WINDOW;
info.align_mask = PAGE_MASK & ~huge_page_mask(h);
struct mm_struct *mm = current->mm;
#ifdef CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES
- if (in_compat_syscall()) {
+ if (in_32bit_syscall()) {
return is_legacy ? mm->mmap_compat_legacy_base
: mm->mmap_compat_base;
}
n = simple_strtoul(emu_cmdline, &emu_cmdline, 0);
ret = -1;
for_each_node_mask(i, physnode_mask) {
+ /*
+ * The reason we pass in blk[0] is due to
+ * numa_remove_memblk_from() called by
+ * emu_setup_memblk() will delete entry 0
+ * and then move everything else up in the pi.blk
+ * array. Therefore we should always be looking
+ * at blk[0].
+ */
ret = split_nodes_size_interleave_uniform(&ei, &pi,
- pi.blk[i].start, pi.blk[i].end, 0,
- n, &pi.blk[i], nid);
+ pi.blk[0].start, pi.blk[0].end, 0,
+ n, &pi.blk[0], nid);
if (ret < 0)
break;
if (ret < n) {
/*
* We should perform an IPI and flush all tlbs,
- * but that can deadlock->flush only current cpu:
+ * but that can deadlock->flush only current cpu.
+ * Preemption needs to be disabled around __flush_tlb_all() due to
+ * CR3 reload in __native_flush_tlb().
*/
+ preempt_disable();
__flush_tlb_all();
+ preempt_enable();
arch_flush_lazy_mmu_mode();
}
#include <linux/export.h>
#include <linux/cpu.h>
#include <linux/debugfs.h>
-#include <linux/ptrace.h>
#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
* Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
*/
+/*
+ * Use bit 0 to mangle the TIF_SPEC_IB state into the mm pointer which is
+ * stored in cpu_tlb_state.last_user_mm_ibpb.
+ */
+#define LAST_USER_MM_IBPB 0x1UL
+
/*
* We get here when we do something requiring a TLB invalidation
* but could not go invalidate all of the contexts. We do the
}
}
-static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id)
+static inline unsigned long mm_mangle_tif_spec_ib(struct task_struct *next)
+{
+ unsigned long next_tif = task_thread_info(next)->flags;
+ unsigned long ibpb = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_IBPB;
+
+ return (unsigned long)next->mm | ibpb;
+}
+
+static void cond_ibpb(struct task_struct *next)
{
+ if (!next || !next->mm)
+ return;
+
/*
- * Check if the current (previous) task has access to the memory
- * of the @tsk (next) task. If access is denied, make sure to
- * issue a IBPB to stop user->user Spectre-v2 attacks.
- *
- * Note: __ptrace_may_access() returns 0 or -ERRNO.
+ * Both, the conditional and the always IBPB mode use the mm
+ * pointer to avoid the IBPB when switching between tasks of the
+ * same process. Using the mm pointer instead of mm->context.ctx_id
+ * opens a hypothetical hole vs. mm_struct reuse, which is more or
+ * less impossible to control by an attacker. Aside of that it
+ * would only affect the first schedule so the theoretically
+ * exposed data is not really interesting.
*/
- return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id &&
- ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB));
+ if (static_branch_likely(&switch_mm_cond_ibpb)) {
+ unsigned long prev_mm, next_mm;
+
+ /*
+ * This is a bit more complex than the always mode because
+ * it has to handle two cases:
+ *
+ * 1) Switch from a user space task (potential attacker)
+ * which has TIF_SPEC_IB set to a user space task
+ * (potential victim) which has TIF_SPEC_IB not set.
+ *
+ * 2) Switch from a user space task (potential attacker)
+ * which has TIF_SPEC_IB not set to a user space task
+ * (potential victim) which has TIF_SPEC_IB set.
+ *
+ * This could be done by unconditionally issuing IBPB when
+ * a task which has TIF_SPEC_IB set is either scheduled in
+ * or out. Though that results in two flushes when:
+ *
+ * - the same user space task is scheduled out and later
+ * scheduled in again and only a kernel thread ran in
+ * between.
+ *
+ * - a user space task belonging to the same process is
+ * scheduled in after a kernel thread ran in between
+ *
+ * - a user space task belonging to the same process is
+ * scheduled in immediately.
+ *
+ * Optimize this with reasonably small overhead for the
+ * above cases. Mangle the TIF_SPEC_IB bit into the mm
+ * pointer of the incoming task which is stored in
+ * cpu_tlbstate.last_user_mm_ibpb for comparison.
+ */
+ next_mm = mm_mangle_tif_spec_ib(next);
+ prev_mm = this_cpu_read(cpu_tlbstate.last_user_mm_ibpb);
+
+ /*
+ * Issue IBPB only if the mm's are different and one or
+ * both have the IBPB bit set.
+ */
+ if (next_mm != prev_mm &&
+ (next_mm | prev_mm) & LAST_USER_MM_IBPB)
+ indirect_branch_prediction_barrier();
+
+ this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, next_mm);
+ }
+
+ if (static_branch_unlikely(&switch_mm_always_ibpb)) {
+ /*
+ * Only flush when switching to a user space task with a
+ * different context than the user space task which ran
+ * last on this CPU.
+ */
+ if (this_cpu_read(cpu_tlbstate.last_user_mm) != next->mm) {
+ indirect_branch_prediction_barrier();
+ this_cpu_write(cpu_tlbstate.last_user_mm, next->mm);
+ }
+ }
}
void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
new_asid = prev_asid;
need_flush = true;
} else {
- u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);
-
/*
* Avoid user/user BTB poisoning by flushing the branch
* predictor when switching between processes. This stops
* one process from doing Spectre-v2 attacks on another.
- *
- * As an optimization, flush indirect branches only when
- * switching into a processes that can't be ptrace by the
- * current one (as in such case, attacker has much more
- * convenient way how to tamper with the next process than
- * branch buffer poisoning).
*/
- if (static_cpu_has(X86_FEATURE_USE_IBPB) &&
- ibpb_needed(tsk, last_ctx_id))
- indirect_branch_prediction_barrier();
+ cond_ibpb(tsk);
if (IS_ENABLED(CONFIG_VMAP_STACK)) {
/*
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0);
}
- /*
- * Record last user mm's context id, so we can avoid
- * flushing branch buffer with IBPB if we switch back
- * to the same user.
- */
- if (next != &init_mm)
- this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
-
/* Make sure we write CR3 before loaded_mm. */
barrier();
write_cr3(build_cr3(mm->pgd, 0));
/* Reinitialize tlbstate. */
- this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id);
+ this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, LAST_USER_MM_IBPB);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
this_cpu_write(cpu_tlbstate.next_asid, 1);
this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
num--;
}
- if (efi_x >= si->lfb_width) {
+ if (efi_x + font->width > si->lfb_width) {
efi_x = 0;
efi_y += font->height;
}
REG_EXTENDED|REG_NOSUB);
if (err) {
- regerror(err, &sym_regex_c[i], errbuf, sizeof errbuf);
+ regerror(err, &sym_regex_c[i], errbuf, sizeof(errbuf));
die("%s", errbuf);
}
}
}
for (i = 0; i < ehdr.e_shnum; i++) {
struct section *sec = &secs[i];
- if (fread(&shdr, sizeof shdr, 1, fp) != 1)
+ if (fread(&shdr, sizeof(shdr), 1, fp) != 1)
die("Cannot read ELF section headers %d/%d: %s\n",
i, ehdr.e_shnum, strerror(errno));
sec->shdr.sh_name = elf_word_to_cpu(shdr.sh_name);
typedef unsigned long elf_greg_t;
-#define ELF_NGREG (sizeof (struct user_regs_struct) / sizeof(elf_greg_t))
+#define ELF_NGREG (sizeof(struct user_regs_struct) / sizeof(elf_greg_t))
typedef elf_greg_t elf_gregset_t[ELF_NGREG];
typedef struct user_i387_struct elf_fpregset_t;
#include <xen/xen.h>
#include <xen/features.h>
#include <xen/page.h>
-#include <xen/interface/memory.h>
#include <asm/xen/hypercall.h>
#include <asm/xen/hypervisor.h>
}
EXPORT_SYMBOL(xen_arch_unregister_cpu);
#endif
-
-#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
-void __init arch_xen_balloon_init(struct resource *hostmem_resource)
-{
- struct xen_memory_map memmap;
- int rc;
- unsigned int i, last_guest_ram;
- phys_addr_t max_addr = PFN_PHYS(max_pfn);
- struct e820_table *xen_e820_table;
- const struct e820_entry *entry;
- struct resource *res;
-
- if (!xen_initial_domain())
- return;
-
- xen_e820_table = kmalloc(sizeof(*xen_e820_table), GFP_KERNEL);
- if (!xen_e820_table)
- return;
-
- memmap.nr_entries = ARRAY_SIZE(xen_e820_table->entries);
- set_xen_guest_handle(memmap.buffer, xen_e820_table->entries);
- rc = HYPERVISOR_memory_op(XENMEM_machine_memory_map, &memmap);
- if (rc) {
- pr_warn("%s: Can't read host e820 (%d)\n", __func__, rc);
- goto out;
- }
-
- last_guest_ram = 0;
- for (i = 0; i < memmap.nr_entries; i++) {
- if (xen_e820_table->entries[i].addr >= max_addr)
- break;
- if (xen_e820_table->entries[i].type == E820_TYPE_RAM)
- last_guest_ram = i;
- }
-
- entry = &xen_e820_table->entries[last_guest_ram];
- if (max_addr >= entry->addr + entry->size)
- goto out; /* No unallocated host RAM. */
-
- hostmem_resource->start = max_addr;
- hostmem_resource->end = entry->addr + entry->size;
-
- /*
- * Mark non-RAM regions between the end of dom0 RAM and end of host RAM
- * as unavailable. The rest of that region can be used for hotplug-based
- * ballooning.
- */
- for (; i < memmap.nr_entries; i++) {
- entry = &xen_e820_table->entries[i];
-
- if (entry->type == E820_TYPE_RAM)
- continue;
-
- if (entry->addr >= hostmem_resource->end)
- break;
-
- res = kzalloc(sizeof(*res), GFP_KERNEL);
- if (!res)
- goto out;
-
- res->name = "Unavailable host RAM";
- res->start = entry->addr;
- res->end = (entry->addr + entry->size < hostmem_resource->end) ?
- entry->addr + entry->size : hostmem_resource->end;
- rc = insert_resource(hostmem_resource, res);
- if (rc) {
- pr_warn("%s: Can't insert [%llx - %llx) (%d)\n",
- __func__, res->start, res->end, rc);
- kfree(res);
- goto out;
- }
- }
-
- out:
- kfree(xen_e820_table);
-}
-#endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */
init_top_pgt[0] = __pgd(0);
/* Pre-constructed entries are in pfn, so convert to mfn */
- /* L4[272] -> level3_ident_pgt */
+ /* L4[273] -> level3_ident_pgt */
/* L4[511] -> level3_kernel_pgt */
convert_pfn_mfn(init_top_pgt);
addr[0] = (unsigned long)pgd;
addr[1] = (unsigned long)l3;
addr[2] = (unsigned long)l2;
- /* Graft it onto L4[272][0]. Note that we creating an aliasing problem:
- * Both L4[272][0] and L4[511][510] have entries that point to the same
+ /* Graft it onto L4[273][0]. Note that we creating an aliasing problem:
+ * Both L4[273][0] and L4[511][510] have entries that point to the same
* L2 (PMD) tables. Meaning that if you modify it in __va space
* it will be also modified in the __ka space! (But if you just
* modify the PMD table to point to other PTE's or none, then you
trace_xen_mc_flush(b->mcidx, b->argidx, b->cbidx);
+#if MC_DEBUG
+ memcpy(b->debug, b->entries,
+ b->mcidx * sizeof(struct multicall_entry));
+#endif
+
switch (b->mcidx) {
case 0:
/* no-op */
break;
default:
-#if MC_DEBUG
- memcpy(b->debug, b->entries,
- b->mcidx * sizeof(struct multicall_entry));
-#endif
-
if (HYPERVISOR_multicall(b->entries, b->mcidx) != 0)
BUG();
for (i = 0; i < b->mcidx; i++)
if (b->entries[i].result < 0)
ret++;
+ }
+ if (WARN_ON(ret)) {
+ pr_err("%d of %d multicall(s) failed: cpu %d\n",
+ ret, b->mcidx, smp_processor_id());
+ for (i = 0; i < b->mcidx; i++) {
+ if (b->entries[i].result < 0) {
#if MC_DEBUG
- if (ret) {
- printk(KERN_ERR "%d multicall(s) failed: cpu %d\n",
- ret, smp_processor_id());
- dump_stack();
- for (i = 0; i < b->mcidx; i++) {
- printk(KERN_DEBUG " call %2d/%d: op=%lu arg=[%lx] result=%ld\t%pF\n",
- i+1, b->mcidx,
+ pr_err(" call %2d: op=%lu arg=[%lx] result=%ld\t%pF\n",
+ i + 1,
b->debug[i].op,
b->debug[i].args[0],
b->entries[i].result,
b->caller[i]);
+#else
+ pr_err(" call %2d: op=%lu arg=[%lx] result=%ld\n",
+ i + 1,
+ b->entries[i].op,
+ b->entries[i].args[0],
+ b->entries[i].result);
+#endif
}
}
-#endif
}
b->mcidx = 0;
b->cbidx = 0;
local_irq_restore(flags);
-
- WARN_ON(ret);
}
struct multicall_space __xen_mc_entry(size_t args)
/*
* The interface requires atomic updates on p2m elements.
- * xen_safe_write_ulong() is using __put_user which does an atomic
- * store via asm().
+ * xen_safe_write_ulong() is using an atomic store via asm().
*/
if (likely(!xen_safe_write_ulong(xen_p2m_addr + pfn, mfn)))
return true;
addr = xen_e820_table.entries[0].addr;
size = xen_e820_table.entries[0].size;
while (i < xen_e820_table.nr_entries) {
+ bool discard = false;
chunk_size = size;
type = xen_e820_table.entries[i].type;
xen_add_extra_mem(pfn_s, n_pfns);
xen_max_p2m_pfn = pfn_s + n_pfns;
} else
- type = E820_TYPE_UNUSABLE;
+ discard = true;
}
- xen_align_and_add_e820_region(addr, chunk_size, type);
+ if (!discard)
+ xen_align_and_add_e820_region(addr, chunk_size, type);
addr += chunk_size;
size -= chunk_size;
* Split spinlock implementation out into its own file, so it can be
* compiled in a FTRACE-compatible way.
*/
-#include <linux/kernel_stat.h>
+#include <linux/kernel.h>
#include <linux/spinlock.h>
-#include <linux/debugfs.h>
-#include <linux/log2.h>
-#include <linux/gfp.h>
#include <linux/slab.h>
+#include <linux/atomic.h>
#include <asm/paravirt.h>
#include <asm/qspinlock.h>
-#include <xen/interface/xen.h>
#include <xen/events.h>
#include "xen-ops.h"
-#include "debugfs.h"
static DEFINE_PER_CPU(int, lock_kicker_irq) = -1;
static DEFINE_PER_CPU(char *, irq_name);
+static DEFINE_PER_CPU(atomic_t, xen_qlock_wait_nest);
static bool xen_pvspin = true;
static void xen_qlock_kick(int cpu)
*/
static void xen_qlock_wait(u8 *byte, u8 val)
{
- unsigned long flags;
int irq = __this_cpu_read(lock_kicker_irq);
+ atomic_t *nest_cnt = this_cpu_ptr(&xen_qlock_wait_nest);
/* If kicker interrupts not initialized yet, just spin */
if (irq == -1 || in_nmi())
return;
- /* Guard against reentry. */
- local_irq_save(flags);
+ /* Detect reentry. */
+ atomic_inc(nest_cnt);
- /* If irq pending already clear it. */
- if (xen_test_irq_pending(irq)) {
+ /* If irq pending already and no nested call clear it. */
+ if (atomic_read(nest_cnt) == 1 && xen_test_irq_pending(irq)) {
xen_clear_irq_pending(irq);
} else if (READ_ONCE(*byte) == val) {
/* Block until irq becomes pending (or a spurious wakeup) */
xen_poll_irq(irq);
}
- local_irq_restore(flags);
+ atomic_dec(nest_cnt);
}
static irqreturn_t dummy_handler(int irq, void *dev_id)
# SPDX-License-Identifier: GPL-2.0
-config ZONE_DMA
- def_bool y
-
config XTENSA
def_bool y
select ARCH_HAS_SG_CHAIN
boot-elf boot-redboot: $(addprefix $(obj)/,$(subdir-y))
$(Q)$(MAKE) $(build)=$(obj)/$@ $(MAKECMDGOALS)
-OBJCOPYFLAGS = --strip-all -R .comment -R .note.gnu.build-id -O binary
+OBJCOPYFLAGS = --strip-all -R .comment -R .notes -O binary
vmlinux.bin: vmlinux FORCE
$(call if_changed,objcopy)
# error Linux requires the Xtensa Windowed Registers Option.
#endif
-#define ARCH_SLAB_MINALIGN XCHAL_DATA_WIDTH
+/* Xtensa ABI requires stack alignment to be at least 16 */
+
+#define STACK_ALIGN (XCHAL_DATA_WIDTH > 16 ? XCHAL_DATA_WIDTH : 16)
+
+#define ARCH_SLAB_MINALIGN STACK_ALIGN
/*
* User space process size: 1 GB.
DEFINE(THREAD_SP, offsetof (struct task_struct, thread.sp));
DEFINE(THREAD_CPENABLE, offsetof (struct thread_info, cpenable));
#if XTENSA_HAVE_COPROCESSORS
- DEFINE(THREAD_XTREGS_CP0, offsetof (struct thread_info, xtregs_cp));
- DEFINE(THREAD_XTREGS_CP1, offsetof (struct thread_info, xtregs_cp));
- DEFINE(THREAD_XTREGS_CP2, offsetof (struct thread_info, xtregs_cp));
- DEFINE(THREAD_XTREGS_CP3, offsetof (struct thread_info, xtregs_cp));
- DEFINE(THREAD_XTREGS_CP4, offsetof (struct thread_info, xtregs_cp));
- DEFINE(THREAD_XTREGS_CP5, offsetof (struct thread_info, xtregs_cp));
- DEFINE(THREAD_XTREGS_CP6, offsetof (struct thread_info, xtregs_cp));
- DEFINE(THREAD_XTREGS_CP7, offsetof (struct thread_info, xtregs_cp));
+ DEFINE(THREAD_XTREGS_CP0, offsetof(struct thread_info, xtregs_cp.cp0));
+ DEFINE(THREAD_XTREGS_CP1, offsetof(struct thread_info, xtregs_cp.cp1));
+ DEFINE(THREAD_XTREGS_CP2, offsetof(struct thread_info, xtregs_cp.cp2));
+ DEFINE(THREAD_XTREGS_CP3, offsetof(struct thread_info, xtregs_cp.cp3));
+ DEFINE(THREAD_XTREGS_CP4, offsetof(struct thread_info, xtregs_cp.cp4));
+ DEFINE(THREAD_XTREGS_CP5, offsetof(struct thread_info, xtregs_cp.cp5));
+ DEFINE(THREAD_XTREGS_CP6, offsetof(struct thread_info, xtregs_cp.cp6));
+ DEFINE(THREAD_XTREGS_CP7, offsetof(struct thread_info, xtregs_cp.cp7));
#endif
DEFINE(THREAD_XTREGS_USER, offsetof (struct thread_info, xtregs_user));
DEFINE(XTREGS_USER_SIZE, sizeof(xtregs_user_t));
initialize_mmu
#if defined(CONFIG_MMU) && XCHAL_HAVE_PTP_MMU && XCHAL_HAVE_SPANNING_WAY
rsr a2, excsave1
- movi a3, 0x08000000
+ movi a3, XCHAL_KSEG_PADDR
+ bltu a2, a3, 1f
+ sub a2, a2, a3
+ movi a3, XCHAL_KSEG_SIZE
bgeu a2, a3, 1f
- movi a3, 0xd0000000
+ movi a3, XCHAL_KSEG_CACHED_VADDR
add a2, a2, a3
wsr a2, excsave1
1:
void coprocessor_flush_all(struct thread_info *ti)
{
- unsigned long cpenable;
+ unsigned long cpenable, old_cpenable;
int i;
preempt_disable();
+ RSR_CPENABLE(old_cpenable);
cpenable = ti->cpenable;
+ WSR_CPENABLE(cpenable);
for (i = 0; i < XCHAL_CP_MAX; i++) {
if ((cpenable & 1) != 0 && coprocessor_owner[i] == ti)
coprocessor_flush(ti, i);
cpenable >>= 1;
}
+ WSR_CPENABLE(old_cpenable);
preempt_enable();
}
}
+#if XTENSA_HAVE_COPROCESSORS
+#define CP_OFFSETS(cp) \
+ { \
+ .elf_xtregs_offset = offsetof(elf_xtregs_t, cp), \
+ .ti_offset = offsetof(struct thread_info, xtregs_cp.cp), \
+ .sz = sizeof(xtregs_ ## cp ## _t), \
+ }
+
+static const struct {
+ size_t elf_xtregs_offset;
+ size_t ti_offset;
+ size_t sz;
+} cp_offsets[] = {
+ CP_OFFSETS(cp0),
+ CP_OFFSETS(cp1),
+ CP_OFFSETS(cp2),
+ CP_OFFSETS(cp3),
+ CP_OFFSETS(cp4),
+ CP_OFFSETS(cp5),
+ CP_OFFSETS(cp6),
+ CP_OFFSETS(cp7),
+};
+#endif
+
static int ptrace_getxregs(struct task_struct *child, void __user *uregs)
{
struct pt_regs *regs = task_pt_regs(child);
struct thread_info *ti = task_thread_info(child);
elf_xtregs_t __user *xtregs = uregs;
int ret = 0;
+ int i __maybe_unused;
if (!access_ok(VERIFY_WRITE, uregs, sizeof(elf_xtregs_t)))
return -EIO;
#if XTENSA_HAVE_COPROCESSORS
/* Flush all coprocessor registers to memory. */
coprocessor_flush_all(ti);
- ret |= __copy_to_user(&xtregs->cp0, &ti->xtregs_cp,
- sizeof(xtregs_coprocessor_t));
+
+ for (i = 0; i < ARRAY_SIZE(cp_offsets); ++i)
+ ret |= __copy_to_user((char __user *)xtregs +
+ cp_offsets[i].elf_xtregs_offset,
+ (const char *)ti +
+ cp_offsets[i].ti_offset,
+ cp_offsets[i].sz);
#endif
ret |= __copy_to_user(&xtregs->opt, ®s->xtregs_opt,
sizeof(xtregs->opt));
struct pt_regs *regs = task_pt_regs(child);
elf_xtregs_t *xtregs = uregs;
int ret = 0;
+ int i __maybe_unused;
if (!access_ok(VERIFY_READ, uregs, sizeof(elf_xtregs_t)))
return -EFAULT;
coprocessor_flush_all(ti);
coprocessor_release_all(ti);
- ret |= __copy_from_user(&ti->xtregs_cp, &xtregs->cp0,
- sizeof(xtregs_coprocessor_t));
+ for (i = 0; i < ARRAY_SIZE(cp_offsets); ++i)
+ ret |= __copy_from_user((char *)ti + cp_offsets[i].ti_offset,
+ (const char __user *)xtregs +
+ cp_offsets[i].elf_xtregs_offset,
+ cp_offsets[i].sz);
#endif
ret |= __copy_from_user(®s->xtregs_opt, &xtregs->opt,
sizeof(xtregs->opt));
.fixup : { *(.fixup) }
EXCEPTION_TABLE(16)
+ NOTES
/* Data section */
_sdata = .;
_end = .;
- .xt.lit : { *(.xt.lit) }
- .xt.prop : { *(.xt.prop) }
-
- .debug 0 : { *(.debug) }
- .line 0 : { *(.line) }
- .debug_srcinfo 0 : { *(.debug_srcinfo) }
- .debug_sfnames 0 : { *(.debug_sfnames) }
- .debug_aranges 0 : { *(.debug_aranges) }
- .debug_pubnames 0 : { *(.debug_pubnames) }
- .debug_info 0 : { *(.debug_info) }
- .debug_abbrev 0 : { *(.debug_abbrev) }
- .debug_line 0 : { *(.debug_line) }
- .debug_frame 0 : { *(.debug_frame) }
- .debug_str 0 : { *(.debug_str) }
- .debug_loc 0 : { *(.debug_loc) }
- .debug_macinfo 0 : { *(.debug_macinfo) }
- .debug_weaknames 0 : { *(.debug_weaknames) }
- .debug_funcnames 0 : { *(.debug_funcnames) }
- .debug_typenames 0 : { *(.debug_typenames) }
- .debug_varnames 0 : { *(.debug_varnames) }
-
- .xt.insn 0 :
- {
- *(.xt.insn)
- *(.gnu.linkonce.x*)
- }
+ DWARF_DEBUG
- .xt.lit 0 :
- {
- *(.xt.lit)
- *(.gnu.linkonce.p*)
- }
+ .xt.prop 0 : { KEEP(*(.xt.prop .xt.prop.* .gnu.linkonce.prop.*)) }
+ .xt.insn 0 : { KEEP(*(.xt.insn .xt.insn.* .gnu.linkonce.x*)) }
+ .xt.lit 0 : { KEEP(*(.xt.lit .xt.lit.* .gnu.linkonce.p*)) }
/* Sections to be discarded */
DISCARDS
{
/* All pages are DMA-able, so we put them all in the DMA zone. */
unsigned long zones_size[MAX_NR_ZONES] = {
- [ZONE_DMA] = max_low_pfn - ARCH_PFN_OFFSET,
+ [ZONE_NORMAL] = max_low_pfn - ARCH_PFN_OFFSET,
#ifdef CONFIG_HIGHMEM
[ZONE_HIGHMEM] = max_pfn - max_low_pfn,
#endif
uint64_t serial_nr;
rcu_read_lock();
- serial_nr = __bio_blkcg(bio)->css.serial_nr;
+ serial_nr = bio_blkcg(bio)->css.serial_nr;
/*
* Check whether blkcg has changed. The condition may trigger
if (unlikely(!bfqd) || likely(bic->blkcg_serial_nr == serial_nr))
goto out;
- bfqg = __bfq_bic_change_cgroup(bfqd, bic, __bio_blkcg(bio));
+ bfqg = __bfq_bic_change_cgroup(bfqd, bic, bio_blkcg(bio));
/*
* Update blkg_path for bfq_log_* functions. We cache this
* path, and update it here, for the following
bfqd->queue_weights_tree.rb_node->rb_right)
#ifdef CONFIG_BFQ_GROUP_IOSCHED
) ||
- (bfqd->num_active_groups > 0
+ (bfqd->num_groups_with_pending_reqs > 0
#endif
);
}
*/
break;
}
- bfqd->num_active_groups--;
+
+ /*
+ * The decrement of num_groups_with_pending_reqs is
+ * not performed immediately upon the deactivation of
+ * entity, but it is delayed to when it also happens
+ * that the first leaf descendant bfqq of entity gets
+ * all its pending requests completed. The following
+ * instructions perform this delayed decrement, if
+ * needed. See the comments on
+ * num_groups_with_pending_reqs for details.
+ */
+ if (entity->in_groups_with_pending_reqs) {
+ entity->in_groups_with_pending_reqs = false;
+ bfqd->num_groups_with_pending_reqs--;
+ }
}
}
* fact, if there are active groups, then, for condition (i)
* to become false, it is enough that an active group contains
* more active processes or sub-groups than some other active
- * group. We address this issue with the following bi-modal
- * behavior, implemented in the function
+ * group. More precisely, for condition (i) to hold because of
+ * such a group, it is not even necessary that the group is
+ * (still) active: it is sufficient that, even if the group
+ * has become inactive, some of its descendant processes still
+ * have some request already dispatched but still waiting for
+ * completion. In fact, requests have still to be guaranteed
+ * their share of the throughput even after being
+ * dispatched. In this respect, it is easy to show that, if a
+ * group frequently becomes inactive while still having
+ * in-flight requests, and if, when this happens, the group is
+ * not considered in the calculation of whether the scenario
+ * is asymmetric, then the group may fail to be guaranteed its
+ * fair share of the throughput (basically because idling may
+ * not be performed for the descendant processes of the group,
+ * but it had to be). We address this issue with the
+ * following bi-modal behavior, implemented in the function
* bfq_symmetric_scenario().
*
- * If there are active groups, then the scenario is tagged as
+ * If there are groups with requests waiting for completion
+ * (as commented above, some of these groups may even be
+ * already inactive), then the scenario is tagged as
* asymmetric, conservatively, without checking any of the
* conditions (i) and (ii). So the device is idled for bfqq.
* This behavior matches also the fact that groups are created
- * exactly if controlling I/O (to preserve bandwidth and
- * latency guarantees) is a primary concern.
+ * exactly if controlling I/O is a primary concern (to
+ * preserve bandwidth and latency guarantees).
*
- * On the opposite end, if there are no active groups, then
- * only condition (i) is actually controlled, i.e., provided
- * that condition (i) holds, idling is not performed,
- * regardless of whether condition (ii) holds. In other words,
- * only if condition (i) does not hold, then idling is
- * allowed, and the device tends to be prevented from queueing
- * many requests, possibly of several processes. Since there
- * are no active groups, then, to control condition (i) it is
- * enough to check whether all active queues have the same
- * weight.
+ * On the opposite end, if there are no groups with requests
+ * waiting for completion, then only condition (i) is actually
+ * controlled, i.e., provided that condition (i) holds, idling
+ * is not performed, regardless of whether condition (ii)
+ * holds. In other words, only if condition (i) does not hold,
+ * then idling is allowed, and the device tends to be
+ * prevented from queueing many requests, possibly of several
+ * processes. Since there are no groups with requests waiting
+ * for completion, then, to control condition (i) it is enough
+ * to check just whether all the queues with requests waiting
+ * for completion also have the same weight.
*
* Not checking condition (ii) evidently exposes bfqq to the
* risk of getting less throughput than its fair share.
* bfqq is weight-raised is checked explicitly here. More
* precisely, the compound condition below takes into account
* also the fact that, even if bfqq is being weight-raised,
- * the scenario is still symmetric if all active queues happen
- * to be weight-raised. Actually, we should be even more
- * precise here, and differentiate between interactive weight
- * raising and soft real-time weight raising.
+ * the scenario is still symmetric if all queues with requests
+ * waiting for completion happen to be
+ * weight-raised. Actually, we should be even more precise
+ * here, and differentiate between interactive weight raising
+ * and soft real-time weight raising.
*
* As a side note, it is worth considering that the above
* device-idling countermeasures may however fail in the
rcu_read_lock();
- bfqg = bfq_find_set_group(bfqd, __bio_blkcg(bio));
+ bfqg = bfq_find_set_group(bfqd, bio_blkcg(bio));
if (!bfqg) {
bfqq = &bfqd->oom_bfqq;
goto out;
bfqd->idle_slice_timer.function = bfq_idle_slice_timer;
bfqd->queue_weights_tree = RB_ROOT;
- bfqd->num_active_groups = 0;
+ bfqd->num_groups_with_pending_reqs = 0;
INIT_LIST_HEAD(&bfqd->active_list);
INIT_LIST_HEAD(&bfqd->idle_list);
/* flag, set to request a weight, ioprio or ioprio_class change */
int prio_changed;
+
+ /* flag, set if the entity is counted in groups_with_pending_reqs */
+ bool in_groups_with_pending_reqs;
};
struct bfq_group;
* bfq_weights_tree_[add|remove] for further details).
*/
struct rb_root queue_weights_tree;
+
/*
- * number of groups with requests still waiting for completion
+ * Number of groups with at least one descendant process that
+ * has at least one request waiting for completion. Note that
+ * this accounts for also requests already dispatched, but not
+ * yet completed. Therefore this number of groups may differ
+ * (be larger) than the number of active groups, as a group is
+ * considered active only if its corresponding entity has
+ * descendant queues with at least one request queued. This
+ * number is used to decide whether a scenario is symmetric.
+ * For a detailed explanation see comments on the computation
+ * of the variable asymmetric_scenario in the function
+ * bfq_better_to_idle().
+ *
+ * However, it is hard to compute this number exactly, for
+ * groups with multiple descendant processes. Consider a group
+ * that is inactive, i.e., that has no descendant process with
+ * pending I/O inside BFQ queues. Then suppose that
+ * num_groups_with_pending_reqs is still accounting for this
+ * group, because the group has descendant processes with some
+ * I/O request still in flight. num_groups_with_pending_reqs
+ * should be decremented when the in-flight request of the
+ * last descendant process is finally completed (assuming that
+ * nothing else has changed for the group in the meantime, in
+ * terms of composition of the group and active/inactive state of child
+ * groups and processes). To accomplish this, an additional
+ * pending-request counter must be added to entities, and must
+ * be updated correctly. To avoid this additional field and operations,
+ * we resort to the following tradeoff between simplicity and
+ * accuracy: for an inactive group that is still counted in
+ * num_groups_with_pending_reqs, we decrement
+ * num_groups_with_pending_reqs when the first descendant
+ * process of the group remains with no request waiting for
+ * completion.
+ *
+ * Even this simpler decrement strategy requires a little
+ * carefulness: to avoid multiple decrements, we flag a group,
+ * more precisely an entity representing a group, as still
+ * counted in num_groups_with_pending_reqs when it becomes
+ * inactive. Then, when the first descendant queue of the
+ * entity remains with no request waiting for completion,
+ * num_groups_with_pending_reqs is decremented, and this flag
+ * is reset. After this flag is reset for the entity,
+ * num_groups_with_pending_reqs won't be decremented any
+ * longer in case a new descendant queue of the entity remains
+ * with no request waiting for completion.
*/
- unsigned int num_active_groups;
+ unsigned int num_groups_with_pending_reqs;
/*
* Number of bfq_queues containing requests (including the
container_of(entity, struct bfq_group, entity);
struct bfq_data *bfqd = bfqg->bfqd;
- bfqd->num_active_groups++;
+ if (!entity->in_groups_with_pending_reqs) {
+ entity->in_groups_with_pending_reqs = true;
+ bfqd->num_groups_with_pending_reqs++;
+ }
}
#endif
if (bio_flagged(bio_src, BIO_THROTTLED))
bio_set_flag(bio, BIO_THROTTLED);
bio->bi_opf = bio_src->bi_opf;
+ bio->bi_ioprio = bio_src->bi_ioprio;
bio->bi_write_hint = bio_src->bi_write_hint;
bio->bi_iter = bio_src->bi_iter;
bio->bi_io_vec = bio_src->bi_io_vec;
- bio_clone_blkg_association(bio, bio_src);
-
- blkcg_bio_issue_init(bio);
+ bio_clone_blkcg_association(bio, bio_src);
}
EXPORT_SYMBOL(__bio_clone_fast);
/*
* success
*/
- if (((iter->type & WRITE) && (!map_data || !map_data->null_mapped)) ||
+ if ((iov_iter_rw(iter) == WRITE && (!map_data || !map_data->null_mapped)) ||
(map_data && map_data->from_user)) {
ret = bio_copy_from_iter(bio, iter);
if (ret)
goto cleanup;
} else {
+ zero_fill_bio(bio);
iov_iter_advance(iter, bio->bi_iter.bi_size);
}
#ifdef CONFIG_BLK_CGROUP
-/**
- * bio_associate_blkg - associate a bio with the a blkg
- * @bio: target bio
- * @blkg: the blkg to associate
- *
- * This tries to associate @bio with the specified blkg. Association failure
- * is handled by walking up the blkg tree. Therefore, the blkg associated can
- * be anything between @blkg and the root_blkg. This situation only happens
- * when a cgroup is dying and then the remaining bios will spill to the closest
- * alive blkg.
- *
- * A reference will be taken on the @blkg and will be released when @bio is
- * freed.
- */
-int bio_associate_blkg(struct bio *bio, struct blkcg_gq *blkg)
-{
- if (unlikely(bio->bi_blkg))
- return -EBUSY;
- bio->bi_blkg = blkg_tryget_closest(blkg);
- return 0;
-}
-
-/**
- * __bio_associate_blkg_from_css - internal blkg association function
- *
- * This in the core association function that all association paths rely on.
- * A blkg reference is taken which is released upon freeing of the bio.
- */
-static int __bio_associate_blkg_from_css(struct bio *bio,
- struct cgroup_subsys_state *css)
-{
- struct request_queue *q = bio->bi_disk->queue;
- struct blkcg_gq *blkg;
- int ret;
-
- rcu_read_lock();
-
- if (!css || !css->parent)
- blkg = q->root_blkg;
- else
- blkg = blkg_lookup_create(css_to_blkcg(css), q);
-
- ret = bio_associate_blkg(bio, blkg);
-
- rcu_read_unlock();
- return ret;
-}
-
-/**
- * bio_associate_blkg_from_css - associate a bio with a specified css
- * @bio: target bio
- * @css: target css
- *
- * Associate @bio with the blkg found by combining the css's blkg and the
- * request_queue of the @bio. This falls back to the queue's root_blkg if
- * the association fails with the css.
- */
-int bio_associate_blkg_from_css(struct bio *bio,
- struct cgroup_subsys_state *css)
-{
- if (unlikely(bio->bi_blkg))
- return -EBUSY;
- return __bio_associate_blkg_from_css(bio, css);
-}
-EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css);
-
#ifdef CONFIG_MEMCG
/**
- * bio_associate_blkg_from_page - associate a bio with the page's blkg
+ * bio_associate_blkcg_from_page - associate a bio with the page's blkcg
* @bio: target bio
* @page: the page to lookup the blkcg from
*
- * Associate @bio with the blkg from @page's owning memcg and the respective
- * request_queue. If cgroup_e_css returns NULL, fall back to the queue's
- * root_blkg.
- *
- * Note: this must be called after bio has an associated device.
+ * Associate @bio with the blkcg from @page's owning memcg. This works like
+ * every other associate function wrt references.
*/
-int bio_associate_blkg_from_page(struct bio *bio, struct page *page)
+int bio_associate_blkcg_from_page(struct bio *bio, struct page *page)
{
- struct cgroup_subsys_state *css;
- int ret;
+ struct cgroup_subsys_state *blkcg_css;
- if (unlikely(bio->bi_blkg))
+ if (unlikely(bio->bi_css))
return -EBUSY;
if (!page->mem_cgroup)
return 0;
-
- rcu_read_lock();
-
- css = cgroup_e_css(page->mem_cgroup->css.cgroup, &io_cgrp_subsys);
-
- ret = __bio_associate_blkg_from_css(bio, css);
-
- rcu_read_unlock();
- return ret;
+ blkcg_css = cgroup_get_e_css(page->mem_cgroup->css.cgroup,
+ &io_cgrp_subsys);
+ bio->bi_css = blkcg_css;
+ return 0;
}
#endif /* CONFIG_MEMCG */
/**
- * bio_associate_create_blkg - associate a bio with a blkg from q
- * @q: request_queue where bio is going
+ * bio_associate_blkcg - associate a bio with the specified blkcg
* @bio: target bio
+ * @blkcg_css: css of the blkcg to associate
+ *
+ * Associate @bio with the blkcg specified by @blkcg_css. Block layer will
+ * treat @bio as if it were issued by a task which belongs to the blkcg.
*
- * Associate @bio with the blkg found from the bio's css and the request_queue.
- * If one is not found, bio_lookup_blkg creates the blkg. This falls back to
- * the queue's root_blkg if association fails.
+ * This function takes an extra reference of @blkcg_css which will be put
+ * when @bio is released. The caller must own @bio and is responsible for
+ * synchronizing calls to this function.
*/
-int bio_associate_create_blkg(struct request_queue *q, struct bio *bio)
+int bio_associate_blkcg(struct bio *bio, struct cgroup_subsys_state *blkcg_css)
{
- struct cgroup_subsys_state *css;
- int ret = 0;
-
- /* someone has already associated this bio with a blkg */
- if (bio->bi_blkg)
- return ret;
-
- rcu_read_lock();
-
- css = blkcg_css();
-
- ret = __bio_associate_blkg_from_css(bio, css);
-
- rcu_read_unlock();
- return ret;
+ if (unlikely(bio->bi_css))
+ return -EBUSY;
+ css_get(blkcg_css);
+ bio->bi_css = blkcg_css;
+ return 0;
}
+EXPORT_SYMBOL_GPL(bio_associate_blkcg);
/**
- * bio_reassociate_blkg - reassociate a bio with a blkg from q
- * @q: request_queue where bio is going
+ * bio_associate_blkg - associate a bio with the specified blkg
* @bio: target bio
+ * @blkg: the blkg to associate
*
- * When submitting a bio, multiple recursive calls to make_request() may occur.
- * This causes the initial associate done in blkcg_bio_issue_check() to be
- * incorrect and reference the prior request_queue. This performs reassociation
- * when this situation happens.
+ * Associate @bio with the blkg specified by @blkg. This is the queue specific
+ * blkcg information associated with the @bio, a reference will be taken on the
+ * @blkg and will be freed when the bio is freed.
*/
-int bio_reassociate_blkg(struct request_queue *q, struct bio *bio)
+int bio_associate_blkg(struct bio *bio, struct blkcg_gq *blkg)
{
- if (bio->bi_blkg) {
- blkg_put(bio->bi_blkg);
- bio->bi_blkg = NULL;
- }
-
- return bio_associate_create_blkg(q, bio);
+ if (unlikely(bio->bi_blkg))
+ return -EBUSY;
+ if (!blkg_try_get(blkg))
+ return -ENODEV;
+ bio->bi_blkg = blkg;
+ return 0;
}
/**
put_io_context(bio->bi_ioc);
bio->bi_ioc = NULL;
}
+ if (bio->bi_css) {
+ css_put(bio->bi_css);
+ bio->bi_css = NULL;
+ }
if (bio->bi_blkg) {
blkg_put(bio->bi_blkg);
bio->bi_blkg = NULL;
}
/**
- * bio_clone_blkg_association - clone blkg association from src to dst bio
+ * bio_clone_blkcg_association - clone blkcg association from src to dst bio
* @dst: destination bio
* @src: source bio
*/
-void bio_clone_blkg_association(struct bio *dst, struct bio *src)
+void bio_clone_blkcg_association(struct bio *dst, struct bio *src)
{
- if (src->bi_blkg)
- bio_associate_blkg(dst, src->bi_blkg);
+ if (src->bi_css)
+ WARN_ON(bio_associate_blkcg(dst, src->bi_css));
}
-EXPORT_SYMBOL_GPL(bio_clone_blkg_association);
+EXPORT_SYMBOL_GPL(bio_clone_blkcg_association);
#endif /* CONFIG_BLK_CGROUP */
static void __init biovec_init_slabs(void)
kfree(blkg);
}
-static void __blkg_release(struct rcu_head *rcu)
-{
- struct blkcg_gq *blkg = container_of(rcu, struct blkcg_gq, rcu_head);
-
- percpu_ref_exit(&blkg->refcnt);
-
- /* release the blkcg and parent blkg refs this blkg has been holding */
- css_put(&blkg->blkcg->css);
- if (blkg->parent)
- blkg_put(blkg->parent);
-
- wb_congested_put(blkg->wb_congested);
-
- blkg_free(blkg);
-}
-
-/*
- * A group is RCU protected, but having an rcu lock does not mean that one
- * can access all the fields of blkg and assume these are valid. For
- * example, don't try to follow throtl_data and request queue links.
- *
- * Having a reference to blkg under an rcu allows accesses to only values
- * local to groups like group stats and group rate limits.
- */
-static void blkg_release(struct percpu_ref *ref)
-{
- struct blkcg_gq *blkg = container_of(ref, struct blkcg_gq, refcnt);
-
- call_rcu(&blkg->rcu_head, __blkg_release);
-}
-
/**
* blkg_alloc - allocate a blkg
* @blkcg: block cgroup the new blkg is associated with
blkg->q = q;
INIT_LIST_HEAD(&blkg->q_node);
blkg->blkcg = blkcg;
+ atomic_set(&blkg->refcnt, 1);
/* root blkg uses @q->root_rl, init rl only for !root blkgs */
if (blkcg != &blkcg_root) {
blkg_get(blkg->parent);
}
- ret = percpu_ref_init(&blkg->refcnt, blkg_release, 0,
- GFP_NOWAIT | __GFP_NOWARN);
- if (ret)
- goto err_cancel_ref;
-
/* invoke per-policy init */
for (i = 0; i < BLKCG_MAX_POLS; i++) {
struct blkcg_policy *pol = blkcg_policy[i];
blkg_put(blkg);
return ERR_PTR(ret);
-err_cancel_ref:
- percpu_ref_exit(&blkg->refcnt);
err_put_congested:
wb_congested_put(wb_congested);
err_put_css:
}
/**
- * __blkg_lookup_create - lookup blkg, try to create one if not there
+ * blkg_lookup_create - lookup blkg, try to create one if not there
* @blkcg: blkcg of interest
* @q: request_queue of interest
*
* that all non-root blkg's have access to the parent blkg. This function
* should be called under RCU read lock and @q->queue_lock.
*
- * Returns the blkg or the closest blkg if blkg_create fails as it walks
- * down from root.
+ * Returns pointer to the looked up or created blkg on success, ERR_PTR()
+ * value on error. If @q is dead, returns ERR_PTR(-EINVAL). If @q is not
+ * dead and bypassing, returns ERR_PTR(-EBUSY).
*/
-struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
- struct request_queue *q)
+struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
+ struct request_queue *q)
{
struct blkcg_gq *blkg;
* we shouldn't allow anything to go through for a bypassing queue.
*/
if (unlikely(blk_queue_bypass(q)))
- return q->root_blkg;
+ return ERR_PTR(blk_queue_dying(q) ? -ENODEV : -EBUSY);
blkg = __blkg_lookup(blkcg, q, true);
if (blkg)
/*
* Create blkgs walking down from blkcg_root to @blkcg, so that all
- * non-root blkgs have access to their parents. Returns the closest
- * blkg to the intended blkg should blkg_create() fail.
+ * non-root blkgs have access to their parents.
*/
while (true) {
struct blkcg *pos = blkcg;
struct blkcg *parent = blkcg_parent(blkcg);
- struct blkcg_gq *ret_blkg = q->root_blkg;
-
- while (parent) {
- blkg = __blkg_lookup(parent, q, false);
- if (blkg) {
- /* remember closest blkg */
- ret_blkg = blkg;
- break;
- }
+
+ while (parent && !__blkg_lookup(parent, q, false)) {
pos = parent;
parent = blkcg_parent(parent);
}
blkg = blkg_create(pos, q, NULL);
- if (IS_ERR(blkg))
- return ret_blkg;
- if (pos == blkcg)
+ if (pos == blkcg || IS_ERR(blkg))
return blkg;
}
}
-/**
- * blkg_lookup_create - find or create a blkg
- * @blkcg: target block cgroup
- * @q: target request_queue
- *
- * This looks up or creates the blkg representing the unique pair
- * of the blkcg and the request_queue.
- */
-struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
- struct request_queue *q)
-{
- struct blkcg_gq *blkg = blkg_lookup(blkcg, q);
- unsigned long flags;
-
- if (unlikely(!blkg)) {
- spin_lock_irqsave(q->queue_lock, flags);
-
- blkg = __blkg_lookup_create(blkcg, q);
-
- spin_unlock_irqrestore(q->queue_lock, flags);
- }
-
- return blkg;
-}
-
static void blkg_destroy(struct blkcg_gq *blkg)
{
struct blkcg *blkcg = blkg->blkcg;
* Put the reference taken at the time of creation so that when all
* queues are gone, group can be destroyed.
*/
- percpu_ref_kill(&blkg->refcnt);
+ blkg_put(blkg);
}
/**
q->root_rl.blkg = NULL;
}
+/*
+ * A group is RCU protected, but having an rcu lock does not mean that one
+ * can access all the fields of blkg and assume these are valid. For
+ * example, don't try to follow throtl_data and request queue links.
+ *
+ * Having a reference to blkg under an rcu allows accesses to only values
+ * local to groups like group stats and group rate limits.
+ */
+void __blkg_release_rcu(struct rcu_head *rcu_head)
+{
+ struct blkcg_gq *blkg = container_of(rcu_head, struct blkcg_gq, rcu_head);
+
+ /* release the blkcg and parent blkg refs this blkg has been holding */
+ css_put(&blkg->blkcg->css);
+ if (blkg->parent)
+ blkg_put(blkg->parent);
+
+ wb_congested_put(blkg->wb_congested);
+
+ blkg_free(blkg);
+}
+EXPORT_SYMBOL_GPL(__blkg_release_rcu);
+
/*
* The next function used by blk_queue_for_each_rl(). It's a bit tricky
* because the root blkg uses @q->root_rl instead of its own rl.
blkg = blkg_lookup(blkcg, q);
if (!blkg)
goto out;
- if (!blkg_tryget(blkg))
+ blkg = blkg_try_get(blkg);
+ if (!blkg)
goto out;
rcu_read_unlock();
* prevent that q->request_fn() gets invoked after draining finished.
*/
blk_freeze_queue(q);
+
+ rq_qos_exit(q);
+
spin_lock_irq(lock);
queue_flag_set(QUEUE_FLAG_DEAD, q);
spin_unlock_irq(lock);
* dispatch may still be in-progress since we dispatch requests
* from more than one contexts.
*
- * No need to quiesce queue if it isn't initialized yet since
- * blk_freeze_queue() should be enough for cases of passthrough
- * request.
+ * We rely on driver to deal with the race in case that queue
+ * initialization isn't done.
*/
if (q->mq_ops && blk_queue_init_done(q))
blk_mq_quiesce_queue(q);
if (q)
blk_queue_exit(q);
q = bio->bi_disk->queue;
- bio_reassociate_blkg(q, bio);
flags = 0;
if (bio->bi_opf & REQ_NOWAIT)
flags = BLK_MQ_REQ_NOWAIT;
spinlock_t *lock)
{
struct blk_iolatency *blkiolat = BLKIOLATENCY(rqos);
- struct blkcg_gq *blkg = bio->bi_blkg;
+ struct blkcg *blkcg;
+ struct blkcg_gq *blkg;
+ struct request_queue *q = rqos->q;
bool issue_as_root = bio_issue_as_root_blkg(bio);
if (!blk_iolatency_enabled(blkiolat))
return;
+ rcu_read_lock();
+ blkcg = bio_blkcg(bio);
+ bio_associate_blkcg(bio, &blkcg->css);
+ blkg = blkg_lookup(blkcg, q);
+ if (unlikely(!blkg)) {
+ if (!lock)
+ spin_lock_irq(q->queue_lock);
+ blkg = blkg_lookup_create(blkcg, q);
+ if (IS_ERR(blkg))
+ blkg = NULL;
+ if (!lock)
+ spin_unlock_irq(q->queue_lock);
+ }
+ if (!blkg)
+ goto out;
+
+ bio_issue_init(&bio->bi_issue, bio_sectors(bio));
+ bio_associate_blkg(bio, blkg);
+out:
+ rcu_read_unlock();
while (blkg && blkg->parent) {
struct iolatency_grp *iolat = blkg_to_lat(blkg);
if (!iolat) {
* We could be exiting, don't access the pd unless we have a
* ref on the blkg.
*/
- if (!blkg_tryget(blkg))
+ if (!blkg_try_get(blkg))
continue;
iolat = blkg_to_lat(blkg);
if ((sector | nr_sects) & bs_mask)
return -EINVAL;
- while (nr_sects) {
- unsigned int req_sects = nr_sects;
- sector_t end_sect;
+ if (!nr_sects)
+ return -EINVAL;
- if (!req_sects)
- goto fail;
- if (req_sects > UINT_MAX >> 9)
- req_sects = UINT_MAX >> 9;
+ while (nr_sects) {
+ sector_t req_sects = min_t(sector_t, nr_sects,
+ bio_allowed_max_sectors(q));
- end_sect = sector + req_sects;
+ WARN_ON_ONCE((req_sects << 9) > UINT_MAX);
bio = blk_next_bio(bio, 0, gfp_mask);
bio->bi_iter.bi_sector = sector;
bio_set_op_attrs(bio, op, 0);
bio->bi_iter.bi_size = req_sects << 9;
+ sector += req_sects;
nr_sects -= req_sects;
- sector = end_sect;
/*
* We can loop for a long time in here, if someone does
*biop = bio;
return 0;
-
-fail:
- if (bio) {
- submit_bio_wait(bio);
- bio_put(bio);
- }
- *biop = NULL;
- return -EOPNOTSUPP;
}
EXPORT_SYMBOL(__blkdev_issue_discard);
return -EOPNOTSUPP;
/* Ensure that max_write_same_sectors doesn't overflow bi_size */
- max_write_same_sectors = UINT_MAX >> 9;
+ max_write_same_sectors = bio_allowed_max_sectors(q);
while (nr_sects) {
bio = blk_next_bio(bio, 1, gfp_mask);
bio_get_first_bvec(prev_rq->bio, &pb);
else
bio_get_first_bvec(prev, &pb);
- if (pb.bv_offset)
+ if (pb.bv_offset & queue_virt_boundary(q))
return true;
/*
/* Zero-sector (unknown) and one-sector granularities are the same. */
granularity = max(q->limits.discard_granularity >> 9, 1U);
- max_discard_sectors = min(q->limits.max_discard_sectors, UINT_MAX >> 9);
+ max_discard_sectors = min(q->limits.max_discard_sectors,
+ bio_allowed_max_sectors(q));
max_discard_sectors -= max_discard_sectors % granularity;
if (unlikely(!max_discard_sectors)) {
part_stat_unlock();
}
}
+/*
+ * Two cases of handling DISCARD merge:
+ * If max_discard_segments > 1, the driver takes every bio
+ * as a range and send them to controller together. The ranges
+ * needn't to be contiguous.
+ * Otherwise, the bios/requests will be handled as same as
+ * others which should be contiguous.
+ */
+static inline bool blk_discard_mergable(struct request *req)
+{
+ if (req_op(req) == REQ_OP_DISCARD &&
+ queue_max_discard_segments(req->q) > 1)
+ return true;
+ return false;
+}
+
+enum elv_merge blk_try_req_merge(struct request *req, struct request *next)
+{
+ if (blk_discard_mergable(req))
+ return ELEVATOR_DISCARD_MERGE;
+ else if (blk_rq_pos(req) + blk_rq_sectors(req) == blk_rq_pos(next))
+ return ELEVATOR_BACK_MERGE;
+
+ return ELEVATOR_NO_MERGE;
+}
/*
* For non-mq, this has to be called with the request spinlock acquired.
if (req_op(req) != req_op(next))
return NULL;
- /*
- * not contiguous
- */
- if (blk_rq_pos(req) + blk_rq_sectors(req) != blk_rq_pos(next))
- return NULL;
-
if (rq_data_dir(req) != rq_data_dir(next)
|| req->rq_disk != next->rq_disk
|| req_no_special_merge(next))
* counts here. Handle DISCARDs separately, as they
* have separate settings.
*/
- if (req_op(req) == REQ_OP_DISCARD) {
+
+ switch (blk_try_req_merge(req, next)) {
+ case ELEVATOR_DISCARD_MERGE:
if (!req_attempt_discard_merge(q, req, next))
return NULL;
- } else if (!ll_merge_requests_fn(q, req, next))
+ break;
+ case ELEVATOR_BACK_MERGE:
+ if (!ll_merge_requests_fn(q, req, next))
+ return NULL;
+ break;
+ default:
return NULL;
+ }
/*
* If failfast settings disagree or any of the two is already
req->__data_len += blk_rq_bytes(next);
- if (req_op(req) != REQ_OP_DISCARD)
+ if (!blk_discard_mergable(req))
elv_merge_requests(q, req, next);
/*
enum elv_merge blk_try_merge(struct request *rq, struct bio *bio)
{
- if (req_op(rq) == REQ_OP_DISCARD &&
- queue_max_discard_segments(rq->q) > 1)
+ if (blk_discard_mergable(rq))
return ELEVATOR_DISCARD_MERGE;
else if (blk_rq_pos(rq) + blk_rq_sectors(rq) == bio->bi_iter.bi_sector)
return ELEVATOR_BACK_MERGE;
if (bypass_insert)
return BLK_STS_RESOURCE;
- blk_mq_sched_insert_request(rq, false, run_queue, false);
+ blk_mq_request_bypass_insert(rq, run_queue);
return BLK_STS_OK;
}
ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
- blk_mq_sched_insert_request(rq, false, true, false);
+ blk_mq_request_bypass_insert(rq, true);
else if (ret != BLK_STS_OK)
blk_mq_end_request(rq, ret);
if (ret != BLK_STS_OK) {
if (ret == BLK_STS_RESOURCE ||
ret == BLK_STS_DEV_RESOURCE) {
- list_add(&rq->queuelist, list);
+ blk_mq_request_bypass_insert(rq,
+ list_empty(list));
break;
}
blk_mq_end_request(rq, ret);
kobject_del(&q->kobj);
blk_trace_remove_sysfs(disk_to_dev(disk));
- rq_qos_exit(q);
-
mutex_lock(&q->sysfs_lock);
if (q->request_fn || (q->mq_ops && q->elevator))
elv_unregister_queue(q);
}
#endif
+static void blk_throtl_assoc_bio(struct throtl_grp *tg, struct bio *bio)
+{
+#ifdef CONFIG_BLK_DEV_THROTTLING_LOW
+ /* fallback to root_blkg if we fail to get a blkg ref */
+ if (bio->bi_css && (bio_associate_blkg(bio, tg_to_blkg(tg)) == -ENODEV))
+ bio_associate_blkg(bio, bio->bi_disk->queue->root_blkg);
+ bio_issue_init(&bio->bi_issue, bio_sectors(bio));
+#endif
+}
+
bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
struct bio *bio)
{
struct throtl_qnode *qn = NULL;
- struct throtl_grp *tg = blkg_to_tg(blkg);
+ struct throtl_grp *tg = blkg_to_tg(blkg ?: q->root_blkg);
struct throtl_service_queue *sq;
bool rw = bio_data_dir(bio);
bool throttled = false;
if (unlikely(blk_queue_bypass(q)))
goto out_unlock;
+ blk_throtl_assoc_bio(tg, bio);
blk_throtl_update_idletime(tg);
sq = &tg->service_queue;
static inline bool __bvec_gap_to_prev(struct request_queue *q,
struct bio_vec *bprv, unsigned int offset)
{
- return offset ||
+ return (offset & queue_virt_boundary(q)) ||
((bprv->bv_offset + bprv->bv_len) & queue_virt_boundary(q));
}
return rq->__deadline & ~0x1UL;
}
+/*
+ * The max size one bio can handle is UINT_MAX becasue bvec_iter.bi_size
+ * is defined as 'unsigned int', meantime it has to aligned to with logical
+ * block size which is the minimum accepted unit by hardware.
+ */
+static inline unsigned int bio_allowed_max_sectors(struct request_queue *q)
+{
+ return round_down(UINT_MAX, queue_logical_block_size(q)) >> 9;
+}
+
/*
* Internal io_context interface
*/
return NULL;
bio->bi_disk = bio_src->bi_disk;
bio->bi_opf = bio_src->bi_opf;
+ bio->bi_ioprio = bio_src->bi_ioprio;
bio->bi_write_hint = bio_src->bi_write_hint;
bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector;
bio->bi_iter.bi_size = bio_src->bi_iter.bi_size;
}
}
- bio_clone_blkg_association(bio, bio_src);
-
- blkcg_bio_issue_init(bio);
+ bio_clone_blkcg_association(bio, bio_src);
return bio;
}
uint64_t serial_nr;
rcu_read_lock();
- serial_nr = __bio_blkcg(bio)->css.serial_nr;
+ serial_nr = bio_blkcg(bio)->css.serial_nr;
rcu_read_unlock();
/*
struct cfq_group *cfqg;
rcu_read_lock();
- cfqg = cfq_lookup_cfqg(cfqd, __bio_blkcg(bio));
+ cfqg = cfq_lookup_cfqg(cfqd, bio_blkcg(bio));
if (!cfqg) {
cfqq = &cfqd->oom_cfqq;
goto out;
cipher algorithms.
config CRYPTO_STATS
- bool "Crypto usage statistics for User-space"
+ bool
help
This option enables the gathering of crypto stats.
This will collect:
appropriate hash algorithms (such as SHA-1) must be available.
ENOPKG will be reported if the requisite algorithm is unavailable.
+config ASYMMETRIC_TPM_KEY_SUBTYPE
+ tristate "Asymmetric TPM backed private key subtype"
+ depends on TCG_TPM
+ depends on TRUSTED_KEYS
+ select CRYPTO_HMAC
+ select CRYPTO_SHA1
+ select CRYPTO_HASH_INFO
+ help
+ This option provides support for TPM backed private key type handling.
+ Operations such as sign, verify, encrypt, decrypt are performed by
+ the TPM after the private key is loaded.
+
config X509_CERTIFICATE_PARSER
tristate "X.509 certificate parser"
depends on ASYMMETRIC_PUBLIC_KEY_SUBTYPE
data and provides the ability to instantiate a crypto key from a
public key packet found inside the certificate.
+config PKCS8_PRIVATE_KEY_PARSER
+ tristate "PKCS#8 private key parser"
+ depends on ASYMMETRIC_PUBLIC_KEY_SUBTYPE
+ select ASN1
+ select OID_REGISTRY
+ help
+ This option provides support for parsing PKCS#8 format blobs for
+ private key data and provides the ability to instantiate a crypto key
+ from that data.
+
+config TPM_KEY_PARSER
+ tristate "TPM private key parser"
+ depends on ASYMMETRIC_TPM_KEY_SUBTYPE
+ select ASN1
+ help
+ This option provides support for parsing TPM format blobs for
+ private key data and provides the ability to instantiate a crypto key
+ from that data.
+
config PKCS7_MESSAGE_PARSER
tristate "PKCS#7 message parser"
depends on X509_CERTIFICATE_PARSER
signature.o
obj-$(CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE) += public_key.o
+obj-$(CONFIG_ASYMMETRIC_TPM_KEY_SUBTYPE) += asym_tpm.o
#
# X.509 Certificate handling
$(obj)/x509.asn1.o: $(obj)/x509.asn1.c $(obj)/x509.asn1.h
$(obj)/x509_akid.asn1.o: $(obj)/x509_akid.asn1.c $(obj)/x509_akid.asn1.h
+#
+# PKCS#8 private key handling
+#
+obj-$(CONFIG_PKCS8_PRIVATE_KEY_PARSER) += pkcs8_key_parser.o
+pkcs8_key_parser-y := \
+ pkcs8.asn1.o \
+ pkcs8_parser.o
+
+$(obj)/pkcs8_parser.o: $(obj)/pkcs8.asn1.h
+$(obj)/pkcs8-asn1.o: $(obj)/pkcs8.asn1.c $(obj)/pkcs8.asn1.h
+
+clean-files += pkcs8.asn1.c pkcs8.asn1.h
+
#
# PKCS#7 message handling
#
$(obj)/mscode_parser.o: $(obj)/mscode.asn1.h $(obj)/mscode.asn1.h
$(obj)/mscode.asn1.o: $(obj)/mscode.asn1.c $(obj)/mscode.asn1.h
+
+#
+# TPM private key parsing
+#
+obj-$(CONFIG_TPM_KEY_PARSER) += tpm_key_parser.o
+tpm_key_parser-y := \
+ tpm.asn1.o \
+ tpm_parser.o
+
+$(obj)/tpm_parser.o: $(obj)/tpm.asn1.h
+$(obj)/tpm.asn1.o: $(obj)/tpm.asn1.c $(obj)/tpm.asn1.h
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) "ASYM-TPM: "fmt
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/seq_file.h>
+#include <linux/scatterlist.h>
+#include <linux/tpm.h>
+#include <linux/tpm_command.h>
+#include <crypto/akcipher.h>
+#include <crypto/hash.h>
+#include <crypto/sha.h>
+#include <asm/unaligned.h>
+#include <keys/asymmetric-subtype.h>
+#include <keys/trusted.h>
+#include <crypto/asym_tpm_subtype.h>
+#include <crypto/public_key.h>
+
+#define TPM_ORD_FLUSHSPECIFIC 186
+#define TPM_ORD_LOADKEY2 65
+#define TPM_ORD_UNBIND 30
+#define TPM_ORD_SIGN 60
+#define TPM_LOADKEY2_SIZE 59
+#define TPM_FLUSHSPECIFIC_SIZE 18
+#define TPM_UNBIND_SIZE 63
+#define TPM_SIGN_SIZE 63
+
+#define TPM_RT_KEY 0x00000001
+
+/*
+ * Load a TPM key from the blob provided by userspace
+ */
+static int tpm_loadkey2(struct tpm_buf *tb,
+ uint32_t keyhandle, unsigned char *keyauth,
+ const unsigned char *keyblob, int keybloblen,
+ uint32_t *newhandle)
+{
+ unsigned char nonceodd[TPM_NONCE_SIZE];
+ unsigned char enonce[TPM_NONCE_SIZE];
+ unsigned char authdata[SHA1_DIGEST_SIZE];
+ uint32_t authhandle = 0;
+ unsigned char cont = 0;
+ uint32_t ordinal;
+ int ret;
+
+ ordinal = htonl(TPM_ORD_LOADKEY2);
+
+ /* session for loading the key */
+ ret = oiap(tb, &authhandle, enonce);
+ if (ret < 0) {
+ pr_info("oiap failed (%d)\n", ret);
+ return ret;
+ }
+
+ /* generate odd nonce */
+ ret = tpm_get_random(NULL, nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0) {
+ pr_info("tpm_get_random failed (%d)\n", ret);
+ return ret;
+ }
+
+ /* calculate authorization HMAC value */
+ ret = TSS_authhmac(authdata, keyauth, SHA1_DIGEST_SIZE, enonce,
+ nonceodd, cont, sizeof(uint32_t), &ordinal,
+ keybloblen, keyblob, 0, 0);
+ if (ret < 0)
+ return ret;
+
+ /* build the request buffer */
+ INIT_BUF(tb);
+ store16(tb, TPM_TAG_RQU_AUTH1_COMMAND);
+ store32(tb, TPM_LOADKEY2_SIZE + keybloblen);
+ store32(tb, TPM_ORD_LOADKEY2);
+ store32(tb, keyhandle);
+ storebytes(tb, keyblob, keybloblen);
+ store32(tb, authhandle);
+ storebytes(tb, nonceodd, TPM_NONCE_SIZE);
+ store8(tb, cont);
+ storebytes(tb, authdata, SHA1_DIGEST_SIZE);
+
+ ret = trusted_tpm_send(tb->data, MAX_BUF_SIZE);
+ if (ret < 0) {
+ pr_info("authhmac failed (%d)\n", ret);
+ return ret;
+ }
+
+ ret = TSS_checkhmac1(tb->data, ordinal, nonceodd, keyauth,
+ SHA1_DIGEST_SIZE, 0, 0);
+ if (ret < 0) {
+ pr_info("TSS_checkhmac1 failed (%d)\n", ret);
+ return ret;
+ }
+
+ *newhandle = LOAD32(tb->data, TPM_DATA_OFFSET);
+ return 0;
+}
+
+/*
+ * Execute the FlushSpecific TPM command
+ */
+static int tpm_flushspecific(struct tpm_buf *tb, uint32_t handle)
+{
+ INIT_BUF(tb);
+ store16(tb, TPM_TAG_RQU_COMMAND);
+ store32(tb, TPM_FLUSHSPECIFIC_SIZE);
+ store32(tb, TPM_ORD_FLUSHSPECIFIC);
+ store32(tb, handle);
+ store32(tb, TPM_RT_KEY);
+
+ return trusted_tpm_send(tb->data, MAX_BUF_SIZE);
+}
+
+/*
+ * Decrypt a blob provided by userspace using a specific key handle.
+ * The handle is a well known handle or previously loaded by e.g. LoadKey2
+ */
+static int tpm_unbind(struct tpm_buf *tb,
+ uint32_t keyhandle, unsigned char *keyauth,
+ const unsigned char *blob, uint32_t bloblen,
+ void *out, uint32_t outlen)
+{
+ unsigned char nonceodd[TPM_NONCE_SIZE];
+ unsigned char enonce[TPM_NONCE_SIZE];
+ unsigned char authdata[SHA1_DIGEST_SIZE];
+ uint32_t authhandle = 0;
+ unsigned char cont = 0;
+ uint32_t ordinal;
+ uint32_t datalen;
+ int ret;
+
+ ordinal = htonl(TPM_ORD_UNBIND);
+ datalen = htonl(bloblen);
+
+ /* session for loading the key */
+ ret = oiap(tb, &authhandle, enonce);
+ if (ret < 0) {
+ pr_info("oiap failed (%d)\n", ret);
+ return ret;
+ }
+
+ /* generate odd nonce */
+ ret = tpm_get_random(NULL, nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0) {
+ pr_info("tpm_get_random failed (%d)\n", ret);
+ return ret;
+ }
+
+ /* calculate authorization HMAC value */
+ ret = TSS_authhmac(authdata, keyauth, SHA1_DIGEST_SIZE, enonce,
+ nonceodd, cont, sizeof(uint32_t), &ordinal,
+ sizeof(uint32_t), &datalen,
+ bloblen, blob, 0, 0);
+ if (ret < 0)
+ return ret;
+
+ /* build the request buffer */
+ INIT_BUF(tb);
+ store16(tb, TPM_TAG_RQU_AUTH1_COMMAND);
+ store32(tb, TPM_UNBIND_SIZE + bloblen);
+ store32(tb, TPM_ORD_UNBIND);
+ store32(tb, keyhandle);
+ store32(tb, bloblen);
+ storebytes(tb, blob, bloblen);
+ store32(tb, authhandle);
+ storebytes(tb, nonceodd, TPM_NONCE_SIZE);
+ store8(tb, cont);
+ storebytes(tb, authdata, SHA1_DIGEST_SIZE);
+
+ ret = trusted_tpm_send(tb->data, MAX_BUF_SIZE);
+ if (ret < 0) {
+ pr_info("authhmac failed (%d)\n", ret);
+ return ret;
+ }
+
+ datalen = LOAD32(tb->data, TPM_DATA_OFFSET);
+
+ ret = TSS_checkhmac1(tb->data, ordinal, nonceodd,
+ keyauth, SHA1_DIGEST_SIZE,
+ sizeof(uint32_t), TPM_DATA_OFFSET,
+ datalen, TPM_DATA_OFFSET + sizeof(uint32_t),
+ 0, 0);
+ if (ret < 0) {
+ pr_info("TSS_checkhmac1 failed (%d)\n", ret);
+ return ret;
+ }
+
+ memcpy(out, tb->data + TPM_DATA_OFFSET + sizeof(uint32_t),
+ min(outlen, datalen));
+
+ return datalen;
+}
+
+/*
+ * Sign a blob provided by userspace (that has had the hash function applied)
+ * using a specific key handle. The handle is assumed to have been previously
+ * loaded by e.g. LoadKey2.
+ *
+ * Note that the key signature scheme of the used key should be set to
+ * TPM_SS_RSASSAPKCS1v15_DER. This allows the hashed input to be of any size
+ * up to key_length_in_bytes - 11 and not be limited to size 20 like the
+ * TPM_SS_RSASSAPKCS1v15_SHA1 signature scheme.
+ */
+static int tpm_sign(struct tpm_buf *tb,
+ uint32_t keyhandle, unsigned char *keyauth,
+ const unsigned char *blob, uint32_t bloblen,
+ void *out, uint32_t outlen)
+{
+ unsigned char nonceodd[TPM_NONCE_SIZE];
+ unsigned char enonce[TPM_NONCE_SIZE];
+ unsigned char authdata[SHA1_DIGEST_SIZE];
+ uint32_t authhandle = 0;
+ unsigned char cont = 0;
+ uint32_t ordinal;
+ uint32_t datalen;
+ int ret;
+
+ ordinal = htonl(TPM_ORD_SIGN);
+ datalen = htonl(bloblen);
+
+ /* session for loading the key */
+ ret = oiap(tb, &authhandle, enonce);
+ if (ret < 0) {
+ pr_info("oiap failed (%d)\n", ret);
+ return ret;
+ }
+
+ /* generate odd nonce */
+ ret = tpm_get_random(NULL, nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0) {
+ pr_info("tpm_get_random failed (%d)\n", ret);
+ return ret;
+ }
+
+ /* calculate authorization HMAC value */
+ ret = TSS_authhmac(authdata, keyauth, SHA1_DIGEST_SIZE, enonce,
+ nonceodd, cont, sizeof(uint32_t), &ordinal,
+ sizeof(uint32_t), &datalen,
+ bloblen, blob, 0, 0);
+ if (ret < 0)
+ return ret;
+
+ /* build the request buffer */
+ INIT_BUF(tb);
+ store16(tb, TPM_TAG_RQU_AUTH1_COMMAND);
+ store32(tb, TPM_SIGN_SIZE + bloblen);
+ store32(tb, TPM_ORD_SIGN);
+ store32(tb, keyhandle);
+ store32(tb, bloblen);
+ storebytes(tb, blob, bloblen);
+ store32(tb, authhandle);
+ storebytes(tb, nonceodd, TPM_NONCE_SIZE);
+ store8(tb, cont);
+ storebytes(tb, authdata, SHA1_DIGEST_SIZE);
+
+ ret = trusted_tpm_send(tb->data, MAX_BUF_SIZE);
+ if (ret < 0) {
+ pr_info("authhmac failed (%d)\n", ret);
+ return ret;
+ }
+
+ datalen = LOAD32(tb->data, TPM_DATA_OFFSET);
+
+ ret = TSS_checkhmac1(tb->data, ordinal, nonceodd,
+ keyauth, SHA1_DIGEST_SIZE,
+ sizeof(uint32_t), TPM_DATA_OFFSET,
+ datalen, TPM_DATA_OFFSET + sizeof(uint32_t),
+ 0, 0);
+ if (ret < 0) {
+ pr_info("TSS_checkhmac1 failed (%d)\n", ret);
+ return ret;
+ }
+
+ memcpy(out, tb->data + TPM_DATA_OFFSET + sizeof(uint32_t),
+ min(datalen, outlen));
+
+ return datalen;
+}
+/*
+ * Maximum buffer size for the BER/DER encoded public key. The public key
+ * is of the form SEQUENCE { INTEGER n, INTEGER e } where n is a maximum 2048
+ * bit key and e is usually 65537
+ * The encoding overhead is:
+ * - max 4 bytes for SEQUENCE
+ * - max 4 bytes for INTEGER n type/length
+ * - 257 bytes of n
+ * - max 2 bytes for INTEGER e type/length
+ * - 3 bytes of e
+ */
+#define PUB_KEY_BUF_SIZE (4 + 4 + 257 + 2 + 3)
+
+/*
+ * Provide a part of a description of the key for /proc/keys.
+ */
+static void asym_tpm_describe(const struct key *asymmetric_key,
+ struct seq_file *m)
+{
+ struct tpm_key *tk = asymmetric_key->payload.data[asym_crypto];
+
+ if (!tk)
+ return;
+
+ seq_printf(m, "TPM1.2/Blob");
+}
+
+static void asym_tpm_destroy(void *payload0, void *payload3)
+{
+ struct tpm_key *tk = payload0;
+
+ if (!tk)
+ return;
+
+ kfree(tk->blob);
+ tk->blob_len = 0;
+
+ kfree(tk);
+}
+
+/* How many bytes will it take to encode the length */
+static inline uint32_t definite_length(uint32_t len)
+{
+ if (len <= 127)
+ return 1;
+ if (len <= 255)
+ return 2;
+ return 3;
+}
+
+static inline uint8_t *encode_tag_length(uint8_t *buf, uint8_t tag,
+ uint32_t len)
+{
+ *buf++ = tag;
+
+ if (len <= 127) {
+ buf[0] = len;
+ return buf + 1;
+ }
+
+ if (len <= 255) {
+ buf[0] = 0x81;
+ buf[1] = len;
+ return buf + 2;
+ }
+
+ buf[0] = 0x82;
+ put_unaligned_be16(len, buf + 1);
+ return buf + 3;
+}
+
+static uint32_t derive_pub_key(const void *pub_key, uint32_t len, uint8_t *buf)
+{
+ uint8_t *cur = buf;
+ uint32_t n_len = definite_length(len) + 1 + len + 1;
+ uint32_t e_len = definite_length(3) + 1 + 3;
+ uint8_t e[3] = { 0x01, 0x00, 0x01 };
+
+ /* SEQUENCE */
+ cur = encode_tag_length(cur, 0x30, n_len + e_len);
+ /* INTEGER n */
+ cur = encode_tag_length(cur, 0x02, len + 1);
+ cur[0] = 0x00;
+ memcpy(cur + 1, pub_key, len);
+ cur += len + 1;
+ cur = encode_tag_length(cur, 0x02, sizeof(e));
+ memcpy(cur, e, sizeof(e));
+ cur += sizeof(e);
+
+ return cur - buf;
+}
+
+/*
+ * Determine the crypto algorithm name.
+ */
+static int determine_akcipher(const char *encoding, const char *hash_algo,
+ char alg_name[CRYPTO_MAX_ALG_NAME])
+{
+ if (strcmp(encoding, "pkcs1") == 0) {
+ if (!hash_algo) {
+ strcpy(alg_name, "pkcs1pad(rsa)");
+ return 0;
+ }
+
+ if (snprintf(alg_name, CRYPTO_MAX_ALG_NAME, "pkcs1pad(rsa,%s)",
+ hash_algo) >= CRYPTO_MAX_ALG_NAME)
+ return -EINVAL;
+
+ return 0;
+ }
+
+ if (strcmp(encoding, "raw") == 0) {
+ strcpy(alg_name, "rsa");
+ return 0;
+ }
+
+ return -ENOPKG;
+}
+
+/*
+ * Query information about a key.
+ */
+static int tpm_key_query(const struct kernel_pkey_params *params,
+ struct kernel_pkey_query *info)
+{
+ struct tpm_key *tk = params->key->payload.data[asym_crypto];
+ int ret;
+ char alg_name[CRYPTO_MAX_ALG_NAME];
+ struct crypto_akcipher *tfm;
+ uint8_t der_pub_key[PUB_KEY_BUF_SIZE];
+ uint32_t der_pub_key_len;
+ int len;
+
+ /* TPM only works on private keys, public keys still done in software */
+ ret = determine_akcipher(params->encoding, params->hash_algo, alg_name);
+ if (ret < 0)
+ return ret;
+
+ tfm = crypto_alloc_akcipher(alg_name, 0, 0);
+ if (IS_ERR(tfm))
+ return PTR_ERR(tfm);
+
+ der_pub_key_len = derive_pub_key(tk->pub_key, tk->pub_key_len,
+ der_pub_key);
+
+ ret = crypto_akcipher_set_pub_key(tfm, der_pub_key, der_pub_key_len);
+ if (ret < 0)
+ goto error_free_tfm;
+
+ len = crypto_akcipher_maxsize(tfm);
+
+ info->key_size = tk->key_len;
+ info->max_data_size = tk->key_len / 8;
+ info->max_sig_size = len;
+ info->max_enc_size = len;
+ info->max_dec_size = tk->key_len / 8;
+
+ info->supported_ops = KEYCTL_SUPPORTS_ENCRYPT |
+ KEYCTL_SUPPORTS_DECRYPT |
+ KEYCTL_SUPPORTS_VERIFY |
+ KEYCTL_SUPPORTS_SIGN;
+
+ ret = 0;
+error_free_tfm:
+ crypto_free_akcipher(tfm);
+ pr_devel("<==%s() = %d\n", __func__, ret);
+ return ret;
+}
+
+/*
+ * Encryption operation is performed with the public key. Hence it is done
+ * in software
+ */
+static int tpm_key_encrypt(struct tpm_key *tk,
+ struct kernel_pkey_params *params,
+ const void *in, void *out)
+{
+ char alg_name[CRYPTO_MAX_ALG_NAME];
+ struct crypto_akcipher *tfm;
+ struct akcipher_request *req;
+ struct crypto_wait cwait;
+ struct scatterlist in_sg, out_sg;
+ uint8_t der_pub_key[PUB_KEY_BUF_SIZE];
+ uint32_t der_pub_key_len;
+ int ret;
+
+ pr_devel("==>%s()\n", __func__);
+
+ ret = determine_akcipher(params->encoding, params->hash_algo, alg_name);
+ if (ret < 0)
+ return ret;
+
+ tfm = crypto_alloc_akcipher(alg_name, 0, 0);
+ if (IS_ERR(tfm))
+ return PTR_ERR(tfm);
+
+ der_pub_key_len = derive_pub_key(tk->pub_key, tk->pub_key_len,
+ der_pub_key);
+
+ ret = crypto_akcipher_set_pub_key(tfm, der_pub_key, der_pub_key_len);
+ if (ret < 0)
+ goto error_free_tfm;
+
+ req = akcipher_request_alloc(tfm, GFP_KERNEL);
+ if (!req)
+ goto error_free_tfm;
+
+ sg_init_one(&in_sg, in, params->in_len);
+ sg_init_one(&out_sg, out, params->out_len);
+ akcipher_request_set_crypt(req, &in_sg, &out_sg, params->in_len,
+ params->out_len);
+ crypto_init_wait(&cwait);
+ akcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
+ CRYPTO_TFM_REQ_MAY_SLEEP,
+ crypto_req_done, &cwait);
+
+ ret = crypto_akcipher_encrypt(req);
+ ret = crypto_wait_req(ret, &cwait);
+
+ if (ret == 0)
+ ret = req->dst_len;
+
+ akcipher_request_free(req);
+error_free_tfm:
+ crypto_free_akcipher(tfm);
+ pr_devel("<==%s() = %d\n", __func__, ret);
+ return ret;
+}
+
+/*
+ * Decryption operation is performed with the private key in the TPM.
+ */
+static int tpm_key_decrypt(struct tpm_key *tk,
+ struct kernel_pkey_params *params,
+ const void *in, void *out)
+{
+ struct tpm_buf *tb;
+ uint32_t keyhandle;
+ uint8_t srkauth[SHA1_DIGEST_SIZE];
+ uint8_t keyauth[SHA1_DIGEST_SIZE];
+ int r;
+
+ pr_devel("==>%s()\n", __func__);
+
+ if (params->hash_algo)
+ return -ENOPKG;
+
+ if (strcmp(params->encoding, "pkcs1"))
+ return -ENOPKG;
+
+ tb = kzalloc(sizeof(*tb), GFP_KERNEL);
+ if (!tb)
+ return -ENOMEM;
+
+ /* TODO: Handle a non-all zero SRK authorization */
+ memset(srkauth, 0, sizeof(srkauth));
+
+ r = tpm_loadkey2(tb, SRKHANDLE, srkauth,
+ tk->blob, tk->blob_len, &keyhandle);
+ if (r < 0) {
+ pr_devel("loadkey2 failed (%d)\n", r);
+ goto error;
+ }
+
+ /* TODO: Handle a non-all zero key authorization */
+ memset(keyauth, 0, sizeof(keyauth));
+
+ r = tpm_unbind(tb, keyhandle, keyauth,
+ in, params->in_len, out, params->out_len);
+ if (r < 0)
+ pr_devel("tpm_unbind failed (%d)\n", r);
+
+ if (tpm_flushspecific(tb, keyhandle) < 0)
+ pr_devel("flushspecific failed (%d)\n", r);
+
+error:
+ kzfree(tb);
+ pr_devel("<==%s() = %d\n", __func__, r);
+ return r;
+}
+
+/*
+ * Hash algorithm OIDs plus ASN.1 DER wrappings [RFC4880 sec 5.2.2].
+ */
+static const u8 digest_info_md5[] = {
+ 0x30, 0x20, 0x30, 0x0c, 0x06, 0x08,
+ 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x02, 0x05, /* OID */
+ 0x05, 0x00, 0x04, 0x10
+};
+
+static const u8 digest_info_sha1[] = {
+ 0x30, 0x21, 0x30, 0x09, 0x06, 0x05,
+ 0x2b, 0x0e, 0x03, 0x02, 0x1a,
+ 0x05, 0x00, 0x04, 0x14
+};
+
+static const u8 digest_info_rmd160[] = {
+ 0x30, 0x21, 0x30, 0x09, 0x06, 0x05,
+ 0x2b, 0x24, 0x03, 0x02, 0x01,
+ 0x05, 0x00, 0x04, 0x14
+};
+
+static const u8 digest_info_sha224[] = {
+ 0x30, 0x2d, 0x30, 0x0d, 0x06, 0x09,
+ 0x60, 0x86, 0x48, 0x01, 0x65, 0x03, 0x04, 0x02, 0x04,
+ 0x05, 0x00, 0x04, 0x1c
+};
+
+static const u8 digest_info_sha256[] = {
+ 0x30, 0x31, 0x30, 0x0d, 0x06, 0x09,
+ 0x60, 0x86, 0x48, 0x01, 0x65, 0x03, 0x04, 0x02, 0x01,
+ 0x05, 0x00, 0x04, 0x20
+};
+
+static const u8 digest_info_sha384[] = {
+ 0x30, 0x41, 0x30, 0x0d, 0x06, 0x09,
+ 0x60, 0x86, 0x48, 0x01, 0x65, 0x03, 0x04, 0x02, 0x02,
+ 0x05, 0x00, 0x04, 0x30
+};
+
+static const u8 digest_info_sha512[] = {
+ 0x30, 0x51, 0x30, 0x0d, 0x06, 0x09,
+ 0x60, 0x86, 0x48, 0x01, 0x65, 0x03, 0x04, 0x02, 0x03,
+ 0x05, 0x00, 0x04, 0x40
+};
+
+static const struct asn1_template {
+ const char *name;
+ const u8 *data;
+ size_t size;
+} asn1_templates[] = {
+#define _(X) { #X, digest_info_##X, sizeof(digest_info_##X) }
+ _(md5),
+ _(sha1),
+ _(rmd160),
+ _(sha256),
+ _(sha384),
+ _(sha512),
+ _(sha224),
+ { NULL }
+#undef _
+};
+
+static const struct asn1_template *lookup_asn1(const char *name)
+{
+ const struct asn1_template *p;
+
+ for (p = asn1_templates; p->name; p++)
+ if (strcmp(name, p->name) == 0)
+ return p;
+ return NULL;
+}
+
+/*
+ * Sign operation is performed with the private key in the TPM.
+ */
+static int tpm_key_sign(struct tpm_key *tk,
+ struct kernel_pkey_params *params,
+ const void *in, void *out)
+{
+ struct tpm_buf *tb;
+ uint32_t keyhandle;
+ uint8_t srkauth[SHA1_DIGEST_SIZE];
+ uint8_t keyauth[SHA1_DIGEST_SIZE];
+ void *asn1_wrapped = NULL;
+ uint32_t in_len = params->in_len;
+ int r;
+
+ pr_devel("==>%s()\n", __func__);
+
+ if (strcmp(params->encoding, "pkcs1"))
+ return -ENOPKG;
+
+ if (params->hash_algo) {
+ const struct asn1_template *asn1 =
+ lookup_asn1(params->hash_algo);
+
+ if (!asn1)
+ return -ENOPKG;
+
+ /* request enough space for the ASN.1 template + input hash */
+ asn1_wrapped = kzalloc(in_len + asn1->size, GFP_KERNEL);
+ if (!asn1_wrapped)
+ return -ENOMEM;
+
+ /* Copy ASN.1 template, then the input */
+ memcpy(asn1_wrapped, asn1->data, asn1->size);
+ memcpy(asn1_wrapped + asn1->size, in, in_len);
+
+ in = asn1_wrapped;
+ in_len += asn1->size;
+ }
+
+ if (in_len > tk->key_len / 8 - 11) {
+ r = -EOVERFLOW;
+ goto error_free_asn1_wrapped;
+ }
+
+ r = -ENOMEM;
+ tb = kzalloc(sizeof(*tb), GFP_KERNEL);
+ if (!tb)
+ goto error_free_asn1_wrapped;
+
+ /* TODO: Handle a non-all zero SRK authorization */
+ memset(srkauth, 0, sizeof(srkauth));
+
+ r = tpm_loadkey2(tb, SRKHANDLE, srkauth,
+ tk->blob, tk->blob_len, &keyhandle);
+ if (r < 0) {
+ pr_devel("loadkey2 failed (%d)\n", r);
+ goto error_free_tb;
+ }
+
+ /* TODO: Handle a non-all zero key authorization */
+ memset(keyauth, 0, sizeof(keyauth));
+
+ r = tpm_sign(tb, keyhandle, keyauth, in, in_len, out, params->out_len);
+ if (r < 0)
+ pr_devel("tpm_sign failed (%d)\n", r);
+
+ if (tpm_flushspecific(tb, keyhandle) < 0)
+ pr_devel("flushspecific failed (%d)\n", r);
+
+error_free_tb:
+ kzfree(tb);
+error_free_asn1_wrapped:
+ kfree(asn1_wrapped);
+ pr_devel("<==%s() = %d\n", __func__, r);
+ return r;
+}
+
+/*
+ * Do encryption, decryption and signing ops.
+ */
+static int tpm_key_eds_op(struct kernel_pkey_params *params,
+ const void *in, void *out)
+{
+ struct tpm_key *tk = params->key->payload.data[asym_crypto];
+ int ret = -EOPNOTSUPP;
+
+ /* Perform the encryption calculation. */
+ switch (params->op) {
+ case kernel_pkey_encrypt:
+ ret = tpm_key_encrypt(tk, params, in, out);
+ break;
+ case kernel_pkey_decrypt:
+ ret = tpm_key_decrypt(tk, params, in, out);
+ break;
+ case kernel_pkey_sign:
+ ret = tpm_key_sign(tk, params, in, out);
+ break;
+ default:
+ BUG();
+ }
+
+ return ret;
+}
+
+/*
+ * Verify a signature using a public key.
+ */
+static int tpm_key_verify_signature(const struct key *key,
+ const struct public_key_signature *sig)
+{
+ const struct tpm_key *tk = key->payload.data[asym_crypto];
+ struct crypto_wait cwait;
+ struct crypto_akcipher *tfm;
+ struct akcipher_request *req;
+ struct scatterlist sig_sg, digest_sg;
+ char alg_name[CRYPTO_MAX_ALG_NAME];
+ uint8_t der_pub_key[PUB_KEY_BUF_SIZE];
+ uint32_t der_pub_key_len;
+ void *output;
+ unsigned int outlen;
+ int ret;
+
+ pr_devel("==>%s()\n", __func__);
+
+ BUG_ON(!tk);
+ BUG_ON(!sig);
+ BUG_ON(!sig->s);
+
+ if (!sig->digest)
+ return -ENOPKG;
+
+ ret = determine_akcipher(sig->encoding, sig->hash_algo, alg_name);
+ if (ret < 0)
+ return ret;
+
+ tfm = crypto_alloc_akcipher(alg_name, 0, 0);
+ if (IS_ERR(tfm))
+ return PTR_ERR(tfm);
+
+ der_pub_key_len = derive_pub_key(tk->pub_key, tk->pub_key_len,
+ der_pub_key);
+
+ ret = crypto_akcipher_set_pub_key(tfm, der_pub_key, der_pub_key_len);
+ if (ret < 0)
+ goto error_free_tfm;
+
+ ret = -ENOMEM;
+ req = akcipher_request_alloc(tfm, GFP_KERNEL);
+ if (!req)
+ goto error_free_tfm;
+
+ ret = -ENOMEM;
+ outlen = crypto_akcipher_maxsize(tfm);
+ output = kmalloc(outlen, GFP_KERNEL);
+ if (!output)
+ goto error_free_req;
+
+ sg_init_one(&sig_sg, sig->s, sig->s_size);
+ sg_init_one(&digest_sg, output, outlen);
+ akcipher_request_set_crypt(req, &sig_sg, &digest_sg, sig->s_size,
+ outlen);
+ crypto_init_wait(&cwait);
+ akcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
+ CRYPTO_TFM_REQ_MAY_SLEEP,
+ crypto_req_done, &cwait);
+
+ /* Perform the verification calculation. This doesn't actually do the
+ * verification, but rather calculates the hash expected by the
+ * signature and returns that to us.
+ */
+ ret = crypto_wait_req(crypto_akcipher_verify(req), &cwait);
+ if (ret)
+ goto out_free_output;
+
+ /* Do the actual verification step. */
+ if (req->dst_len != sig->digest_size ||
+ memcmp(sig->digest, output, sig->digest_size) != 0)
+ ret = -EKEYREJECTED;
+
+out_free_output:
+ kfree(output);
+error_free_req:
+ akcipher_request_free(req);
+error_free_tfm:
+ crypto_free_akcipher(tfm);
+ pr_devel("<==%s() = %d\n", __func__, ret);
+ if (WARN_ON_ONCE(ret > 0))
+ ret = -EINVAL;
+ return ret;
+}
+
+/*
+ * Parse enough information out of TPM_KEY structure:
+ * TPM_STRUCT_VER -> 4 bytes
+ * TPM_KEY_USAGE -> 2 bytes
+ * TPM_KEY_FLAGS -> 4 bytes
+ * TPM_AUTH_DATA_USAGE -> 1 byte
+ * TPM_KEY_PARMS -> variable
+ * UINT32 PCRInfoSize -> 4 bytes
+ * BYTE* -> PCRInfoSize bytes
+ * TPM_STORE_PUBKEY
+ * UINT32 encDataSize;
+ * BYTE* -> encDataSize;
+ *
+ * TPM_KEY_PARMS:
+ * TPM_ALGORITHM_ID -> 4 bytes
+ * TPM_ENC_SCHEME -> 2 bytes
+ * TPM_SIG_SCHEME -> 2 bytes
+ * UINT32 parmSize -> 4 bytes
+ * BYTE* -> variable
+ */
+static int extract_key_parameters(struct tpm_key *tk)
+{
+ const void *cur = tk->blob;
+ uint32_t len = tk->blob_len;
+ const void *pub_key;
+ uint32_t sz;
+ uint32_t key_len;
+
+ if (len < 11)
+ return -EBADMSG;
+
+ /* Ensure this is a legacy key */
+ if (get_unaligned_be16(cur + 4) != 0x0015)
+ return -EBADMSG;
+
+ /* Skip to TPM_KEY_PARMS */
+ cur += 11;
+ len -= 11;
+
+ if (len < 12)
+ return -EBADMSG;
+
+ /* Make sure this is an RSA key */
+ if (get_unaligned_be32(cur) != 0x00000001)
+ return -EBADMSG;
+
+ /* Make sure this is TPM_ES_RSAESPKCSv15 encoding scheme */
+ if (get_unaligned_be16(cur + 4) != 0x0002)
+ return -EBADMSG;
+
+ /* Make sure this is TPM_SS_RSASSAPKCS1v15_DER signature scheme */
+ if (get_unaligned_be16(cur + 6) != 0x0003)
+ return -EBADMSG;
+
+ sz = get_unaligned_be32(cur + 8);
+ if (len < sz + 12)
+ return -EBADMSG;
+
+ /* Move to TPM_RSA_KEY_PARMS */
+ len -= 12;
+ cur += 12;
+
+ /* Grab the RSA key length */
+ key_len = get_unaligned_be32(cur);
+
+ switch (key_len) {
+ case 512:
+ case 1024:
+ case 1536:
+ case 2048:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* Move just past TPM_KEY_PARMS */
+ cur += sz;
+ len -= sz;
+
+ if (len < 4)
+ return -EBADMSG;
+
+ sz = get_unaligned_be32(cur);
+ if (len < 4 + sz)
+ return -EBADMSG;
+
+ /* Move to TPM_STORE_PUBKEY */
+ cur += 4 + sz;
+ len -= 4 + sz;
+
+ /* Grab the size of the public key, it should jive with the key size */
+ sz = get_unaligned_be32(cur);
+ if (sz > 256)
+ return -EINVAL;
+
+ pub_key = cur + 4;
+
+ tk->key_len = key_len;
+ tk->pub_key = pub_key;
+ tk->pub_key_len = sz;
+
+ return 0;
+}
+
+/* Given the blob, parse it and load it into the TPM */
+struct tpm_key *tpm_key_create(const void *blob, uint32_t blob_len)
+{
+ int r;
+ struct tpm_key *tk;
+
+ r = tpm_is_tpm2(NULL);
+ if (r < 0)
+ goto error;
+
+ /* We don't support TPM2 yet */
+ if (r > 0) {
+ r = -ENODEV;
+ goto error;
+ }
+
+ r = -ENOMEM;
+ tk = kzalloc(sizeof(struct tpm_key), GFP_KERNEL);
+ if (!tk)
+ goto error;
+
+ tk->blob = kmemdup(blob, blob_len, GFP_KERNEL);
+ if (!tk->blob)
+ goto error_memdup;
+
+ tk->blob_len = blob_len;
+
+ r = extract_key_parameters(tk);
+ if (r < 0)
+ goto error_extract;
+
+ return tk;
+
+error_extract:
+ kfree(tk->blob);
+ tk->blob_len = 0;
+error_memdup:
+ kfree(tk);
+error:
+ return ERR_PTR(r);
+}
+EXPORT_SYMBOL_GPL(tpm_key_create);
+
+/*
+ * TPM-based asymmetric key subtype
+ */
+struct asymmetric_key_subtype asym_tpm_subtype = {
+ .owner = THIS_MODULE,
+ .name = "asym_tpm",
+ .name_len = sizeof("asym_tpm") - 1,
+ .describe = asym_tpm_describe,
+ .destroy = asym_tpm_destroy,
+ .query = tpm_key_query,
+ .eds_op = tpm_key_eds_op,
+ .verify_signature = tpm_key_verify_signature,
+};
+EXPORT_SYMBOL_GPL(asym_tpm_subtype);
+
+MODULE_DESCRIPTION("TPM based asymmetric key subtype");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_LICENSE("GPL v2");
extern int __asymmetric_key_hex_to_key_id(const char *id,
struct asymmetric_key_id *match_id,
size_t hexlen);
+
+extern int asymmetric_key_eds_op(struct kernel_pkey_params *params,
+ const void *in, void *out);
#include <linux/slab.h>
#include <linux/ctype.h>
#include <keys/system_keyring.h>
+#include <keys/user-type.h>
#include "asymmetric_keys.h"
MODULE_LICENSE("GPL");
return ret;
}
+int asymmetric_key_eds_op(struct kernel_pkey_params *params,
+ const void *in, void *out)
+{
+ const struct asymmetric_key_subtype *subtype;
+ struct key *key = params->key;
+ int ret;
+
+ pr_devel("==>%s()\n", __func__);
+
+ if (key->type != &key_type_asymmetric)
+ return -EINVAL;
+ subtype = asymmetric_key_subtype(key);
+ if (!subtype ||
+ !key->payload.data[0])
+ return -EINVAL;
+ if (!subtype->eds_op)
+ return -ENOTSUPP;
+
+ ret = subtype->eds_op(params, in, out);
+
+ pr_devel("<==%s() = %d\n", __func__, ret);
+ return ret;
+}
+
+static int asymmetric_key_verify_signature(struct kernel_pkey_params *params,
+ const void *in, const void *in2)
+{
+ struct public_key_signature sig = {
+ .s_size = params->in2_len,
+ .digest_size = params->in_len,
+ .encoding = params->encoding,
+ .hash_algo = params->hash_algo,
+ .digest = (void *)in,
+ .s = (void *)in2,
+ };
+
+ return verify_signature(params->key, &sig);
+}
+
struct key_type key_type_asymmetric = {
.name = "asymmetric",
.preparse = asymmetric_key_preparse,
.destroy = asymmetric_key_destroy,
.describe = asymmetric_key_describe,
.lookup_restriction = asymmetric_lookup_restriction,
+ .asym_query = query_asymmetric_key,
+ .asym_eds_op = asymmetric_key_eds_op,
+ .asym_verify_signature = asymmetric_key_verify_signature,
};
EXPORT_SYMBOL_GPL(key_type_asymmetric);
switch (ctx->last_oid) {
case OID_rsaEncryption:
ctx->sinfo->sig->pkey_algo = "rsa";
+ ctx->sinfo->sig->encoding = "pkcs1";
break;
default:
printk("Unsupported pkey algo: %u\n", ctx->last_oid);
--- /dev/null
+--
+-- This is the unencrypted variant
+--
+PrivateKeyInfo ::= SEQUENCE {
+ version Version,
+ privateKeyAlgorithm PrivateKeyAlgorithmIdentifier,
+ privateKey PrivateKey,
+ attributes [0] IMPLICIT Attributes OPTIONAL
+}
+
+Version ::= INTEGER ({ pkcs8_note_version })
+
+PrivateKeyAlgorithmIdentifier ::= AlgorithmIdentifier ({ pkcs8_note_algo })
+
+PrivateKey ::= OCTET STRING ({ pkcs8_note_key })
+
+Attributes ::= SET OF Attribute
+
+Attribute ::= ANY
+
+AlgorithmIdentifier ::= SEQUENCE {
+ algorithm OBJECT IDENTIFIER ({ pkcs8_note_OID }),
+ parameters ANY OPTIONAL
+}
--- /dev/null
+/* PKCS#8 Private Key parser [RFC 5208].
+ *
+ * Copyright (C) 2016 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) "PKCS8: "fmt
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/err.h>
+#include <linux/oid_registry.h>
+#include <keys/asymmetric-subtype.h>
+#include <keys/asymmetric-parser.h>
+#include <crypto/public_key.h>
+#include "pkcs8.asn1.h"
+
+struct pkcs8_parse_context {
+ struct public_key *pub;
+ unsigned long data; /* Start of data */
+ enum OID last_oid; /* Last OID encountered */
+ enum OID algo_oid; /* Algorithm OID */
+ u32 key_size;
+ const void *key;
+};
+
+/*
+ * Note an OID when we find one for later processing when we know how to
+ * interpret it.
+ */
+int pkcs8_note_OID(void *context, size_t hdrlen,
+ unsigned char tag,
+ const void *value, size_t vlen)
+{
+ struct pkcs8_parse_context *ctx = context;
+
+ ctx->last_oid = look_up_OID(value, vlen);
+ if (ctx->last_oid == OID__NR) {
+ char buffer[50];
+
+ sprint_oid(value, vlen, buffer, sizeof(buffer));
+ pr_info("Unknown OID: [%lu] %s\n",
+ (unsigned long)value - ctx->data, buffer);
+ }
+ return 0;
+}
+
+/*
+ * Note the version number of the ASN.1 blob.
+ */
+int pkcs8_note_version(void *context, size_t hdrlen,
+ unsigned char tag,
+ const void *value, size_t vlen)
+{
+ if (vlen != 1 || ((const u8 *)value)[0] != 0) {
+ pr_warn("Unsupported PKCS#8 version\n");
+ return -EBADMSG;
+ }
+ return 0;
+}
+
+/*
+ * Note the public algorithm.
+ */
+int pkcs8_note_algo(void *context, size_t hdrlen,
+ unsigned char tag,
+ const void *value, size_t vlen)
+{
+ struct pkcs8_parse_context *ctx = context;
+
+ if (ctx->last_oid != OID_rsaEncryption)
+ return -ENOPKG;
+
+ ctx->pub->pkey_algo = "rsa";
+ return 0;
+}
+
+/*
+ * Note the key data of the ASN.1 blob.
+ */
+int pkcs8_note_key(void *context, size_t hdrlen,
+ unsigned char tag,
+ const void *value, size_t vlen)
+{
+ struct pkcs8_parse_context *ctx = context;
+
+ ctx->key = value;
+ ctx->key_size = vlen;
+ return 0;
+}
+
+/*
+ * Parse a PKCS#8 private key blob.
+ */
+static struct public_key *pkcs8_parse(const void *data, size_t datalen)
+{
+ struct pkcs8_parse_context ctx;
+ struct public_key *pub;
+ long ret;
+
+ memset(&ctx, 0, sizeof(ctx));
+
+ ret = -ENOMEM;
+ ctx.pub = kzalloc(sizeof(struct public_key), GFP_KERNEL);
+ if (!ctx.pub)
+ goto error;
+
+ ctx.data = (unsigned long)data;
+
+ /* Attempt to decode the private key */
+ ret = asn1_ber_decoder(&pkcs8_decoder, &ctx, data, datalen);
+ if (ret < 0)
+ goto error_decode;
+
+ ret = -ENOMEM;
+ pub = ctx.pub;
+ pub->key = kmemdup(ctx.key, ctx.key_size, GFP_KERNEL);
+ if (!pub->key)
+ goto error_decode;
+
+ pub->keylen = ctx.key_size;
+ pub->key_is_private = true;
+ return pub;
+
+error_decode:
+ kfree(ctx.pub);
+error:
+ return ERR_PTR(ret);
+}
+
+/*
+ * Attempt to parse a data blob for a key as a PKCS#8 private key.
+ */
+static int pkcs8_key_preparse(struct key_preparsed_payload *prep)
+{
+ struct public_key *pub;
+
+ pub = pkcs8_parse(prep->data, prep->datalen);
+ if (IS_ERR(pub))
+ return PTR_ERR(pub);
+
+ pr_devel("Cert Key Algo: %s\n", pub->pkey_algo);
+ pub->id_type = "PKCS8";
+
+ /* We're pinning the module by being linked against it */
+ __module_get(public_key_subtype.owner);
+ prep->payload.data[asym_subtype] = &public_key_subtype;
+ prep->payload.data[asym_key_ids] = NULL;
+ prep->payload.data[asym_crypto] = pub;
+ prep->payload.data[asym_auth] = NULL;
+ prep->quotalen = 100;
+ return 0;
+}
+
+static struct asymmetric_key_parser pkcs8_key_parser = {
+ .owner = THIS_MODULE,
+ .name = "pkcs8",
+ .parse = pkcs8_key_preparse,
+};
+
+/*
+ * Module stuff
+ */
+static int __init pkcs8_key_init(void)
+{
+ return register_asymmetric_key_parser(&pkcs8_key_parser);
+}
+
+static void __exit pkcs8_key_exit(void)
+{
+ unregister_asymmetric_key_parser(&pkcs8_key_parser);
+}
+
+module_init(pkcs8_key_init);
+module_exit(pkcs8_key_exit);
+
+MODULE_DESCRIPTION("PKCS#8 certificate parser");
+MODULE_LICENSE("GPL");
public_key_signature_free(payload3);
}
+/*
+ * Determine the crypto algorithm name.
+ */
+static
+int software_key_determine_akcipher(const char *encoding,
+ const char *hash_algo,
+ const struct public_key *pkey,
+ char alg_name[CRYPTO_MAX_ALG_NAME])
+{
+ int n;
+
+ if (strcmp(encoding, "pkcs1") == 0) {
+ /* The data wangled by the RSA algorithm is typically padded
+ * and encoded in some manner, such as EMSA-PKCS1-1_5 [RFC3447
+ * sec 8.2].
+ */
+ if (!hash_algo)
+ n = snprintf(alg_name, CRYPTO_MAX_ALG_NAME,
+ "pkcs1pad(%s)",
+ pkey->pkey_algo);
+ else
+ n = snprintf(alg_name, CRYPTO_MAX_ALG_NAME,
+ "pkcs1pad(%s,%s)",
+ pkey->pkey_algo, hash_algo);
+ return n >= CRYPTO_MAX_ALG_NAME ? -EINVAL : 0;
+ }
+
+ if (strcmp(encoding, "raw") == 0) {
+ strcpy(alg_name, pkey->pkey_algo);
+ return 0;
+ }
+
+ return -ENOPKG;
+}
+
+/*
+ * Query information about a key.
+ */
+static int software_key_query(const struct kernel_pkey_params *params,
+ struct kernel_pkey_query *info)
+{
+ struct crypto_akcipher *tfm;
+ struct public_key *pkey = params->key->payload.data[asym_crypto];
+ char alg_name[CRYPTO_MAX_ALG_NAME];
+ int ret, len;
+
+ ret = software_key_determine_akcipher(params->encoding,
+ params->hash_algo,
+ pkey, alg_name);
+ if (ret < 0)
+ return ret;
+
+ tfm = crypto_alloc_akcipher(alg_name, 0, 0);
+ if (IS_ERR(tfm))
+ return PTR_ERR(tfm);
+
+ if (pkey->key_is_private)
+ ret = crypto_akcipher_set_priv_key(tfm,
+ pkey->key, pkey->keylen);
+ else
+ ret = crypto_akcipher_set_pub_key(tfm,
+ pkey->key, pkey->keylen);
+ if (ret < 0)
+ goto error_free_tfm;
+
+ len = crypto_akcipher_maxsize(tfm);
+ info->key_size = len * 8;
+ info->max_data_size = len;
+ info->max_sig_size = len;
+ info->max_enc_size = len;
+ info->max_dec_size = len;
+ info->supported_ops = (KEYCTL_SUPPORTS_ENCRYPT |
+ KEYCTL_SUPPORTS_VERIFY);
+ if (pkey->key_is_private)
+ info->supported_ops |= (KEYCTL_SUPPORTS_DECRYPT |
+ KEYCTL_SUPPORTS_SIGN);
+ ret = 0;
+
+error_free_tfm:
+ crypto_free_akcipher(tfm);
+ pr_devel("<==%s() = %d\n", __func__, ret);
+ return ret;
+}
+
+/*
+ * Do encryption, decryption and signing ops.
+ */
+static int software_key_eds_op(struct kernel_pkey_params *params,
+ const void *in, void *out)
+{
+ const struct public_key *pkey = params->key->payload.data[asym_crypto];
+ struct akcipher_request *req;
+ struct crypto_akcipher *tfm;
+ struct crypto_wait cwait;
+ struct scatterlist in_sg, out_sg;
+ char alg_name[CRYPTO_MAX_ALG_NAME];
+ int ret;
+
+ pr_devel("==>%s()\n", __func__);
+
+ ret = software_key_determine_akcipher(params->encoding,
+ params->hash_algo,
+ pkey, alg_name);
+ if (ret < 0)
+ return ret;
+
+ tfm = crypto_alloc_akcipher(alg_name, 0, 0);
+ if (IS_ERR(tfm))
+ return PTR_ERR(tfm);
+
+ req = akcipher_request_alloc(tfm, GFP_KERNEL);
+ if (!req)
+ goto error_free_tfm;
+
+ if (pkey->key_is_private)
+ ret = crypto_akcipher_set_priv_key(tfm,
+ pkey->key, pkey->keylen);
+ else
+ ret = crypto_akcipher_set_pub_key(tfm,
+ pkey->key, pkey->keylen);
+ if (ret)
+ goto error_free_req;
+
+ sg_init_one(&in_sg, in, params->in_len);
+ sg_init_one(&out_sg, out, params->out_len);
+ akcipher_request_set_crypt(req, &in_sg, &out_sg, params->in_len,
+ params->out_len);
+ crypto_init_wait(&cwait);
+ akcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
+ CRYPTO_TFM_REQ_MAY_SLEEP,
+ crypto_req_done, &cwait);
+
+ /* Perform the encryption calculation. */
+ switch (params->op) {
+ case kernel_pkey_encrypt:
+ ret = crypto_akcipher_encrypt(req);
+ break;
+ case kernel_pkey_decrypt:
+ ret = crypto_akcipher_decrypt(req);
+ break;
+ case kernel_pkey_sign:
+ ret = crypto_akcipher_sign(req);
+ break;
+ default:
+ BUG();
+ }
+
+ ret = crypto_wait_req(ret, &cwait);
+ if (ret == 0)
+ ret = req->dst_len;
+
+error_free_req:
+ akcipher_request_free(req);
+error_free_tfm:
+ crypto_free_akcipher(tfm);
+ pr_devel("<==%s() = %d\n", __func__, ret);
+ return ret;
+}
+
/*
* Verify a signature using a public key.
*/
struct crypto_akcipher *tfm;
struct akcipher_request *req;
struct scatterlist sig_sg, digest_sg;
- const char *alg_name;
- char alg_name_buf[CRYPTO_MAX_ALG_NAME];
+ char alg_name[CRYPTO_MAX_ALG_NAME];
void *output;
unsigned int outlen;
int ret;
BUG_ON(!sig);
BUG_ON(!sig->s);
- if (!sig->digest)
- return -ENOPKG;
-
- alg_name = sig->pkey_algo;
- if (strcmp(sig->pkey_algo, "rsa") == 0) {
- /* The data wangled by the RSA algorithm is typically padded
- * and encoded in some manner, such as EMSA-PKCS1-1_5 [RFC3447
- * sec 8.2].
- */
- if (snprintf(alg_name_buf, CRYPTO_MAX_ALG_NAME,
- "pkcs1pad(rsa,%s)", sig->hash_algo
- ) >= CRYPTO_MAX_ALG_NAME)
- return -EINVAL;
- alg_name = alg_name_buf;
- }
+ ret = software_key_determine_akcipher(sig->encoding,
+ sig->hash_algo,
+ pkey, alg_name);
+ if (ret < 0)
+ return ret;
tfm = crypto_alloc_akcipher(alg_name, 0, 0);
if (IS_ERR(tfm))
if (!req)
goto error_free_tfm;
- ret = crypto_akcipher_set_pub_key(tfm, pkey->key, pkey->keylen);
+ if (pkey->key_is_private)
+ ret = crypto_akcipher_set_priv_key(tfm,
+ pkey->key, pkey->keylen);
+ else
+ ret = crypto_akcipher_set_pub_key(tfm,
+ pkey->key, pkey->keylen);
if (ret)
goto error_free_req;
.name_len = sizeof("public_key") - 1,
.describe = public_key_describe,
.destroy = public_key_destroy,
+ .query = software_key_query,
+ .eds_op = software_key_eds_op,
.verify_signature = public_key_verify_signature_2,
};
EXPORT_SYMBOL_GPL(public_key_subtype);
#include <linux/export.h>
#include <linux/err.h>
#include <linux/slab.h>
+#include <linux/keyctl.h>
#include <crypto/public_key.h>
+#include <keys/user-type.h>
#include "asymmetric_keys.h"
/*
}
EXPORT_SYMBOL_GPL(public_key_signature_free);
+/**
+ * query_asymmetric_key - Get information about an aymmetric key.
+ * @params: Various parameters.
+ * @info: Where to put the information.
+ */
+int query_asymmetric_key(const struct kernel_pkey_params *params,
+ struct kernel_pkey_query *info)
+{
+ const struct asymmetric_key_subtype *subtype;
+ struct key *key = params->key;
+ int ret;
+
+ pr_devel("==>%s()\n", __func__);
+
+ if (key->type != &key_type_asymmetric)
+ return -EINVAL;
+ subtype = asymmetric_key_subtype(key);
+ if (!subtype ||
+ !key->payload.data[0])
+ return -EINVAL;
+ if (!subtype->query)
+ return -ENOTSUPP;
+
+ ret = subtype->query(params, info);
+
+ pr_devel("<==%s() = %d\n", __func__, ret);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(query_asymmetric_key);
+
+/**
+ * encrypt_blob - Encrypt data using an asymmetric key
+ * @params: Various parameters
+ * @data: Data blob to be encrypted, length params->data_len
+ * @enc: Encrypted data buffer, length params->enc_len
+ *
+ * Encrypt the specified data blob using the private key specified by
+ * params->key. The encrypted data is wrapped in an encoding if
+ * params->encoding is specified (eg. "pkcs1").
+ *
+ * Returns the length of the data placed in the encrypted data buffer or an
+ * error.
+ */
+int encrypt_blob(struct kernel_pkey_params *params,
+ const void *data, void *enc)
+{
+ params->op = kernel_pkey_encrypt;
+ return asymmetric_key_eds_op(params, data, enc);
+}
+EXPORT_SYMBOL_GPL(encrypt_blob);
+
+/**
+ * decrypt_blob - Decrypt data using an asymmetric key
+ * @params: Various parameters
+ * @enc: Encrypted data to be decrypted, length params->enc_len
+ * @data: Decrypted data buffer, length params->data_len
+ *
+ * Decrypt the specified data blob using the private key specified by
+ * params->key. The decrypted data is wrapped in an encoding if
+ * params->encoding is specified (eg. "pkcs1").
+ *
+ * Returns the length of the data placed in the decrypted data buffer or an
+ * error.
+ */
+int decrypt_blob(struct kernel_pkey_params *params,
+ const void *enc, void *data)
+{
+ params->op = kernel_pkey_decrypt;
+ return asymmetric_key_eds_op(params, enc, data);
+}
+EXPORT_SYMBOL_GPL(decrypt_blob);
+
+/**
+ * create_signature - Sign some data using an asymmetric key
+ * @params: Various parameters
+ * @data: Data blob to be signed, length params->data_len
+ * @enc: Signature buffer, length params->enc_len
+ *
+ * Sign the specified data blob using the private key specified by params->key.
+ * The signature is wrapped in an encoding if params->encoding is specified
+ * (eg. "pkcs1"). If the encoding needs to know the digest type, this can be
+ * passed through params->hash_algo (eg. "sha1").
+ *
+ * Returns the length of the data placed in the signature buffer or an error.
+ */
+int create_signature(struct kernel_pkey_params *params,
+ const void *data, void *enc)
+{
+ params->op = kernel_pkey_sign;
+ return asymmetric_key_eds_op(params, data, enc);
+}
+EXPORT_SYMBOL_GPL(create_signature);
+
/**
* verify_signature - Initiate the use of an asymmetric key to verify a signature
* @key: The asymmetric key to verify against
--- /dev/null
+--
+-- Unencryted TPM Blob. For details of the format, see:
+-- http://david.woodhou.se/draft-woodhouse-cert-best-practice.html#I-D.mavrogiannopoulos-tpmuri
+--
+PrivateKeyInfo ::= OCTET STRING ({ tpm_note_key })
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) "TPM-PARSER: "fmt
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/err.h>
+#include <keys/asymmetric-subtype.h>
+#include <keys/asymmetric-parser.h>
+#include <crypto/asym_tpm_subtype.h>
+#include "tpm.asn1.h"
+
+struct tpm_parse_context {
+ const void *blob;
+ u32 blob_len;
+};
+
+/*
+ * Note the key data of the ASN.1 blob.
+ */
+int tpm_note_key(void *context, size_t hdrlen,
+ unsigned char tag,
+ const void *value, size_t vlen)
+{
+ struct tpm_parse_context *ctx = context;
+
+ ctx->blob = value;
+ ctx->blob_len = vlen;
+
+ return 0;
+}
+
+/*
+ * Parse a TPM-encrypted private key blob.
+ */
+static struct tpm_key *tpm_parse(const void *data, size_t datalen)
+{
+ struct tpm_parse_context ctx;
+ long ret;
+
+ memset(&ctx, 0, sizeof(ctx));
+
+ /* Attempt to decode the private key */
+ ret = asn1_ber_decoder(&tpm_decoder, &ctx, data, datalen);
+ if (ret < 0)
+ goto error;
+
+ return tpm_key_create(ctx.blob, ctx.blob_len);
+
+error:
+ return ERR_PTR(ret);
+}
+/*
+ * Attempt to parse a data blob for a key as a TPM private key blob.
+ */
+static int tpm_key_preparse(struct key_preparsed_payload *prep)
+{
+ struct tpm_key *tk;
+
+ /*
+ * TPM 1.2 keys are max 2048 bits long, so assume the blob is no
+ * more than 4x that
+ */
+ if (prep->datalen > 256 * 4)
+ return -EMSGSIZE;
+
+ tk = tpm_parse(prep->data, prep->datalen);
+
+ if (IS_ERR(tk))
+ return PTR_ERR(tk);
+
+ /* We're pinning the module by being linked against it */
+ __module_get(asym_tpm_subtype.owner);
+ prep->payload.data[asym_subtype] = &asym_tpm_subtype;
+ prep->payload.data[asym_key_ids] = NULL;
+ prep->payload.data[asym_crypto] = tk;
+ prep->payload.data[asym_auth] = NULL;
+ prep->quotalen = 100;
+ return 0;
+}
+
+static struct asymmetric_key_parser tpm_key_parser = {
+ .owner = THIS_MODULE,
+ .name = "tpm_parser",
+ .parse = tpm_key_preparse,
+};
+
+static int __init tpm_key_init(void)
+{
+ return register_asymmetric_key_parser(&tpm_key_parser);
+}
+
+static void __exit tpm_key_exit(void)
+{
+ unregister_asymmetric_key_parser(&tpm_key_parser);
+}
+
+module_init(tpm_key_init);
+module_exit(tpm_key_exit);
+
+MODULE_DESCRIPTION("TPM private key-blob parser");
+MODULE_LICENSE("GPL v2");
case OID_md4WithRSAEncryption:
ctx->cert->sig->hash_algo = "md4";
- ctx->cert->sig->pkey_algo = "rsa";
- break;
+ goto rsa_pkcs1;
case OID_sha1WithRSAEncryption:
ctx->cert->sig->hash_algo = "sha1";
- ctx->cert->sig->pkey_algo = "rsa";
- break;
+ goto rsa_pkcs1;
case OID_sha256WithRSAEncryption:
ctx->cert->sig->hash_algo = "sha256";
- ctx->cert->sig->pkey_algo = "rsa";
- break;
+ goto rsa_pkcs1;
case OID_sha384WithRSAEncryption:
ctx->cert->sig->hash_algo = "sha384";
- ctx->cert->sig->pkey_algo = "rsa";
- break;
+ goto rsa_pkcs1;
case OID_sha512WithRSAEncryption:
ctx->cert->sig->hash_algo = "sha512";
- ctx->cert->sig->pkey_algo = "rsa";
- break;
+ goto rsa_pkcs1;
case OID_sha224WithRSAEncryption:
ctx->cert->sig->hash_algo = "sha224";
- ctx->cert->sig->pkey_algo = "rsa";
- break;
+ goto rsa_pkcs1;
}
+rsa_pkcs1:
+ ctx->cert->sig->pkey_algo = "rsa";
+ ctx->cert->sig->encoding = "pkcs1";
ctx->algo_oid = ctx->last_oid;
return 0;
}
spawn = skcipher_instance_ctx(inst);
err = crypto_init_spawn(spawn, alg, skcipher_crypto_instance(inst),
CRYPTO_ALG_TYPE_MASK);
- crypto_mod_put(alg);
if (err)
- goto err_free_inst;
+ goto err_put_alg;
err = crypto_inst_setname(skcipher_crypto_instance(inst), "cbc", alg);
if (err)
err = skcipher_register_instance(tmpl, inst);
if (err)
goto err_drop_spawn;
+ crypto_mod_put(alg);
out:
return err;
err_drop_spawn:
crypto_drop_spawn(spawn);
+err_put_alg:
+ crypto_mod_put(alg);
err_free_inst:
kfree(inst);
goto out;
spawn = skcipher_instance_ctx(inst);
err = crypto_init_spawn(spawn, alg, skcipher_crypto_instance(inst),
CRYPTO_ALG_TYPE_MASK);
- crypto_mod_put(alg);
if (err)
- goto err_free_inst;
+ goto err_put_alg;
err = crypto_inst_setname(skcipher_crypto_instance(inst), "cfb", alg);
if (err)
err = skcipher_register_instance(tmpl, inst);
if (err)
goto err_drop_spawn;
+ crypto_mod_put(alg);
out:
return err;
err_drop_spawn:
crypto_drop_spawn(spawn);
+err_put_alg:
+ crypto_mod_put(alg);
err_free_inst:
kfree(inst);
goto out;
{
struct crypto_report_cipher rcipher;
- strlcpy(rcipher.type, "cipher", sizeof(rcipher.type));
+ strncpy(rcipher.type, "cipher", sizeof(rcipher.type));
rcipher.blocksize = alg->cra_blocksize;
rcipher.min_keysize = alg->cra_cipher.cia_min_keysize;
{
struct crypto_report_comp rcomp;
- strlcpy(rcomp.type, "compression", sizeof(rcomp.type));
+ strncpy(rcomp.type, "compression", sizeof(rcomp.type));
if (nla_put(skb, CRYPTOCFGA_REPORT_COMPRESS,
sizeof(struct crypto_report_comp), &rcomp))
goto nla_put_failure;
{
struct crypto_report_acomp racomp;
- strlcpy(racomp.type, "acomp", sizeof(racomp.type));
+ strncpy(racomp.type, "acomp", sizeof(racomp.type));
if (nla_put(skb, CRYPTOCFGA_REPORT_ACOMP,
sizeof(struct crypto_report_acomp), &racomp))
{
struct crypto_report_akcipher rakcipher;
- strlcpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
+ strncpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
if (nla_put(skb, CRYPTOCFGA_REPORT_AKCIPHER,
sizeof(struct crypto_report_akcipher), &rakcipher))
{
struct crypto_report_kpp rkpp;
- strlcpy(rkpp.type, "kpp", sizeof(rkpp.type));
+ strncpy(rkpp.type, "kpp", sizeof(rkpp.type));
if (nla_put(skb, CRYPTOCFGA_REPORT_KPP,
sizeof(struct crypto_report_kpp), &rkpp))
static int crypto_report_one(struct crypto_alg *alg,
struct crypto_user_alg *ualg, struct sk_buff *skb)
{
- strlcpy(ualg->cru_name, alg->cra_name, sizeof(ualg->cru_name));
- strlcpy(ualg->cru_driver_name, alg->cra_driver_name,
+ strncpy(ualg->cru_name, alg->cra_name, sizeof(ualg->cru_name));
+ strncpy(ualg->cru_driver_name, alg->cra_driver_name,
sizeof(ualg->cru_driver_name));
- strlcpy(ualg->cru_module_name, module_name(alg->cra_module),
+ strncpy(ualg->cru_module_name, module_name(alg->cra_module),
sizeof(ualg->cru_module_name));
ualg->cru_type = 0;
if (alg->cra_flags & CRYPTO_ALG_LARVAL) {
struct crypto_report_larval rl;
- strlcpy(rl.type, "larval", sizeof(rl.type));
+ strncpy(rl.type, "larval", sizeof(rl.type));
if (nla_put(skb, CRYPTOCFGA_REPORT_LARVAL,
sizeof(struct crypto_report_larval), &rl))
goto nla_put_failure;
u64 v64;
u32 v32;
+ memset(&raead, 0, sizeof(raead));
+
strncpy(raead.type, "aead", sizeof(raead.type));
v32 = atomic_read(&alg->encrypt_cnt);
u64 v64;
u32 v32;
+ memset(&rcipher, 0, sizeof(rcipher));
+
strlcpy(rcipher.type, "cipher", sizeof(rcipher.type));
v32 = atomic_read(&alg->encrypt_cnt);
u64 v64;
u32 v32;
+ memset(&rcomp, 0, sizeof(rcomp));
+
strlcpy(rcomp.type, "compression", sizeof(rcomp.type));
v32 = atomic_read(&alg->compress_cnt);
rcomp.stat_compress_cnt = v32;
u64 v64;
u32 v32;
+ memset(&racomp, 0, sizeof(racomp));
+
strlcpy(racomp.type, "acomp", sizeof(racomp.type));
v32 = atomic_read(&alg->compress_cnt);
racomp.stat_compress_cnt = v32;
u64 v64;
u32 v32;
+ memset(&rakcipher, 0, sizeof(rakcipher));
+
strncpy(rakcipher.type, "akcipher", sizeof(rakcipher.type));
v32 = atomic_read(&alg->encrypt_cnt);
rakcipher.stat_encrypt_cnt = v32;
struct crypto_stat rkpp;
u32 v;
+ memset(&rkpp, 0, sizeof(rkpp));
+
strlcpy(rkpp.type, "kpp", sizeof(rkpp.type));
v = atomic_read(&alg->setsecret_cnt);
u64 v64;
u32 v32;
+ memset(&rhash, 0, sizeof(rhash));
+
strncpy(rhash.type, "ahash", sizeof(rhash.type));
v32 = atomic_read(&alg->hash_cnt);
u64 v64;
u32 v32;
+ memset(&rhash, 0, sizeof(rhash));
+
strncpy(rhash.type, "shash", sizeof(rhash.type));
v32 = atomic_read(&alg->hash_cnt);
u64 v64;
u32 v32;
+ memset(&rrng, 0, sizeof(rrng));
+
strncpy(rrng.type, "rng", sizeof(rrng.type));
v32 = atomic_read(&alg->generate_cnt);
struct crypto_user_alg *ualg,
struct sk_buff *skb)
{
+ memset(ualg, 0, sizeof(*ualg));
+
strlcpy(ualg->cru_name, alg->cra_name, sizeof(ualg->cru_name));
strlcpy(ualg->cru_driver_name, alg->cra_driver_name,
sizeof(ualg->cru_driver_name));
if (alg->cra_flags & CRYPTO_ALG_LARVAL) {
struct crypto_stat rl;
+ memset(&rl, 0, sizeof(rl));
strlcpy(rl.type, "larval", sizeof(rl.type));
if (nla_put(skb, CRYPTOCFGA_STAT_LARVAL,
sizeof(struct crypto_stat), &rl))
spawn = skcipher_instance_ctx(inst);
err = crypto_init_spawn(spawn, alg, skcipher_crypto_instance(inst),
CRYPTO_ALG_TYPE_MASK);
- crypto_mod_put(alg);
if (err)
- goto err_free_inst;
+ goto err_put_alg;
err = crypto_inst_setname(skcipher_crypto_instance(inst), "pcbc", alg);
if (err)
err = skcipher_register_instance(tmpl, inst);
if (err)
goto err_drop_spawn;
+ crypto_mod_put(alg);
out:
return err;
err_drop_spawn:
crypto_drop_spawn(spawn);
+err_put_alg:
+ crypto_mod_put(alg);
err_free_inst:
kfree(inst);
goto out;
if (!ctx->key_size)
return -EINVAL;
- digest_size = digest_info->size;
+ if (digest_info)
+ digest_size = digest_info->size;
if (req->src_len + digest_size > ctx->key_size - 11)
return -EOVERFLOW;
memset(req_ctx->in_buf + 1, 0xff, ps_end - 1);
req_ctx->in_buf[ps_end] = 0x00;
- memcpy(req_ctx->in_buf + ps_end + 1, digest_info->data,
- digest_info->size);
+ if (digest_info)
+ memcpy(req_ctx->in_buf + ps_end + 1, digest_info->data,
+ digest_info->size);
pkcs1pad_sg_set_buf(req_ctx->in_sg, req_ctx->in_buf,
ctx->key_size - 1 - req->src_len, req->src);
goto done;
pos++;
- if (crypto_memneq(out_buf + pos, digest_info->data, digest_info->size))
- goto done;
+ if (digest_info) {
+ if (crypto_memneq(out_buf + pos, digest_info->data,
+ digest_info->size))
+ goto done;
- pos += digest_info->size;
+ pos += digest_info->size;
+ }
err = 0;
hash_name = crypto_attr_alg_name(tb[2]);
if (IS_ERR(hash_name))
- return PTR_ERR(hash_name);
+ hash_name = NULL;
- digest_info = rsa_lookup_asn1(hash_name);
- if (!digest_info)
- return -EINVAL;
+ if (hash_name) {
+ digest_info = rsa_lookup_asn1(hash_name);
+ if (!digest_info)
+ return -EINVAL;
+ } else
+ digest_info = NULL;
inst = kzalloc(sizeof(*inst) + sizeof(*ctx), GFP_KERNEL);
if (!inst)
err = -ENAMETOOLONG;
- if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
- "pkcs1pad(%s,%s)", rsa_alg->base.cra_name, hash_name) >=
- CRYPTO_MAX_ALG_NAME ||
- snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
- "pkcs1pad(%s,%s)",
- rsa_alg->base.cra_driver_name, hash_name) >=
- CRYPTO_MAX_ALG_NAME)
- goto out_drop_alg;
+ if (!hash_name) {
+ if (snprintf(inst->alg.base.cra_name,
+ CRYPTO_MAX_ALG_NAME, "pkcs1pad(%s)",
+ rsa_alg->base.cra_name) >= CRYPTO_MAX_ALG_NAME)
+ goto out_drop_alg;
+
+ if (snprintf(inst->alg.base.cra_driver_name,
+ CRYPTO_MAX_ALG_NAME, "pkcs1pad(%s)",
+ rsa_alg->base.cra_driver_name) >=
+ CRYPTO_MAX_ALG_NAME)
+ goto out_drop_alg;
+ } else {
+ if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
+ "pkcs1pad(%s,%s)", rsa_alg->base.cra_name,
+ hash_name) >= CRYPTO_MAX_ALG_NAME)
+ goto out_drop_alg;
+
+ if (snprintf(inst->alg.base.cra_driver_name,
+ CRYPTO_MAX_ALG_NAME, "pkcs1pad(%s,%s)",
+ rsa_alg->base.cra_driver_name,
+ hash_name) >= CRYPTO_MAX_ALG_NAME)
+ goto out_drop_alg;
+ }
inst->alg.base.cra_flags = rsa_alg->base.cra_flags & CRYPTO_ALG_ASYNC;
inst->alg.base.cra_priority = rsa_alg->base.cra_priority;
ctx->cryptd_tfm = cryptd_tfm;
- reqsize = sizeof(struct skcipher_request);
- reqsize += crypto_skcipher_reqsize(&cryptd_tfm->base);
+ reqsize = crypto_skcipher_reqsize(cryptd_skcipher_child(cryptd_tfm));
+ reqsize = max(reqsize, crypto_skcipher_reqsize(&cryptd_tfm->base));
+ reqsize += sizeof(struct skcipher_request);
crypto_skcipher_set_reqsize(tfm, reqsize);
config XPOWER_PMIC_OPREGION
bool "ACPI operation region support for XPower AXP288 PMIC"
- depends on MFD_AXP20X_I2C && IOSF_MBI
+ depends on MFD_AXP20X_I2C && IOSF_MBI=y
help
This config adds ACPI operation region support for XPower AXP288 PMIC.
{"PNP0200", 0}, /* AT DMA Controller */
{"ACPI0009", 0}, /* IOxAPIC */
{"ACPI000A", 0}, /* IOAPIC */
+ {"SMB0001", 0}, /* ACPI SMBUS virtual device */
{"", 0},
};
{
acpi_status status;
u32 buffer_length;
- u32 data_length;
void *buffer;
union acpi_operand_object *buffer_desc;
u32 function;
case ACPI_ADR_SPACE_SMBUS:
buffer_length = ACPI_SMBUS_BUFFER_SIZE;
- data_length = ACPI_SMBUS_DATA_SIZE;
function = ACPI_WRITE | (obj_desc->field.attribute << 16);
break;
case ACPI_ADR_SPACE_IPMI:
buffer_length = ACPI_IPMI_BUFFER_SIZE;
- data_length = ACPI_IPMI_DATA_SIZE;
function = ACPI_WRITE;
break;
/* Add header length to get the full size of the buffer */
buffer_length += ACPI_SERIAL_HEADER_SIZE;
- data_length = source_desc->buffer.pointer[1];
function = ACPI_WRITE | (accessor_type << 16);
break;
return_ACPI_STATUS(AE_AML_INVALID_SPACE_ID);
}
-#if 0
- OBSOLETE ?
- /* Check for possible buffer overflow */
- if (data_length > source_desc->buffer.length) {
- ACPI_ERROR((AE_INFO,
- "Length in buffer header (%u)(%u) is greater than "
- "the physical buffer length (%u) and will overflow",
- data_length, buffer_length,
- source_desc->buffer.length));
-
- return_ACPI_STATUS(AE_AML_BUFFER_LIMIT);
- }
-#endif
-
/* Create the transfer/bidirectional/return buffer */
buffer_desc = acpi_ut_create_buffer_object(buffer_length);
/* Copy the input buffer data to the transfer buffer */
buffer = buffer_desc->buffer.pointer;
- memcpy(buffer, source_desc->buffer.pointer, data_length);
+ memcpy(buffer, source_desc->buffer.pointer,
+ min(buffer_length, source_desc->buffer.length));
/* Lock entire transaction if requested */
*/
static struct irq_domain *iort_get_platform_device_domain(struct device *dev)
{
- struct acpi_iort_node *node, *msi_parent;
+ struct acpi_iort_node *node, *msi_parent = NULL;
struct fwnode_handle *iort_fwnode;
struct acpi_iort_its_group *its;
int i;
return 0;
}
+EXPORT_SYMBOL(acpi_device_get_power);
static int acpi_dev_pm_explicit_set(struct acpi_device *adev, int state)
{
if (nd_desc) {
struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
- rc = acpi_nfit_ars_rescan(acpi_desc, 0);
+ rc = acpi_nfit_ars_rescan(acpi_desc, ARS_REQ_LONG);
}
device_unlock(dev);
if (rc)
return rc;
if (ars_status_process_records(acpi_desc))
- return -ENOMEM;
+ dev_err(acpi_desc->dev, "Failed to process ARS records\n");
- return 0;
+ return rc;
}
static int ars_register(struct acpi_nfit_desc *acpi_desc,
struct nvdimm *nvdimm, unsigned int cmd)
{
struct acpi_nfit_desc *acpi_desc = to_acpi_nfit_desc(nd_desc);
- struct nfit_spa *nfit_spa;
- int rc = 0;
if (nvdimm)
return 0;
* just needs guarantees that any ARS it initiates are not
* interrupted by any intervening start requests from userspace.
*/
- mutex_lock(&acpi_desc->init_mutex);
- list_for_each_entry(nfit_spa, &acpi_desc->spas, list)
- if (acpi_desc->scrub_spa
- || test_bit(ARS_REQ_SHORT, &nfit_spa->ars_state)
- || test_bit(ARS_REQ_LONG, &nfit_spa->ars_state)) {
- rc = -EBUSY;
- break;
- }
- mutex_unlock(&acpi_desc->init_mutex);
+ if (work_busy(&acpi_desc->dwork.work))
+ return -EBUSY;
- return rc;
+ return 0;
}
int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc,
struct acpi_nfit_desc *acpi_desc;
struct nfit_spa *nfit_spa;
- /* We only care about memory errors */
- if (!mce_is_memory_error(mce))
+ /* We only care about uncorrectable memory errors */
+ if (!mce_is_memory_error(mce) || mce_is_correctable(mce))
+ return NOTIFY_DONE;
+
+ /* Verify the address reported in the MCE is valid. */
+ if (!mce_usable_address(mce))
return NOTIFY_DONE;
/*
t->buffer = NULL;
goto err_binder_alloc_buf_failed;
}
- t->buffer->allow_user_free = 0;
t->buffer->debug_id = t->debug_id;
t->buffer->transaction = t;
t->buffer->target_node = target_node;
buffer = binder_alloc_prepare_to_free(&proc->alloc,
data_ptr);
- if (buffer == NULL) {
- binder_user_error("%d:%d BC_FREE_BUFFER u%016llx no match\n",
- proc->pid, thread->pid, (u64)data_ptr);
- break;
- }
- if (!buffer->allow_user_free) {
- binder_user_error("%d:%d BC_FREE_BUFFER u%016llx matched unreturned buffer\n",
- proc->pid, thread->pid, (u64)data_ptr);
+ if (IS_ERR_OR_NULL(buffer)) {
+ if (PTR_ERR(buffer) == -EPERM) {
+ binder_user_error(
+ "%d:%d BC_FREE_BUFFER u%016llx matched unreturned or currently freeing buffer\n",
+ proc->pid, thread->pid,
+ (u64)data_ptr);
+ } else {
+ binder_user_error(
+ "%d:%d BC_FREE_BUFFER u%016llx no match\n",
+ proc->pid, thread->pid,
+ (u64)data_ptr);
+ }
break;
}
binder_debug(BINDER_DEBUG_FREE_BUFFER,
else {
/*
* Guard against user threads attempting to
- * free the buffer twice
+ * free the buffer when in use by kernel or
+ * after it's already been freed.
*/
- if (buffer->free_in_progress) {
- binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
- "%d:%d FREE_BUFFER u%016llx user freed buffer twice\n",
- alloc->pid, current->pid,
- (u64)user_ptr);
- return NULL;
- }
- buffer->free_in_progress = 1;
+ if (!buffer->allow_user_free)
+ return ERR_PTR(-EPERM);
+ buffer->allow_user_free = 0;
return buffer;
}
}
rb_erase(best_fit, &alloc->free_buffers);
buffer->free = 0;
- buffer->free_in_progress = 0;
+ buffer->allow_user_free = 0;
binder_insert_allocated_buffer_locked(alloc, buffer);
binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
"%d: binder_alloc_buf size %zd got %pK\n",
unsigned free:1;
unsigned allow_user_free:1;
unsigned async_transaction:1;
- unsigned free_in_progress:1;
- unsigned debug_id:28;
+ unsigned debug_id:29;
struct binder_transaction *transaction;
/* These specific Samsung models/firmware-revs do not handle LPM well */
{ "SAMSUNG MZMPC128HBFU-000MV", "CXM14M1Q", ATA_HORKAGE_NOLPM, },
{ "SAMSUNG SSD PM830 mSATA *", "CXM13D1Q", ATA_HORKAGE_NOLPM, },
- { "SAMSUNG MZ7TD256HAFV-000L9", "DXT02L5Q", ATA_HORKAGE_NOLPM, },
+ { "SAMSUNG MZ7TD256HAFV-000L9", NULL, ATA_HORKAGE_NOLPM, },
/* devices that don't properly handle queued TRIM commands */
{ "Micron_M500IT_*", "MU01", ATA_HORKAGE_NO_NCQ_TRIM |
{ "SSD*INTEL*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Samsung*SSD*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "SAMSUNG*SSD*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, },
+ { "SAMSUNG*MZ7KM*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "ST[1248][0248]0[FH]*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, },
/*
+// SPDX-License-Identifier: GPL-2.0+
/*
* Renesas R-Car SATA driver
*
* Author: Vladimir Barinov <source@cogentembedded.com>
* Copyright (C) 2013-2015 Cogent Embedded, Inc.
* Copyright (C) 2013-2015 Renesas Solutions Corp.
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License as published by the
- * Free Software Foundation; either version 2 of the License, or (at your
- * option) any later version.
*/
#include <linux/kernel.h>
func_enter ();
- fs_dprintk (FS_DEBUG_INIT, "Inititing queue at %x: %d entries:\n",
+ fs_dprintk (FS_DEBUG_INIT, "Initializing queue at %x: %d entries:\n",
queue, nentries);
p = aligned_kmalloc (sz, GFP_KERNEL, 0x10);
{
func_enter ();
- fs_dprintk (FS_DEBUG_INIT, "Inititing free pool at %x:\n", queue);
+ fs_dprintk (FS_DEBUG_INIT, "Initializing free pool at %x:\n", queue);
write_fs (dev, FP_CNF(queue), (bufsize * RBFP_RBS) | RBFP_RBSVAL | RBFP_CME);
write_fs (dev, FP_SA(queue), 0);
int release_data;
} std;
struct { /* valid when type == INPUT_TYPE_KBD */
- /* strings can be non null-terminated */
- char press_str[sizeof(void *) + sizeof(int)];
- char repeat_str[sizeof(void *) + sizeof(int)];
- char release_str[sizeof(void *) + sizeof(int)];
+ char press_str[sizeof(void *) + sizeof(int)] __nonstring;
+ char repeat_str[sizeof(void *) + sizeof(int)] __nonstring;
+ char release_str[sizeof(void *) + sizeof(int)] __nonstring;
} kbd;
} u;
};
struct devres {
struct devres_node node;
- /* -- 3 pointers */
- unsigned long long data[]; /* guarantee ull alignment */
+ /*
+ * Some archs want to perform DMA into kmalloc caches
+ * and need a guaranteed alignment larger than
+ * the alignment of a 64-bit integer.
+ * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
+ * buffer alignment as if it was allocated by plain kmalloc().
+ */
+ u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
};
struct devres_group {
disk->first_minor = i * max_part;
disk->fops = &brd_fops;
disk->private_data = brd;
- disk->queue = brd->brd_queue;
disk->flags = GENHD_FL_EXT_DEVT;
sprintf(disk->disk_name, "ram%d", i);
set_capacity(disk, rd_size * 2);
- disk->queue->backing_dev_info->capabilities |= BDI_CAP_SYNCHRONOUS_IO;
+ brd->brd_queue->backing_dev_info->capabilities |= BDI_CAP_SYNCHRONOUS_IO;
/* Tell the block layer that this is not a rotational device */
- blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue);
- blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, disk->queue);
+ blk_queue_flag_set(QUEUE_FLAG_NONROT, brd->brd_queue);
+ blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, brd->brd_queue);
return brd;
brd = brd_alloc(i);
if (brd) {
+ brd->brd_disk->queue = brd->brd_queue;
add_disk(brd->brd_disk);
list_add_tail(&brd->brd_list, &brd_devices);
}
/* point of no return */
- list_for_each_entry(brd, &brd_devices, brd_list)
+ list_for_each_entry(brd, &brd_devices, brd_list) {
+ /*
+ * associate with queue just before adding disk for
+ * avoiding to mess up failure path
+ */
+ brd->brd_disk->queue = brd->brd_queue;
add_disk(brd->brd_disk);
+ }
blk_register_region(MKDEV(RAMDISK_MAJOR, 0), 1UL << MINORBITS,
THIS_MODULE, brd_probe, NULL, NULL);
/* THINK if (signal_pending) return ... ? */
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg.msg_iter, WRITE, &iov, 1, size);
if (sock == connection->data.socket) {
rcu_read_lock();
struct msghdr msg = {
.msg_flags = (flags ? flags : MSG_WAITALL | MSG_NOSIGNAL)
};
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, size);
return sock_recvmsg(sock, &msg, msg.msg_flags);
}
bio.bi_end_io = floppy_rb0_cb;
bio_set_op_attrs(&bio, REQ_OP_READ, 0);
+ init_completion(&cbdata.complete);
+
submit_bio(&bio);
process_fd_request();
- init_completion(&cbdata.complete);
wait_for_completion(&cbdata.complete);
__free_page(page);
#include <linux/falloc.h>
#include <linux/uio.h>
#include <linux/ioprio.h>
-#include <linux/blk-cgroup.h>
#include "loop.h"
struct iov_iter i;
ssize_t bw;
- iov_iter_bvec(&i, ITER_BVEC | WRITE, bvec, 1, bvec->bv_len);
+ iov_iter_bvec(&i, WRITE, bvec, 1, bvec->bv_len);
file_start_write(file);
bw = vfs_iter_write(file, &i, ppos, 0);
ssize_t len;
rq_for_each_segment(bvec, rq, iter) {
- iov_iter_bvec(&i, ITER_BVEC, &bvec, 1, bvec.bv_len);
+ iov_iter_bvec(&i, READ, &bvec, 1, bvec.bv_len);
len = vfs_iter_read(lo->lo_backing_file, &i, &pos, 0);
if (len < 0)
return len;
b.bv_offset = 0;
b.bv_len = bvec.bv_len;
- iov_iter_bvec(&i, ITER_BVEC, &b, 1, b.bv_len);
+ iov_iter_bvec(&i, READ, &b, 1, b.bv_len);
len = vfs_iter_read(lo->lo_backing_file, &i, &pos, 0);
if (len < 0) {
ret = len;
}
atomic_set(&cmd->ref, 2);
- iov_iter_bvec(&iter, ITER_BVEC | rw, bvec,
- segments, blk_rq_bytes(rq));
+ iov_iter_bvec(&iter, rw, bvec, segments, blk_rq_bytes(rq));
iter.iov_offset = offset;
cmd->iocb.ki_pos = pos;
/* always use the first bio's css */
#ifdef CONFIG_BLK_CGROUP
- if (cmd->use_aio && rq->bio && rq->bio->bi_blkg) {
- cmd->css = &bio_blkcg(rq->bio)->css;
+ if (cmd->use_aio && rq->bio && rq->bio->bi_css) {
+ cmd->css = rq->bio->bi_css;
css_get(cmd->css);
} else
#endif
dev_warn(&dd->pdev->dev,
"data movement but "
"sect_count is 0\n");
- err = -EINVAL;
- goto abort;
+ err = -EINVAL;
+ goto abort;
}
}
}
u32 nbd_cmd_flags = 0;
int sent = nsock->sent, skip = 0;
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &iov, 1, sizeof(request));
+ iov_iter_kvec(&from, WRITE, &iov, 1, sizeof(request));
switch (req_op(req)) {
case REQ_OP_DISCARD:
dev_dbg(nbd_to_dev(nbd), "request %p: sending %d bytes data\n",
req, bvec.bv_len);
- iov_iter_bvec(&from, ITER_BVEC | WRITE,
- &bvec, 1, bvec.bv_len);
+ iov_iter_bvec(&from, WRITE, &bvec, 1, bvec.bv_len);
if (skip) {
if (skip >= iov_iter_count(&from)) {
skip -= iov_iter_count(&from);
int ret = 0;
reply.magic = 0;
- iov_iter_kvec(&to, READ | ITER_KVEC, &iov, 1, sizeof(reply));
+ iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
if (result <= 0) {
if (!nbd_disconnected(config))
struct bio_vec bvec;
rq_for_each_segment(bvec, req, iter) {
- iov_iter_bvec(&to, ITER_BVEC | READ,
- &bvec, 1, bvec.bv_len);
+ iov_iter_bvec(&to, READ, &bvec, 1, bvec.bv_len);
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
if (result <= 0) {
dev_err(disk_to_dev(nbd->disk), "Receive data failed (result %d)\n",
for (i = 0; i < config->num_connections; i++) {
struct nbd_sock *nsock = config->socks[i];
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &iov, 1, sizeof(request));
+ iov_iter_kvec(&from, WRITE, &iov, 1, sizeof(request));
mutex_lock(&nsock->tx_lock);
ret = sock_xmit(nbd, i, 1, &from, 0, NULL);
if (ret <= 0)
GFP_KERNEL);
if (!info->rinfo) {
xenbus_dev_fatal(info->xbdev, -ENOMEM, "allocating ring_info structure");
+ info->nr_rings = 0;
return -ENOMEM;
}
{
struct clk *clk = platform_get_drvdata(pdev);
+ of_clk_del_provider(pdev->dev.of_node);
clk_unregister_fixed_factor(clk);
return 0;
.ops = &clk_regmap_gate_ops,
.parent_names = (const char *[]){ "fclk_div2_div" },
.num_parents = 1,
+ .flags = CLK_IS_CRITICAL,
},
};
.ops = &clk_regmap_gate_ops,
.parent_names = (const char *[]){ "fclk_div3_div" },
.num_parents = 1,
+ /*
+ * FIXME:
+ * This clock, as fdiv2, is used by the SCPI FW and is required
+ * by the platform to operate correctly.
+ * Until the following condition are met, we need this clock to
+ * be marked as critical:
+ * a) The SCPI generic driver claims and enable all the clocks
+ * it needs
+ * b) CCF has a clock hand-off mechanism to make the sure the
+ * clock stays on until the proper driver comes along
+ */
+ .flags = CLK_IS_CRITICAL,
},
};
.ops = &clk_regmap_gate_ops,
.parent_names = (const char *[]){ "fclk_div3_div" },
.num_parents = 1,
+ /*
+ * FIXME:
+ * This clock, as fdiv2, is used by the SCPI FW and is required
+ * by the platform to operate correctly.
+ * Until the following condition are met, we need this clock to
+ * be marked as critical:
+ * a) The SCPI generic driver claims and enable all the clocks
+ * it needs
+ * b) CCF has a clock hand-off mechanism to make the sure the
+ * clock stays on until the proper driver comes along
+ */
+ .flags = CLK_IS_CRITICAL,
},
};
pr_err("CLK %d has invalid pointer %p\n", id, clk);
return;
}
- if (id > unit->nr_clks) {
+ if (id >= unit->nr_clks) {
pr_err("CLK %d is invalid\n", id);
return;
}
unsigned int idx = clkspec->args[1];
if (type == CP110_CLK_TYPE_CORE) {
- if (idx > CP110_MAX_CORE_CLOCKS)
+ if (idx >= CP110_MAX_CORE_CLOCKS)
return ERR_PTR(-EINVAL);
return clk_data->hws[idx];
} else if (type == CP110_CLK_TYPE_GATABLE) {
- if (idx > CP110_MAX_GATABLE_CLOCKS)
+ if (idx >= CP110_MAX_GATABLE_CLOCKS)
return ERR_PTR(-EINVAL);
return clk_data->hws[CP110_MAX_CORE_CLOCKS + idx];
}
}
EXPORT_SYMBOL_GPL(qcom_cc_register_sleep_clk);
+/* Drop 'protected-clocks' from the list of clocks to register */
+static void qcom_cc_drop_protected(struct device *dev, struct qcom_cc *cc)
+{
+ struct device_node *np = dev->of_node;
+ struct property *prop;
+ const __be32 *p;
+ u32 i;
+
+ of_property_for_each_u32(np, "protected-clocks", prop, p, i) {
+ if (i >= cc->num_rclks)
+ continue;
+
+ cc->rclks[i] = NULL;
+ }
+}
+
static struct clk_hw *qcom_cc_clk_hw_get(struct of_phandle_args *clkspec,
void *data)
{
cc->rclks = rclks;
cc->num_rclks = num_clks;
+ qcom_cc_drop_protected(dev, cc);
+
for (i = 0; i < num_clks; i++) {
if (!rclks[i])
continue;
.div = 1,
.hw.init = &(struct clk_init_data){
.name = "cxo",
- .parent_names = (const char *[]){ "xo_board" },
+ .parent_names = (const char *[]){ "xo-board" },
.num_parents = 1,
.ops = &clk_fixed_factor_ops,
},
*/
static inline int zynqmp_is_valid_clock(u32 clk_id)
{
- if (clk_id > clock_max_idx)
+ if (clk_id >= clock_max_idx)
return -ENODEV;
return clock[clk_id].valid;
qdata.arg1 = clk_id;
ret = eemi_ops->query_data(qdata, ret_payload);
+ if (ret)
+ return ERR_PTR(ret);
+
mult = ret_payload[1];
div = ret_payload[2];
is accessed via both the SBI and the rdcycle instruction. This is
required for all RISC-V systems.
+config CSKY_MP_TIMER
+ bool "SMP Timer for the C-SKY platform" if COMPILE_TEST
+ depends on CSKY
+ select TIMER_OF
+ help
+ Say yes here to enable C-SKY SMP timer driver used for C-SKY SMP
+ system.
+ csky,mptimer is not only used in SMP system, it also could be used
+ single core system. It's not a mmio reg and it use mtcr/mfcr instruction.
+
+config GX6605S_TIMER
+ bool "Gx6605s SOC system timer driver" if COMPILE_TEST
+ depends on CSKY
+ select CLKSRC_MMIO
+ select TIMER_OF
+ help
+ This option enables support for gx6605s SOC's timer.
+
endmenu
obj-$(CONFIG_X86_NUMACHIP) += numachip.o
obj-$(CONFIG_ATCPIT100_TIMER) += timer-atcpit100.o
obj-$(CONFIG_RISCV_TIMER) += riscv_timer.o
+obj-$(CONFIG_CSKY_MP_TIMER) += timer-mp-csky.o
+obj-$(CONFIG_GX6605S_TIMER) += timer-gx6605s.o
DEFINE_RAW_SPINLOCK(i8253_lock);
EXPORT_SYMBOL(i8253_lock);
+/*
+ * Handle PIT quirk in pit_shutdown() where zeroing the counter register
+ * restarts the PIT, negating the shutdown. On platforms with the quirk,
+ * platform specific code can set this to false.
+ */
+bool i8253_clear_counter_on_shutdown __ro_after_init = true;
+
#ifdef CONFIG_CLKSRC_I8253
/*
* Since the PIT overflows every tick, its not very useful
raw_spin_lock(&i8253_lock);
outb_p(0x30, PIT_MODE);
- outb_p(0, PIT_CH0);
- outb_p(0, PIT_CH0);
+
+ if (i8253_clear_counter_on_shutdown) {
+ outb_p(0, PIT_CH0);
+ outb_p(0, PIT_CH0);
+ }
raw_spin_unlock(&i8253_lock);
return 0;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd.
+
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/sched_clock.h>
+
+#include "timer-of.h"
+
+#define CLKSRC_OFFSET 0x40
+
+#define TIMER_STATUS 0x00
+#define TIMER_VALUE 0x04
+#define TIMER_CONTRL 0x10
+#define TIMER_CONFIG 0x20
+#define TIMER_DIV 0x24
+#define TIMER_INI 0x28
+
+#define GX6605S_STATUS_CLR BIT(0)
+#define GX6605S_CONTRL_RST BIT(0)
+#define GX6605S_CONTRL_START BIT(1)
+#define GX6605S_CONFIG_EN BIT(0)
+#define GX6605S_CONFIG_IRQ_EN BIT(1)
+
+static irqreturn_t gx6605s_timer_interrupt(int irq, void *dev)
+{
+ struct clock_event_device *ce = dev;
+ void __iomem *base = timer_of_base(to_timer_of(ce));
+
+ writel_relaxed(GX6605S_STATUS_CLR, base + TIMER_STATUS);
+
+ ce->event_handler(ce);
+
+ return IRQ_HANDLED;
+}
+
+static int gx6605s_timer_set_oneshot(struct clock_event_device *ce)
+{
+ void __iomem *base = timer_of_base(to_timer_of(ce));
+
+ /* reset and stop counter */
+ writel_relaxed(GX6605S_CONTRL_RST, base + TIMER_CONTRL);
+
+ /* enable with irq and start */
+ writel_relaxed(GX6605S_CONFIG_EN | GX6605S_CONFIG_IRQ_EN,
+ base + TIMER_CONFIG);
+
+ return 0;
+}
+
+static int gx6605s_timer_set_next_event(unsigned long delta,
+ struct clock_event_device *ce)
+{
+ void __iomem *base = timer_of_base(to_timer_of(ce));
+
+ /* use reset to pause timer */
+ writel_relaxed(GX6605S_CONTRL_RST, base + TIMER_CONTRL);
+
+ /* config next timeout value */
+ writel_relaxed(ULONG_MAX - delta, base + TIMER_INI);
+ writel_relaxed(GX6605S_CONTRL_START, base + TIMER_CONTRL);
+
+ return 0;
+}
+
+static int gx6605s_timer_shutdown(struct clock_event_device *ce)
+{
+ void __iomem *base = timer_of_base(to_timer_of(ce));
+
+ writel_relaxed(0, base + TIMER_CONTRL);
+ writel_relaxed(0, base + TIMER_CONFIG);
+
+ return 0;
+}
+
+static struct timer_of to = {
+ .flags = TIMER_OF_IRQ | TIMER_OF_BASE | TIMER_OF_CLOCK,
+ .clkevt = {
+ .rating = 300,
+ .features = CLOCK_EVT_FEAT_DYNIRQ |
+ CLOCK_EVT_FEAT_ONESHOT,
+ .set_state_shutdown = gx6605s_timer_shutdown,
+ .set_state_oneshot = gx6605s_timer_set_oneshot,
+ .set_next_event = gx6605s_timer_set_next_event,
+ .cpumask = cpu_possible_mask,
+ },
+ .of_irq = {
+ .handler = gx6605s_timer_interrupt,
+ .flags = IRQF_TIMER | IRQF_IRQPOLL,
+ },
+};
+
+static u64 notrace gx6605s_sched_clock_read(void)
+{
+ void __iomem *base;
+
+ base = timer_of_base(&to) + CLKSRC_OFFSET;
+
+ return (u64)readl_relaxed(base + TIMER_VALUE);
+}
+
+static void gx6605s_clkevt_init(void __iomem *base)
+{
+ writel_relaxed(0, base + TIMER_DIV);
+ writel_relaxed(0, base + TIMER_CONFIG);
+
+ clockevents_config_and_register(&to.clkevt, timer_of_rate(&to), 2,
+ ULONG_MAX);
+}
+
+static int gx6605s_clksrc_init(void __iomem *base)
+{
+ writel_relaxed(0, base + TIMER_DIV);
+ writel_relaxed(0, base + TIMER_INI);
+
+ writel_relaxed(GX6605S_CONTRL_RST, base + TIMER_CONTRL);
+
+ writel_relaxed(GX6605S_CONFIG_EN, base + TIMER_CONFIG);
+
+ writel_relaxed(GX6605S_CONTRL_START, base + TIMER_CONTRL);
+
+ sched_clock_register(gx6605s_sched_clock_read, 32, timer_of_rate(&to));
+
+ return clocksource_mmio_init(base + TIMER_VALUE, "gx6605s",
+ timer_of_rate(&to), 200, 32, clocksource_mmio_readl_up);
+}
+
+static int __init gx6605s_timer_init(struct device_node *np)
+{
+ int ret;
+
+ /*
+ * The timer driver is for nationalchip gx6605s SOC and there are two
+ * same timer in gx6605s. We use one for clkevt and another for clksrc.
+ *
+ * The timer is mmio map to access, so we need give mmio address in dts.
+ *
+ * It provides a 32bit countup timer and interrupt will be caused by
+ * count-overflow.
+ * So we need set-next-event by ULONG_MAX - delta in TIMER_INI reg.
+ *
+ * The counter at 0x0 offset is clock event.
+ * The counter at 0x40 offset is clock source.
+ * They are the same in hardware, just different used by driver.
+ */
+ ret = timer_of_init(np, &to);
+ if (ret)
+ return ret;
+
+ gx6605s_clkevt_init(timer_of_base(&to));
+
+ return gx6605s_clksrc_init(timer_of_base(&to) + CLKSRC_OFFSET);
+}
+TIMER_OF_DECLARE(csky_gx6605s_timer, "csky,gx6605s-timer", gx6605s_timer_init);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd.
+
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/sched_clock.h>
+#include <linux/cpu.h>
+#include <linux/of_irq.h>
+#include <asm/reg_ops.h>
+
+#include "timer-of.h"
+
+#define PTIM_CCVR "cr<3, 14>"
+#define PTIM_CTLR "cr<0, 14>"
+#define PTIM_LVR "cr<6, 14>"
+#define PTIM_TSR "cr<1, 14>"
+
+static int csky_mptimer_irq;
+
+static int csky_mptimer_set_next_event(unsigned long delta,
+ struct clock_event_device *ce)
+{
+ mtcr(PTIM_LVR, delta);
+
+ return 0;
+}
+
+static int csky_mptimer_shutdown(struct clock_event_device *ce)
+{
+ mtcr(PTIM_CTLR, 0);
+
+ return 0;
+}
+
+static int csky_mptimer_oneshot(struct clock_event_device *ce)
+{
+ mtcr(PTIM_CTLR, 1);
+
+ return 0;
+}
+
+static int csky_mptimer_oneshot_stopped(struct clock_event_device *ce)
+{
+ mtcr(PTIM_CTLR, 0);
+
+ return 0;
+}
+
+static DEFINE_PER_CPU(struct timer_of, csky_to) = {
+ .flags = TIMER_OF_CLOCK,
+ .clkevt = {
+ .rating = 300,
+ .features = CLOCK_EVT_FEAT_PERCPU |
+ CLOCK_EVT_FEAT_ONESHOT,
+ .set_state_shutdown = csky_mptimer_shutdown,
+ .set_state_oneshot = csky_mptimer_oneshot,
+ .set_state_oneshot_stopped = csky_mptimer_oneshot_stopped,
+ .set_next_event = csky_mptimer_set_next_event,
+ },
+};
+
+static irqreturn_t csky_timer_interrupt(int irq, void *dev)
+{
+ struct timer_of *to = this_cpu_ptr(&csky_to);
+
+ mtcr(PTIM_TSR, 0);
+
+ to->clkevt.event_handler(&to->clkevt);
+
+ return IRQ_HANDLED;
+}
+
+/*
+ * clock event for percpu
+ */
+static int csky_mptimer_starting_cpu(unsigned int cpu)
+{
+ struct timer_of *to = per_cpu_ptr(&csky_to, cpu);
+
+ to->clkevt.cpumask = cpumask_of(cpu);
+
+ clockevents_config_and_register(&to->clkevt, timer_of_rate(to),
+ 2, ULONG_MAX);
+
+ enable_percpu_irq(csky_mptimer_irq, 0);
+
+ return 0;
+}
+
+static int csky_mptimer_dying_cpu(unsigned int cpu)
+{
+ disable_percpu_irq(csky_mptimer_irq);
+
+ return 0;
+}
+
+/*
+ * clock source
+ */
+static u64 sched_clock_read(void)
+{
+ return (u64)mfcr(PTIM_CCVR);
+}
+
+static u64 clksrc_read(struct clocksource *c)
+{
+ return (u64)mfcr(PTIM_CCVR);
+}
+
+struct clocksource csky_clocksource = {
+ .name = "csky",
+ .rating = 400,
+ .mask = CLOCKSOURCE_MASK(32),
+ .flags = CLOCK_SOURCE_IS_CONTINUOUS,
+ .read = clksrc_read,
+};
+
+static int __init csky_mptimer_init(struct device_node *np)
+{
+ int ret, cpu, cpu_rollback;
+ struct timer_of *to = NULL;
+
+ /*
+ * Csky_mptimer is designed for C-SKY SMP multi-processors and
+ * every core has it's own private irq and regs for clkevt and
+ * clksrc.
+ *
+ * The regs is accessed by cpu instruction: mfcr/mtcr instead of
+ * mmio map style. So we needn't mmio-address in dts, but we still
+ * need to give clk and irq number.
+ *
+ * We use private irq for the mptimer and irq number is the same
+ * for every core. So we use request_percpu_irq() in timer_of_init.
+ */
+ csky_mptimer_irq = irq_of_parse_and_map(np, 0);
+ if (csky_mptimer_irq <= 0)
+ return -EINVAL;
+
+ ret = request_percpu_irq(csky_mptimer_irq, csky_timer_interrupt,
+ "csky_mp_timer", &csky_to);
+ if (ret)
+ return -EINVAL;
+
+ for_each_possible_cpu(cpu) {
+ to = per_cpu_ptr(&csky_to, cpu);
+ ret = timer_of_init(np, to);
+ if (ret)
+ goto rollback;
+ }
+
+ clocksource_register_hz(&csky_clocksource, timer_of_rate(to));
+ sched_clock_register(sched_clock_read, 32, timer_of_rate(to));
+
+ ret = cpuhp_setup_state(CPUHP_AP_CSKY_TIMER_STARTING,
+ "clockevents/csky/timer:starting",
+ csky_mptimer_starting_cpu,
+ csky_mptimer_dying_cpu);
+ if (ret)
+ return -EINVAL;
+
+ return 0;
+
+rollback:
+ for_each_possible_cpu(cpu_rollback) {
+ if (cpu_rollback == cpu)
+ break;
+
+ to = per_cpu_ptr(&csky_to, cpu_rollback);
+ timer_of_cleanup(to);
+ }
+ return -EINVAL;
+}
+TIMER_OF_DECLARE(csky_mptimer, "csky,mptimer", csky_mptimer_init);
/* Ensure the arm clock divider is what we expect */
ret = clk_set_rate(clks[ARM].clk, new_freq * 1000);
if (ret) {
+ int ret1;
+
dev_err(cpu_dev, "failed to set clock rate: %d\n", ret);
- regulator_set_voltage_tol(arm_reg, volt_old, 0);
+ ret1 = regulator_set_voltage_tol(arm_reg, volt_old, 0);
+ if (ret1)
+ dev_warn(cpu_dev,
+ "failed to restore vddarm voltage: %d\n", ret1);
return ret;
}
{},
};
+static const struct of_device_id *ti_cpufreq_match_node(void)
+{
+ struct device_node *np;
+ const struct of_device_id *match;
+
+ np = of_find_node_by_path("/");
+ match = of_match_node(ti_cpufreq_of_match, np);
+ of_node_put(np);
+
+ return match;
+}
+
static int ti_cpufreq_probe(struct platform_device *pdev)
{
u32 version[VERSION_COUNT];
- struct device_node *np;
const struct of_device_id *match;
struct opp_table *ti_opp_table;
struct ti_cpufreq_data *opp_data;
const char * const reg_names[] = {"vdd", "vbb"};
int ret;
- np = of_find_node_by_path("/");
- match = of_match_node(ti_cpufreq_of_match, np);
- of_node_put(np);
+ match = dev_get_platdata(&pdev->dev);
if (!match)
return -ENODEV;
static int ti_cpufreq_init(void)
{
- platform_device_register_simple("ti-cpufreq", -1, NULL, 0);
+ const struct of_device_id *match;
+
+ /* Check to ensure we are on a compatible platform */
+ match = ti_cpufreq_match_node();
+ if (match)
+ platform_device_register_data(NULL, "ti-cpufreq", -1, match,
+ sizeof(*match));
+
return 0;
}
module_init(ti_cpufreq_init);
{
int ret;
struct cpuidle_driver *drv;
- struct cpuidle_device *dev;
drv = kmemdup(&arm_idle_driver, sizeof(*drv), GFP_KERNEL);
if (!drv)
goto out_kfree_drv;
}
- ret = cpuidle_register_driver(drv);
- if (ret) {
- if (ret != -EBUSY)
- pr_err("Failed to register cpuidle driver\n");
- goto out_kfree_drv;
- }
-
/*
* Call arch CPU operations in order to initialize
* idle states suspend back-end specific data
ret = arm_cpuidle_init(cpu);
/*
- * Skip the cpuidle device initialization if the reported
+ * Allow the initialization to continue for other CPUs, if the reported
* failure is a HW misconfiguration/breakage (-ENXIO).
*/
- if (ret == -ENXIO)
- return 0;
-
if (ret) {
pr_err("CPU %d failed to init idle CPU ops\n", cpu);
- goto out_unregister_drv;
- }
-
- dev = kzalloc(sizeof(*dev), GFP_KERNEL);
- if (!dev) {
- ret = -ENOMEM;
- goto out_unregister_drv;
+ ret = ret == -ENXIO ? 0 : ret;
+ goto out_kfree_drv;
}
- dev->cpu = cpu;
- ret = cpuidle_register_device(dev);
- if (ret) {
- pr_err("Failed to register cpuidle device for CPU %d\n",
- cpu);
- goto out_kfree_dev;
- }
+ ret = cpuidle_register(drv, NULL);
+ if (ret)
+ goto out_kfree_drv;
return 0;
-out_kfree_dev:
- kfree(dev);
-out_unregister_drv:
- cpuidle_unregister_driver(drv);
out_kfree_drv:
kfree(drv);
return ret;
while (--cpu >= 0) {
dev = per_cpu(cpuidle_devices, cpu);
drv = cpuidle_get_cpu_driver(dev);
- cpuidle_unregister_device(dev);
- cpuidle_unregister_driver(drv);
- kfree(dev);
+ cpuidle_unregister(drv);
kfree(drv);
}
int *splits_in_nents;
int *splits_out_nents = NULL;
struct sec_request_el *el, *temp;
+ bool split = skreq->src != skreq->dst;
mutex_init(&sec_req->lock);
sec_req->req_base = &skreq->base;
if (ret)
goto err_free_split_sizes;
- if (skreq->src != skreq->dst) {
+ if (split) {
sec_req->len_out = sg_nents(skreq->dst);
ret = sec_map_and_split_sg(skreq->dst, split_sizes, steps,
&splits_out, &splits_out_nents,
split_sizes[i],
skreq->src != skreq->dst,
splits_in[i], splits_in_nents[i],
- splits_out[i],
- splits_out_nents[i], info);
+ split ? splits_out[i] : NULL,
+ split ? splits_out_nents[i] : 0,
+ info);
if (IS_ERR(el)) {
ret = PTR_ERR(el);
goto err_free_elements;
* more refined but this is unlikely to happen so no need.
*/
- /* Cleanup - all elements in pointer arrays have been coppied */
- kfree(splits_in_nents);
- kfree(splits_in);
- kfree(splits_out_nents);
- kfree(splits_out);
- kfree(split_sizes);
-
/* Grab a big lock for a long time to avoid concurrency issues */
mutex_lock(&queue->queuelock);
(!queue->havesoftqueue ||
kfifo_avail(&queue->softqueue) > steps)) ||
!list_empty(&ctx->backlog)) {
+ ret = -EBUSY;
if ((skreq->base.flags & CRYPTO_TFM_REQ_MAY_BACKLOG)) {
list_add_tail(&sec_req->backlog_head, &ctx->backlog);
mutex_unlock(&queue->queuelock);
- return -EBUSY;
+ goto out;
}
- ret = -EBUSY;
mutex_unlock(&queue->queuelock);
goto err_free_elements;
}
if (ret)
goto err_free_elements;
- return -EINPROGRESS;
+ ret = -EINPROGRESS;
+out:
+ /* Cleanup - all elements in pointer arrays have been copied */
+ kfree(splits_in_nents);
+ kfree(splits_in);
+ kfree(splits_out_nents);
+ kfree(splits_out);
+ kfree(split_sizes);
+ return ret;
err_free_elements:
list_for_each_entry_safe(el, temp, &sec_req->elements, head) {
crypto_skcipher_ivsize(atfm),
DMA_BIDIRECTIONAL);
err_unmap_out_sg:
- if (skreq->src != skreq->dst)
+ if (split)
sec_unmap_sg_on_err(skreq->dst, steps, splits_out,
splits_out_nents, sec_req->len_out,
info->dev);
exp_info.ops = &udmabuf_ops;
exp_info.size = ubuf->pagecount << PAGE_SHIFT;
exp_info.priv = ubuf;
+ exp_info.flags = O_RDWR;
buf = dma_buf_export(&exp_info);
if (IS_ERR(buf)) {
atchan->descs_allocated = 0;
atchan->status = 0;
+ /*
+ * Free atslave allocated in at_dma_xlate()
+ */
+ kfree(chan->private);
+ chan->private = NULL;
+
dev_vdbg(chan2dev(chan), "free_chan_resources: done\n");
}
dma_cap_zero(mask);
dma_cap_set(DMA_SLAVE, mask);
- atslave = devm_kzalloc(&dmac_pdev->dev, sizeof(*atslave), GFP_KERNEL);
+ atslave = kzalloc(sizeof(*atslave), GFP_KERNEL);
if (!atslave)
return NULL;
struct resource *io;
at_dma_off(atdma);
+ if (pdev->dev.of_node)
+ of_dma_controller_free(pdev->dev.of_node);
dma_async_device_unregister(&atdma->dma_common);
dma_pool_destroy(atdma->memset_pool);
/*
* Program FIFO size of channels.
*
- * By default full FIFO (1024 bytes) is assigned to channel 0. Here we
+ * By default full FIFO (512 bytes) is assigned to channel 0. Here we
* slice FIFO on equal parts between channels.
*/
static void idma32_fifo_partition(struct dw_dma *dw)
{
- u64 value = IDMA32C_FP_PSIZE_CH0(128) | IDMA32C_FP_PSIZE_CH1(128) |
+ u64 value = IDMA32C_FP_PSIZE_CH0(64) | IDMA32C_FP_PSIZE_CH1(64) |
IDMA32C_FP_UPDATE;
u64 fifo_partition = 0;
/* Fill FIFO_PARTITION high bits (Channels 2..3, 6..7) */
fifo_partition |= value << 32;
- /* Program FIFO Partition registers - 128 bytes for each channel */
+ /* Program FIFO Partition registers - 64 bytes per channel */
idma32_writeq(dw, FIFO_PARTITION1, fifo_partition);
idma32_writeq(dw, FIFO_PARTITION0, fifo_partition);
}
#include <linux/spinlock.h>
#include <linux/device.h>
#include <linux/dma-mapping.h>
-#include <linux/dmapool.h>
#include <linux/firmware.h>
#include <linux/slab.h>
#include <linux/platform_device.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of_dma.h>
+#include <linux/workqueue.h>
#include <asm/irq.h>
#include <linux/platform_data/dma-imx-sdma.h>
u32 shp_addr, per_addr;
enum dma_status status;
struct imx_dma_data data;
- struct dma_pool *bd_pool;
+ struct work_struct terminate_worker;
};
#define IMX_DMA_SG_LOOP BIT(0)
return 0;
}
-
-static int sdma_disable_channel_with_delay(struct dma_chan *chan)
+static void sdma_channel_terminate_work(struct work_struct *work)
{
- struct sdma_channel *sdmac = to_sdma_chan(chan);
+ struct sdma_channel *sdmac = container_of(work, struct sdma_channel,
+ terminate_worker);
unsigned long flags;
LIST_HEAD(head);
- sdma_disable_channel(chan);
- spin_lock_irqsave(&sdmac->vc.lock, flags);
- vchan_get_all_descriptors(&sdmac->vc, &head);
- sdmac->desc = NULL;
- spin_unlock_irqrestore(&sdmac->vc.lock, flags);
- vchan_dma_desc_free_list(&sdmac->vc, &head);
-
/*
* According to NXP R&D team a delay of one BD SDMA cost time
* (maximum is 1ms) should be added after disable of the channel
* bit, to ensure SDMA core has really been stopped after SDMA
* clients call .device_terminate_all.
*/
- mdelay(1);
+ usleep_range(1000, 2000);
+
+ spin_lock_irqsave(&sdmac->vc.lock, flags);
+ vchan_get_all_descriptors(&sdmac->vc, &head);
+ sdmac->desc = NULL;
+ spin_unlock_irqrestore(&sdmac->vc.lock, flags);
+ vchan_dma_desc_free_list(&sdmac->vc, &head);
+}
+
+static int sdma_disable_channel_async(struct dma_chan *chan)
+{
+ struct sdma_channel *sdmac = to_sdma_chan(chan);
+
+ sdma_disable_channel(chan);
+
+ if (sdmac->desc)
+ schedule_work(&sdmac->terminate_worker);
return 0;
}
+static void sdma_channel_synchronize(struct dma_chan *chan)
+{
+ struct sdma_channel *sdmac = to_sdma_chan(chan);
+
+ vchan_synchronize(&sdmac->vc);
+
+ flush_work(&sdmac->terminate_worker);
+}
+
static void sdma_set_watermarklevel_for_p2p(struct sdma_channel *sdmac)
{
struct sdma_engine *sdma = sdmac->sdma;
static int sdma_alloc_bd(struct sdma_desc *desc)
{
+ u32 bd_size = desc->num_bd * sizeof(struct sdma_buffer_descriptor);
int ret = 0;
- desc->bd = dma_pool_alloc(desc->sdmac->bd_pool, GFP_NOWAIT,
- &desc->bd_phys);
+ desc->bd = dma_zalloc_coherent(NULL, bd_size, &desc->bd_phys,
+ GFP_NOWAIT);
if (!desc->bd) {
ret = -ENOMEM;
goto out;
static void sdma_free_bd(struct sdma_desc *desc)
{
- dma_pool_free(desc->sdmac->bd_pool, desc->bd, desc->bd_phys);
+ u32 bd_size = desc->num_bd * sizeof(struct sdma_buffer_descriptor);
+
+ dma_free_coherent(NULL, bd_size, desc->bd, desc->bd_phys);
}
static void sdma_desc_free(struct virt_dma_desc *vd)
if (ret)
goto disable_clk_ahb;
- sdmac->bd_pool = dma_pool_create("bd_pool", chan->device->dev,
- sizeof(struct sdma_buffer_descriptor),
- 32, 0);
-
return 0;
disable_clk_ahb:
struct sdma_channel *sdmac = to_sdma_chan(chan);
struct sdma_engine *sdma = sdmac->sdma;
- sdma_disable_channel_with_delay(chan);
+ sdma_disable_channel_async(chan);
+
+ sdma_channel_synchronize(chan);
if (sdmac->event_id0)
sdma_event_disable(sdmac, sdmac->event_id0);
clk_disable(sdma->clk_ipg);
clk_disable(sdma->clk_ahb);
-
- dma_pool_destroy(sdmac->bd_pool);
- sdmac->bd_pool = NULL;
}
static struct sdma_desc *sdma_transfer_init(struct sdma_channel *sdmac,
sdmac->channel = i;
sdmac->vc.desc_free = sdma_desc_free;
+ INIT_WORK(&sdmac->terminate_worker,
+ sdma_channel_terminate_work);
/*
* Add the channel to the DMAC list. Do not add channel 0 though
* because we need it internally in the SDMA driver. This also means
sdma->dma_device.device_prep_slave_sg = sdma_prep_slave_sg;
sdma->dma_device.device_prep_dma_cyclic = sdma_prep_dma_cyclic;
sdma->dma_device.device_config = sdma_config;
- sdma->dma_device.device_terminate_all = sdma_disable_channel_with_delay;
+ sdma->dma_device.device_terminate_all = sdma_disable_channel_async;
+ sdma->dma_device.device_synchronize = sdma_channel_synchronize;
sdma->dma_device.src_addr_widths = SDMA_DMA_BUSWIDTHS;
sdma->dma_device.dst_addr_widths = SDMA_DMA_BUSWIDTHS;
sdma->dma_device.directions = SDMA_DMA_DIRECTIONS;
desc_phys = lower_32_bits(c->desc_phys);
desc_num = (desc_phys - cdd->descs_phys) / sizeof(struct cppi41_desc);
- if (!cdd->chan_busy[desc_num])
+ if (!cdd->chan_busy[desc_num]) {
+ struct cppi41_channel *cc, *_ct;
+
+ /*
+ * channels might still be in the pendling list if
+ * cppi41_dma_issue_pending() is called after
+ * cppi41_runtime_suspend() is called
+ */
+ list_for_each_entry_safe(cc, _ct, &cdd->pending, node) {
+ if (cc != c)
+ continue;
+ list_del(&cc->node);
+ break;
+ }
return 0;
+ }
ret = cppi41_tear_down_chan(c);
if (ret)
depends on PCI && X86_64 && X86_MCE_INTEL && PCI_MMCONFIG
depends on ACPI_NFIT || !ACPI_NFIT # if ACPI_NFIT=m, EDAC_SKX can't be y
select DMI
+ select ACPI_ADXL if ACPI
help
Support for error detection and correction the Intel
Skylake server Integrated Memory Controllers. If your
#include <linux/bitmap.h>
#include <linux/math64.h>
#include <linux/mod_devicetable.h>
+#include <linux/adxl.h>
#include <acpi/nfit.h>
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
#include "edac_module.h"
#define EDAC_MOD_STR "skx_edac"
+#define MSG_SIZE 1024
/*
* Debug macros
static LIST_HEAD(skx_edac_list);
static u64 skx_tolm, skx_tohm;
+static char *skx_msg;
+static unsigned int nvdimm_count;
+
+enum {
+ INDEX_SOCKET,
+ INDEX_MEMCTRL,
+ INDEX_CHANNEL,
+ INDEX_DIMM,
+ INDEX_MAX
+};
+
+static const char * const component_names[] = {
+ [INDEX_SOCKET] = "ProcessorSocketId",
+ [INDEX_MEMCTRL] = "MemoryControllerId",
+ [INDEX_CHANNEL] = "ChannelId",
+ [INDEX_DIMM] = "DimmSlotId",
+};
+
+static int component_indices[ARRAY_SIZE(component_names)];
+static int adxl_component_count;
+static const char * const *adxl_component_names;
+static u64 *adxl_values;
+static char *adxl_msg;
#define NUM_IMC 2 /* memory controllers per socket */
#define NUM_CHANNELS 3 /* channels per memory controller */
u16 flags;
u64 size = 0;
+ nvdimm_count++;
+
dev_handle = ACPI_NFIT_BUILD_DEVICE_HANDLE(dimmno, chan, imc->lmc,
imc->src_id, 0);
}
#endif /*CONFIG_EDAC_DEBUG*/
+static bool skx_adxl_decode(struct decoded_addr *res)
+
+{
+ int i, len = 0;
+
+ if (res->addr >= skx_tohm || (res->addr >= skx_tolm &&
+ res->addr < BIT_ULL(32))) {
+ edac_dbg(0, "Address 0x%llx out of range\n", res->addr);
+ return false;
+ }
+
+ if (adxl_decode(res->addr, adxl_values)) {
+ edac_dbg(0, "Failed to decode 0x%llx\n", res->addr);
+ return false;
+ }
+
+ res->socket = (int)adxl_values[component_indices[INDEX_SOCKET]];
+ res->imc = (int)adxl_values[component_indices[INDEX_MEMCTRL]];
+ res->channel = (int)adxl_values[component_indices[INDEX_CHANNEL]];
+ res->dimm = (int)adxl_values[component_indices[INDEX_DIMM]];
+
+ for (i = 0; i < adxl_component_count; i++) {
+ if (adxl_values[i] == ~0x0ull)
+ continue;
+
+ len += snprintf(adxl_msg + len, MSG_SIZE - len, " %s:0x%llx",
+ adxl_component_names[i], adxl_values[i]);
+ if (MSG_SIZE - len <= 0)
+ break;
+ }
+
+ return true;
+}
+
static void skx_mce_output_error(struct mem_ctl_info *mci,
const struct mce *m,
struct decoded_addr *res)
{
enum hw_event_mc_err_type tp_event;
- char *type, *optype, msg[256];
+ char *type, *optype;
bool ripv = GET_BITFIELD(m->mcgstatus, 0, 0);
bool overflow = GET_BITFIELD(m->status, 62, 62);
bool uncorrected_error = GET_BITFIELD(m->status, 61, 61);
break;
}
}
+ if (adxl_component_count) {
+ snprintf(skx_msg, MSG_SIZE, "%s%s err_code:%04x:%04x %s",
+ overflow ? " OVERFLOW" : "",
+ (uncorrected_error && recoverable) ? " recoverable" : "",
+ mscod, errcode, adxl_msg);
+ } else {
+ snprintf(skx_msg, MSG_SIZE,
+ "%s%s err_code:%04x:%04x socket:%d imc:%d rank:%d bg:%d ba:%d row:%x col:%x",
+ overflow ? " OVERFLOW" : "",
+ (uncorrected_error && recoverable) ? " recoverable" : "",
+ mscod, errcode,
+ res->socket, res->imc, res->rank,
+ res->bank_group, res->bank_address, res->row, res->column);
+ }
- snprintf(msg, sizeof(msg),
- "%s%s err_code:%04x:%04x socket:%d imc:%d rank:%d bg:%d ba:%d row:%x col:%x",
- overflow ? " OVERFLOW" : "",
- (uncorrected_error && recoverable) ? " recoverable" : "",
- mscod, errcode,
- res->socket, res->imc, res->rank,
- res->bank_group, res->bank_address, res->row, res->column);
-
- edac_dbg(0, "%s\n", msg);
+ edac_dbg(0, "%s\n", skx_msg);
/* Call the helper to output message */
edac_mc_handle_error(tp_event, mci, core_err_cnt,
m->addr >> PAGE_SHIFT, m->addr & ~PAGE_MASK, 0,
res->channel, res->dimm, -1,
- optype, msg);
+ optype, skx_msg);
+}
+
+static struct mem_ctl_info *get_mci(int src_id, int lmc)
+{
+ struct skx_dev *d;
+
+ if (lmc > NUM_IMC - 1) {
+ skx_printk(KERN_ERR, "Bad lmc %d\n", lmc);
+ return NULL;
+ }
+
+ list_for_each_entry(d, &skx_edac_list, list) {
+ if (d->imc[0].src_id == src_id)
+ return d->imc[lmc].mci;
+ }
+
+ skx_printk(KERN_ERR, "No mci for src_id %d lmc %d\n", src_id, lmc);
+
+ return NULL;
}
static int skx_mce_check_error(struct notifier_block *nb, unsigned long val,
if ((mce->status & 0xefff) >> 7 != 1 || !(mce->status & MCI_STATUS_ADDRV))
return NOTIFY_DONE;
+ memset(&res, 0, sizeof(res));
res.addr = mce->addr;
- if (!skx_decode(&res))
+
+ if (adxl_component_count) {
+ if (!skx_adxl_decode(&res))
+ return NOTIFY_DONE;
+
+ mci = get_mci(res.socket, res.imc);
+ } else {
+ if (!skx_decode(&res))
+ return NOTIFY_DONE;
+
+ mci = res.dev->imc[res.imc].mci;
+ }
+
+ if (!mci)
return NOTIFY_DONE;
- mci = res.dev->imc[res.imc].mci;
if (mce->mcgstatus & MCG_STATUS_MCIP)
type = "Exception";
}
}
+static void __init skx_adxl_get(void)
+{
+ const char * const *names;
+ int i, j;
+
+ names = adxl_get_component_names();
+ if (!names) {
+ skx_printk(KERN_NOTICE, "No firmware support for address translation.");
+ skx_printk(KERN_CONT, " Only decoding DDR4 address!\n");
+ return;
+ }
+
+ for (i = 0; i < INDEX_MAX; i++) {
+ for (j = 0; names[j]; j++) {
+ if (!strcmp(component_names[i], names[j])) {
+ component_indices[i] = j;
+ break;
+ }
+ }
+
+ if (!names[j])
+ goto err;
+ }
+
+ adxl_component_names = names;
+ while (*names++)
+ adxl_component_count++;
+
+ adxl_values = kcalloc(adxl_component_count, sizeof(*adxl_values),
+ GFP_KERNEL);
+ if (!adxl_values) {
+ adxl_component_count = 0;
+ return;
+ }
+
+ adxl_msg = kzalloc(MSG_SIZE, GFP_KERNEL);
+ if (!adxl_msg) {
+ adxl_component_count = 0;
+ kfree(adxl_values);
+ }
+
+ return;
+err:
+ skx_printk(KERN_ERR, "'%s' is not matched from DSM parameters: ",
+ component_names[i]);
+ for (j = 0; names[j]; j++)
+ skx_printk(KERN_CONT, "%s ", names[j]);
+ skx_printk(KERN_CONT, "\n");
+}
+
+static void __exit skx_adxl_put(void)
+{
+ kfree(adxl_values);
+ kfree(adxl_msg);
+}
+
/*
* skx_init:
* make sure we are running on the correct cpu model
}
}
+ skx_msg = kzalloc(MSG_SIZE, GFP_KERNEL);
+ if (!skx_msg) {
+ rc = -ENOMEM;
+ goto fail;
+ }
+
+ if (nvdimm_count)
+ skx_adxl_get();
+
/* Ensure that the OPSTATE is set correctly for POLL or NMI */
opstate_init();
edac_dbg(2, "\n");
mce_unregister_decode_chain(&skx_mce_dec);
skx_remove();
+ if (nvdimm_count)
+ skx_adxl_put();
+ kfree(skx_msg);
teardown_skx_debug();
}
(params.mmap & ~PAGE_MASK)));
init_screen_info();
+
+ /* ARM does not permit early mappings to persist across paging_init() */
+ if (IS_ENABLED(CONFIG_ARM))
+ efi_memmap_unmap();
}
static int __init register_gop_device(void)
{
u64 mapsize;
- if (!efi_enabled(EFI_BOOT) || !efi_enabled(EFI_MEMMAP)) {
+ if (!efi_enabled(EFI_BOOT)) {
pr_info("EFI services will not be available.\n");
return 0;
}
early_memunmap(tbl, sizeof(*tbl));
}
+ return 0;
+}
+int __init efi_apply_persistent_mem_reservations(void)
+{
if (efi.mem_reserve != EFI_INVALID_TABLE_ADDR) {
unsigned long prsv = efi.mem_reserve;
}
static DEFINE_SPINLOCK(efi_mem_reserve_persistent_lock);
+static struct linux_efi_memreserve *efi_memreserve_root __ro_after_init;
-int efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
+static int __init efi_memreserve_map_root(void)
{
- struct linux_efi_memreserve *rsv, *parent;
-
if (efi.mem_reserve == EFI_INVALID_TABLE_ADDR)
return -ENODEV;
- rsv = kmalloc(sizeof(*rsv), GFP_KERNEL);
- if (!rsv)
+ efi_memreserve_root = memremap(efi.mem_reserve,
+ sizeof(*efi_memreserve_root),
+ MEMREMAP_WB);
+ if (WARN_ON_ONCE(!efi_memreserve_root))
return -ENOMEM;
+ return 0;
+}
- parent = memremap(efi.mem_reserve, sizeof(*rsv), MEMREMAP_WB);
- if (!parent) {
- kfree(rsv);
- return -ENOMEM;
+int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
+{
+ struct linux_efi_memreserve *rsv;
+ int rc;
+
+ if (efi_memreserve_root == (void *)ULONG_MAX)
+ return -ENODEV;
+
+ if (!efi_memreserve_root) {
+ rc = efi_memreserve_map_root();
+ if (rc)
+ return rc;
}
+ rsv = kmalloc(sizeof(*rsv), GFP_ATOMIC);
+ if (!rsv)
+ return -ENOMEM;
+
rsv->base = addr;
rsv->size = size;
spin_lock(&efi_mem_reserve_persistent_lock);
- rsv->next = parent->next;
- parent->next = __pa(rsv);
+ rsv->next = efi_memreserve_root->next;
+ efi_memreserve_root->next = __pa(rsv);
spin_unlock(&efi_mem_reserve_persistent_lock);
- memunmap(parent);
+ return 0;
+}
+static int __init efi_memreserve_root_init(void)
+{
+ if (efi_memreserve_root)
+ return 0;
+ if (efi_memreserve_map_root())
+ efi_memreserve_root = (void *)ULONG_MAX;
return 0;
}
+early_initcall(efi_memreserve_root_init);
#ifdef CONFIG_KEXEC
static int update_efi_random_seed(struct notifier_block *nb,
return 0;
}
-static inline bool is_compat(void)
-{
- if (IS_ENABLED(CONFIG_COMPAT) && in_compat_syscall())
- return true;
-
- return false;
-}
-
static void
copy_out_compat(struct efi_variable *dst, struct compat_efi_variable *src)
{
u8 *data;
int err;
- if (is_compat()) {
+ if (in_compat_syscall()) {
struct compat_efi_variable *compat;
if (count != sizeof(*compat))
&entry->var.DataSize, entry->var.Data))
return -EIO;
- if (is_compat()) {
+ if (in_compat_syscall()) {
compat = (struct compat_efi_variable *)buf;
size = sizeof(*compat);
struct compat_efi_variable *compat = (struct compat_efi_variable *)buf;
struct efi_variable *new_var = (struct efi_variable *)buf;
struct efivar_entry *new_entry;
- bool need_compat = is_compat();
+ bool need_compat = in_compat_syscall();
efi_char16_t *name;
unsigned long size;
u32 attributes;
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
- if (is_compat()) {
+ if (in_compat_syscall()) {
if (count != sizeof(*compat))
return -EINVAL;
efi_guid_t memreserve_table_guid = LINUX_EFI_MEMRESERVE_TABLE_GUID;
efi_status_t status;
+ if (IS_ENABLED(CONFIG_ARM))
+ return;
+
status = efi_call_early(allocate_pool, EFI_LOADER_DATA, sizeof(*rsv),
(void **)&rsv);
if (status != EFI_SUCCESS) {
return efi_status;
}
}
+
+ /* shrink the FDT back to its minimum size */
+ fdt_pack(fdt);
+
return EFI_SUCCESS;
fdt_set_fail:
void __init efi_memmap_unmap(void)
{
+ if (!efi_enabled(EFI_MEMMAP))
+ return;
+
if (!efi.memmap.late) {
unsigned long size;
} \
\
init_completion(&efi_rts_work.efi_rts_comp); \
- INIT_WORK_ONSTACK(&efi_rts_work.work, efi_call_rts); \
+ INIT_WORK(&efi_rts_work.work, efi_call_rts); \
efi_rts_work.arg1 = _arg1; \
efi_rts_work.arg2 = _arg2; \
efi_rts_work.arg3 = _arg3; \
tristate "FSI master based on Aspeed ColdFire coprocessor"
depends on GPIOLIB
depends on GPIO_ASPEED
+ select GENERIC_ALLOCATOR
---help---
This option enables a FSI master using the AST2400 and AST2500 GPIO
lines driven by the internal ColdFire coprocessor. This requires
}
ffdc_iov.iov_base = ffdc;
ffdc_iov.iov_len = SBEFIFO_MAX_FFDC_SIZE;
- iov_iter_kvec(&ffdc_iter, WRITE | ITER_KVEC, &ffdc_iov, 1, SBEFIFO_MAX_FFDC_SIZE);
+ iov_iter_kvec(&ffdc_iter, WRITE, &ffdc_iov, 1, SBEFIFO_MAX_FFDC_SIZE);
cmd[0] = cpu_to_be32(2);
cmd[1] = cpu_to_be32(SBEFIFO_CMD_GET_SBE_FFDC);
rc = sbefifo_do_command(sbefifo, cmd, 2, &ffdc_iter);
rbytes = (*resp_len) * sizeof(__be32);
resp_iov.iov_base = response;
resp_iov.iov_len = rbytes;
- iov_iter_kvec(&resp_iter, WRITE | ITER_KVEC, &resp_iov, 1, rbytes);
+ iov_iter_kvec(&resp_iter, WRITE, &resp_iov, 1, rbytes);
/* Perform the command */
mutex_lock(&sbefifo->lock);
#include <linux/fs.h>
#include <linux/uaccess.h>
#include <linux/slab.h>
-#include <linux/cdev.h>
#include <linux/list.h>
#include <uapi/linux/fsi.h>
#include <linux/of.h>
#include <linux/pm.h>
#include <linux/pm_runtime.h>
+#include <linux/sched.h>
#include <linux/serdev.h>
#include <linux/slab.h>
int ret;
/* write is only buffered synchronously */
- ret = serdev_device_write(serdev, buf, count, 0);
+ ret = serdev_device_write(serdev, buf, count, MAX_SCHEDULE_TIMEOUT);
if (ret < 0)
return ret;
#include <linux/pm.h>
#include <linux/pm_runtime.h>
#include <linux/regulator/consumer.h>
+#include <linux/sched.h>
#include <linux/serdev.h>
#include <linux/slab.h>
#include <linux/wait.h>
int ret;
/* write is only buffered synchronously */
- ret = serdev_device_write(serdev, buf, count, 0);
+ ret = serdev_device_write(serdev, buf, count, MAX_SCHEDULE_TIMEOUT);
if (ret < 0)
return ret;
else
timeout = SIRF_HIBERNATE_TIMEOUT;
- while (retries-- > 0) {
+ do {
sirf_pulse_on_off(data);
ret = sirf_wait_for_power_state(data, active, timeout);
if (ret < 0) {
}
break;
- }
+ } while (retries--);
- if (retries == 0)
+ if (retries < 0)
return -ETIMEDOUT;
return 0;
chips->chip.set = davinci_gpio_set;
chips->chip.ngpio = ngpio;
- chips->chip.base = -1;
+ chips->chip.base = pdata->no_auto_base ? pdata->base : -1;
#ifdef CONFIG_OF_GPIO
chips->chip.of_gpio_n_cells = 2;
#define gpio_mockup_err(...) pr_err(GPIO_MOCKUP_NAME ": " __VA_ARGS__)
enum {
- GPIO_MOCKUP_DIR_OUT = 0,
- GPIO_MOCKUP_DIR_IN = 1,
+ GPIO_MOCKUP_DIR_IN = 0,
+ GPIO_MOCKUP_DIR_OUT = 1,
};
/*
{
struct gpio_mockup_chip *chip = gpiochip_get_data(gc);
- return chip->lines[offset].dir;
+ return !chip->lines[offset].dir;
}
static int gpio_mockup_to_irq(struct gpio_chip *gc, unsigned int offset)
if (pxa_gpio_has_pinctrl()) {
ret = pinctrl_gpio_direction_input(chip->base + offset);
- if (!ret)
- return 0;
+ if (ret)
+ return ret;
}
spin_lock_irqsave(&gpio_lock, flags);
gdev->descs = kcalloc(chip->ngpio, sizeof(gdev->descs[0]), GFP_KERNEL);
if (!gdev->descs) {
status = -ENOMEM;
- goto err_free_gdev;
+ goto err_free_ida;
}
if (chip->ngpio == 0) {
kfree_const(gdev->label);
err_free_descs:
kfree(gdev->descs);
-err_free_gdev:
+err_free_ida:
ida_simple_remove(&gpio_ida, gdev->id);
+err_free_gdev:
/* failures here can mean systems won't boot... */
pr_err("%s: GPIOs %d..%d (%s) failed to register, %d\n", __func__,
gdev->base, gdev->base + gdev->ngpio - 1,
extern int amdgpu_gpu_recovery;
extern int amdgpu_emu_mode;
extern uint amdgpu_smu_memory_pool_size;
+extern uint amdgpu_dc_feature_mask;
extern struct amdgpu_mgpu_info mgpu_info;
#ifdef CONFIG_DRM_AMDGPU_SI
#define MAX_KIQ_REG_WAIT 5000 /* in usecs, 5ms */
#define MAX_KIQ_REG_BAILOUT_INTERVAL 5 /* in msecs, 5ms */
-#define MAX_KIQ_REG_TRY 20
+#define MAX_KIQ_REG_TRY 80 /* 20 -> 80 */
int amdgpu_device_ip_set_clockgating_state(void *dev,
enum amd_ip_block_type block_type,
* 2. power off the acp tiles
* 3. check and enter ulv state
*/
- if (adev->powerplay.pp_funcs->set_powergating_by_smu)
+ if (adev->powerplay.pp_funcs &&
+ adev->powerplay.pp_funcs->set_powergating_by_smu)
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, true);
}
return 0;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
bool enable = state == AMD_PG_STATE_GATE ? true : false;
- if (adev->powerplay.pp_funcs->set_powergating_by_smu)
+ if (adev->powerplay.pp_funcs &&
+ adev->powerplay.pp_funcs->set_powergating_by_smu)
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_ACP, enable);
return 0;
{
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
- amdgpu_dpm_switch_power_profile(adev,
- PP_SMC_POWER_PROFILE_COMPUTE, !idle);
+ if (adev->powerplay.pp_funcs &&
+ adev->powerplay.pp_funcs->switch_power_profile)
+ amdgpu_dpm_switch_power_profile(adev,
+ PP_SMC_POWER_PROFILE_COMPUTE,
+ !idle);
}
bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
[AMDGPU_HW_IP_UVD_ENC] = 1,
[AMDGPU_HW_IP_VCN_DEC] = 1,
[AMDGPU_HW_IP_VCN_ENC] = 1,
+ [AMDGPU_HW_IP_VCN_JPEG] = 1,
};
static int amdgput_ctx_total_num_entities(void)
}
adev->powerplay.pp_feature = amdgpu_pp_feature_mask;
- if (amdgpu_sriov_vf(adev))
- adev->powerplay.pp_feature &= ~PP_GFXOFF_MASK;
for (i = 0; i < adev->num_ip_blocks; i++) {
if ((amdgpu_ip_block_mask & (1 << i)) == 0) {
}
}
- if (adev->powerplay.pp_funcs->load_firmware) {
+ if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->load_firmware) {
r = adev->powerplay.pp_funcs->load_firmware(adev->powerplay.pp_handle);
if (r) {
pr_err("firmware loading failed\n");
kthread_park(ring->sched.thread);
- if (job && job->base.sched == &ring->sched)
+ if (job && job->base.sched != &ring->sched)
continue;
drm_sched_hw_job_reset(&ring->sched, job ? &job->base : NULL);
"dither",
amdgpu_dither_enum_list, sz);
+ if (amdgpu_device_has_dc_support(adev)) {
+ adev->mode_info.max_bpc_property =
+ drm_property_create_range(adev->ddev, 0, "max bpc", 8, 16);
+ if (!adev->mode_info.max_bpc_property)
+ return -ENOMEM;
+ }
+
return 0;
}
uint amdgpu_sdma_phase_quantum = 32;
char *amdgpu_disable_cu = NULL;
char *amdgpu_virtual_display = NULL;
-/* OverDrive(bit 14) disabled by default*/
-uint amdgpu_pp_feature_mask = 0xffffbfff;
+/* OverDrive(bit 14),gfxoff(bit 15),stutter mode(bit 17) disabled by default*/
+uint amdgpu_pp_feature_mask = 0xfffd3fff;
int amdgpu_ngg = 0;
int amdgpu_prim_buf_per_se = 0;
int amdgpu_pos_buf_per_se = 0;
int amdgpu_gpu_recovery = -1; /* auto */
int amdgpu_emu_mode = 0;
uint amdgpu_smu_memory_pool_size = 0;
+/* FBC (bit 0) disabled by default*/
+uint amdgpu_dc_feature_mask = 0;
+
struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
};
MODULE_PARM_DESC(halt_if_hws_hang, "Halt if HWS hang is detected (0 = off (default), 1 = on)");
#endif
+/**
+ * DOC: dcfeaturemask (uint)
+ * Override display features enabled. See enum DC_FEATURE_MASK in drivers/gpu/drm/amd/include/amd_shared.h.
+ * The default is the current set of stable display features.
+ */
+MODULE_PARM_DESC(dcfeaturemask, "all stable DC features enabled (default))");
+module_param_named(dcfeaturemask, amdgpu_dc_feature_mask, uint, 0444);
+
static const struct pci_device_id pciidlist[] = {
#ifdef CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
if (!(adev->powerplay.pp_feature & PP_GFXOFF_MASK))
return;
- if (!adev->powerplay.pp_funcs->set_powergating_by_smu)
+ if (!adev->powerplay.pp_funcs || !adev->powerplay.pp_funcs->set_powergating_by_smu)
return;
if (!info->return_size || !info->return_pointer)
return -EINVAL;
- /* Ensure IB tests are run on ring */
- flush_delayed_work(&adev->late_init_work);
-
switch (info->query) {
case AMDGPU_INFO_ACCEL_WORKING:
ui32 = adev->accel_working;
struct amdgpu_fpriv *fpriv;
int r, pasid;
+ /* Ensure IB tests are run on ring */
+ flush_delayed_work(&adev->late_init_work);
+
file_priv->driver_priv = NULL;
r = pm_runtime_get_sync(dev->dev);
struct drm_property *audio_property;
/* FMT dithering */
struct drm_property *dither_property;
+ /* maximum number of bits per channel for monitor color */
+ struct drm_property *max_bpc_property;
/* hardcoded DFP edid from BIOS */
struct edid *bios_hardcoded_edid;
int bios_hardcoded_edid_size;
return ret;
if (adev->powerplay.pp_funcs->force_clock_level)
- amdgpu_dpm_force_clock_level(adev, PP_SCLK, mask);
+ ret = amdgpu_dpm_force_clock_level(adev, PP_SCLK, mask);
+
+ if (ret)
+ return -EINVAL;
return count;
}
return ret;
if (adev->powerplay.pp_funcs->force_clock_level)
- amdgpu_dpm_force_clock_level(adev, PP_MCLK, mask);
+ ret = amdgpu_dpm_force_clock_level(adev, PP_MCLK, mask);
+
+ if (ret)
+ return -EINVAL;
return count;
}
return ret;
if (adev->powerplay.pp_funcs->force_clock_level)
- amdgpu_dpm_force_clock_level(adev, PP_PCIE, mask);
+ ret = amdgpu_dpm_force_clock_level(adev, PP_PCIE, mask);
+
+ if (ret)
+ return -EINVAL;
return count;
}
if (level == adev->vm_manager.root_level)
/* For the root directory */
- return round_up(adev->vm_manager.max_pfn, 1 << shift) >> shift;
+ return round_up(adev->vm_manager.max_pfn, 1ULL << shift) >> shift;
else if (level != AMDGPU_VM_PTB)
/* Everything in between */
return 512;
struct amdgpu_vm_pt_cursor *cursor)
{
amdgpu_vm_pt_next(adev, cursor);
- while (amdgpu_vm_pt_descendant(adev, cursor));
+ if (cursor->pfn != ~0ll)
+ while (amdgpu_vm_pt_descendant(adev, cursor));
}
/**
continue;
}
- /* First check if the entry is already handled */
- if (cursor.pfn < frag_start) {
- cursor.entry->huge = true;
- amdgpu_vm_pt_next(adev, &cursor);
- continue;
- }
-
/* If it isn't already handled it can't be a huge page */
if (cursor.entry->huge) {
/* Add the entry to the relocated list to update it. */
if (!amdgpu_vm_pt_descendant(adev, &cursor))
return -ENOENT;
continue;
- } else if (frag >= parent_shift) {
+ } else if (frag >= parent_shift &&
+ cursor.level - 1 != adev->vm_manager.root_level) {
/* If the fragment size is even larger than the parent
- * shift we should go up one level and check it again.
+ * shift we should go up one level and check it again
+ * unless one level up is the root level.
*/
if (!amdgpu_vm_pt_ancestor(&cursor))
return -ENOENT;
}
/* Looks good so far, calculate parameters for the update */
- incr = AMDGPU_GPU_PAGE_SIZE << shift;
+ incr = (uint64_t)AMDGPU_GPU_PAGE_SIZE << shift;
mask = amdgpu_vm_entries_mask(adev, cursor.level);
pe_start = ((cursor.pfn >> shift) & mask) * 8;
- entry_end = (mask + 1) << shift;
+ entry_end = (uint64_t)(mask + 1) << shift;
entry_end += cursor.pfn & ~(entry_end - 1);
entry_end = min(entry_end, end);
flags | AMDGPU_PTE_FRAG(frag));
pe_start += nptes * 8;
- dst += nptes * AMDGPU_GPU_PAGE_SIZE << shift;
+ dst += (uint64_t)nptes * AMDGPU_GPU_PAGE_SIZE << shift;
frag_start = upd_end;
if (frag_start >= frag_end) {
}
} while (frag_start < entry_end);
- if (frag >= shift)
+ if (amdgpu_vm_pt_descendant(adev, &cursor)) {
+ /* Mark all child entries as huge */
+ while (cursor.pfn < frag_start) {
+ cursor.entry->huge = true;
+ amdgpu_vm_pt_next(adev, &cursor);
+ }
+
+ } else if (frag >= shift) {
+ /* or just move on to the next on the same level. */
amdgpu_vm_pt_next(adev, &cursor);
+ }
}
return 0;
}
rbtree_postorder_for_each_entry_safe(mapping, tmp,
&vm->va.rb_root, rb) {
+ /* Don't remove the mapping here, we don't want to trigger a
+ * rebalance and the tree is about to be destroyed anyway.
+ */
list_del(&mapping->list);
- amdgpu_vm_it_remove(mapping, &vm->va);
kfree(mapping);
}
list_for_each_entry_safe(mapping, tmp, &vm->freed, list) {
if (r)
goto done;
- /* Test KCQs */
- for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+ /* Test KCQs - reversing the order of rings seems to fix ring test failure
+ * after GPU reset
+ */
+ for (i = adev->gfx.num_compute_rings - 1; i >= 0; i--) {
ring = &adev->gfx.compute_ring[i];
ring->ready = true;
r = amdgpu_ring_test_ring(ring);
#endif
WREG32_FIELD15(GC, 0, RLC_CNTL, RLC_ENABLE_F32, 1);
+ udelay(50);
/* carrizo do enable cp interrupt after cp inited */
- if (!(adev->flags & AMD_IS_APU))
+ if (!(adev->flags & AMD_IS_APU)) {
gfx_v9_0_enable_gui_idle_interrupt(adev, true);
-
- udelay(50);
+ udelay(50);
+ }
#ifdef AMDGPU_RLC_DEBUG_RETRY
/* RLC_GPM_GENERAL_6 : RLC Ucode version */
/* Program the system aperture low logical page number. */
WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
- min(adev->gmc.vram_start, adev->gmc.agp_start) >> 18);
+ min(adev->gmc.fb_start, adev->gmc.agp_start) >> 18);
if (adev->asic_type == CHIP_RAVEN && adev->rev_id >= 0x8)
/*
* to get rid of the VM fault and hardware hang.
*/
WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
- max((adev->gmc.vram_end >> 18) + 0x1,
+ max((adev->gmc.fb_end >> 18) + 0x1,
adev->gmc.agp_end >> 18));
else
WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
- max(adev->gmc.vram_end, adev->gmc.agp_end) >> 18);
+ max(adev->gmc.fb_end, adev->gmc.agp_end) >> 18);
/* Set default page address. */
value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start
MODULE_FIRMWARE("amdgpu/pitcairn_mc.bin");
MODULE_FIRMWARE("amdgpu/verde_mc.bin");
MODULE_FIRMWARE("amdgpu/oland_mc.bin");
+MODULE_FIRMWARE("amdgpu/hainan_mc.bin");
MODULE_FIRMWARE("amdgpu/si58_mc.bin");
#define MC_SEQ_MISC0__MT__MASK 0xf0000000
MODULE_FIRMWARE("amdgpu/polaris11_mc.bin");
MODULE_FIRMWARE("amdgpu/polaris10_mc.bin");
MODULE_FIRMWARE("amdgpu/polaris12_mc.bin");
+MODULE_FIRMWARE("amdgpu/polaris11_k_mc.bin");
+MODULE_FIRMWARE("amdgpu/polaris10_k_mc.bin");
+MODULE_FIRMWARE("amdgpu/polaris12_k_mc.bin");
static const u32 golden_settings_tonga_a11[] =
{
chip_name = "tonga";
break;
case CHIP_POLARIS11:
- chip_name = "polaris11";
+ if (((adev->pdev->device == 0x67ef) &&
+ ((adev->pdev->revision == 0xe0) ||
+ (adev->pdev->revision == 0xe5))) ||
+ ((adev->pdev->device == 0x67ff) &&
+ ((adev->pdev->revision == 0xcf) ||
+ (adev->pdev->revision == 0xef) ||
+ (adev->pdev->revision == 0xff))))
+ chip_name = "polaris11_k";
+ else if ((adev->pdev->device == 0x67ef) &&
+ (adev->pdev->revision == 0xe2))
+ chip_name = "polaris11_k";
+ else
+ chip_name = "polaris11";
break;
case CHIP_POLARIS10:
- chip_name = "polaris10";
+ if ((adev->pdev->device == 0x67df) &&
+ ((adev->pdev->revision == 0xe1) ||
+ (adev->pdev->revision == 0xf7)))
+ chip_name = "polaris10_k";
+ else
+ chip_name = "polaris10";
break;
case CHIP_POLARIS12:
- chip_name = "polaris12";
+ if (((adev->pdev->device == 0x6987) &&
+ ((adev->pdev->revision == 0xc0) ||
+ (adev->pdev->revision == 0xc3))) ||
+ ((adev->pdev->device == 0x6981) &&
+ ((adev->pdev->revision == 0x00) ||
+ (adev->pdev->revision == 0x01) ||
+ (adev->pdev->revision == 0x10))))
+ chip_name = "polaris12_k";
+ else
+ chip_name = "polaris12";
break;
case CHIP_FIJI:
case CHIP_CARRIZO:
const struct mc_firmware_header_v1_0 *hdr;
const __le32 *fw_data = NULL;
const __le32 *io_mc_regs = NULL;
- u32 data, vbios_version;
+ u32 data;
int i, ucode_size, regs_size;
/* Skip MC ucode loading on SR-IOV capable boards.
if (amdgpu_sriov_bios(adev))
return 0;
- WREG32(mmMC_SEQ_IO_DEBUG_INDEX, 0x9F);
- data = RREG32(mmMC_SEQ_IO_DEBUG_DATA);
- vbios_version = data & 0xf;
-
- if (vbios_version == 0)
- return 0;
-
if (!adev->gmc.fw)
return -EINVAL;
/* Program the system aperture low logical page number. */
WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
- min(adev->gmc.vram_start, adev->gmc.agp_start) >> 18);
+ min(adev->gmc.fb_start, adev->gmc.agp_start) >> 18);
if (adev->asic_type == CHIP_RAVEN && adev->rev_id >= 0x8)
/*
* to get rid of the VM fault and hardware hang.
*/
WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
- max((adev->gmc.vram_end >> 18) + 0x1,
+ max((adev->gmc.fb_end >> 18) + 0x1,
adev->gmc.agp_end >> 18));
else
WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
- max(adev->gmc.vram_end, adev->gmc.agp_end) >> 18);
+ max(adev->gmc.fb_end, adev->gmc.agp_end) >> 18);
/* Set default page address. */
value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start +
return;
if (enable && adev->pg_flags & AMD_PG_SUPPORT_MMHUB) {
- if (adev->powerplay.pp_funcs->set_powergating_by_smu)
+ if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->set_powergating_by_smu)
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GMC, true);
}
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs->set_powergating_by_smu)
+ if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs &&
+ adev->powerplay.pp_funcs->set_powergating_by_smu)
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, false);
sdma_v4_0_init_golden_registers(adev);
sdma_v4_0_ctx_switch_enable(adev, false);
sdma_v4_0_enable(adev, false);
- if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs->set_powergating_by_smu)
+ if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs
+ && adev->powerplay.pp_funcs->set_powergating_by_smu)
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, true);
return 0;
#define mmMP0_MISC_LIGHT_SLEEP_CTRL 0x01ba
#define mmMP0_MISC_LIGHT_SLEEP_CTRL_BASE_IDX 0
+/* for Vega20 register name change */
+#define mmHDP_MEM_POWER_CTRL 0x00d4
+#define HDP_MEM_POWER_CTRL__IPH_MEM_POWER_CTRL_EN_MASK 0x00000001L
+#define HDP_MEM_POWER_CTRL__IPH_MEM_POWER_LS_EN_MASK 0x00000002L
+#define HDP_MEM_POWER_CTRL__RC_MEM_POWER_CTRL_EN_MASK 0x00010000L
+#define HDP_MEM_POWER_CTRL__RC_MEM_POWER_LS_EN_MASK 0x00020000L
+#define mmHDP_MEM_POWER_CTRL_BASE_IDX 0
/*
* Indirect registers accessor
*/
{
uint32_t def, data;
- def = data = RREG32(SOC15_REG_OFFSET(HDP, 0, mmHDP_MEM_POWER_LS));
+ if (adev->asic_type == CHIP_VEGA20) {
+ def = data = RREG32(SOC15_REG_OFFSET(HDP, 0, mmHDP_MEM_POWER_CTRL));
- if (enable && (adev->cg_flags & AMD_CG_SUPPORT_HDP_LS))
- data |= HDP_MEM_POWER_LS__LS_ENABLE_MASK;
- else
- data &= ~HDP_MEM_POWER_LS__LS_ENABLE_MASK;
+ if (enable && (adev->cg_flags & AMD_CG_SUPPORT_HDP_LS))
+ data |= HDP_MEM_POWER_CTRL__IPH_MEM_POWER_CTRL_EN_MASK |
+ HDP_MEM_POWER_CTRL__IPH_MEM_POWER_LS_EN_MASK |
+ HDP_MEM_POWER_CTRL__RC_MEM_POWER_CTRL_EN_MASK |
+ HDP_MEM_POWER_CTRL__RC_MEM_POWER_LS_EN_MASK;
+ else
+ data &= ~(HDP_MEM_POWER_CTRL__IPH_MEM_POWER_CTRL_EN_MASK |
+ HDP_MEM_POWER_CTRL__IPH_MEM_POWER_LS_EN_MASK |
+ HDP_MEM_POWER_CTRL__RC_MEM_POWER_CTRL_EN_MASK |
+ HDP_MEM_POWER_CTRL__RC_MEM_POWER_LS_EN_MASK);
- if (def != data)
- WREG32(SOC15_REG_OFFSET(HDP, 0, mmHDP_MEM_POWER_LS), data);
+ if (def != data)
+ WREG32(SOC15_REG_OFFSET(HDP, 0, mmHDP_MEM_POWER_CTRL), data);
+ } else {
+ def = data = RREG32(SOC15_REG_OFFSET(HDP, 0, mmHDP_MEM_POWER_LS));
+
+ if (enable && (adev->cg_flags & AMD_CG_SUPPORT_HDP_LS))
+ data |= HDP_MEM_POWER_LS__LS_ENABLE_MASK;
+ else
+ data &= ~HDP_MEM_POWER_LS__LS_ENABLE_MASK;
+
+ if (def != data)
+ WREG32(SOC15_REG_OFFSET(HDP, 0, mmHDP_MEM_POWER_LS), data);
+ }
}
static void soc15_update_drm_clock_gating(struct amdgpu_device *adev, bool enable)
static void vcn_v1_0_set_jpeg_ring_funcs(struct amdgpu_device *adev);
static void vcn_v1_0_set_irq_funcs(struct amdgpu_device *adev);
static void vcn_v1_0_jpeg_ring_set_patch_ring(struct amdgpu_ring *ring, uint32_t ptr);
+static int vcn_v1_0_set_powergating_state(void *handle, enum amd_powergating_state state);
/**
* vcn_v1_0_early_init - set function pointers
struct amdgpu_ring *ring = &adev->vcn.ring_dec;
if (RREG32_SOC15(VCN, 0, mmUVD_STATUS))
- vcn_v1_0_stop(adev);
+ vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
ring->ready = false;
else
wptr_off = adev->wb.gpu_addr + (adev->irq.ih.wptr_offs * 4);
WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_ADDR_LO, lower_32_bits(wptr_off));
- WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFF);
+ WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFFFF);
/* set rptr, wptr to 0 */
WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, 0);
adev->reg_offset[SMUIO_HWIP][i] = (uint32_t *)(&(SMUIO_BASE.instance[i]));
adev->reg_offset[NBIF_HWIP][i] = (uint32_t *)(&(NBIO_BASE.instance[i]));
adev->reg_offset[THM_HWIP][i] = (uint32_t *)(&(THM_BASE.instance[i]));
+ adev->reg_offset[CLK_HWIP][i] = (uint32_t *)(&(CLK_BASE.instance[i]));
}
return 0;
}
adev->asic_type < CHIP_RAVEN)
init_data.flags.gpu_vm_support = true;
+ if (amdgpu_dc_feature_mask & DC_FBC_MASK)
+ init_data.flags.fbc_support = true;
+
/* Display Core create. */
adev->dm.dc = dc_create(&init_data);
static enum dc_color_depth
convert_color_depth_from_display_info(const struct drm_connector *connector)
{
+ struct dm_connector_state *dm_conn_state =
+ to_dm_connector_state(connector->state);
uint32_t bpc = connector->display_info.bpc;
+ /* TODO: Remove this when there's support for max_bpc in drm */
+ if (dm_conn_state && bpc > dm_conn_state->max_bpc)
+ /* Round down to nearest even number. */
+ bpc = dm_conn_state->max_bpc - (dm_conn_state->max_bpc & 1);
+
switch (bpc) {
case 0:
/*
cea_revision = drm_connector->display_info.cea_rev;
- strncpy(audio_info->display_name,
+ strscpy(audio_info->display_name,
edid_caps->display_name,
- AUDIO_INFO_DISPLAY_NAME_SIZE_IN_CHARS - 1);
+ AUDIO_INFO_DISPLAY_NAME_SIZE_IN_CHARS);
if (cea_revision >= 3) {
audio_info->mode_count = edid_caps->audio_mode_count;
drm_connector = &aconnector->base;
if (!aconnector->dc_sink) {
- /*
- * Create dc_sink when necessary to MST
- * Don't apply fake_sink to MST
- */
- if (aconnector->mst_port) {
- dm_dp_mst_dc_sink_create(drm_connector);
- return stream;
+ if (!aconnector->mst_port) {
+ sink = create_fake_sink(aconnector);
+ if (!sink)
+ return stream;
}
-
- sink = create_fake_sink(aconnector);
- if (!sink)
- return stream;
} else {
sink = aconnector->dc_sink;
}
} else if (property == adev->mode_info.underscan_property) {
dm_new_state->underscan_enable = val;
ret = 0;
+ } else if (property == adev->mode_info.max_bpc_property) {
+ dm_new_state->max_bpc = val;
+ ret = 0;
}
return ret;
} else if (property == adev->mode_info.underscan_property) {
*val = dm_state->underscan_enable;
ret = 0;
+ } else if (property == adev->mode_info.max_bpc_property) {
+ *val = dm_state->max_bpc;
+ ret = 0;
}
return ret;
}
state->underscan_enable = false;
state->underscan_hborder = 0;
state->underscan_vborder = 0;
+ state->max_bpc = 8;
__drm_atomic_helper_connector_reset(connector, &state->base);
}
new_state->freesync_capable = state->freesync_capable;
new_state->freesync_enable = state->freesync_enable;
+ new_state->max_bpc = state->max_bpc;
return &new_state->base;
}
static const struct drm_plane_funcs dm_plane_funcs = {
.update_plane = drm_atomic_helper_update_plane,
.disable_plane = drm_atomic_helper_disable_plane,
- .destroy = drm_plane_cleanup,
+ .destroy = drm_primary_helper_destroy,
.reset = dm_drm_plane_reset,
.atomic_duplicate_state = dm_drm_plane_duplicate_state,
.atomic_destroy_state = dm_drm_plane_destroy_state,
mode->hdisplay = hdisplay;
mode->vdisplay = vdisplay;
mode->type &= ~DRM_MODE_TYPE_PREFERRED;
- strncpy(mode->name, name, DRM_DISPLAY_MODE_LEN);
+ strscpy(mode->name, name, DRM_DISPLAY_MODE_LEN);
return mode;
drm_object_attach_property(&aconnector->base.base,
adev->mode_info.underscan_vborder_property,
0);
+ drm_object_attach_property(&aconnector->base.base,
+ adev->mode_info.max_bpc_property,
+ 0);
}
struct mutex hpd_lock;
bool fake_enable;
-
- bool mst_connected;
};
#define to_amdgpu_dm_connector(x) container_of(x, struct amdgpu_dm_connector, base)
enum amdgpu_rmx_type scaling;
uint8_t underscan_vborder;
uint8_t underscan_hborder;
+ uint8_t max_bpc;
bool underscan_enable;
bool freesync_enable;
bool freesync_capable;
.atomic_get_property = amdgpu_dm_connector_atomic_get_property
};
-void dm_dp_mst_dc_sink_create(struct drm_connector *connector)
-{
- struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector);
- struct dc_sink *dc_sink;
- struct dc_sink_init_data init_params = {
- .link = aconnector->dc_link,
- .sink_signal = SIGNAL_TYPE_DISPLAY_PORT_MST };
-
- /* FIXME none of this is safe. we shouldn't touch aconnector here in
- * atomic_check
- */
-
- /*
- * TODO: Need to further figure out why ddc.algo is NULL while MST port exists
- */
- if (!aconnector->port || !aconnector->port->aux.ddc.algo)
- return;
-
- ASSERT(aconnector->edid);
-
- dc_sink = dc_link_add_remote_sink(
- aconnector->dc_link,
- (uint8_t *)aconnector->edid,
- (aconnector->edid->extensions + 1) * EDID_LENGTH,
- &init_params);
-
- dc_sink->priv = aconnector;
- aconnector->dc_sink = dc_sink;
-
- if (aconnector->dc_sink)
- amdgpu_dm_update_freesync_caps(
- connector, aconnector->edid);
-}
-
static int dm_dp_mst_get_modes(struct drm_connector *connector)
{
struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector);
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_encoder *amdgpu_encoder;
struct drm_encoder *encoder;
- const struct drm_connector_helper_funcs *connector_funcs =
- connector->base.helper_private;
- struct drm_encoder *enc_master =
- connector_funcs->best_encoder(&connector->base);
- DRM_DEBUG_KMS("enc master is %p\n", enc_master);
amdgpu_encoder = kzalloc(sizeof(*amdgpu_encoder), GFP_KERNEL);
if (!amdgpu_encoder)
return NULL;
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_dm_connector *aconnector;
struct drm_connector *connector;
- struct drm_connector_list_iter conn_iter;
-
- drm_connector_list_iter_begin(dev, &conn_iter);
- drm_for_each_connector_iter(connector, &conn_iter) {
- aconnector = to_amdgpu_dm_connector(connector);
- if (aconnector->mst_port == master
- && !aconnector->port) {
- DRM_INFO("DM_MST: reusing connector: %p [id: %d] [master: %p]\n",
- aconnector, connector->base.id, aconnector->mst_port);
-
- aconnector->port = port;
- drm_connector_set_path_property(connector, pathprop);
-
- drm_connector_list_iter_end(&conn_iter);
- aconnector->mst_connected = true;
- return &aconnector->base;
- }
- }
- drm_connector_list_iter_end(&conn_iter);
aconnector = kzalloc(sizeof(*aconnector), GFP_KERNEL);
if (!aconnector)
master->connector_id);
aconnector->mst_encoder = dm_dp_create_fake_mst_encoder(master);
+ drm_connector_attach_encoder(&aconnector->base,
+ &aconnector->mst_encoder->base);
- /*
- * TODO: understand why this one is needed
- */
drm_object_attach_property(
&connector->base,
dev->mode_config.path_property,
*/
amdgpu_dm_connector_funcs_reset(connector);
- aconnector->mst_connected = true;
-
DRM_INFO("DM_MST: added connector: %p [id: %d] [master: %p]\n",
aconnector, connector->base.id, aconnector->mst_port);
static void dm_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
struct drm_connector *connector)
{
+ struct amdgpu_dm_connector *master = container_of(mgr, struct amdgpu_dm_connector, mst_mgr);
+ struct drm_device *dev = master->base.dev;
+ struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector);
DRM_INFO("DM_MST: Disabling connector: %p [id: %d] [master: %p]\n",
aconnector->dc_sink = NULL;
}
- aconnector->mst_connected = false;
+ drm_connector_unregister(connector);
+ if (adev->mode_info.rfbdev)
+ drm_fb_helper_remove_one_connector(&adev->mode_info.rfbdev->helper, connector);
+ drm_connector_put(connector);
}
static void dm_dp_mst_hotplug(struct drm_dp_mst_topology_mgr *mgr)
drm_kms_helper_hotplug_event(dev);
}
-static void dm_dp_mst_link_status_reset(struct drm_connector *connector)
-{
- mutex_lock(&connector->dev->mode_config.mutex);
- drm_connector_set_link_status_property(connector, DRM_MODE_LINK_STATUS_BAD);
- mutex_unlock(&connector->dev->mode_config.mutex);
-}
-
static void dm_dp_mst_register_connector(struct drm_connector *connector)
{
struct drm_device *dev = connector->dev;
struct amdgpu_device *adev = dev->dev_private;
- struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector);
if (adev->mode_info.rfbdev)
drm_fb_helper_add_one_connector(&adev->mode_info.rfbdev->helper, connector);
DRM_ERROR("adev->mode_info.rfbdev is NULL\n");
drm_connector_register(connector);
-
- if (aconnector->mst_connected)
- dm_dp_mst_link_status_reset(connector);
}
static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
struct amdgpu_dm_connector *aconnector);
-void dm_dp_mst_dc_sink_create(struct drm_connector *connector);
#endif
adev->pm.pm_display_cfg.displays[i].controller_id = dc_cfg->pipe_idx + 1;
}
- if (adev->powerplay.pp_funcs->display_configuration_change)
+ if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->display_configuration_change)
adev->powerplay.pp_funcs->display_configuration_change(
adev->powerplay.pp_handle,
&adev->pm.pm_display_cfg);
struct amd_pp_simple_clock_info validation_clks = { 0 };
uint32_t i;
- if (adev->powerplay.pp_funcs->get_clock_by_type) {
+ if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->get_clock_by_type) {
if (adev->powerplay.pp_funcs->get_clock_by_type(pp_handle,
dc_to_pp_clock_type(clk_type), &pp_clks)) {
/* Error in pplib. Provide default values. */
pp_to_dc_clock_levels(&pp_clks, dc_clks, clk_type);
- if (adev->powerplay.pp_funcs->get_display_mode_validation_clocks) {
+ if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->get_display_mode_validation_clocks) {
if (adev->powerplay.pp_funcs->get_display_mode_validation_clocks(
pp_handle, &validation_clks)) {
/* Error in pplib. Provide default values. */
struct pp_clock_levels_with_voltage pp_clk_info = {0};
const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+ if (!pp_funcs || !pp_funcs->get_clock_by_type_with_voltage)
+ return false;
+
if (pp_funcs->get_clock_by_type_with_voltage(pp_handle,
dc_to_pp_clock_type(clk_type),
&pp_clk_info))
if (!pp_clock_request.clock_type)
return false;
- if (adev->powerplay.pp_funcs->display_clock_voltage_request)
+ if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->display_clock_voltage_request)
ret = adev->powerplay.pp_funcs->display_clock_voltage_request(
adev->powerplay.pp_handle,
&pp_clock_request);
struct amd_pp_clock_info pp_clk_info = {0};
int ret = 0;
- if (adev->powerplay.pp_funcs->get_current_clocks)
+ if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->get_current_clocks)
ret = adev->powerplay.pp_funcs->get_current_clocks(
adev->powerplay.pp_handle,
&pp_clk_info);
wm_with_clock_ranges.num_wm_dmif_sets = ranges->num_reader_wm_sets;
wm_with_clock_ranges.num_wm_mcif_sets = ranges->num_writer_wm_sets;
+ if (!pp_funcs || !pp_funcs->set_watermarks_for_clocks_ranges)
+ return;
+
for (i = 0; i < wm_with_clock_ranges.num_wm_dmif_sets; i++) {
if (ranges->reader_wm_sets[i].wm_inst > 3)
wm_dce_clocks[i].wm_set_id = WM_SET_A;
i2c_success = i2c_write(pipe_ctx, slave_address,
buffer, sizeof(buffer));
RETIMER_REDRIVER_INFO("retimer write to slave_address = 0x%x,\
- offset = 0x%d, reg_val = 0x%d, i2c_success = %d\n",
+ offset = 0x%x, reg_val = 0x%x, i2c_success = %d\n",
slave_address, buffer[0], buffer[1], i2c_success?1:0);
if (!i2c_success)
/* Write failure */
i2c_success = i2c_write(pipe_ctx, slave_address,
buffer, sizeof(buffer));
RETIMER_REDRIVER_INFO("retimer write to slave_address = 0x%x,\
- offset = 0x%d, reg_val = 0x%d, i2c_success = %d\n",
+ offset = 0x%x, reg_val = 0x%x, i2c_success = %d\n",
slave_address, buffer[0], buffer[1], i2c_success?1:0);
if (!i2c_success)
/* Write failure */
struct dc_config {
bool gpu_vm_support;
bool disable_disp_pll_sharing;
+ bool fbc_support;
};
enum visual_confirm {
if (events->force_trigger)
value |= 0x1;
- value |= 0x84;
+ if (num_pipes) {
+ struct dc *dc = pipe_ctx[0]->stream->ctx->dc;
+
+ if (dc->fbc_compressor)
+ value |= 0x84;
+ }
for (i = 0; i < num_pipes; i++)
pipe_ctx[i]->stream_res.tg->funcs->
dc,
context->bw.dce.sclk_khz);
+ pp_display_cfg->min_dcfclock_khz = pp_display_cfg->min_engine_clock_khz;
+
pp_display_cfg->min_engine_clock_deep_sleep_khz
= context->bw.dce.sclk_deep_sleep_khz;
static const struct encoder_feature_support link_enc_feature = {
.max_hdmi_deep_color = COLOR_DEPTH_121212,
- .max_hdmi_pixel_clock = 594000,
+ .max_hdmi_pixel_clock = 300000,
.flags.bits.IS_HBR2_CAPABLE = true,
.flags.bits.IS_TPS3_CAPABLE = true
};
pool->base.sw_i2cs[i] = NULL;
}
- dc->fbc_compressor = dce110_compressor_create(ctx);
+ if (dc->config.fbc_support)
+ dc->fbc_compressor = dce110_compressor_create(ctx);
if (!underlay_create(ctx, &pool->base))
goto res_create_fail;
#define LITTLEENDIAN_CPU
#endif
-#undef READ
-#undef WRITE
#undef FRAME_SIZE
#define dm_output_to_console(fmt, ...) DRM_DEBUG_KMS(fmt, ##__VA_ARGS__)
PP_AVFS_MASK = 0x40000,
};
+enum DC_FEATURE_MASK {
+ DC_FBC_MASK = 0x1,
+};
+
/**
* struct amd_ip_funcs - general hooks for managing amdgpu IP Blocks
*/
struct atom_common_table_header table_header;
uint8_t smuip_min_ver;
uint8_t smuip_max_ver;
- uint8_t smu_rsd1;
+ uint8_t waflclk_ss_mode;
uint8_t gpuclk_ss_mode;
uint16_t sclk_ss_percentage;
uint16_t sclk_ss_rate_10hz;
uint32_t syspll3_1_vco_freq_10khz;
uint32_t bootup_fclk_10khz;
uint32_t bootup_waflclk_10khz;
- uint32_t reserved[3];
+ uint32_t smu_info_caps;
+ uint16_t waflclk_ss_percentage; // in unit of 0.001%
+ uint16_t smuinitoffset;
+ uint32_t reserved;
};
/*
pr_info("%s was not implemented.\n", __func__);
return 0;
}
+
+ if (hwmgr->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL) {
+ pr_info("force clock level is for dpm manual mode only.\n");
+ return -EINVAL;
+ }
+
mutex_lock(&hwmgr->smu_lock);
- if (hwmgr->dpm_level == AMD_DPM_FORCED_LEVEL_MANUAL)
- ret = hwmgr->hwmgr_func->force_clock_level(hwmgr, type, mask);
- else
- ret = -EINVAL;
+ ret = hwmgr->hwmgr_func->force_clock_level(hwmgr, type, mask);
mutex_unlock(&hwmgr->smu_lock);
return ret;
}
static int pp_set_power_limit(void *handle, uint32_t limit)
{
struct pp_hwmgr *hwmgr = handle;
+ uint32_t max_power_limit;
if (!hwmgr || !hwmgr->pm_en)
return -EINVAL;
if (limit == 0)
limit = hwmgr->default_power_limit;
- if (limit > hwmgr->default_power_limit)
+ max_power_limit = hwmgr->default_power_limit;
+ if (hwmgr->od_enabled) {
+ max_power_limit *= (100 + hwmgr->platform_descriptor.TDPODLimit);
+ max_power_limit /= 100;
+ }
+
+ if (limit > max_power_limit)
return -EINVAL;
mutex_lock(&hwmgr->smu_lock);
mutex_lock(&hwmgr->smu_lock);
- if (default_limit)
+ if (default_limit) {
*limit = hwmgr->default_power_limit;
+ if (hwmgr->od_enabled) {
+ *limit *= (100 + hwmgr->platform_descriptor.TDPODLimit);
+ *limit /= 100;
+ }
+ }
else
*limit = hwmgr->power_limit;
{
struct pp_hwmgr *hwmgr = handle;
- if (!hwmgr || !hwmgr->pm_en)
+ if (!hwmgr)
return -EINVAL;
- if (hwmgr->hwmgr_func->enable_mgpu_fan_boost == NULL) {
+ if (!hwmgr->pm_en ||
+ hwmgr->hwmgr_func->enable_mgpu_fan_boost == NULL)
return 0;
- }
mutex_lock(&hwmgr->smu_lock);
hwmgr->hwmgr_func->enable_mgpu_fan_boost(hwmgr);
PHM_FUNC_CHECK(hwmgr);
adev = hwmgr->adev;
- if (smum_is_dpm_running(hwmgr) && !amdgpu_passthrough(adev)) {
+ /* Skip for suspend/resume case */
+ if (smum_is_dpm_running(hwmgr) && !amdgpu_passthrough(adev)
+ && adev->in_suspend) {
pr_info("dpm has been enabled\n");
return 0;
}
switch (task_id) {
case AMD_PP_TASK_DISPLAY_CONFIG_CHANGE:
+ ret = phm_pre_display_configuration_changed(hwmgr);
+ if (ret)
+ return ret;
ret = phm_set_cpu_power_state(hwmgr);
if (ret)
return ret;
if (skip)
return 0;
- phm_pre_display_configuration_changed(hwmgr);
-
phm_display_configuration_changed(hwmgr);
if (hwmgr->ps)
break;
}
- if (i >= sclk_table->count)
- data->need_update_smu7_dpm_table |= DPMTABLE_OD_UPDATE_SCLK;
- else {
+ if (i >= sclk_table->count) {
+ if (sclk > sclk_table->dpm_levels[i-1].value) {
+ data->need_update_smu7_dpm_table |= DPMTABLE_OD_UPDATE_SCLK;
+ sclk_table->dpm_levels[i-1].value = sclk;
+ }
+ } else {
/* TODO: Check SCLK in DAL's minimum clocks
* in case DeepSleep divider update is required.
*/
break;
}
- if (i >= mclk_table->count)
- data->need_update_smu7_dpm_table |= DPMTABLE_OD_UPDATE_MCLK;
-
+ if (i >= mclk_table->count) {
+ if (mclk > mclk_table->dpm_levels[i-1].value) {
+ data->need_update_smu7_dpm_table |= DPMTABLE_OD_UPDATE_MCLK;
+ mclk_table->dpm_levels[i-1].value = mclk;
+ }
+ }
if (data->display_timing.num_existing_displays != hwmgr->display_config->num_display)
data->need_update_smu7_dpm_table |= DPMTABLE_UPDATE_MCLK;
struct smu7_single_dpm_table *sclk_table = &(data->dpm_table.sclk_table);
struct smu7_single_dpm_table *golden_sclk_table =
&(data->golden_dpm_table.sclk_table);
- int value;
+ int value = sclk_table->dpm_levels[sclk_table->count - 1].value;
+ int golden_value = golden_sclk_table->dpm_levels
+ [golden_sclk_table->count - 1].value;
- value = (sclk_table->dpm_levels[sclk_table->count - 1].value -
- golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value) *
- 100 /
- golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value;
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
return value;
}
struct smu7_single_dpm_table *mclk_table = &(data->dpm_table.mclk_table);
struct smu7_single_dpm_table *golden_mclk_table =
&(data->golden_dpm_table.mclk_table);
- int value;
+ int value = mclk_table->dpm_levels[mclk_table->count - 1].value;
+ int golden_value = golden_mclk_table->dpm_levels
+ [golden_mclk_table->count - 1].value;
- value = (mclk_table->dpm_levels[mclk_table->count - 1].value -
- golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value) *
- 100 /
- golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value;
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
return value;
}
for (i = 0; i < wm_with_clock_ranges->num_wm_dmif_sets; i++) {
table->WatermarkRow[1][i].MinClock =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_min_dcfclk_clk_in_khz) /
- 1000);
+ (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_min_dcfclk_clk_in_khz /
+ 1000));
table->WatermarkRow[1][i].MaxClock =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_max_dcfclk_clk_in_khz) /
- 100);
+ (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_max_dcfclk_clk_in_khz /
+ 1000));
table->WatermarkRow[1][i].MinUclk =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_min_mem_clk_in_khz) /
- 1000);
+ (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_min_mem_clk_in_khz /
+ 1000));
table->WatermarkRow[1][i].MaxUclk =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_max_mem_clk_in_khz) /
- 1000);
+ (wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_max_mem_clk_in_khz /
+ 1000));
table->WatermarkRow[1][i].WmSetting = (uint8_t)
wm_with_clock_ranges->wm_dmif_clocks_ranges[i].wm_set_id;
}
for (i = 0; i < wm_with_clock_ranges->num_wm_mcif_sets; i++) {
table->WatermarkRow[0][i].MinClock =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_min_socclk_clk_in_khz) /
- 1000);
+ (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_min_socclk_clk_in_khz /
+ 1000));
table->WatermarkRow[0][i].MaxClock =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_max_socclk_clk_in_khz) /
- 1000);
+ (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_max_socclk_clk_in_khz /
+ 1000));
table->WatermarkRow[0][i].MinUclk =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_min_mem_clk_in_khz) /
- 1000);
+ (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_min_mem_clk_in_khz /
+ 1000));
table->WatermarkRow[0][i].MaxUclk =
cpu_to_le16((uint16_t)
- (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_max_mem_clk_in_khz) /
- 1000);
+ (wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_max_mem_clk_in_khz /
+ 1000));
table->WatermarkRow[0][i].WmSetting = (uint8_t)
wm_with_clock_ranges->wm_mcif_clocks_ranges[i].wm_set_id;
}
if (hwmgr->platform_descriptor.overdriveLimit.memoryClock == 0)
hwmgr->platform_descriptor.overdriveLimit.memoryClock =
dpm_table->dpm_levels[dpm_table->count-1].value;
-
vega10_init_dpm_state(&(dpm_table->dpm_state));
data->dpm_table.eclk_table.count = 0;
static int vega10_find_dpm_states_clocks_in_dpm_table(struct pp_hwmgr *hwmgr, const void *input)
{
struct vega10_hwmgr *data = hwmgr->backend;
+ const struct phm_set_power_state_input *states =
+ (const struct phm_set_power_state_input *)input;
+ const struct vega10_power_state *vega10_ps =
+ cast_const_phw_vega10_power_state(states->pnew_state);
+ struct vega10_single_dpm_table *sclk_table = &(data->dpm_table.gfx_table);
+ uint32_t sclk = vega10_ps->performance_levels
+ [vega10_ps->performance_level_count - 1].gfx_clock;
+ struct vega10_single_dpm_table *mclk_table = &(data->dpm_table.mem_table);
+ uint32_t mclk = vega10_ps->performance_levels
+ [vega10_ps->performance_level_count - 1].mem_clock;
+ uint32_t i;
+
+ for (i = 0; i < sclk_table->count; i++) {
+ if (sclk == sclk_table->dpm_levels[i].value)
+ break;
+ }
+
+ if (i >= sclk_table->count) {
+ if (sclk > sclk_table->dpm_levels[i-1].value) {
+ data->need_update_dpm_table |= DPMTABLE_OD_UPDATE_SCLK;
+ sclk_table->dpm_levels[i-1].value = sclk;
+ }
+ }
+
+ for (i = 0; i < mclk_table->count; i++) {
+ if (mclk == mclk_table->dpm_levels[i].value)
+ break;
+ }
+
+ if (i >= mclk_table->count) {
+ if (mclk > mclk_table->dpm_levels[i-1].value) {
+ data->need_update_dpm_table |= DPMTABLE_OD_UPDATE_MCLK;
+ mclk_table->dpm_levels[i-1].value = mclk;
+ }
+ }
if (data->display_timing.num_existing_displays != hwmgr->display_config->num_display)
data->need_update_dpm_table |= DPMTABLE_UPDATE_MCLK;
struct vega10_single_dpm_table *sclk_table = &(data->dpm_table.gfx_table);
struct vega10_single_dpm_table *golden_sclk_table =
&(data->golden_dpm_table.gfx_table);
- int value;
-
- value = (sclk_table->dpm_levels[sclk_table->count - 1].value -
- golden_sclk_table->dpm_levels
- [golden_sclk_table->count - 1].value) *
- 100 /
- golden_sclk_table->dpm_levels
+ int value = sclk_table->dpm_levels[sclk_table->count - 1].value;
+ int golden_value = golden_sclk_table->dpm_levels
[golden_sclk_table->count - 1].value;
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
+
return value;
}
if (vega10_ps->performance_levels
[vega10_ps->performance_level_count - 1].gfx_clock >
- hwmgr->platform_descriptor.overdriveLimit.engineClock)
+ hwmgr->platform_descriptor.overdriveLimit.engineClock) {
vega10_ps->performance_levels
[vega10_ps->performance_level_count - 1].gfx_clock =
hwmgr->platform_descriptor.overdriveLimit.engineClock;
-
+ pr_warn("max sclk supported by vbios is %d\n",
+ hwmgr->platform_descriptor.overdriveLimit.engineClock);
+ }
return 0;
}
struct vega10_single_dpm_table *mclk_table = &(data->dpm_table.mem_table);
struct vega10_single_dpm_table *golden_mclk_table =
&(data->golden_dpm_table.mem_table);
- int value;
-
- value = (mclk_table->dpm_levels
- [mclk_table->count - 1].value -
- golden_mclk_table->dpm_levels
- [golden_mclk_table->count - 1].value) *
- 100 /
- golden_mclk_table->dpm_levels
+ int value = mclk_table->dpm_levels[mclk_table->count - 1].value;
+ int golden_value = golden_mclk_table->dpm_levels
[golden_mclk_table->count - 1].value;
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
+
return value;
}
if (vega10_ps->performance_levels
[vega10_ps->performance_level_count - 1].mem_clock >
- hwmgr->platform_descriptor.overdriveLimit.memoryClock)
+ hwmgr->platform_descriptor.overdriveLimit.memoryClock) {
vega10_ps->performance_levels
[vega10_ps->performance_level_count - 1].mem_clock =
hwmgr->platform_descriptor.overdriveLimit.memoryClock;
+ pr_warn("max mclk supported by vbios is %d\n",
+ hwmgr->platform_descriptor.overdriveLimit.memoryClock);
+ }
return 0;
}
struct vega12_single_dpm_table *sclk_table = &(data->dpm_table.gfx_table);
struct vega12_single_dpm_table *golden_sclk_table =
&(data->golden_dpm_table.gfx_table);
- int value;
+ int value = sclk_table->dpm_levels[sclk_table->count - 1].value;
+ int golden_value = golden_sclk_table->dpm_levels
+ [golden_sclk_table->count - 1].value;
- value = (sclk_table->dpm_levels[sclk_table->count - 1].value -
- golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value) *
- 100 /
- golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value;
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
return value;
}
struct vega12_single_dpm_table *mclk_table = &(data->dpm_table.mem_table);
struct vega12_single_dpm_table *golden_mclk_table =
&(data->golden_dpm_table.mem_table);
- int value;
-
- value = (mclk_table->dpm_levels
- [mclk_table->count - 1].value -
- golden_mclk_table->dpm_levels
- [golden_mclk_table->count - 1].value) *
- 100 /
- golden_mclk_table->dpm_levels
+ int value = mclk_table->dpm_levels[mclk_table->count - 1].value;
+ int golden_value = golden_mclk_table->dpm_levels
[golden_mclk_table->count - 1].value;
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
+
return value;
}
return vega12_disable_gfx_off(hwmgr);
}
+static int vega12_get_performance_level(struct pp_hwmgr *hwmgr, const struct pp_hw_power_state *state,
+ PHM_PerformanceLevelDesignation designation, uint32_t index,
+ PHM_PerformanceLevel *level)
+{
+ return 0;
+}
+
static const struct pp_hwmgr_func vega12_hwmgr_funcs = {
.backend_init = vega12_hwmgr_backend_init,
.backend_fini = vega12_hwmgr_backend_fini,
.register_irq_handlers = smu9_register_irq_handlers,
.start_thermal_controller = vega12_start_thermal_controller,
.powergate_gfx = vega12_gfx_off_control,
+ .get_performance_level = vega12_get_performance_level,
};
int vega12_hwmgr_init(struct pp_hwmgr *hwmgr)
data->phy_clk_quad_eqn_b = PPREGKEY_VEGA20QUADRATICEQUATION_DFLT;
data->phy_clk_quad_eqn_c = PPREGKEY_VEGA20QUADRATICEQUATION_DFLT;
- data->registry_data.disallowed_features = 0x0;
+ /*
+ * Disable the following features for now:
+ * GFXCLK DS
+ * SOCLK DS
+ * LCLK DS
+ * DCEFCLK DS
+ * FCLK DS
+ * MP1CLK DS
+ * MP0CLK DS
+ */
+ data->registry_data.disallowed_features = 0xE0041C00;
data->registry_data.od_state_in_dc_support = 0;
data->registry_data.thermal_support = 1;
data->registry_data.skip_baco_hardware = 0;
data->registry_data.disable_auto_wattman = 1;
data->registry_data.auto_wattman_debug = 0;
data->registry_data.auto_wattman_sample_period = 100;
+ data->registry_data.fclk_gfxclk_ratio = 0x3F6CCCCD;
data->registry_data.auto_wattman_threshold = 50;
data->registry_data.gfxoff_controlled_by_driver = 1;
data->gfxoff_allowed = false;
return 0;
}
+static int vega20_notify_smc_display_change(struct pp_hwmgr *hwmgr)
+{
+ struct vega20_hwmgr *data = (struct vega20_hwmgr *)(hwmgr->backend);
+
+ if (data->smu_features[GNLD_DPM_UCLK].enabled)
+ return smum_send_msg_to_smc_with_parameter(hwmgr,
+ PPSMC_MSG_SetUclkFastSwitch,
+ 1);
+
+ return 0;
+}
+
+static int vega20_send_clock_ratio(struct pp_hwmgr *hwmgr)
+{
+ struct vega20_hwmgr *data =
+ (struct vega20_hwmgr *)(hwmgr->backend);
+
+ return smum_send_msg_to_smc_with_parameter(hwmgr,
+ PPSMC_MSG_SetFclkGfxClkRatio,
+ data->registry_data.fclk_gfxclk_ratio);
+}
+
static int vega20_disable_all_smu_features(struct pp_hwmgr *hwmgr)
{
struct vega20_hwmgr *data =
&(data->dpm_table.gfx_table);
struct vega20_single_dpm_table *golden_sclk_table =
&(data->golden_dpm_table.gfx_table);
- int value;
+ int value = sclk_table->dpm_levels[sclk_table->count - 1].value;
+ int golden_value = golden_sclk_table->dpm_levels
+ [golden_sclk_table->count - 1].value;
/* od percentage */
- value = DIV_ROUND_UP((sclk_table->dpm_levels[sclk_table->count - 1].value -
- golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value) * 100,
- golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value);
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
return value;
}
&(data->dpm_table.mem_table);
struct vega20_single_dpm_table *golden_mclk_table =
&(data->golden_dpm_table.mem_table);
- int value;
+ int value = mclk_table->dpm_levels[mclk_table->count - 1].value;
+ int golden_value = golden_mclk_table->dpm_levels
+ [golden_mclk_table->count - 1].value;
/* od percentage */
- value = DIV_ROUND_UP((mclk_table->dpm_levels[mclk_table->count - 1].value -
- golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value) * 100,
- golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value);
+ value -= golden_value;
+ value = DIV_ROUND_UP(value * 100, golden_value);
return value;
}
"[EnableDPMTasks] Failed to enable all smu features!",
return result);
+ result = vega20_notify_smc_display_change(hwmgr);
+ PP_ASSERT_WITH_CODE(!result,
+ "[EnableDPMTasks] Failed to notify smc display change!",
+ return result);
+
+ result = vega20_send_clock_ratio(hwmgr);
+ PP_ASSERT_WITH_CODE(!result,
+ "[EnableDPMTasks] Failed to send clock ratio!",
+ return result);
+
/* Initialize UVD/VCE powergating state */
vega20_init_powergate_state(hwmgr);
return i;
}
-static int vega20_upload_dpm_min_level(struct pp_hwmgr *hwmgr)
+static int vega20_upload_dpm_min_level(struct pp_hwmgr *hwmgr, uint32_t feature_mask)
{
struct vega20_hwmgr *data =
(struct vega20_hwmgr *)(hwmgr->backend);
uint32_t min_freq;
int ret = 0;
- if (data->smu_features[GNLD_DPM_GFXCLK].enabled) {
+ if (data->smu_features[GNLD_DPM_GFXCLK].enabled &&
+ (feature_mask & FEATURE_DPM_GFXCLK_MASK)) {
min_freq = data->dpm_table.gfx_table.dpm_state.soft_min_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
hwmgr, PPSMC_MSG_SetSoftMinByFreq,
return ret);
}
- if (data->smu_features[GNLD_DPM_UCLK].enabled) {
+ if (data->smu_features[GNLD_DPM_UCLK].enabled &&
+ (feature_mask & FEATURE_DPM_UCLK_MASK)) {
min_freq = data->dpm_table.mem_table.dpm_state.soft_min_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
hwmgr, PPSMC_MSG_SetSoftMinByFreq,
return ret);
}
- if (data->smu_features[GNLD_DPM_UVD].enabled) {
+ if (data->smu_features[GNLD_DPM_UVD].enabled &&
+ (feature_mask & FEATURE_DPM_UVD_MASK)) {
min_freq = data->dpm_table.vclk_table.dpm_state.soft_min_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret);
}
- if (data->smu_features[GNLD_DPM_VCE].enabled) {
+ if (data->smu_features[GNLD_DPM_VCE].enabled &&
+ (feature_mask & FEATURE_DPM_VCE_MASK)) {
min_freq = data->dpm_table.eclk_table.dpm_state.soft_min_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret);
}
- if (data->smu_features[GNLD_DPM_SOCCLK].enabled) {
+ if (data->smu_features[GNLD_DPM_SOCCLK].enabled &&
+ (feature_mask & FEATURE_DPM_SOCCLK_MASK)) {
min_freq = data->dpm_table.soc_table.dpm_state.soft_min_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret;
}
-static int vega20_upload_dpm_max_level(struct pp_hwmgr *hwmgr)
+static int vega20_upload_dpm_max_level(struct pp_hwmgr *hwmgr, uint32_t feature_mask)
{
struct vega20_hwmgr *data =
(struct vega20_hwmgr *)(hwmgr->backend);
uint32_t max_freq;
int ret = 0;
- if (data->smu_features[GNLD_DPM_GFXCLK].enabled) {
+ if (data->smu_features[GNLD_DPM_GFXCLK].enabled &&
+ (feature_mask & FEATURE_DPM_GFXCLK_MASK)) {
max_freq = data->dpm_table.gfx_table.dpm_state.soft_max_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret);
}
- if (data->smu_features[GNLD_DPM_UCLK].enabled) {
+ if (data->smu_features[GNLD_DPM_UCLK].enabled &&
+ (feature_mask & FEATURE_DPM_UCLK_MASK)) {
max_freq = data->dpm_table.mem_table.dpm_state.soft_max_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret);
}
- if (data->smu_features[GNLD_DPM_UVD].enabled) {
+ if (data->smu_features[GNLD_DPM_UVD].enabled &&
+ (feature_mask & FEATURE_DPM_UVD_MASK)) {
max_freq = data->dpm_table.vclk_table.dpm_state.soft_max_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret);
}
- if (data->smu_features[GNLD_DPM_VCE].enabled) {
+ if (data->smu_features[GNLD_DPM_VCE].enabled &&
+ (feature_mask & FEATURE_DPM_VCE_MASK)) {
max_freq = data->dpm_table.eclk_table.dpm_state.soft_max_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret);
}
- if (data->smu_features[GNLD_DPM_SOCCLK].enabled) {
+ if (data->smu_features[GNLD_DPM_SOCCLK].enabled &&
+ (feature_mask & FEATURE_DPM_SOCCLK_MASK)) {
max_freq = data->dpm_table.soc_table.dpm_state.soft_max_level;
PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
return ret;
}
-static int vega20_get_current_gfx_clk_freq(struct pp_hwmgr *hwmgr, uint32_t *gfx_freq)
-{
- uint32_t gfx_clk = 0;
- int ret = 0;
-
- *gfx_freq = 0;
-
- PP_ASSERT_WITH_CODE((ret = smum_send_msg_to_smc_with_parameter(hwmgr,
- PPSMC_MSG_GetDpmClockFreq, (PPCLK_GFXCLK << 16))) == 0,
- "[GetCurrentGfxClkFreq] Attempt to get Current GFXCLK Frequency Failed!",
- return ret);
- gfx_clk = smum_get_argument(hwmgr);
-
- *gfx_freq = gfx_clk * 100;
-
- return 0;
-}
-
-static int vega20_get_current_mclk_freq(struct pp_hwmgr *hwmgr, uint32_t *mclk_freq)
+static int vega20_get_current_clk_freq(struct pp_hwmgr *hwmgr,
+ PPCLK_e clk_id, uint32_t *clk_freq)
{
- uint32_t mem_clk = 0;
int ret = 0;
- *mclk_freq = 0;
+ *clk_freq = 0;
PP_ASSERT_WITH_CODE((ret = smum_send_msg_to_smc_with_parameter(hwmgr,
- PPSMC_MSG_GetDpmClockFreq, (PPCLK_UCLK << 16))) == 0,
- "[GetCurrentMClkFreq] Attempt to get Current MCLK Frequency Failed!",
+ PPSMC_MSG_GetDpmClockFreq, (clk_id << 16))) == 0,
+ "[GetCurrentClkFreq] Attempt to get Current Frequency Failed!",
return ret);
- mem_clk = smum_get_argument(hwmgr);
+ *clk_freq = smum_get_argument(hwmgr);
- *mclk_freq = mem_clk * 100;
+ *clk_freq = *clk_freq * 100;
return 0;
}
switch (idx) {
case AMDGPU_PP_SENSOR_GFX_SCLK:
- ret = vega20_get_current_gfx_clk_freq(hwmgr, (uint32_t *)value);
+ ret = vega20_get_current_clk_freq(hwmgr,
+ PPCLK_GFXCLK,
+ (uint32_t *)value);
if (!ret)
*size = 4;
break;
case AMDGPU_PP_SENSOR_GFX_MCLK:
- ret = vega20_get_current_mclk_freq(hwmgr, (uint32_t *)value);
+ ret = vega20_get_current_clk_freq(hwmgr,
+ PPCLK_UCLK,
+ (uint32_t *)value);
if (!ret)
*size = 4;
break;
return ret;
}
-static int vega20_notify_smc_display_change(struct pp_hwmgr *hwmgr,
- bool has_disp)
-{
- struct vega20_hwmgr *data = (struct vega20_hwmgr *)(hwmgr->backend);
-
- if (data->smu_features[GNLD_DPM_UCLK].enabled)
- return smum_send_msg_to_smc_with_parameter(hwmgr,
- PPSMC_MSG_SetUclkFastSwitch,
- has_disp ? 1 : 0);
-
- return 0;
-}
-
int vega20_display_clock_voltage_request(struct pp_hwmgr *hwmgr,
struct pp_display_clock_request *clock_req)
{
if (data->smu_features[GNLD_DPM_DCEFCLK].enabled) {
switch (clk_type) {
case amd_pp_dcef_clock:
- clk_freq = clock_req->clock_freq_in_khz / 100;
clk_select = PPCLK_DCEFCLK;
break;
case amd_pp_disp_clock:
return result;
}
+static int vega20_get_performance_level(struct pp_hwmgr *hwmgr, const struct pp_hw_power_state *state,
+ PHM_PerformanceLevelDesignation designation, uint32_t index,
+ PHM_PerformanceLevel *level)
+{
+ return 0;
+}
+
static int vega20_notify_smc_display_config_after_ps_adjustment(
struct pp_hwmgr *hwmgr)
{
struct vega20_hwmgr *data =
(struct vega20_hwmgr *)(hwmgr->backend);
+ struct vega20_single_dpm_table *dpm_table =
+ &data->dpm_table.mem_table;
struct PP_Clocks min_clocks = {0};
struct pp_display_clock_request clock_req;
int ret = 0;
- if ((hwmgr->display_config->num_display > 1) &&
- !hwmgr->display_config->multi_monitor_in_sync &&
- !hwmgr->display_config->nb_pstate_switch_disable)
- vega20_notify_smc_display_change(hwmgr, false);
- else
- vega20_notify_smc_display_change(hwmgr, true);
-
min_clocks.dcefClock = hwmgr->display_config->min_dcef_set_clk;
min_clocks.dcefClockInSR = hwmgr->display_config->min_dcef_deep_sleep_set_clk;
min_clocks.memoryClock = hwmgr->display_config->min_mem_set_clock;
if (data->smu_features[GNLD_DPM_DCEFCLK].supported) {
clock_req.clock_type = amd_pp_dcef_clock;
- clock_req.clock_freq_in_khz = min_clocks.dcefClock;
+ clock_req.clock_freq_in_khz = min_clocks.dcefClock * 10;
if (!vega20_display_clock_voltage_request(hwmgr, &clock_req)) {
if (data->smu_features[GNLD_DS_DCEFCLK].supported)
PP_ASSERT_WITH_CODE((ret = smum_send_msg_to_smc_with_parameter(
}
}
+ if (data->smu_features[GNLD_DPM_UCLK].enabled) {
+ dpm_table->dpm_state.hard_min_level = min_clocks.memoryClock / 100;
+ PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(hwmgr,
+ PPSMC_MSG_SetHardMinByFreq,
+ (PPCLK_UCLK << 16 ) | dpm_table->dpm_state.hard_min_level)),
+ "[SetHardMinFreq] Set hard min uclk failed!",
+ return ret);
+ }
+
return 0;
}
data->dpm_table.mem_table.dpm_state.soft_max_level =
data->dpm_table.mem_table.dpm_levels[soft_level].value;
- ret = vega20_upload_dpm_min_level(hwmgr);
+ ret = vega20_upload_dpm_min_level(hwmgr, 0xFFFFFFFF);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload boot level to highest!",
return ret);
- ret = vega20_upload_dpm_max_level(hwmgr);
+ ret = vega20_upload_dpm_max_level(hwmgr, 0xFFFFFFFF);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload dpm max level to highest!",
return ret);
data->dpm_table.mem_table.dpm_state.soft_max_level =
data->dpm_table.mem_table.dpm_levels[soft_level].value;
- ret = vega20_upload_dpm_min_level(hwmgr);
+ ret = vega20_upload_dpm_min_level(hwmgr, 0xFFFFFFFF);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload boot level to highest!",
return ret);
- ret = vega20_upload_dpm_max_level(hwmgr);
+ ret = vega20_upload_dpm_max_level(hwmgr, 0xFFFFFFFF);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload dpm max level to highest!",
return ret);
{
int ret = 0;
- ret = vega20_upload_dpm_min_level(hwmgr);
+ ret = vega20_upload_dpm_min_level(hwmgr, 0xFFFFFFFF);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload DPM Bootup Levels!",
return ret);
- ret = vega20_upload_dpm_max_level(hwmgr);
+ ret = vega20_upload_dpm_max_level(hwmgr, 0xFFFFFFFF);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload DPM Max Levels!",
return ret);
data->dpm_table.gfx_table.dpm_state.soft_max_level =
data->dpm_table.gfx_table.dpm_levels[soft_max_level].value;
- ret = vega20_upload_dpm_min_level(hwmgr);
+ ret = vega20_upload_dpm_min_level(hwmgr, FEATURE_DPM_GFXCLK_MASK);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload boot level to lowest!",
return ret);
- ret = vega20_upload_dpm_max_level(hwmgr);
+ ret = vega20_upload_dpm_max_level(hwmgr, FEATURE_DPM_GFXCLK_MASK);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload dpm max level to highest!",
return ret);
data->dpm_table.mem_table.dpm_state.soft_max_level =
data->dpm_table.mem_table.dpm_levels[soft_max_level].value;
- ret = vega20_upload_dpm_min_level(hwmgr);
+ ret = vega20_upload_dpm_min_level(hwmgr, FEATURE_DPM_UCLK_MASK);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload boot level to lowest!",
return ret);
- ret = vega20_upload_dpm_max_level(hwmgr);
+ ret = vega20_upload_dpm_max_level(hwmgr, FEATURE_DPM_UCLK_MASK);
PP_ASSERT_WITH_CODE(!ret,
"Failed to upload dpm max level to highest!",
return ret);
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
- dpm_table->dpm_levels[i].value * 100;
+ dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us = 0;
}
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
data->mclk_latency_table.entries[i].frequency =
- dpm_table->dpm_levels[i].value * 100;
+ dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us =
data->mclk_latency_table.entries[i].latency =
vega20_get_mem_latency(hwmgr, dpm_table->dpm_levels[i].value);
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
- dpm_table->dpm_levels[i].value * 100;
+ dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us = 0;
}
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
- dpm_table->dpm_levels[i].value * 100;
+ dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us = 0;
}
return -EINVAL;
}
- if (input_clk < clocks.data[0].clocks_in_khz / 100 ||
+ if (input_clk < clocks.data[0].clocks_in_khz / 1000 ||
input_clk > od8_settings[OD8_SETTING_UCLK_FMAX].max_value) {
pr_info("clock freq %d is not within allowed range [%d - %d]\n",
input_clk,
- clocks.data[0].clocks_in_khz / 100,
+ clocks.data[0].clocks_in_khz / 1000,
od8_settings[OD8_SETTING_UCLK_FMAX].max_value);
return -EINVAL;
}
switch (type) {
case PP_SCLK:
- ret = vega20_get_current_gfx_clk_freq(hwmgr, &now);
+ ret = vega20_get_current_clk_freq(hwmgr, PPCLK_GFXCLK, &now);
PP_ASSERT_WITH_CODE(!ret,
"Attempt to get current gfx clk Failed!",
return ret);
for (i = 0; i < clocks.num_levels; i++)
size += sprintf(buf + size, "%d: %uMhz %s\n",
- i, clocks.data[i].clocks_in_khz / 100,
+ i, clocks.data[i].clocks_in_khz / 1000,
(clocks.data[i].clocks_in_khz == now) ? "*" : "");
break;
case PP_MCLK:
- ret = vega20_get_current_mclk_freq(hwmgr, &now);
+ ret = vega20_get_current_clk_freq(hwmgr, PPCLK_UCLK, &now);
PP_ASSERT_WITH_CODE(!ret,
"Attempt to get current mclk freq Failed!",
return ret);
for (i = 0; i < clocks.num_levels; i++)
size += sprintf(buf + size, "%d: %uMhz %s\n",
- i, clocks.data[i].clocks_in_khz / 100,
+ i, clocks.data[i].clocks_in_khz / 1000,
(clocks.data[i].clocks_in_khz == now) ? "*" : "");
break;
return ret);
size += sprintf(buf + size, "MCLK: %7uMhz %10uMhz\n",
- clocks.data[0].clocks_in_khz / 100,
+ clocks.data[0].clocks_in_khz / 1000,
od8_settings[OD8_SETTING_UCLK_FMAX].max_value);
}
vega20_set_watermarks_for_clocks_ranges,
.display_clock_voltage_request =
vega20_display_clock_voltage_request,
+ .get_performance_level =
+ vega20_get_performance_level,
/* UMD pstate, profile related */
.force_dpm_level =
vega20_dpm_force_dpm_level,
uint8_t disable_auto_wattman;
uint32_t auto_wattman_debug;
uint32_t auto_wattman_sample_period;
+ uint32_t fclk_gfxclk_ratio;
uint8_t auto_wattman_threshold;
uint8_t log_avfs_param;
uint8_t enable_enginess;
"Unsupported PPTable format!", return -1);
PP_ASSERT_WITH_CODE(powerplay_table->sHeader.structuresize > 0,
"Invalid PowerPlay Table!", return -1);
- PP_ASSERT_WITH_CODE(powerplay_table->smcPPTable.Version == PPTABLE_V20_SMU_VERSION,
- "Unmatch PPTable version, vbios update may be needed!", return -1);
+
+ if (powerplay_table->smcPPTable.Version != PPTABLE_V20_SMU_VERSION) {
+ pr_info("Unmatch PPTable version: "
+ "pptable from VBIOS is V%d while driver supported is V%d!",
+ powerplay_table->smcPPTable.Version,
+ PPTABLE_V20_SMU_VERSION);
+ return -EINVAL;
+ }
//dump_pptable(&powerplay_table->smcPPTable);
"[appendVbiosPPTable] Failed to retrieve Smc Dpm Table from VBIOS!",
return -1);
- memset(ppsmc_pptable->Padding32,
- 0,
- sizeof(struct atom_smc_dpm_info_v4_4) -
- sizeof(struct atom_common_table_header));
ppsmc_pptable->MaxVoltageStepGfx = smc_dpm_table->maxvoltagestepgfx;
ppsmc_pptable->MaxVoltageStepSoc = smc_dpm_table->maxvoltagestepsoc;
ppsmc_pptable->FllGfxclkSpreadPercent = smc_dpm_table->fllgfxclkspreadpercent;
ppsmc_pptable->FllGfxclkSpreadFreq = smc_dpm_table->fllgfxclkspreadfreq;
- if ((smc_dpm_table->table_header.format_revision == 4) &&
- (smc_dpm_table->table_header.content_revision == 4)) {
- for (i = 0; i < I2C_CONTROLLER_NAME_COUNT; i++) {
- ppsmc_pptable->I2cControllers[i].Enabled =
- smc_dpm_table->i2ccontrollers[i].enabled;
- ppsmc_pptable->I2cControllers[i].SlaveAddress =
- smc_dpm_table->i2ccontrollers[i].slaveaddress;
- ppsmc_pptable->I2cControllers[i].ControllerPort =
- smc_dpm_table->i2ccontrollers[i].controllerport;
- ppsmc_pptable->I2cControllers[i].ThermalThrottler =
- smc_dpm_table->i2ccontrollers[i].thermalthrottler;
- ppsmc_pptable->I2cControllers[i].I2cProtocol =
- smc_dpm_table->i2ccontrollers[i].i2cprotocol;
- ppsmc_pptable->I2cControllers[i].I2cSpeed =
- smc_dpm_table->i2ccontrollers[i].i2cspeed;
- }
+ for (i = 0; i < I2C_CONTROLLER_NAME_COUNT; i++) {
+ ppsmc_pptable->I2cControllers[i].Enabled =
+ smc_dpm_table->i2ccontrollers[i].enabled;
+ ppsmc_pptable->I2cControllers[i].SlaveAddress =
+ smc_dpm_table->i2ccontrollers[i].slaveaddress;
+ ppsmc_pptable->I2cControllers[i].ControllerPort =
+ smc_dpm_table->i2ccontrollers[i].controllerport;
+ ppsmc_pptable->I2cControllers[i].ThermalThrottler =
+ smc_dpm_table->i2ccontrollers[i].thermalthrottler;
+ ppsmc_pptable->I2cControllers[i].I2cProtocol =
+ smc_dpm_table->i2ccontrollers[i].i2cprotocol;
+ ppsmc_pptable->I2cControllers[i].I2cSpeed =
+ smc_dpm_table->i2ccontrollers[i].i2cspeed;
}
return 0;
if (pptable_information->smc_pptable == NULL)
return -ENOMEM;
- if (powerplay_table->smcPPTable.Version <= 2)
- memcpy(pptable_information->smc_pptable,
- &(powerplay_table->smcPPTable),
- sizeof(PPTable_t) -
- sizeof(I2cControllerConfig_t) * I2C_CONTROLLER_NAME_COUNT);
- else
- memcpy(pptable_information->smc_pptable,
- &(powerplay_table->smcPPTable),
- sizeof(PPTable_t));
+ memcpy(pptable_information->smc_pptable,
+ &(powerplay_table->smcPPTable),
+ sizeof(PPTable_t));
+
result = append_vbios_pptable(hwmgr, (pptable_information->smc_pptable));
// any structure is changed in this file
#define SMU11_DRIVER_IF_VERSION 0x12
-#define PPTABLE_V20_SMU_VERSION 2
+#define PPTABLE_V20_SMU_VERSION 3
#define NUM_GFXCLK_DPM_LEVELS 16
#define NUM_VCLK_DPM_LEVELS 8
#define PPSMC_MSG_SetSystemVirtualDramAddrHigh 0x4B
#define PPSMC_MSG_SetSystemVirtualDramAddrLow 0x4C
#define PPSMC_MSG_WaflTest 0x4D
-// Unused ID 0x4E to 0x50
+#define PPSMC_MSG_SetFclkGfxClkRatio 0x4E
+// Unused ID 0x4F to 0x50
#define PPSMC_MSG_AllowGfxOff 0x51
#define PPSMC_MSG_DisallowGfxOff 0x52
#define PPSMC_MSG_GetPptLimit 0x53
result = PHM_WAIT_FIELD_UNEQUAL(hwmgr,
SMU_MP1_SRBM2P_RESP_0, CONTENT, 0);
if (result != 0) {
+ /* Read the last message to SMU, to report actual cause */
+ uint32_t val = cgs_read_register(hwmgr->device,
+ mmSMU_MP1_SRBM2P_MSG_0);
pr_err("smu8_send_msg_to_smc_async (0x%04x) failed\n", msg);
+ pr_err("SMU still servicing msg (0x%04x)\n", val);
return result;
}
MODULE_DEVICE_TABLE(pci, pciidlist);
+static void ast_kick_out_firmware_fb(struct pci_dev *pdev)
+{
+ struct apertures_struct *ap;
+ bool primary = false;
+
+ ap = alloc_apertures(1);
+ if (!ap)
+ return;
+
+ ap->ranges[0].base = pci_resource_start(pdev, 0);
+ ap->ranges[0].size = pci_resource_len(pdev, 0);
+
+#ifdef CONFIG_X86
+ primary = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
+#endif
+ drm_fb_helper_remove_conflicting_framebuffers(ap, "astdrmfb", primary);
+ kfree(ap);
+}
+
static int ast_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
+ ast_kick_out_firmware_fb(pdev);
+
return drm_get_pci_dev(pdev, ent, &driver);
}
{
struct ast_framebuffer *afb = &afbdev->afb;
+ drm_crtc_force_disable_all(dev);
drm_fb_helper_unregister_fbi(&afbdev->helper);
if (afb->obj) {
drm_mode_config_cleanup(dev);
ast_mm_fini(ast);
- pci_iounmap(dev->pdev, ast->ioregs);
+ if (ast->ioregs != ast->regs + AST_IO_MM_OFFSET)
+ pci_iounmap(dev->pdev, ast->ioregs);
pci_iounmap(dev->pdev, ast->regs);
kfree(ast);
}
}
ast_bo_unreserve(bo);
+ ast_set_offset_reg(crtc);
ast_set_start_address_crt1(crtc, (u32)gpu_addr);
return 0;
{
struct ast_i2c_chan *i2c = i2c_priv;
struct ast_private *ast = i2c->dev->dev_private;
- uint32_t val;
+ uint32_t val, val2, count, pass;
+
+ count = 0;
+ pass = 0;
+ val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4) & 0x01;
+ do {
+ val2 = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4) & 0x01;
+ if (val == val2) {
+ pass++;
+ } else {
+ pass = 0;
+ val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4) & 0x01;
+ }
+ } while ((pass < 5) && (count++ < 0x10000));
- val = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4;
return val & 1 ? 1 : 0;
}
{
struct ast_i2c_chan *i2c = i2c_priv;
struct ast_private *ast = i2c->dev->dev_private;
- uint32_t val;
+ uint32_t val, val2, count, pass;
+
+ count = 0;
+ pass = 0;
+ val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5) & 0x01;
+ do {
+ val2 = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5) & 0x01;
+ if (val == val2) {
+ pass++;
+ } else {
+ pass = 0;
+ val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5) & 0x01;
+ }
+ } while ((pass < 5) && (count++ < 0x10000));
- val = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5;
return val & 1 ? 1 : 0;
}
for (i = 0; i < 0x10000; i++) {
ujcrb7 = ((clock & 0x01) ? 0 : 1);
- ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xfe, ujcrb7);
+ ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xf4, ujcrb7);
jtemp = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x01);
if (ujcrb7 == jtemp)
break;
for (i = 0; i < 0x10000; i++) {
ujcrb7 = ((data & 0x01) ? 0 : 1) << 2;
- ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xfb, ujcrb7);
+ ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xf1, ujcrb7);
jtemp = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x04);
if (ujcrb7 == jtemp)
break;
ast_set_index_reg(ast, AST_IO_CRTC_PORT, 0xc7, ((y >> 8) & 0x07));
/* dummy write to fire HWC */
- ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xCB, 0xFF, 0x00);
+ ast_show_cursor(crtc);
return 0;
}
#define SN_AUX_ADDR_7_0_REG 0x76
#define SN_AUX_LENGTH_REG 0x77
#define SN_AUX_CMD_REG 0x78
-#define AUX_CMD_SEND BIT(1)
+#define AUX_CMD_SEND BIT(0)
#define AUX_CMD_REQ(x) ((x) << 4)
#define SN_AUX_RDATA_REG(x) (0x79 + (x))
#define SN_SSC_CONFIG_REG 0x93
unsigned int val;
int ret;
- /*
- * FIXME:
- * This 70ms was found necessary by experimentation. If it's not
- * present, link training fails. It seems like it can go anywhere from
- * pre_enable() up to semi-auto link training initiation below.
- *
- * Neither the datasheet for the bridge nor the panel tested mention a
- * delay of this magnitude in the timing requirements. So for now, add
- * the mystery delay until someone figures out a better fix.
- */
- msleep(70);
-
/* DSI_A lane config */
val = CHA_DSI_LANES(4 - pdata->dsi->lanes);
regmap_update_bits(pdata->regmap, SN_DSI_LANES_REG,
/* configure bridge ref_clk */
ti_sn_bridge_set_refclk_freq(pdata);
- /* in case drm_panel is connected then HPD is not supported */
+ /*
+ * HPD on this bridge chip is a bit useless. This is an eDP bridge
+ * so the HPD is an internal signal that's only there to signal that
+ * the panel is done powering up. ...but the bridge chip debounces
+ * this signal by between 100 ms and 400 ms (depending on process,
+ * voltage, and temperate--I measured it at about 200 ms). One
+ * particular panel asserted HPD 84 ms after it was powered on meaning
+ * that we saw HPD 284 ms after power on. ...but the same panel said
+ * that instead of looking at HPD you could just hardcode a delay of
+ * 200 ms. We'll assume that the panel driver will have the hardcoded
+ * delay in its prepare and always disable HPD.
+ *
+ * If HPD somehow makes sense on some future panel we'll have to
+ * change this to be conditional on someone specifying that HPD should
+ * be used.
+ */
regmap_update_bits(pdata->regmap, SN_HPD_DISABLE_REG, HPD_DISABLE,
HPD_DISABLE);
return 0;
}
+ crtc_state = drm_atomic_get_new_crtc_state(state,
+ new_connector_state->crtc);
+ /*
+ * For compatibility with legacy users, we want to make sure that
+ * we allow DPMS On->Off modesets on unregistered connectors. Modesets
+ * which would result in anything else must be considered invalid, to
+ * avoid turning on new displays on dead connectors.
+ *
+ * Since the connector can be unregistered at any point during an
+ * atomic check or commit, this is racy. But that's OK: all we care
+ * about is ensuring that userspace can't do anything but shut off the
+ * display on a connector that was destroyed after its been notified,
+ * not before.
+ */
+ if (drm_connector_is_unregistered(connector) && crtc_state->active) {
+ DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] is not registered\n",
+ connector->base.id, connector->name);
+ return -EINVAL;
+ }
+
funcs = connector->helper_private;
if (funcs->atomic_best_encoder)
set_best_encoder(state, new_connector_state, new_encoder);
- crtc_state = drm_atomic_get_new_crtc_state(state, new_connector_state->crtc);
crtc_state->connectors_changed = true;
DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] using [ENCODER:%d:%s] on [CRTC:%d:%s]\n",
lockdep_assert_held_once(&dev->master_mutex);
+ WARN_ON(fpriv->is_master);
old_master = fpriv->master;
fpriv->master = drm_master_create(dev);
if (!fpriv->master) {
/* drop references and restore old master on failure */
drm_master_put(&fpriv->master);
fpriv->master = old_master;
+ fpriv->is_master = 0;
return ret;
}
/* The connector should have been removed from userspace long before
* it is finally destroyed.
*/
- if (WARN_ON(connector->registered))
+ if (WARN_ON(connector->registration_state ==
+ DRM_CONNECTOR_REGISTERED))
drm_connector_unregister(connector);
if (connector->tile_group) {
return 0;
mutex_lock(&connector->mutex);
- if (connector->registered)
+ if (connector->registration_state != DRM_CONNECTOR_INITIALIZING)
goto unlock;
ret = drm_sysfs_connector_add(connector);
drm_mode_object_register(connector->dev, &connector->base);
- connector->registered = true;
+ connector->registration_state = DRM_CONNECTOR_REGISTERED;
goto unlock;
err_debugfs:
void drm_connector_unregister(struct drm_connector *connector)
{
mutex_lock(&connector->mutex);
- if (!connector->registered) {
+ if (connector->registration_state != DRM_CONNECTOR_REGISTERED) {
mutex_unlock(&connector->mutex);
return;
}
drm_sysfs_connector_remove(connector);
drm_debugfs_connector_remove(connector);
- connector->registered = false;
+ connector->registration_state = DRM_CONNECTOR_UNREGISTERED;
mutex_unlock(&connector->mutex);
}
EXPORT_SYMBOL(drm_connector_unregister);
mutex_lock(&mgr->lock);
mstb = mgr->mst_primary;
+ if (!mstb)
+ goto out;
+
for (i = 0; i < lct - 1; i++) {
int shift = (i % 2) ? 0 : 4;
int port_num = (rad[i / 2] >> shift) & 0xf;
/* SDC panel of Lenovo B50-80 reports 8 bpc, but is a 6 bpc panel */
{ "SDC", 0x3652, EDID_QUIRK_FORCE_6BPC },
+ /* BOE model 0x0771 reports 8 bpc, but is a 6 bpc panel */
+ { "BOE", 0x0771, EDID_QUIRK_FORCE_6BPC },
+
/* Belinea 10 15 55 */
{ "MAX", 1516, EDID_QUIRK_PREFER_LARGE_60 },
{ "MAX", 0x77e, EDID_QUIRK_PREFER_LARGE_60 },
#if IS_ENABLED(CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM)
static bool drm_leak_fbdev_smem = false;
module_param_unsafe(drm_leak_fbdev_smem, bool, 0600);
-MODULE_PARM_DESC(fbdev_emulation,
+MODULE_PARM_DESC(drm_leak_fbdev_smem,
"Allow unsafe leaking fbdev physical smem address [default=false]");
#endif
mutex_lock(&fb_helper->lock);
drm_connector_list_iter_begin(dev, &conn_iter);
drm_for_each_connector_iter(connector, &conn_iter) {
+ if (connector->connector_type == DRM_MODE_CONNECTOR_WRITEBACK)
+ continue;
+
ret = __drm_fb_helper_add_one_connector(fb_helper, connector);
if (ret)
goto fail;
/**
* drm_driver_legacy_fb_format - compute drm fourcc code from legacy description
+ * @dev: DRM device
* @bpp: bits per pixels
* @depth: bit depth per pixel
- * @native: use host native byte order
*
* Computes a drm fourcc pixel format code for the given @bpp/@depth values.
* Unlike drm_mode_legacy_fb_format() this looks at the drivers mode_config,
int drm_sysfs_connector_add(struct drm_connector *connector);
void drm_sysfs_connector_remove(struct drm_connector *connector);
+void drm_sysfs_lease_event(struct drm_device *dev);
+
/* drm_gem.c */
int drm_gem_init(struct drm_device *dev);
void drm_gem_destroy(struct drm_device *dev);
if (master->lessor) {
/* Tell the master to check the lessee list */
- drm_sysfs_hotplug_event(dev);
+ drm_sysfs_lease_event(dev);
drm_master_put(&master->lessor);
}
connector->kdev = NULL;
}
+void drm_sysfs_lease_event(struct drm_device *dev)
+{
+ char *event_string = "LEASE=1";
+ char *envp[] = { event_string, NULL };
+
+ DRM_DEBUG("generating lease event\n");
+
+ kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+
/**
* drm_sysfs_hotplug_event - generate a DRM uevent
* @dev: DRM device
* If the GPU managed to complete this jobs fence, the timout is
* spurious. Bail out.
*/
- if (fence_completed(gpu, submit->out_fence->seqno))
+ if (dma_fence_is_signaled(submit->out_fence))
return;
/*
return frm;
}
-static u32 decon_get_vblank_counter(struct exynos_drm_crtc *crtc)
-{
- struct decon_context *ctx = crtc->ctx;
-
- return decon_get_frame_count(ctx, false);
-}
-
static void decon_setup_trigger(struct decon_context *ctx)
{
if (!ctx->crtc->i80_mode && !(ctx->out_type & I80_HW_TRG))
.disable = decon_disable,
.enable_vblank = decon_enable_vblank,
.disable_vblank = decon_disable_vblank,
- .get_vblank_counter = decon_get_vblank_counter,
.atomic_begin = decon_atomic_begin,
.update_plane = decon_update_plane,
.disable_plane = decon_disable_plane,
int ret;
ctx->drm_dev = drm_dev;
- drm_dev->max_vblank_count = 0xffffffff;
for (win = ctx->first_win; win < WINDOWS_NR; win++) {
ctx->configs[win].pixel_formats = decon_formats;
exynos_crtc->ops->disable_vblank(exynos_crtc);
}
-static u32 exynos_drm_crtc_get_vblank_counter(struct drm_crtc *crtc)
-{
- struct exynos_drm_crtc *exynos_crtc = to_exynos_crtc(crtc);
-
- if (exynos_crtc->ops->get_vblank_counter)
- return exynos_crtc->ops->get_vblank_counter(exynos_crtc);
-
- return 0;
-}
-
static const struct drm_crtc_funcs exynos_crtc_funcs = {
.set_config = drm_atomic_helper_set_config,
.page_flip = drm_atomic_helper_page_flip,
.atomic_destroy_state = drm_atomic_helper_crtc_destroy_state,
.enable_vblank = exynos_drm_crtc_enable_vblank,
.disable_vblank = exynos_drm_crtc_disable_vblank,
- .get_vblank_counter = exynos_drm_crtc_get_vblank_counter,
};
struct exynos_drm_crtc *exynos_drm_crtc_create(struct drm_device *drm_dev,
void (*disable)(struct exynos_drm_crtc *crtc);
int (*enable_vblank)(struct exynos_drm_crtc *crtc);
void (*disable_vblank)(struct exynos_drm_crtc *crtc);
- u32 (*get_vblank_counter)(struct exynos_drm_crtc *crtc);
enum drm_mode_status (*mode_valid)(struct exynos_drm_crtc *crtc,
const struct drm_display_mode *mode);
bool (*mode_fixup)(struct exynos_drm_crtc *crtc,
#include <drm/drmP.h>
#include <drm/drm_crtc_helper.h>
+#include <drm/drm_fb_helper.h>
#include <drm/drm_mipi_dsi.h>
#include <drm/drm_panel.h>
#include <drm/drm_atomic_helper.h>
{
struct exynos_dsi *dsi = encoder_to_dsi(encoder);
struct drm_connector *connector = &dsi->connector;
+ struct drm_device *drm = encoder->dev;
int ret;
connector->polled = DRM_CONNECTOR_POLL_HPD;
- ret = drm_connector_init(encoder->dev, connector,
- &exynos_dsi_connector_funcs,
+ ret = drm_connector_init(drm, connector, &exynos_dsi_connector_funcs,
DRM_MODE_CONNECTOR_DSI);
if (ret) {
DRM_ERROR("Failed to initialize connector with drm\n");
connector->status = connector_status_disconnected;
drm_connector_helper_add(connector, &exynos_dsi_connector_helper_funcs);
drm_connector_attach_encoder(connector, encoder);
+ if (!drm->registered)
+ return 0;
+ connector->funcs->reset(connector);
+ drm_fb_helper_add_one_connector(drm->fb_helper, connector);
+ drm_connector_register(connector);
return 0;
}
}
dsi->panel = of_drm_find_panel(device->dev.of_node);
- if (dsi->panel) {
+ if (IS_ERR(dsi->panel)) {
+ dsi->panel = NULL;
+ } else {
drm_panel_attach(dsi->panel, &dsi->connector);
dsi->connector.status = connector_status_connected;
}
struct drm_fb_helper *helper;
int ret;
- if (!dev->mode_config.num_crtc || !dev->mode_config.num_connector)
+ if (!dev->mode_config.num_crtc)
return 0;
fbdev = kzalloc(sizeof(*fbdev), GFP_KERNEL);
}
mutex_lock(&dev_priv->drm.struct_mutex);
+ mmio_hw_access_pre(dev_priv);
ret = i915_gem_gtt_insert(&dev_priv->ggtt.vm, node,
size, I915_GTT_PAGE_SIZE,
I915_COLOR_UNEVICTABLE,
start, end, flags);
+ mmio_hw_access_post(dev_priv);
mutex_unlock(&dev_priv->drm.struct_mutex);
if (ret)
gvt_err("fail to alloc %s gm space from host\n",
vgpu_free_mm(mm);
return ERR_PTR(-ENOMEM);
}
- mm->ggtt_mm.last_partial_off = -1UL;
return mm;
}
invalidate_ppgtt_mm(mm);
} else {
vfree(mm->ggtt_mm.virtual_ggtt);
- mm->ggtt_mm.last_partial_off = -1UL;
}
vgpu_free_mm(mm);
struct intel_gvt_gtt_entry e, m;
dma_addr_t dma_addr;
int ret;
+ struct intel_gvt_partial_pte *partial_pte, *pos, *n;
+ bool partial_update = false;
if (bytes != 4 && bytes != 8)
return -EINVAL;
if (!vgpu_gmadr_is_valid(vgpu, gma))
return 0;
- ggtt_get_guest_entry(ggtt_mm, &e, g_gtt_index);
-
+ e.type = GTT_TYPE_GGTT_PTE;
memcpy((void *)&e.val64 + (off & (info->gtt_entry_size - 1)), p_data,
bytes);
/* If ggtt entry size is 8 bytes, and it's split into two 4 bytes
- * write, we assume the two 4 bytes writes are consecutive.
- * Otherwise, we abort and report error
+ * write, save the first 4 bytes in a list and update virtual
+ * PTE. Only update shadow PTE when the second 4 bytes comes.
*/
if (bytes < info->gtt_entry_size) {
- if (ggtt_mm->ggtt_mm.last_partial_off == -1UL) {
- /* the first partial part*/
- ggtt_mm->ggtt_mm.last_partial_off = off;
- ggtt_mm->ggtt_mm.last_partial_data = e.val64;
- return 0;
- } else if ((g_gtt_index ==
- (ggtt_mm->ggtt_mm.last_partial_off >>
- info->gtt_entry_size_shift)) &&
- (off != ggtt_mm->ggtt_mm.last_partial_off)) {
- /* the second partial part */
-
- int last_off = ggtt_mm->ggtt_mm.last_partial_off &
- (info->gtt_entry_size - 1);
-
- memcpy((void *)&e.val64 + last_off,
- (void *)&ggtt_mm->ggtt_mm.last_partial_data +
- last_off, bytes);
-
- ggtt_mm->ggtt_mm.last_partial_off = -1UL;
- } else {
- int last_offset;
-
- gvt_vgpu_err("failed to populate guest ggtt entry: abnormal ggtt entry write sequence, last_partial_off=%lx, offset=%x, bytes=%d, ggtt entry size=%d\n",
- ggtt_mm->ggtt_mm.last_partial_off, off,
- bytes, info->gtt_entry_size);
-
- /* set host ggtt entry to scratch page and clear
- * virtual ggtt entry as not present for last
- * partially write offset
- */
- last_offset = ggtt_mm->ggtt_mm.last_partial_off &
- (~(info->gtt_entry_size - 1));
-
- ggtt_get_host_entry(ggtt_mm, &m, last_offset);
- ggtt_invalidate_pte(vgpu, &m);
- ops->set_pfn(&m, gvt->gtt.scratch_mfn);
- ops->clear_present(&m);
- ggtt_set_host_entry(ggtt_mm, &m, last_offset);
- ggtt_invalidate(gvt->dev_priv);
-
- ggtt_get_guest_entry(ggtt_mm, &e, last_offset);
- ops->clear_present(&e);
- ggtt_set_guest_entry(ggtt_mm, &e, last_offset);
-
- ggtt_mm->ggtt_mm.last_partial_off = off;
- ggtt_mm->ggtt_mm.last_partial_data = e.val64;
+ bool found = false;
+
+ list_for_each_entry_safe(pos, n,
+ &ggtt_mm->ggtt_mm.partial_pte_list, list) {
+ if (g_gtt_index == pos->offset >>
+ info->gtt_entry_size_shift) {
+ if (off != pos->offset) {
+ /* the second partial part*/
+ int last_off = pos->offset &
+ (info->gtt_entry_size - 1);
+
+ memcpy((void *)&e.val64 + last_off,
+ (void *)&pos->data + last_off,
+ bytes);
+
+ list_del(&pos->list);
+ kfree(pos);
+ found = true;
+ break;
+ }
+
+ /* update of the first partial part */
+ pos->data = e.val64;
+ ggtt_set_guest_entry(ggtt_mm, &e, g_gtt_index);
+ return 0;
+ }
+ }
- return 0;
+ if (!found) {
+ /* the first partial part */
+ partial_pte = kzalloc(sizeof(*partial_pte), GFP_KERNEL);
+ if (!partial_pte)
+ return -ENOMEM;
+ partial_pte->offset = off;
+ partial_pte->data = e.val64;
+ list_add_tail(&partial_pte->list,
+ &ggtt_mm->ggtt_mm.partial_pte_list);
+ partial_update = true;
}
}
- if (ops->test_present(&e)) {
+ if (!partial_update && (ops->test_present(&e))) {
gfn = ops->get_pfn(&e);
m = e;
} else
ops->set_pfn(&m, dma_addr >> PAGE_SHIFT);
} else {
- ggtt_get_host_entry(ggtt_mm, &m, g_gtt_index);
- ggtt_invalidate_pte(vgpu, &m);
ops->set_pfn(&m, gvt->gtt.scratch_mfn);
ops->clear_present(&m);
}
out:
+ ggtt_set_guest_entry(ggtt_mm, &e, g_gtt_index);
+
+ ggtt_get_host_entry(ggtt_mm, &e, g_gtt_index);
+ ggtt_invalidate_pte(vgpu, &e);
+
ggtt_set_host_entry(ggtt_mm, &m, g_gtt_index);
ggtt_invalidate(gvt->dev_priv);
- ggtt_set_guest_entry(ggtt_mm, &e, g_gtt_index);
return 0;
}
intel_vgpu_reset_ggtt(vgpu, false);
+ INIT_LIST_HEAD(>t->ggtt_mm->ggtt_mm.partial_pte_list);
+
return create_scratch_page_tree(vgpu);
}
static void intel_vgpu_destroy_ggtt_mm(struct intel_vgpu *vgpu)
{
+ struct intel_gvt_partial_pte *pos, *next;
+
+ list_for_each_entry_safe(pos, next,
+ &vgpu->gtt.ggtt_mm->ggtt_mm.partial_pte_list,
+ list) {
+ gvt_dbg_mm("partial PTE update on hold 0x%lx : 0x%llx\n",
+ pos->offset, pos->data);
+ kfree(pos);
+ }
intel_vgpu_destroy_mm(vgpu->gtt.ggtt_mm);
vgpu->gtt.ggtt_mm = NULL;
}
#define _GVT_GTT_H_
#define I915_GTT_PAGE_SHIFT 12
-#define I915_GTT_PAGE_MASK (~(I915_GTT_PAGE_SIZE - 1))
struct intel_vgpu_mm;
#define GVT_RING_CTX_NR_PDPS GEN8_3LVL_PDPES
+struct intel_gvt_partial_pte {
+ unsigned long offset;
+ u64 data;
+ struct list_head list;
+};
+
struct intel_vgpu_mm {
enum intel_gvt_mm_type type;
struct intel_vgpu *vgpu;
} ppgtt_mm;
struct {
void *virtual_ggtt;
- unsigned long last_partial_off;
- u64 last_partial_data;
+ struct list_head partial_pte_list;
} ggtt_mm;
};
};
return 0;
}
-static int bxt_edp_psr_imr_iir_write(struct intel_vgpu *vgpu,
+static int edp_psr_imr_iir_write(struct intel_vgpu *vgpu,
unsigned int offset, void *p_data, unsigned int bytes)
{
vgpu_vreg(vgpu, offset) = 0;
MMIO_DFH(_MMIO(0x1a178), D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
MMIO_DFH(_MMIO(0x1a17c), D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
MMIO_DFH(_MMIO(0x2217c), D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
+
+ MMIO_DH(EDP_PSR_IMR, D_BDW_PLUS, NULL, edp_psr_imr_iir_write);
+ MMIO_DH(EDP_PSR_IIR, D_BDW_PLUS, NULL, edp_psr_imr_iir_write);
return 0;
}
MMIO_D(HSW_TVIDEO_DIP_GCP(TRANSCODER_B), D_BXT);
MMIO_D(HSW_TVIDEO_DIP_GCP(TRANSCODER_C), D_BXT);
- MMIO_DH(EDP_PSR_IMR, D_BXT, NULL, bxt_edp_psr_imr_iir_write);
- MMIO_DH(EDP_PSR_IIR, D_BXT, NULL, bxt_edp_psr_imr_iir_write);
-
MMIO_D(RC6_CTX_BASE, D_BXT);
MMIO_D(GEN8_PUSHBUS_CONTROL, D_BXT);
{RCS, GAMT_CHKN_BIT_REG, 0x0, false}, /* 0x4ab8 */
{RCS, GEN9_GAMT_ECO_REG_RW_IA, 0x0, false}, /* 0x4ab0 */
- {RCS, GEN9_CSFE_CHICKEN1_RCS, 0x0, false}, /* 0x20d4 */
+ {RCS, GEN9_CSFE_CHICKEN1_RCS, 0xffff, false}, /* 0x20d4 */
{RCS, GEN8_GARBCNTL, 0x0, false}, /* 0xb004 */
{RCS, GEN7_FF_THREAD_MODE, 0x0, false}, /* 0x20a0 */
int ring_id, i;
for (ring_id = 0; ring_id < ARRAY_SIZE(regs); ring_id++) {
+ if (!HAS_ENGINE(dev_priv, ring_id))
+ continue;
offset.reg = regs[ring_id];
for (i = 0; i < GEN9_MOCS_SIZE; i++) {
gen9_render_mocs.control_table[ring_id][i] =
return -EINVAL;
}
- dram_info->valid_dimm = true;
-
/*
* If any of the channel is single rank channel, worst case output
* will be same as if single rank memory, so consider single rank
return -EINVAL;
}
- if (ch0.is_16gb_dimm || ch1.is_16gb_dimm)
- dram_info->is_16gb_dimm = true;
+ dram_info->is_16gb_dimm = ch0.is_16gb_dimm || ch1.is_16gb_dimm;
dev_priv->dram_info.symmetric_memory = intel_is_dram_symmetric(val_ch0,
val_ch1,
return -EINVAL;
}
- dram_info->valid_dimm = true;
dram_info->valid = true;
return 0;
}
int ret;
dram_info->valid = false;
- dram_info->valid_dimm = false;
- dram_info->is_16gb_dimm = false;
dram_info->rank = I915_DRAM_RANK_INVALID;
dram_info->bandwidth_kbps = 0;
dram_info->num_channels = 0;
+ /*
+ * Assume 16Gb DIMMs are present until proven otherwise.
+ * This is only used for the level 0 watermark latency
+ * w/a which does not apply to bxt/glk.
+ */
+ dram_info->is_16gb_dimm = !IS_GEN9_LP(dev_priv);
+
if (INTEL_GEN(dev_priv) < 9 || IS_GEMINILAKE(dev_priv))
return;
struct dram_info {
bool valid;
- bool valid_dimm;
bool is_16gb_dimm;
u8 num_channels;
enum dram_rank {
* any non-page-aligned or non-canonical addresses.
*/
if (unlikely(entry->flags & EXEC_OBJECT_PINNED &&
- entry->offset != gen8_canonical_addr(entry->offset & PAGE_MASK)))
+ entry->offset != gen8_canonical_addr(entry->offset & I915_GTT_PAGE_MASK)))
return -EINVAL;
/* pad_to_size was once a reserved field, so sanitize it */
else if (gen >= 4)
len = 4;
else
- len = 3;
+ len = 6;
batch = reloc_gpu(eb, vma, len);
if (IS_ERR(batch))
*batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
*batch++ = addr;
*batch++ = target_offset;
+
+ /* And again for good measure (blb/pnv) */
+ *batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
+ *batch++ = addr;
+ *batch++ = target_offset;
}
goto out;
if (i == 4)
continue;
- seq_printf(m, "\t\t(%03d, %04d) %08lx: ",
+ seq_printf(m, "\t\t(%03d, %04d) %08llx: ",
pde, pte,
(pde * GEN6_PTES + pte) * I915_GTT_PAGE_SIZE);
for (i = 0; i < 4; i++) {
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
if (ggtt->vm.clear_range != nop_clear_range)
ggtt->vm.clear_range = bxt_vtd_ggtt_clear_range__BKL;
+
+ /* Prevent recursively calling stop_machine() and deadlocks. */
+ dev_info(dev_priv->drm.dev,
+ "Disabling error capture for VT-d workaround\n");
+ i915_disable_error_state(dev_priv, -ENODEV);
}
ggtt->invalidate = gen6_ggtt_invalidate;
#include "i915_selftest.h"
#include "i915_timeline.h"
-#define I915_GTT_PAGE_SIZE_4K BIT(12)
-#define I915_GTT_PAGE_SIZE_64K BIT(16)
-#define I915_GTT_PAGE_SIZE_2M BIT(21)
+#define I915_GTT_PAGE_SIZE_4K BIT_ULL(12)
+#define I915_GTT_PAGE_SIZE_64K BIT_ULL(16)
+#define I915_GTT_PAGE_SIZE_2M BIT_ULL(21)
#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K
#define I915_GTT_MAX_PAGE_SIZE I915_GTT_PAGE_SIZE_2M
+#define I915_GTT_PAGE_MASK -I915_GTT_PAGE_SIZE
+
#define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
#define I915_FENCE_REG_NONE -1
u64 start, u64 end, unsigned int flags);
/* Flags used by pin/bind&friends. */
-#define PIN_NONBLOCK BIT(0)
-#define PIN_MAPPABLE BIT(1)
-#define PIN_ZONE_4G BIT(2)
-#define PIN_NONFAULT BIT(3)
-#define PIN_NOEVICT BIT(4)
-
-#define PIN_MBZ BIT(5) /* I915_VMA_PIN_OVERFLOW */
-#define PIN_GLOBAL BIT(6) /* I915_VMA_GLOBAL_BIND */
-#define PIN_USER BIT(7) /* I915_VMA_LOCAL_BIND */
-#define PIN_UPDATE BIT(8)
-
-#define PIN_HIGH BIT(9)
-#define PIN_OFFSET_BIAS BIT(10)
-#define PIN_OFFSET_FIXED BIT(11)
+#define PIN_NONBLOCK BIT_ULL(0)
+#define PIN_MAPPABLE BIT_ULL(1)
+#define PIN_ZONE_4G BIT_ULL(2)
+#define PIN_NONFAULT BIT_ULL(3)
+#define PIN_NOEVICT BIT_ULL(4)
+
+#define PIN_MBZ BIT_ULL(5) /* I915_VMA_PIN_OVERFLOW */
+#define PIN_GLOBAL BIT_ULL(6) /* I915_VMA_GLOBAL_BIND */
+#define PIN_USER BIT_ULL(7) /* I915_VMA_LOCAL_BIND */
+#define PIN_UPDATE BIT_ULL(8)
+
+#define PIN_HIGH BIT_ULL(9)
+#define PIN_OFFSET_BIAS BIT_ULL(10)
+#define PIN_OFFSET_FIXED BIT_ULL(11)
#define PIN_OFFSET_MASK (-I915_GTT_PAGE_SIZE)
#endif
return 0;
}
+ if (IS_ERR(error))
+ return PTR_ERR(error);
+
if (*error->error_msg)
err_printf(m, "%s\n", error->error_msg);
err_printf(m, "Kernel: " UTS_RELEASE "\n");
error = i915_capture_gpu_state(i915);
if (!error) {
DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
+ i915_disable_error_state(i915, -ENOMEM);
return;
}
i915->gpu_error.first_error = NULL;
spin_unlock_irq(&i915->gpu_error.lock);
- i915_gpu_state_put(error);
+ if (!IS_ERR(error))
+ i915_gpu_state_put(error);
+}
+
+void i915_disable_error_state(struct drm_i915_private *i915, int err)
+{
+ spin_lock_irq(&i915->gpu_error.lock);
+ if (!i915->gpu_error.first_error)
+ i915->gpu_error.first_error = ERR_PTR(err);
+ spin_unlock_irq(&i915->gpu_error.lock);
}
struct i915_gpu_state *i915_first_error_state(struct drm_i915_private *i915);
void i915_reset_error_state(struct drm_i915_private *i915);
+void i915_disable_error_state(struct drm_i915_private *i915, int err);
#else
static inline struct i915_gpu_state *
i915_first_error_state(struct drm_i915_private *i915)
{
- return NULL;
+ return ERR_PTR(-ENODEV);
}
static inline void i915_reset_error_state(struct drm_i915_private *i915)
{
}
+static inline void i915_disable_error_state(struct drm_i915_private *i915,
+ int err)
+{
+}
+
#endif /* IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) */
#endif /* _I915_GPU_ERROR_H_ */
/* ICL PHY DFLEX registers */
#define PORT_TX_DFLEXDPMLE1 _MMIO(0x1638C0)
-#define DFLEXDPMLE1_DPMLETC_MASK(n) (0xf << (4 * (n)))
-#define DFLEXDPMLE1_DPMLETC(n, x) ((x) << (4 * (n)))
+#define DFLEXDPMLE1_DPMLETC_MASK(tc_port) (0xf << (4 * (tc_port)))
+#define DFLEXDPMLE1_DPMLETC_ML0(tc_port) (1 << (4 * (tc_port)))
+#define DFLEXDPMLE1_DPMLETC_ML1_0(tc_port) (3 << (4 * (tc_port)))
+#define DFLEXDPMLE1_DPMLETC_ML3(tc_port) (8 << (4 * (tc_port)))
+#define DFLEXDPMLE1_DPMLETC_ML3_2(tc_port) (12 << (4 * (tc_port)))
+#define DFLEXDPMLE1_DPMLETC_ML3_0(tc_port) (15 << (4 * (tc_port)))
/* BXT PHY Ref registers */
#define _PORT_REF_DW3_A 0x16218C
#define DRM_DIP_ENABLE (1 << 28)
#define PSR_VSC_BIT_7_SET (1 << 27)
-#define VSC_SELECT_MASK (0x3 << 26)
-#define VSC_SELECT_SHIFT 26
-#define VSC_DIP_HW_HEA_DATA (0 << 26)
-#define VSC_DIP_HW_HEA_SW_DATA (1 << 26)
-#define VSC_DIP_HW_DATA_SW_HEA (2 << 26)
-#define VSC_DIP_SW_HEA_DATA (3 << 26)
+#define VSC_SELECT_MASK (0x3 << 25)
+#define VSC_SELECT_SHIFT 25
+#define VSC_DIP_HW_HEA_DATA (0 << 25)
+#define VSC_DIP_HW_HEA_SW_DATA (1 << 25)
+#define VSC_DIP_HW_DATA_SW_HEA (2 << 25)
+#define VSC_DIP_SW_HEA_DATA (3 << 25)
#define VDIP_ENABLE_PPS (1 << 24)
/* Panel power sequencing */
/* HDMI N/CTS table */
#define TMDS_297M 297000
#define TMDS_296M 296703
+#define TMDS_594M 594000
+#define TMDS_593M 593407
+
static const struct {
int sample_rate;
int clock;
{ 176400, TMDS_297M, 18816, 247500 },
{ 192000, TMDS_296M, 23296, 281250 },
{ 192000, TMDS_297M, 20480, 247500 },
+ { 44100, TMDS_593M, 8918, 937500 },
+ { 44100, TMDS_594M, 9408, 990000 },
+ { 48000, TMDS_593M, 5824, 562500 },
+ { 48000, TMDS_594M, 6144, 594000 },
+ { 32000, TMDS_593M, 5824, 843750 },
+ { 32000, TMDS_594M, 3072, 445500 },
+ { 88200, TMDS_593M, 17836, 937500 },
+ { 88200, TMDS_594M, 18816, 990000 },
+ { 96000, TMDS_593M, 11648, 562500 },
+ { 96000, TMDS_594M, 12288, 594000 },
+ { 176400, TMDS_593M, 35672, 937500 },
+ { 176400, TMDS_594M, 37632, 990000 },
+ { 192000, TMDS_593M, 23296, 562500 },
+ { 192000, TMDS_594M, 24576, 594000 },
};
/* get AUD_CONFIG_PIXEL_CLOCK_HDMI_* value for mode */
static int intel_pixel_rate_to_cdclk(struct drm_i915_private *dev_priv,
int pixel_rate)
{
- if (INTEL_GEN(dev_priv) >= 10)
+ if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
return DIV_ROUND_UP(pixel_rate, 2);
- else if (IS_GEMINILAKE(dev_priv))
- /*
- * FIXME: Avoid using a pixel clock that is more than 99% of the cdclk
- * as a temporary workaround. Use a higher cdclk instead. (Note that
- * intel_compute_max_dotclk() limits the max pixel clock to 99% of max
- * cdclk.)
- */
- return DIV_ROUND_UP(pixel_rate * 100, 2 * 99);
else if (IS_GEN9(dev_priv) ||
IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv))
return pixel_rate;
{
int max_cdclk_freq = dev_priv->max_cdclk_freq;
- if (INTEL_GEN(dev_priv) >= 10)
+ if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
return 2 * max_cdclk_freq;
- else if (IS_GEMINILAKE(dev_priv))
- /*
- * FIXME: Limiting to 99% as a temporary workaround. See
- * intel_min_cdclk() for details.
- */
- return 2 * max_cdclk_freq * 99 / 100;
else if (IS_GEN9(dev_priv) ||
IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv))
return max_cdclk_freq;
u8 eu_disabled_mask;
u32 n_disabled;
- if (!(sseu->subslice_mask[ss] & BIT(ss)))
+ if (!(sseu->subslice_mask[s] & BIT(ss)))
/* skip disabled subslice */
continue;
return;
valid_fb:
+ intel_state->base.rotation = plane_config->rotation;
intel_fill_fb_ggtt_view(&intel_state->view, fb,
intel_state->base.rotation);
intel_state->color_plane[0].stride =
* chroma samples for both of the luma samples, and thus we don't
* actually get the expected MPEG2 chroma siting convention :(
* The same behaviour is observed on pre-SKL platforms as well.
+ *
+ * Theory behind the formula (note that we ignore sub-pixel
+ * source coordinates):
+ * s = source sample position
+ * d = destination sample position
+ *
+ * Downscaling 4:1:
+ * -0.5
+ * | 0.0
+ * | | 1.5 (initial phase)
+ * | | |
+ * v v v
+ * | s | s | s | s |
+ * | d |
+ *
+ * Upscaling 1:4:
+ * -0.5
+ * | -0.375 (initial phase)
+ * | | 0.0
+ * | | |
+ * v v v
+ * | s |
+ * | d | d | d | d |
*/
-u16 skl_scaler_calc_phase(int sub, bool chroma_cosited)
+u16 skl_scaler_calc_phase(int sub, int scale, bool chroma_cosited)
{
int phase = -0x8000;
u16 trip = 0;
if (chroma_cosited)
phase += (sub - 1) * 0x8000 / sub;
+ phase += scale / (2 * sub);
+
+ /*
+ * Hardware initial phase limited to [-0.5:1.5].
+ * Since the max hardware scale factor is 3.0, we
+ * should never actually excdeed 1.0 here.
+ */
+ WARN_ON(phase < -0x8000 || phase > 0x18000);
+
if (phase < 0)
phase = 0x10000 + phase;
else
if (crtc->config->pch_pfit.enabled) {
u16 uv_rgb_hphase, uv_rgb_vphase;
+ int pfit_w, pfit_h, hscale, vscale;
int id;
if (WARN_ON(crtc->config->scaler_state.scaler_id < 0))
return;
- uv_rgb_hphase = skl_scaler_calc_phase(1, false);
- uv_rgb_vphase = skl_scaler_calc_phase(1, false);
+ pfit_w = (crtc->config->pch_pfit.size >> 16) & 0xFFFF;
+ pfit_h = crtc->config->pch_pfit.size & 0xFFFF;
+
+ hscale = (crtc->config->pipe_src_w << 16) / pfit_w;
+ vscale = (crtc->config->pipe_src_h << 16) / pfit_h;
+
+ uv_rgb_hphase = skl_scaler_calc_phase(1, hscale, false);
+ uv_rgb_vphase = skl_scaler_calc_phase(1, vscale, false);
id = scaler_state->scaler_id;
I915_WRITE(SKL_PS_CTRL(pipe, id), PS_SCALER_EN |
plane_config->tiling = I915_TILING_X;
fb->modifier = I915_FORMAT_MOD_X_TILED;
}
+
+ if (val & DISPPLANE_ROTATE_180)
+ plane_config->rotation = DRM_MODE_ROTATE_180;
}
+ if (IS_CHERRYVIEW(dev_priv) && pipe == PIPE_B &&
+ val & DISPPLANE_MIRROR)
+ plane_config->rotation |= DRM_MODE_REFLECT_X;
+
pixel_format = val & DISPPLANE_PIXFORMAT_MASK;
fourcc = i9xx_format_to_fourcc(pixel_format);
fb->format = drm_format_info(fourcc);
goto error;
}
+ /*
+ * DRM_MODE_ROTATE_ is counter clockwise to stay compatible with Xrandr
+ * while i915 HW rotation is clockwise, thats why this swapping.
+ */
+ switch (val & PLANE_CTL_ROTATE_MASK) {
+ case PLANE_CTL_ROTATE_0:
+ plane_config->rotation = DRM_MODE_ROTATE_0;
+ break;
+ case PLANE_CTL_ROTATE_90:
+ plane_config->rotation = DRM_MODE_ROTATE_270;
+ break;
+ case PLANE_CTL_ROTATE_180:
+ plane_config->rotation = DRM_MODE_ROTATE_180;
+ break;
+ case PLANE_CTL_ROTATE_270:
+ plane_config->rotation = DRM_MODE_ROTATE_90;
+ break;
+ }
+
+ if (INTEL_GEN(dev_priv) >= 10 &&
+ val & PLANE_CTL_FLIP_HORIZONTAL)
+ plane_config->rotation |= DRM_MODE_REFLECT_X;
+
base = I915_READ(PLANE_SURF(pipe, plane_id)) & 0xfffff000;
plane_config->base = base;
intel_check_cpu_fifo_underruns(dev_priv);
intel_check_pch_fifo_underruns(dev_priv);
- if (!new_crtc_state->active) {
- /*
- * Make sure we don't call initial_watermarks
- * for ILK-style watermark updates.
- *
- * No clue what this is supposed to achieve.
- */
- if (INTEL_GEN(dev_priv) >= 9)
- dev_priv->display.initial_watermarks(intel_state,
- to_intel_crtc_state(new_crtc_state));
- }
+ /* FIXME unify this for all platforms */
+ if (!new_crtc_state->active &&
+ !HAS_GMCH_DISPLAY(dev_priv) &&
+ dev_priv->display.initial_watermarks)
+ dev_priv->display.initial_watermarks(intel_state,
+ to_intel_crtc_state(new_crtc_state));
}
}
fb->height < SKL_MIN_YUV_420_SRC_H ||
(fb->width % 4) != 0 || (fb->height % 4) != 0)) {
DRM_DEBUG_KMS("src dimensions not correct for NV12\n");
- return -EINVAL;
+ goto err;
}
for (i = 0; i < fb->format->num_planes; i++) {
ret = drm_atomic_add_affected_planes(state, crtc);
if (ret)
goto out;
+
+ /*
+ * FIXME hack to force a LUT update to avoid the
+ * plane update forcing the pipe gamma on without
+ * having a proper LUT loaded. Remove once we
+ * have readout for pipe gamma enable.
+ */
+ crtc_state->color_mgmt_changed = true;
}
}
*/
status = connector_status_disconnected;
goto out;
- } else {
- /*
- * If display is now connected check links status,
- * there has been known issues of link loss triggering
- * long pulse.
- *
- * Some sinks (eg. ASUS PB287Q) seem to perform some
- * weird HPD ping pong during modesets. So we can apparently
- * end up with HPD going low during a modeset, and then
- * going back up soon after. And once that happens we must
- * retrain the link to get a picture. That's in case no
- * userspace component reacted to intermittent HPD dip.
- */
+ }
+
+ /*
+ * Some external monitors do not signal loss of link synchronization
+ * with an IRQ_HPD, so force a link status check.
+ */
+ if (!intel_dp_is_edp(intel_dp)) {
struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
intel_dp_retrain_link(encoder, ctx);
pipe_config->pbn = mst_pbn;
/* Zombie connectors can't have VCPI slots */
- if (READ_ONCE(connector->registered)) {
+ if (!drm_connector_is_unregistered(connector)) {
slots = drm_dp_atomic_find_vcpi_slots(state,
&intel_dp->mst_mgr,
port,
struct edid *edid;
int ret;
- if (!READ_ONCE(connector->registered))
+ if (drm_connector_is_unregistered(connector))
return intel_connector_update_modes(connector, NULL);
edid = drm_dp_mst_get_edid(connector, &intel_dp->mst_mgr, intel_connector->port);
struct intel_connector *intel_connector = to_intel_connector(connector);
struct intel_dp *intel_dp = intel_connector->mst_port;
- if (!READ_ONCE(connector->registered))
+ if (drm_connector_is_unregistered(connector))
return connector_status_disconnected;
return drm_dp_mst_detect_port(connector, &intel_dp->mst_mgr,
intel_connector->port);
int bpp = 24; /* MST uses fixed bpp */
int max_rate, mode_rate, max_lanes, max_link_clock;
- if (!READ_ONCE(connector->registered))
+ if (drm_connector_is_unregistered(connector))
return MODE_ERROR;
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
if (!intel_connector)
return NULL;
+ intel_connector->get_hw_state = intel_dp_mst_get_hw_state;
+ intel_connector->mst_port = intel_dp;
+ intel_connector->port = port;
+
connector = &intel_connector->base;
ret = drm_connector_init(dev, connector, &intel_dp_mst_connector_funcs,
DRM_MODE_CONNECTOR_DisplayPort);
drm_connector_helper_add(connector, &intel_dp_mst_connector_helper_funcs);
- intel_connector->get_hw_state = intel_dp_mst_get_hw_state;
- intel_connector->mst_port = intel_dp;
- intel_connector->port = port;
-
for_each_pipe(dev_priv, pipe) {
struct drm_encoder *enc =
&intel_dp->mst_encoders[pipe]->base.base;
unsigned int tiling;
int size;
u32 base;
+ u8 rotation;
};
#define SKL_MIN_SRC_W 8
void intel_crtc_arm_fifo_underrun(struct intel_crtc *crtc,
struct intel_crtc_state *crtc_state);
-u16 skl_scaler_calc_phase(int sub, bool chroma_center);
+u16 skl_scaler_calc_phase(int sub, int scale, bool chroma_center);
int skl_update_scaler_crtc(struct intel_crtc_state *crtc_state);
int skl_max_scale(const struct intel_crtc_state *crtc_state,
u32 pixel_format);
drm_for_each_connector_iter(connector, &conn_iter) {
struct intel_connector *intel_connector = to_intel_connector(connector);
- if (intel_connector->encoder->hpd_pin == pin) {
+ /* Don't check MST ports, they don't have pins */
+ if (!intel_connector->mst_port &&
+ intel_connector->encoder->hpd_pin == pin) {
if (connector->polled != intel_connector->polled)
DRM_DEBUG_DRIVER("Reenabling HPD on connector %s\n",
connector->name);
struct intel_encoder *encoder;
bool storm_detected = false;
bool queue_dig = false, queue_hp = false;
+ u32 long_hpd_pulse_mask = 0;
+ u32 short_hpd_pulse_mask = 0;
+ enum hpd_pin pin;
if (!pin_mask)
return;
spin_lock(&dev_priv->irq_lock);
+
+ /*
+ * Determine whether ->hpd_pulse() exists for each pin, and
+ * whether we have a short or a long pulse. This is needed
+ * as each pin may have up to two encoders (HDMI and DP) and
+ * only the one of them (DP) will have ->hpd_pulse().
+ */
for_each_intel_encoder(&dev_priv->drm, encoder) {
- enum hpd_pin pin = encoder->hpd_pin;
bool has_hpd_pulse = intel_encoder_has_hpd_pulse(encoder);
+ enum port port = encoder->port;
+ bool long_hpd;
+ pin = encoder->hpd_pin;
if (!(BIT(pin) & pin_mask))
continue;
- if (has_hpd_pulse) {
- bool long_hpd = long_mask & BIT(pin);
- enum port port = encoder->port;
+ if (!has_hpd_pulse)
+ continue;
- DRM_DEBUG_DRIVER("digital hpd port %c - %s\n", port_name(port),
- long_hpd ? "long" : "short");
- /*
- * For long HPD pulses we want to have the digital queue happen,
- * but we still want HPD storm detection to function.
- */
- queue_dig = true;
- if (long_hpd) {
- dev_priv->hotplug.long_port_mask |= (1 << port);
- } else {
- /* for short HPD just trigger the digital queue */
- dev_priv->hotplug.short_port_mask |= (1 << port);
- continue;
- }
+ long_hpd = long_mask & BIT(pin);
+
+ DRM_DEBUG_DRIVER("digital hpd port %c - %s\n", port_name(port),
+ long_hpd ? "long" : "short");
+ queue_dig = true;
+
+ if (long_hpd) {
+ long_hpd_pulse_mask |= BIT(pin);
+ dev_priv->hotplug.long_port_mask |= BIT(port);
+ } else {
+ short_hpd_pulse_mask |= BIT(pin);
+ dev_priv->hotplug.short_port_mask |= BIT(port);
}
+ }
+
+ /* Now process each pin just once */
+ for_each_hpd_pin(pin) {
+ bool long_hpd;
+
+ if (!(BIT(pin) & pin_mask))
+ continue;
if (dev_priv->hotplug.stats[pin].state == HPD_DISABLED) {
/*
if (dev_priv->hotplug.stats[pin].state != HPD_ENABLED)
continue;
- if (!has_hpd_pulse) {
+ /*
+ * Delegate to ->hpd_pulse() if one of the encoders for this
+ * pin has it, otherwise let the hotplug_work deal with this
+ * pin directly.
+ */
+ if (((short_hpd_pulse_mask | long_hpd_pulse_mask) & BIT(pin))) {
+ long_hpd = long_hpd_pulse_mask & BIT(pin);
+ } else {
dev_priv->hotplug.event_bits |= BIT(pin);
+ long_hpd = true;
queue_hp = true;
}
+ if (!long_hpd)
+ continue;
+
if (intel_hpd_irq_storm_detect(dev_priv, pin)) {
dev_priv->hotplug.event_bits &= ~BIT(pin);
storm_detected = true;
lpe_audio_platdev_destroy(dev_priv);
irq_free_desc(dev_priv->lpe_audio.irq);
-}
+ dev_priv->lpe_audio.irq = -1;
+ dev_priv->lpe_audio.platdev = NULL;
+}
/**
* intel_lpe_audio_notify() - notify lpe audio event
reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
- /* True 32b PPGTT with dynamic page allocation: update PDP
+ /*
+ * True 32b PPGTT with dynamic page allocation: update PDP
* registers and point the unallocated PDPs to scratch page.
* PML4 is allocated during ppgtt init, so this is not needed
* in 48-bit mode.
if (ppgtt && !i915_vm_is_48bit(&ppgtt->vm))
execlists_update_context_pdps(ppgtt, reg_state);
+ /*
+ * Make sure the context image is complete before we submit it to HW.
+ *
+ * Ostensibly, writes (including the WCB) should be flushed prior to
+ * an uncached write such as our mmio register access, the empirical
+ * evidence (esp. on Braswell) suggests that the WC write into memory
+ * may not be visible to the HW prior to the completion of the UC
+ * register write and that we may begin execution from the context
+ * before its image is complete leading to invalid PD chasing.
+ */
+ wmb();
return ce->lrc_desc;
}
uint32_t method1, method2;
int cpp;
+ if (mem_value == 0)
+ return U32_MAX;
+
if (!intel_wm_plane_visible(cstate, pstate))
return 0;
uint32_t method1, method2;
int cpp;
+ if (mem_value == 0)
+ return U32_MAX;
+
if (!intel_wm_plane_visible(cstate, pstate))
return 0;
{
int cpp;
+ if (mem_value == 0)
+ return U32_MAX;
+
if (!intel_wm_plane_visible(cstate, pstate))
return 0;
* any underrun. If not able to get Dimm info assume 16GB dimm
* to avoid any underrun.
*/
- if (!dev_priv->dram_info.valid_dimm ||
- dev_priv->dram_info.is_16gb_dimm)
+ if (dev_priv->dram_info.is_16gb_dimm)
wm[0] += 1;
} else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) {
intel_print_wm_latency(dev_priv, "Cursor", dev_priv->wm.cur_latency);
}
+static void snb_wm_lp3_irq_quirk(struct drm_i915_private *dev_priv)
+{
+ /*
+ * On some SNB machines (Thinkpad X220 Tablet at least)
+ * LP3 usage can cause vblank interrupts to be lost.
+ * The DEIIR bit will go high but it looks like the CPU
+ * never gets interrupted.
+ *
+ * It's not clear whether other interrupt source could
+ * be affected or if this is somehow limited to vblank
+ * interrupts only. To play it safe we disable LP3
+ * watermarks entirely.
+ */
+ if (dev_priv->wm.pri_latency[3] == 0 &&
+ dev_priv->wm.spr_latency[3] == 0 &&
+ dev_priv->wm.cur_latency[3] == 0)
+ return;
+
+ dev_priv->wm.pri_latency[3] = 0;
+ dev_priv->wm.spr_latency[3] = 0;
+ dev_priv->wm.cur_latency[3] = 0;
+
+ DRM_DEBUG_KMS("LP3 watermarks disabled due to potential for lost interrupts\n");
+ intel_print_wm_latency(dev_priv, "Primary", dev_priv->wm.pri_latency);
+ intel_print_wm_latency(dev_priv, "Sprite", dev_priv->wm.spr_latency);
+ intel_print_wm_latency(dev_priv, "Cursor", dev_priv->wm.cur_latency);
+}
+
static void ilk_setup_wm_latency(struct drm_i915_private *dev_priv)
{
intel_read_wm_latency(dev_priv, dev_priv->wm.pri_latency);
intel_print_wm_latency(dev_priv, "Sprite", dev_priv->wm.spr_latency);
intel_print_wm_latency(dev_priv, "Cursor", dev_priv->wm.cur_latency);
- if (IS_GEN6(dev_priv))
+ if (IS_GEN6(dev_priv)) {
snb_wm_latency_quirk(dev_priv);
+ snb_wm_lp3_irq_quirk(dev_priv);
+ }
}
static void skl_setup_wm_latency(struct drm_i915_private *dev_priv)
gen4_render_ring_flush(struct i915_request *rq, u32 mode)
{
u32 cmd, *cs;
+ int i;
/*
* read/write caches:
cmd |= MI_INVALIDATE_ISP;
}
- cs = intel_ring_begin(rq, 2);
+ i = 2;
+ if (mode & EMIT_INVALIDATE)
+ i += 20;
+
+ cs = intel_ring_begin(rq, i);
if (IS_ERR(cs))
return PTR_ERR(cs);
*cs++ = cmd;
- *cs++ = MI_NOOP;
+
+ /*
+ * A random delay to let the CS invalidate take effect? Without this
+ * delay, the GPU relocation path fails as the CS does not see
+ * the updated contents. Just as important, if we apply the flushes
+ * to the EMIT_FLUSH branch (i.e. immediately after the relocation
+ * write and before the invalidate on the next batch), the relocations
+ * still fail. This implies that is a delay following invalidation
+ * that is required to reset the caches as opposed to a delay to
+ * ensure the memory is written.
+ */
+ if (mode & EMIT_INVALIDATE) {
+ *cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
+ *cs++ = i915_ggtt_offset(rq->engine->scratch) |
+ PIPE_CONTROL_GLOBAL_GTT;
+ *cs++ = 0;
+ *cs++ = 0;
+
+ for (i = 0; i < 12; i++)
+ *cs++ = MI_FLUSH;
+
+ *cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
+ *cs++ = i915_ggtt_offset(rq->engine->scratch) |
+ PIPE_CONTROL_GLOBAL_GTT;
+ *cs++ = 0;
+ *cs++ = 0;
+ }
+
+ *cs++ = cmd;
+
intel_ring_advance(rq, cs);
return 0;
.hsw.has_fuses = true,
},
},
+ {
+ .name = "DC off",
+ .domains = ICL_DISPLAY_DC_OFF_POWER_DOMAINS,
+ .ops = &gen9_dc_off_power_well_ops,
+ .id = DISP_PW_ID_NONE,
+ },
{
.name = "power well 2",
.domains = ICL_PW_2_POWER_DOMAINS,
.hsw.has_fuses = true,
},
},
- {
- .name = "DC off",
- .domains = ICL_DISPLAY_DC_OFF_POWER_DOMAINS,
- .ops = &gen9_dc_off_power_well_ops,
- .id = DISP_PW_ID_NONE,
- },
{
.name = "power well 3",
.domains = ICL_PW_3_POWER_DOMAINS,
void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
u8 req_slices)
{
- u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
- u32 val;
+ const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
bool ret;
if (req_slices > intel_dbuf_max_slices(dev_priv)) {
if (req_slices == hw_enabled_slices || req_slices == 0)
return;
- val = I915_READ(DBUF_CTL_S2);
if (req_slices > hw_enabled_slices)
ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, true);
else
return min(8192 * cpp, 32768);
}
+static void
+skl_program_scaler(struct intel_plane *plane,
+ const struct intel_crtc_state *crtc_state,
+ const struct intel_plane_state *plane_state)
+{
+ struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
+ enum pipe pipe = plane->pipe;
+ int scaler_id = plane_state->scaler_id;
+ const struct intel_scaler *scaler =
+ &crtc_state->scaler_state.scalers[scaler_id];
+ int crtc_x = plane_state->base.dst.x1;
+ int crtc_y = plane_state->base.dst.y1;
+ uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
+ uint32_t crtc_h = drm_rect_height(&plane_state->base.dst);
+ u16 y_hphase, uv_rgb_hphase;
+ u16 y_vphase, uv_rgb_vphase;
+ int hscale, vscale;
+
+ hscale = drm_rect_calc_hscale(&plane_state->base.src,
+ &plane_state->base.dst,
+ 0, INT_MAX);
+ vscale = drm_rect_calc_vscale(&plane_state->base.src,
+ &plane_state->base.dst,
+ 0, INT_MAX);
+
+ /* TODO: handle sub-pixel coordinates */
+ if (plane_state->base.fb->format->format == DRM_FORMAT_NV12) {
+ y_hphase = skl_scaler_calc_phase(1, hscale, false);
+ y_vphase = skl_scaler_calc_phase(1, vscale, false);
+
+ /* MPEG2 chroma siting convention */
+ uv_rgb_hphase = skl_scaler_calc_phase(2, hscale, true);
+ uv_rgb_vphase = skl_scaler_calc_phase(2, vscale, false);
+ } else {
+ /* not used */
+ y_hphase = 0;
+ y_vphase = 0;
+
+ uv_rgb_hphase = skl_scaler_calc_phase(1, hscale, false);
+ uv_rgb_vphase = skl_scaler_calc_phase(1, vscale, false);
+ }
+
+ I915_WRITE_FW(SKL_PS_CTRL(pipe, scaler_id),
+ PS_SCALER_EN | PS_PLANE_SEL(plane->id) | scaler->mode);
+ I915_WRITE_FW(SKL_PS_PWR_GATE(pipe, scaler_id), 0);
+ I915_WRITE_FW(SKL_PS_VPHASE(pipe, scaler_id),
+ PS_Y_PHASE(y_vphase) | PS_UV_RGB_PHASE(uv_rgb_vphase));
+ I915_WRITE_FW(SKL_PS_HPHASE(pipe, scaler_id),
+ PS_Y_PHASE(y_hphase) | PS_UV_RGB_PHASE(uv_rgb_hphase));
+ I915_WRITE_FW(SKL_PS_WIN_POS(pipe, scaler_id), (crtc_x << 16) | crtc_y);
+ I915_WRITE_FW(SKL_PS_WIN_SZ(pipe, scaler_id), (crtc_w << 16) | crtc_h);
+}
+
void
skl_update_plane(struct intel_plane *plane,
const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
- const struct drm_framebuffer *fb = plane_state->base.fb;
enum plane_id plane_id = plane->id;
enum pipe pipe = plane->pipe;
u32 plane_ctl = plane_state->ctl;
u32 aux_stride = skl_plane_stride(plane_state, 1);
int crtc_x = plane_state->base.dst.x1;
int crtc_y = plane_state->base.dst.y1;
- uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
- uint32_t crtc_h = drm_rect_height(&plane_state->base.dst);
uint32_t x = plane_state->color_plane[0].x;
uint32_t y = plane_state->color_plane[0].y;
uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16;
/* Sizes are 0 based */
src_w--;
src_h--;
- crtc_w--;
- crtc_h--;
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
(plane_state->color_plane[1].y << 16) |
plane_state->color_plane[1].x);
- /* program plane scaler */
if (plane_state->scaler_id >= 0) {
- int scaler_id = plane_state->scaler_id;
- const struct intel_scaler *scaler =
- &crtc_state->scaler_state.scalers[scaler_id];
- u16 y_hphase, uv_rgb_hphase;
- u16 y_vphase, uv_rgb_vphase;
-
- /* TODO: handle sub-pixel coordinates */
- if (fb->format->format == DRM_FORMAT_NV12) {
- y_hphase = skl_scaler_calc_phase(1, false);
- y_vphase = skl_scaler_calc_phase(1, false);
-
- /* MPEG2 chroma siting convention */
- uv_rgb_hphase = skl_scaler_calc_phase(2, true);
- uv_rgb_vphase = skl_scaler_calc_phase(2, false);
- } else {
- /* not used */
- y_hphase = 0;
- y_vphase = 0;
-
- uv_rgb_hphase = skl_scaler_calc_phase(1, false);
- uv_rgb_vphase = skl_scaler_calc_phase(1, false);
- }
-
- I915_WRITE_FW(SKL_PS_CTRL(pipe, scaler_id),
- PS_SCALER_EN | PS_PLANE_SEL(plane_id) | scaler->mode);
- I915_WRITE_FW(SKL_PS_PWR_GATE(pipe, scaler_id), 0);
- I915_WRITE_FW(SKL_PS_VPHASE(pipe, scaler_id),
- PS_Y_PHASE(y_vphase) | PS_UV_RGB_PHASE(uv_rgb_vphase));
- I915_WRITE_FW(SKL_PS_HPHASE(pipe, scaler_id),
- PS_Y_PHASE(y_hphase) | PS_UV_RGB_PHASE(uv_rgb_hphase));
- I915_WRITE_FW(SKL_PS_WIN_POS(pipe, scaler_id), (crtc_x << 16) | crtc_y);
- I915_WRITE_FW(SKL_PS_WIN_SZ(pipe, scaler_id),
- ((crtc_w + 1) << 16)|(crtc_h + 1));
+ skl_program_scaler(plane, crtc_state, plane_state);
I915_WRITE_FW(PLANE_POS(pipe, plane_id), 0);
} else {
err = igt_check_page_sizes(vma);
if (vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K) {
- pr_err("page_sizes.gtt=%u, expected %lu\n",
+ pr_err("page_sizes.gtt=%u, expected %llu\n",
vma->page_sizes.gtt, I915_GTT_PAGE_SIZE_4K);
err = -EINVAL;
}
GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
if (vma->node.start != total ||
vma->node.size != 2*I915_GTT_PAGE_SIZE) {
- pr_err("i915_gem_gtt_reserve (pass 1) placement failed, found (%llx + %llx), expected (%llx + %lx)\n",
+ pr_err("i915_gem_gtt_reserve (pass 1) placement failed, found (%llx + %llx), expected (%llx + %llx)\n",
vma->node.start, vma->node.size,
total, 2*I915_GTT_PAGE_SIZE);
err = -EINVAL;
GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
if (vma->node.start != total ||
vma->node.size != 2*I915_GTT_PAGE_SIZE) {
- pr_err("i915_gem_gtt_reserve (pass 2) placement failed, found (%llx + %llx), expected (%llx + %lx)\n",
+ pr_err("i915_gem_gtt_reserve (pass 2) placement failed, found (%llx + %llx), expected (%llx + %llx)\n",
vma->node.start, vma->node.size,
total, 2*I915_GTT_PAGE_SIZE);
err = -EINVAL;
GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
if (vma->node.start != offset ||
vma->node.size != 2*I915_GTT_PAGE_SIZE) {
- pr_err("i915_gem_gtt_reserve (pass 3) placement failed, found (%llx + %llx), expected (%llx + %lx)\n",
+ pr_err("i915_gem_gtt_reserve (pass 3) placement failed, found (%llx + %llx), expected (%llx + %llx)\n",
vma->node.start, vma->node.size,
offset, 2*I915_GTT_PAGE_SIZE);
err = -EINVAL;
struct drm_crtc base;
struct drm_pending_vblank_event *event;
struct meson_drm *priv;
+ bool enabled;
};
#define to_meson_crtc(x) container_of(x, struct meson_crtc, base)
};
-static void meson_crtc_atomic_enable(struct drm_crtc *crtc,
- struct drm_crtc_state *old_state)
+static void meson_crtc_enable(struct drm_crtc *crtc)
{
struct meson_crtc *meson_crtc = to_meson_crtc(crtc);
struct drm_crtc_state *crtc_state = crtc->state;
writel_bits_relaxed(VPP_POSTBLEND_ENABLE, VPP_POSTBLEND_ENABLE,
priv->io_base + _REG(VPP_MISC));
+ drm_crtc_vblank_on(crtc);
+
+ meson_crtc->enabled = true;
+}
+
+static void meson_crtc_atomic_enable(struct drm_crtc *crtc,
+ struct drm_crtc_state *old_state)
+{
+ struct meson_crtc *meson_crtc = to_meson_crtc(crtc);
+ struct meson_drm *priv = meson_crtc->priv;
+
+ DRM_DEBUG_DRIVER("\n");
+
+ if (!meson_crtc->enabled)
+ meson_crtc_enable(crtc);
+
priv->viu.osd1_enabled = true;
}
struct meson_crtc *meson_crtc = to_meson_crtc(crtc);
struct meson_drm *priv = meson_crtc->priv;
+ drm_crtc_vblank_off(crtc);
+
priv->viu.osd1_enabled = false;
priv->viu.osd1_commit = false;
crtc->state->event = NULL;
}
+
+ meson_crtc->enabled = false;
}
static void meson_crtc_atomic_begin(struct drm_crtc *crtc,
struct meson_crtc *meson_crtc = to_meson_crtc(crtc);
unsigned long flags;
+ if (crtc->state->enable && !meson_crtc->enabled)
+ meson_crtc_enable(crtc);
+
if (crtc->state->event) {
WARN_ON(drm_crtc_vblank_get(crtc) != 0);
.reg_read = meson_dw_hdmi_reg_read,
.reg_write = meson_dw_hdmi_reg_write,
.max_register = 0x10000,
+ .fast_io = true,
};
static bool meson_hdmi_connector_is_available(struct device *dev)
*/
/* HHI Registers */
+#define HHI_GCLK_MPEG2 0x148 /* 0x52 offset in data sheet */
#define HHI_VDAC_CNTL0 0x2F4 /* 0xbd offset in data sheet */
#define HHI_VDAC_CNTL1 0x2F8 /* 0xbe offset in data sheet */
#define HHI_HDMI_PHY_CNTL0 0x3a0 /* 0xe8 offset in data sheet */
{ 5, &meson_hdmi_encp_mode_1080i60 },
{ 20, &meson_hdmi_encp_mode_1080i50 },
{ 32, &meson_hdmi_encp_mode_1080p24 },
+ { 33, &meson_hdmi_encp_mode_1080p50 },
{ 34, &meson_hdmi_encp_mode_1080p30 },
{ 31, &meson_hdmi_encp_mode_1080p50 },
{ 16, &meson_hdmi_encp_mode_1080p60 },
unsigned int sof_lines;
unsigned int vsync_lines;
+ /* Use VENCI for 480i and 576i and double HDMI pixels */
+ if (mode->flags & DRM_MODE_FLAG_DBLCLK) {
+ hdmi_repeat = true;
+ use_enci = true;
+ venc_hdmi_latency = 1;
+ }
+
if (meson_venc_hdmi_supported_vic(vic)) {
vmode = meson_venc_hdmi_get_vic_vmode(vic);
if (!vmode) {
} else {
meson_venc_hdmi_get_dmt_vmode(mode, &vmode_dmt);
vmode = &vmode_dmt;
- }
-
- /* Use VENCI for 480i and 576i and double HDMI pixels */
- if (mode->flags & DRM_MODE_FLAG_DBLCLK) {
- hdmi_repeat = true;
- use_enci = true;
- venc_hdmi_latency = 1;
+ use_enci = false;
}
/* Repeat VENC pixels for 480/576i/p, 720p50/60 and 1080p50/60 */
void meson_venc_enable_vsync(struct meson_drm *priv)
{
writel_relaxed(2, priv->io_base + _REG(VENC_INTCTRL));
+ regmap_update_bits(priv->hhi, HHI_GCLK_MPEG2, BIT(25), BIT(25));
}
void meson_venc_disable_vsync(struct meson_drm *priv)
{
+ regmap_update_bits(priv->hhi, HHI_GCLK_MPEG2, BIT(25), 0);
writel_relaxed(0, priv->io_base + _REG(VENC_INTCTRL));
}
if (lut_sel == VIU_LUT_OSD_OETF) {
writel(0, priv->io_base + _REG(addr_port));
- for (i = 0; i < 20; i++)
+ for (i = 0; i < (OSD_OETF_LUT_SIZE / 2); i++)
writel(r_map[i * 2] | (r_map[i * 2 + 1] << 16),
priv->io_base + _REG(data_port));
writel(r_map[OSD_OETF_LUT_SIZE - 1] | (g_map[0] << 16),
priv->io_base + _REG(data_port));
- for (i = 0; i < 20; i++)
+ for (i = 0; i < (OSD_OETF_LUT_SIZE / 2); i++)
writel(g_map[i * 2 + 1] | (g_map[i * 2 + 2] << 16),
priv->io_base + _REG(data_port));
- for (i = 0; i < 20; i++)
+ for (i = 0; i < (OSD_OETF_LUT_SIZE / 2); i++)
writel(b_map[i * 2] | (b_map[i * 2 + 1] << 16),
priv->io_base + _REG(data_port));
} else if (lut_sel == VIU_LUT_OSD_EOTF) {
writel(0, priv->io_base + _REG(addr_port));
- for (i = 0; i < 20; i++)
+ for (i = 0; i < (OSD_EOTF_LUT_SIZE / 2); i++)
writel(r_map[i * 2] | (r_map[i * 2 + 1] << 16),
priv->io_base + _REG(data_port));
writel(r_map[OSD_EOTF_LUT_SIZE - 1] | (g_map[0] << 16),
priv->io_base + _REG(data_port));
- for (i = 0; i < 20; i++)
+ for (i = 0; i < (OSD_EOTF_LUT_SIZE / 2); i++)
writel(g_map[i * 2 + 1] | (g_map[i * 2 + 2] << 16),
priv->io_base + _REG(data_port));
- for (i = 0; i < 20; i++)
+ for (i = 0; i < (OSD_EOTF_LUT_SIZE / 2); i++)
writel(b_map[i * 2] | (b_map[i * 2 + 1] << 16),
priv->io_base + _REG(data_port));
NULL);
drm_crtc_helper_add(crtc, &dpu_crtc_helper_funcs);
- plane->crtc = crtc;
/* save user friendly CRTC name for later */
snprintf(dpu_crtc->name, DPU_CRTC_NAME_SIZE, "crtc%u", crtc->base.id);
drm_encoder_cleanup(drm_enc);
mutex_destroy(&dpu_enc->enc_lock);
-
- kfree(dpu_enc);
}
void dpu_encoder_helper_split_config(
INTERLEAVED_RGB_FMT(XBGR8888,
COLOR_8BIT, COLOR_8BIT, COLOR_8BIT, COLOR_8BIT,
C2_R_Cr, C0_G_Y, C1_B_Cb, C3_ALPHA, 4,
- true, 4, 0,
+ false, 4, 0,
DPU_FETCH_LINEAR, 1),
INTERLEAVED_RGB_FMT(RGBA8888,
#define DSI_PIXEL_PLL_CLK 1
#define NUM_PROVIDED_CLKS 2
+#define VCO_REF_CLK_RATE 19200000
+
struct dsi_pll_regs {
u32 pll_prop_gain_rate;
u32 pll_lockdet_rate;
parent_rate);
pll_10nm->vco_current_rate = rate;
- pll_10nm->vco_ref_clk_rate = parent_rate;
+ pll_10nm->vco_ref_clk_rate = VCO_REF_CLK_RATE;
dsi_pll_setup_config(pll_10nm);
goto fail;
}
+ ret = msm_hdmi_hpd_enable(hdmi->connector);
+ if (ret < 0) {
+ DRM_DEV_ERROR(&hdmi->pdev->dev, "failed to enable HPD: %d\n", ret);
+ goto fail;
+ }
+
encoder->bridge = hdmi->bridge;
priv->bridges[priv->num_bridges++] = hdmi->bridge;
{
struct drm_device *drm = dev_get_drvdata(master);
struct msm_drm_private *priv = drm->dev_private;
- static struct hdmi_platform_config *hdmi_cfg;
+ struct hdmi_platform_config *hdmi_cfg;
struct hdmi *hdmi;
struct device_node *of_node = dev->of_node;
int i, err;
void msm_hdmi_connector_irq(struct drm_connector *connector);
struct drm_connector *msm_hdmi_connector_init(struct hdmi *hdmi);
+int msm_hdmi_hpd_enable(struct drm_connector *connector);
/*
* i2c adapter for ddc:
}
}
-static int hpd_enable(struct hdmi_connector *hdmi_connector)
+int msm_hdmi_hpd_enable(struct drm_connector *connector)
{
+ struct hdmi_connector *hdmi_connector = to_hdmi_connector(connector);
struct hdmi *hdmi = hdmi_connector->hdmi;
const struct hdmi_platform_config *config = hdmi->config;
struct device *dev = &hdmi->pdev->dev;
{
struct drm_connector *connector = NULL;
struct hdmi_connector *hdmi_connector;
- int ret;
hdmi_connector = kzalloc(sizeof(*hdmi_connector), GFP_KERNEL);
if (!hdmi_connector)
connector->interlace_allowed = 0;
connector->doublescan_allowed = 0;
- ret = hpd_enable(hdmi_connector);
- if (ret) {
- dev_err(&hdmi->pdev->dev, "failed to enable HPD: %d\n", ret);
- return ERR_PTR(ret);
- }
-
drm_connector_attach_encoder(connector, hdmi->encoder);
return connector;
if (!new_crtc_state->active)
continue;
+ if (drm_crtc_vblank_get(crtc))
+ continue;
+
kms->funcs->wait_for_crtc_commit_done(kms, crtc);
+
+ drm_crtc_vblank_put(crtc);
}
}
ret = mutex_lock_interruptible(&dev->struct_mutex);
if (ret)
- return ret;
+ goto free_priv;
pm_runtime_get_sync(&gpu->pdev->dev);
show_priv->state = gpu->funcs->gpu_state_get(gpu);
if (IS_ERR(show_priv->state)) {
ret = PTR_ERR(show_priv->state);
- kfree(show_priv);
- return ret;
+ goto free_priv;
}
show_priv->dev = dev;
- return single_open(file, msm_gpu_show, show_priv);
+ ret = single_open(file, msm_gpu_show, show_priv);
+ if (ret)
+ goto free_priv;
+
+ return 0;
+
+free_priv:
+ kfree(show_priv);
+ return ret;
}
static const struct file_operations msm_gpu_fops = {
kthread_run(kthread_worker_fn,
&priv->disp_thread[i].worker,
"crtc_commit:%d", priv->disp_thread[i].crtc_id);
- ret = sched_setscheduler(priv->disp_thread[i].thread,
- SCHED_FIFO, ¶m);
- if (ret)
- pr_warn("display thread priority update failed: %d\n",
- ret);
-
if (IS_ERR(priv->disp_thread[i].thread)) {
dev_err(dev, "failed to create crtc_commit kthread\n");
priv->disp_thread[i].thread = NULL;
+ goto err_msm_uninit;
}
+ ret = sched_setscheduler(priv->disp_thread[i].thread,
+ SCHED_FIFO, ¶m);
+ if (ret)
+ dev_warn(dev, "disp_thread set priority failed: %d\n",
+ ret);
+
/* initialize event thread */
priv->event_thread[i].crtc_id = priv->crtcs[i]->base.id;
kthread_init_worker(&priv->event_thread[i].worker);
kthread_run(kthread_worker_fn,
&priv->event_thread[i].worker,
"crtc_event:%d", priv->event_thread[i].crtc_id);
+ if (IS_ERR(priv->event_thread[i].thread)) {
+ dev_err(dev, "failed to create crtc_event kthread\n");
+ priv->event_thread[i].thread = NULL;
+ goto err_msm_uninit;
+ }
+
/**
* event thread should also run at same priority as disp_thread
* because it is handling frame_done events. A lower priority
* failure at crtc commit level.
*/
ret = sched_setscheduler(priv->event_thread[i].thread,
- SCHED_FIFO, ¶m);
+ SCHED_FIFO, ¶m);
if (ret)
- pr_warn("display event thread priority update failed: %d\n",
- ret);
-
- if (IS_ERR(priv->event_thread[i].thread)) {
- dev_err(dev, "failed to create crtc_event kthread\n");
- priv->event_thread[i].thread = NULL;
- }
-
- if ((!priv->disp_thread[i].thread) ||
- !priv->event_thread[i].thread) {
- /* clean up previously created threads if any */
- for ( ; i >= 0; i--) {
- if (priv->disp_thread[i].thread) {
- kthread_stop(
- priv->disp_thread[i].thread);
- priv->disp_thread[i].thread = NULL;
- }
-
- if (priv->event_thread[i].thread) {
- kthread_stop(
- priv->event_thread[i].thread);
- priv->event_thread[i].thread = NULL;
- }
- }
- goto err_msm_uninit;
- }
+ dev_warn(dev, "event_thread set priority failed:%d\n",
+ ret);
}
ret = drm_vblank_init(ddev, priv->num_crtcs);
uint32_t *ptr;
int ret = 0;
+ if (!nr_relocs)
+ return 0;
+
if (offset % 4) {
DRM_ERROR("non-aligned cmdstream buffer: %u\n", offset);
return -EINVAL;
struct msm_file_private *ctx = file->driver_priv;
struct msm_gem_submit *submit;
struct msm_gpu *gpu = priv->gpu;
- struct dma_fence *in_fence = NULL;
struct sync_file *sync_file = NULL;
struct msm_gpu_submitqueue *queue;
struct msm_ringbuffer *ring;
ring = gpu->rb[queue->prio];
if (args->flags & MSM_SUBMIT_FENCE_FD_IN) {
+ struct dma_fence *in_fence;
+
in_fence = sync_file_get_fence(args->fence_fd);
if (!in_fence)
* Wait if the fence is from a foreign context, or if the fence
* array contains any fence from a foreign context.
*/
- if (!dma_fence_match_context(in_fence, ring->fctx->context)) {
+ ret = 0;
+ if (!dma_fence_match_context(in_fence, ring->fctx->context))
ret = dma_fence_wait(in_fence, true);
- if (ret)
- return ret;
- }
+
+ dma_fence_put(in_fence);
+ if (ret)
+ return ret;
}
ret = mutex_lock_interruptible(&dev->struct_mutex);
}
out:
- if (in_fence)
- dma_fence_put(in_fence);
submit_cleanup(submit);
if (ret)
msm_gem_submit_free(submit);
{
struct msm_gpu_state *state;
+ /* Check if the target supports capturing crash state */
+ if (!gpu->funcs->gpu_state_get)
+ return;
+
/* Only save one crash state at a time */
if (gpu->crashstate)
return;
if (submit) {
struct task_struct *task;
- rcu_read_lock();
- task = pid_task(submit->pid, PIDTYPE_PID);
+ task = get_pid_task(submit->pid, PIDTYPE_PID);
if (task) {
- comm = kstrdup(task->comm, GFP_ATOMIC);
+ comm = kstrdup(task->comm, GFP_KERNEL);
/*
* So slightly annoying, in other paths like
* about the submit going away.
*/
mutex_unlock(&dev->struct_mutex);
- cmd = kstrdup_quotable_cmdline(task, GFP_ATOMIC);
+ cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL);
+ put_task_struct(task);
mutex_lock(&dev->struct_mutex);
}
- rcu_read_unlock();
if (comm && cmd) {
dev_err(dev->dev, "%s: offending task: %s (%s)\n",
// pm_runtime_get_sync(mmu->dev);
ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
// pm_runtime_put_sync(mmu->dev);
- WARN_ON(ret < 0);
+ WARN_ON(!ret);
return (ret == len) ? 0 : -EINVAL;
}
uint64_t iova, uint32_t size)
{
struct msm_gem_object *obj = submit->bos[idx].obj;
+ unsigned offset = 0;
const char *buf;
if (iova) {
- buf += iova - submit->bos[idx].iova;
+ offset = iova - submit->bos[idx].iova;
} else {
iova = submit->bos[idx].iova;
size = obj->base.size;
if (IS_ERR(buf))
return;
+ buf += offset;
+
rd_write_section(rd, RD_BUFFER_CONTENTS, buf, size);
msm_gem_put_vaddr(&obj->base);
{
struct nv50_head *head = nv50_head(connector_state->crtc);
struct nv50_mstc *mstc = nv50_mstc(connector);
- if (mstc->port) {
- struct nv50_mstm *mstm = mstc->mstm;
- return &mstm->msto[head->base.index]->encoder;
- }
- return NULL;
+
+ return &mstc->mstm->msto[head->base.index]->encoder;
}
static struct drm_encoder *
nv50_mstc_best_encoder(struct drm_connector *connector)
{
struct nv50_mstc *mstc = nv50_mstc(connector);
- if (mstc->port) {
- struct nv50_mstm *mstm = mstc->mstm;
- return &mstm->msto[0]->encoder;
- }
- return NULL;
+
+ return &mstc->mstm->msto[0]->encoder;
}
static enum drm_mode_status
dssdev->type = OMAP_DISPLAY_TYPE_DPI;
dssdev->owner = THIS_MODULE;
dssdev->of_ports = BIT(0);
+ drm_bus_flags_from_videomode(&ddata->vm, &dssdev->bus_flags);
omapdss_display_init(dssdev);
omapdss_device_register(dssdev);
/* DSI on OMAP3 doesn't have register DSI_GNQ, set number
* of data to 3 by default */
- if (dsi->data->quirks & DSI_QUIRK_GNQ)
+ if (dsi->data->quirks & DSI_QUIRK_GNQ) {
+ dsi_runtime_get(dsi);
/* NB_DATA_LANES */
dsi->num_lanes_supported = 1 + REG_GET(dsi, DSI_GNQ, 11, 9);
- else
+ dsi_runtime_put(dsi);
+ } else {
dsi->num_lanes_supported = 3;
+ }
+
+ r = of_platform_populate(dev->of_node, NULL, NULL, dev);
+ if (r) {
+ DSSERR("Failed to populate DSI child devices: %d\n", r);
+ goto err_pm_disable;
+ }
r = dsi_init_output(dsi);
if (r)
- goto err_pm_disable;
+ goto err_of_depopulate;
r = dsi_probe_of(dsi);
if (r) {
goto err_uninit_output;
}
- r = of_platform_populate(dev->of_node, NULL, NULL, dev);
- if (r)
- DSSERR("Failed to populate DSI child devices: %d\n", r);
-
r = component_add(&pdev->dev, &dsi_component_ops);
if (r)
goto err_uninit_output;
err_uninit_output:
dsi_uninit_output(dsi);
+err_of_depopulate:
+ of_platform_depopulate(dev);
err_pm_disable:
pm_runtime_disable(dev);
return r;
/* wait for current handler to finish before turning the DSI off */
synchronize_irq(dsi->irq);
- dispc_runtime_put(dsi->dss->dispc);
-
return 0;
}
static int dsi_runtime_resume(struct device *dev)
{
struct dsi_data *dsi = dev_get_drvdata(dev);
- int r;
-
- r = dispc_runtime_get(dsi->dss->dispc);
- if (r)
- return r;
dsi->is_enabled = true;
/* ensure the irq handler sees the is_enabled value */
dss);
/* Add all the child devices as components. */
+ r = of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
+ if (r)
+ goto err_uninit_debugfs;
+
omapdss_gather_components(&pdev->dev);
device_for_each_child(&pdev->dev, &match, dss_add_child_component);
r = component_master_add_with_match(&pdev->dev, &dss_component_ops, match);
if (r)
- goto err_uninit_debugfs;
+ goto err_of_depopulate;
return 0;
+err_of_depopulate:
+ of_platform_depopulate(&pdev->dev);
+
err_uninit_debugfs:
dss_debugfs_remove_file(dss->debugfs.clk);
dss_debugfs_remove_file(dss->debugfs.dss);
{
struct dss_device *dss = platform_get_drvdata(pdev);
+ of_platform_depopulate(&pdev->dev);
+
component_master_del(&pdev->dev, &dss_component_ops);
dss_debugfs_remove_file(dss->debugfs.clk);
hdmi->dss = dss;
- r = hdmi_pll_init(dss, hdmi->pdev, &hdmi->pll, &hdmi->wp);
+ r = hdmi_runtime_get(hdmi);
if (r)
return r;
+ r = hdmi_pll_init(dss, hdmi->pdev, &hdmi->pll, &hdmi->wp);
+ if (r)
+ goto err_runtime_put;
+
r = hdmi4_cec_init(hdmi->pdev, &hdmi->core, &hdmi->wp);
if (r)
goto err_pll_uninit;
hdmi->debugfs = dss_debugfs_create_file(dss, "hdmi", hdmi_dump_regs,
hdmi);
+ hdmi_runtime_put(hdmi);
+
return 0;
err_cec_uninit:
hdmi4_cec_uninit(&hdmi->core);
err_pll_uninit:
hdmi_pll_uninit(&hdmi->pll);
+err_runtime_put:
+ hdmi_runtime_put(hdmi);
return r;
}
return 0;
}
-static int hdmi_runtime_suspend(struct device *dev)
-{
- struct omap_hdmi *hdmi = dev_get_drvdata(dev);
-
- dispc_runtime_put(hdmi->dss->dispc);
-
- return 0;
-}
-
-static int hdmi_runtime_resume(struct device *dev)
-{
- struct omap_hdmi *hdmi = dev_get_drvdata(dev);
- int r;
-
- r = dispc_runtime_get(hdmi->dss->dispc);
- if (r < 0)
- return r;
-
- return 0;
-}
-
-static const struct dev_pm_ops hdmi_pm_ops = {
- .runtime_suspend = hdmi_runtime_suspend,
- .runtime_resume = hdmi_runtime_resume,
-};
-
static const struct of_device_id hdmi_of_match[] = {
{ .compatible = "ti,omap4-hdmi", },
{},
.remove = hdmi4_remove,
.driver = {
.name = "omapdss_hdmi",
- .pm = &hdmi_pm_ops,
.of_match_table = hdmi_of_match,
.suppress_bind_attrs = true,
},
return 0;
}
-static int hdmi_runtime_suspend(struct device *dev)
-{
- struct omap_hdmi *hdmi = dev_get_drvdata(dev);
-
- dispc_runtime_put(hdmi->dss->dispc);
-
- return 0;
-}
-
-static int hdmi_runtime_resume(struct device *dev)
-{
- struct omap_hdmi *hdmi = dev_get_drvdata(dev);
- int r;
-
- r = dispc_runtime_get(hdmi->dss->dispc);
- if (r < 0)
- return r;
-
- return 0;
-}
-
-static const struct dev_pm_ops hdmi_pm_ops = {
- .runtime_suspend = hdmi_runtime_suspend,
- .runtime_resume = hdmi_runtime_resume,
-};
-
static const struct of_device_id hdmi_of_match[] = {
{ .compatible = "ti,omap5-hdmi", },
{ .compatible = "ti,dra7-hdmi", },
.remove = hdmi5_remove,
.driver = {
.name = "omapdss_hdmi5",
- .pm = &hdmi_pm_ops,
.of_match_table = hdmi_of_match,
.suppress_bind_attrs = true,
},
const struct omap_dss_driver *driver;
const struct omap_dss_device_ops *ops;
unsigned long ops_flags;
- unsigned long bus_flags;
+ u32 bus_flags;
/* helper variable for driver suspend/resume */
bool activate_after_resume;
if (venc->tv_dac_clk)
clk_disable_unprepare(venc->tv_dac_clk);
- dispc_runtime_put(venc->dss->dispc);
-
return 0;
}
static int venc_runtime_resume(struct device *dev)
{
struct venc_device *venc = dev_get_drvdata(dev);
- int r;
-
- r = dispc_runtime_get(venc->dss->dispc);
- if (r < 0)
- return r;
if (venc->tv_dac_clk)
clk_prepare_enable(venc->tv_dac_clk);
static void omap_crtc_atomic_enable(struct drm_crtc *crtc,
struct drm_crtc_state *old_state)
{
+ struct omap_drm_private *priv = crtc->dev->dev_private;
struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
int ret;
DBG("%s", omap_crtc->name);
+ priv->dispc_ops->runtime_get(priv->dispc);
+
spin_lock_irq(&crtc->dev->event_lock);
drm_crtc_vblank_on(crtc);
ret = drm_crtc_vblank_get(crtc);
static void omap_crtc_atomic_disable(struct drm_crtc *crtc,
struct drm_crtc_state *old_state)
{
+ struct omap_drm_private *priv = crtc->dev->dev_private;
struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
DBG("%s", omap_crtc->name);
spin_unlock_irq(&crtc->dev->event_lock);
drm_crtc_vblank_off(crtc);
+
+ priv->dispc_ops->runtime_put(priv->dispc);
}
static enum drm_mode_status omap_crtc_mode_valid(struct drm_crtc *crtc,
.destroy = omap_encoder_destroy,
};
+static void omap_encoder_hdmi_mode_set(struct drm_encoder *encoder,
+ struct drm_display_mode *adjusted_mode)
+{
+ struct drm_device *dev = encoder->dev;
+ struct omap_encoder *omap_encoder = to_omap_encoder(encoder);
+ struct omap_dss_device *dssdev = omap_encoder->output;
+ struct drm_connector *connector;
+ bool hdmi_mode;
+
+ hdmi_mode = false;
+ list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
+ if (connector->encoder == encoder) {
+ hdmi_mode = omap_connector_get_hdmi_mode(connector);
+ break;
+ }
+ }
+
+ if (dssdev->ops->hdmi.set_hdmi_mode)
+ dssdev->ops->hdmi.set_hdmi_mode(dssdev, hdmi_mode);
+
+ if (hdmi_mode && dssdev->ops->hdmi.set_infoframe) {
+ struct hdmi_avi_infoframe avi;
+ int r;
+
+ r = drm_hdmi_avi_infoframe_from_display_mode(&avi, adjusted_mode,
+ false);
+ if (r == 0)
+ dssdev->ops->hdmi.set_infoframe(dssdev, &avi);
+ }
+}
+
static void omap_encoder_mode_set(struct drm_encoder *encoder,
struct drm_display_mode *mode,
struct drm_display_mode *adjusted_mode)
{
- struct drm_device *dev = encoder->dev;
struct omap_encoder *omap_encoder = to_omap_encoder(encoder);
- struct drm_connector *connector;
struct omap_dss_device *dssdev;
struct videomode vm = { 0 };
- bool hdmi_mode;
- int r;
drm_display_mode_to_videomode(adjusted_mode, &vm);
}
/* Set the HDMI mode and HDMI infoframe if applicable. */
- hdmi_mode = false;
- list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
- if (connector->encoder == encoder) {
- hdmi_mode = omap_connector_get_hdmi_mode(connector);
- break;
- }
- }
-
- dssdev = omap_encoder->output;
-
- if (dssdev->ops->hdmi.set_hdmi_mode)
- dssdev->ops->hdmi.set_hdmi_mode(dssdev, hdmi_mode);
-
- if (hdmi_mode && dssdev->ops->hdmi.set_infoframe) {
- struct hdmi_avi_infoframe avi;
-
- r = drm_hdmi_avi_infoframe_from_display_mode(&avi, adjusted_mode,
- false);
- if (r == 0)
- dssdev->ops->hdmi.set_infoframe(dssdev, &avi);
- }
+ if (omap_encoder->output->output_type == OMAP_DISPLAY_TYPE_HDMI)
+ omap_encoder_hdmi_mode_set(encoder, adjusted_mode);
}
static void omap_encoder_disable(struct drm_encoder *encoder)
/**
* @prepare: the time (in milliseconds) that it takes for the panel to
* become ready and start receiving video data
+ * @hpd_absent_delay: Add this to the prepare delay if we know Hot
+ * Plug Detect isn't used.
* @enable: the time (in milliseconds) that it takes for the panel to
* display the first valid frame after starting to receive
* video data
*/
struct {
unsigned int prepare;
+ unsigned int hpd_absent_delay;
unsigned int enable;
unsigned int disable;
unsigned int unprepare;
struct drm_panel base;
bool prepared;
bool enabled;
+ bool no_hpd;
const struct panel_desc *desc;
static int panel_simple_prepare(struct drm_panel *panel)
{
struct panel_simple *p = to_panel_simple(panel);
+ unsigned int delay;
int err;
if (p->prepared)
gpiod_set_value_cansleep(p->enable_gpio, 1);
- if (p->desc->delay.prepare)
- msleep(p->desc->delay.prepare);
+ delay = p->desc->delay.prepare;
+ if (p->no_hpd)
+ delay += p->desc->delay.hpd_absent_delay;
+ if (delay)
+ msleep(delay);
p->prepared = true;
panel->prepared = false;
panel->desc = desc;
+ panel->no_hpd = of_property_read_bool(dev->of_node, "no-hpd");
+
panel->supply = devm_regulator_get(dev, "power");
if (IS_ERR(panel->supply))
return PTR_ERR(panel->supply);
},
};
-static const struct drm_display_mode innolux_tv123wam_mode = {
+static const struct drm_display_mode innolux_p120zdg_bf1_mode = {
.clock = 206016,
.hdisplay = 2160,
.hsync_start = 2160 + 48,
.flags = DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC,
};
-static const struct panel_desc innolux_tv123wam = {
- .modes = &innolux_tv123wam_mode,
+static const struct panel_desc innolux_p120zdg_bf1 = {
+ .modes = &innolux_p120zdg_bf1_mode,
.num_modes = 1,
.bpc = 8,
.size = {
- .width = 259,
- .height = 173,
+ .width = 254,
+ .height = 169,
},
.delay = {
+ .hpd_absent_delay = 200,
.unprepare = 500,
},
};
.compatible = "innolux,n156bge-l21",
.data = &innolux_n156bge_l21,
}, {
- .compatible = "innolux,tv123wam",
- .data = &innolux_tv123wam,
+ .compatible = "innolux,p120zdg-bf1",
+ .data = &innolux_p120zdg_bf1,
}, {
.compatible = "innolux,zj070na-01p",
.data = &innolux_zj070na_01p,
static void __rcar_du_group_start_stop(struct rcar_du_group *rgrp, bool start)
{
- struct rcar_du_crtc *rcrtc = &rgrp->dev->crtcs[rgrp->index * 2];
+ struct rcar_du_device *rcdu = rgrp->dev;
+
+ /*
+ * Group start/stop is controlled by the DRES and DEN bits of DSYSR0
+ * for the first group and DSYSR2 for the second group. On most DU
+ * instances, this maps to the first CRTC of the group, and we can just
+ * use rcar_du_crtc_dsysr_clr_set() to access the correct DSYSR. On
+ * M3-N, however, DU2 doesn't exist, but DSYSR2 does. We thus need to
+ * access the register directly using group read/write.
+ */
+ if (rcdu->info->channels_mask & BIT(rgrp->index * 2)) {
+ struct rcar_du_crtc *rcrtc = &rgrp->dev->crtcs[rgrp->index * 2];
- rcar_du_crtc_dsysr_clr_set(rcrtc, DSYSR_DRES | DSYSR_DEN,
- start ? DSYSR_DEN : DSYSR_DRES);
+ rcar_du_crtc_dsysr_clr_set(rcrtc, DSYSR_DRES | DSYSR_DEN,
+ start ? DSYSR_DEN : DSYSR_DRES);
+ } else {
+ rcar_du_group_write(rgrp, DSYSR,
+ start ? DSYSR_DEN : DSYSR_DRES);
+ }
}
void rcar_du_group_start_stop(struct rcar_du_group *rgrp, bool start)
DRM_DEBUG_DRIVER("Enabling LVDS output\n");
- if (!IS_ERR(tcon->panel)) {
+ if (tcon->panel) {
drm_panel_prepare(tcon->panel);
drm_panel_enable(tcon->panel);
}
DRM_DEBUG_DRIVER("Disabling LVDS output\n");
- if (!IS_ERR(tcon->panel)) {
+ if (tcon->panel) {
drm_panel_disable(tcon->panel);
drm_panel_unprepare(tcon->panel);
}
DRM_DEBUG_DRIVER("Enabling RGB output\n");
- if (!IS_ERR(tcon->panel)) {
+ if (tcon->panel) {
drm_panel_prepare(tcon->panel);
drm_panel_enable(tcon->panel);
}
DRM_DEBUG_DRIVER("Disabling RGB output\n");
- if (!IS_ERR(tcon->panel)) {
+ if (tcon->panel) {
drm_panel_disable(tcon->panel);
drm_panel_unprepare(tcon->panel);
}
sun4i_tcon0_mode_set_common(tcon, mode);
/* Set dithering if needed */
- sun4i_tcon0_mode_set_dithering(tcon, tcon->panel->connector);
+ if (tcon->panel)
+ sun4i_tcon0_mode_set_dithering(tcon, tcon->panel->connector);
/* Adjust clock delay */
clk_delay = sun4i_tcon_get_clk_delay(mode, 0);
* Following code is a way to avoid quirks all around TCON
* and DOTCLOCK drivers.
*/
- if (!IS_ERR(tcon->panel)) {
+ if (tcon->panel) {
struct drm_panel *panel = tcon->panel;
struct drm_connector *connector = panel->connector;
struct drm_display_info display_info = connector->display_info;
if (!fbo)
return -ENOMEM;
- ttm_bo_get(bo);
fbo->base = *bo;
+ fbo->base.mem.placement |= TTM_PL_FLAG_NO_EVICT;
+
+ ttm_bo_get(bo);
fbo->bo = bo;
/**
return 0;
}
+ /* We know for sure we don't want an async update here. Set
+ * state->legacy_cursor_update to false to prevent
+ * drm_atomic_helper_setup_commit() from auto-completing
+ * commit->flip_done.
+ */
+ state->legacy_cursor_update = false;
ret = drm_atomic_helper_setup_commit(state, nonblock);
if (ret)
return ret;
static void vc4_plane_atomic_async_update(struct drm_plane *plane,
struct drm_plane_state *state)
{
- struct vc4_plane_state *vc4_state = to_vc4_plane_state(plane->state);
+ struct vc4_plane_state *vc4_state, *new_vc4_state;
if (plane->state->fb != state->fb) {
vc4_plane_async_set_fb(plane, state->fb);
plane->state->src_y = state->src_y;
/* Update the display list based on the new crtc_x/y. */
- vc4_plane_atomic_check(plane, plane->state);
+ vc4_plane_atomic_check(plane, state);
+
+ new_vc4_state = to_vc4_plane_state(state);
+ vc4_state = to_vc4_plane_state(plane->state);
+
+ /* Update the current vc4_state pos0, pos2 and ptr0 dlist entries. */
+ vc4_state->dlist[vc4_state->pos0_offset] =
+ new_vc4_state->dlist[vc4_state->pos0_offset];
+ vc4_state->dlist[vc4_state->pos2_offset] =
+ new_vc4_state->dlist[vc4_state->pos2_offset];
+ vc4_state->dlist[vc4_state->ptr0_offset] =
+ new_vc4_state->dlist[vc4_state->ptr0_offset];
/* Note that we can't just call vc4_plane_write_dlist()
* because that would smash the context data that the HVS is
mutex_unlock(&vgasr_mutex);
return -EINVAL;
}
+ /* notify if GPU has been already bound */
+ if (ops->gpu_bound)
+ ops->gpu_bound(pdev, id);
}
mutex_unlock(&vgasr_mutex);
#define QUIRK_T100_KEYBOARD BIT(6)
#define QUIRK_T100CHI BIT(7)
#define QUIRK_G752_KEYBOARD BIT(8)
+#define QUIRK_T101HA_DOCK BIT(9)
#define I2C_KEYBOARD_QUIRKS (QUIRK_FIX_NOTEBOOK_REPORT | \
QUIRK_NO_INIT_REPORTS | \
return 1;
}
+static int asus_event(struct hid_device *hdev, struct hid_field *field,
+ struct hid_usage *usage, __s32 value)
+{
+ if ((usage->hid & HID_USAGE_PAGE) == 0xff310000 &&
+ (usage->hid & HID_USAGE) != 0x00 && !usage->type) {
+ hid_warn(hdev, "Unmapped Asus vendor usagepage code 0x%02x\n",
+ usage->hid & HID_USAGE);
+ }
+
+ return 0;
+}
+
static int asus_raw_event(struct hid_device *hdev,
struct hid_report *report, u8 *data, int size)
{
case 0x20: asus_map_key_clear(KEY_BRIGHTNESSUP); break;
case 0x35: asus_map_key_clear(KEY_DISPLAY_OFF); break;
case 0x6c: asus_map_key_clear(KEY_SLEEP); break;
+ case 0x7c: asus_map_key_clear(KEY_MICMUTE); break;
case 0x82: asus_map_key_clear(KEY_CAMERA); break;
case 0x88: asus_map_key_clear(KEY_RFKILL); break;
case 0xb5: asus_map_key_clear(KEY_CALC); break;
/* Fn+Space Power4Gear Hybrid */
case 0x5c: asus_map_key_clear(KEY_PROG3); break;
+ /* Fn+F5 "fan" symbol on FX503VD */
+ case 0x99: asus_map_key_clear(KEY_PROG4); break;
+
default:
/* ASUS lazily declares 256 usages, ignore the rest,
* as some make the keyboard appear as a pointer device. */
return ret;
}
+ /* use hid-multitouch for T101HA touchpad */
+ if (id->driver_data & QUIRK_T101HA_DOCK &&
+ hdev->collection->usage == HID_GD_MOUSE)
+ return -ENODEV;
+
ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT);
if (ret) {
hid_err(hdev, "Asus hw start failed: %d\n", ret);
USB_DEVICE_ID_ASUSTEK_ROG_KEYBOARD2), QUIRK_USE_KBD_BACKLIGHT },
{ HID_USB_DEVICE(USB_VENDOR_ID_ASUSTEK,
USB_DEVICE_ID_ASUSTEK_ROG_KEYBOARD3), QUIRK_G752_KEYBOARD },
+ { HID_USB_DEVICE(USB_VENDOR_ID_ASUSTEK,
+ USB_DEVICE_ID_ASUSTEK_FX503VD_KEYBOARD),
+ QUIRK_USE_KBD_BACKLIGHT },
{ HID_USB_DEVICE(USB_VENDOR_ID_ASUSTEK,
USB_DEVICE_ID_ASUSTEK_T100TA_KEYBOARD),
QUIRK_T100_KEYBOARD | QUIRK_NO_CONSUMER_USAGES },
{ HID_USB_DEVICE(USB_VENDOR_ID_ASUSTEK,
USB_DEVICE_ID_ASUSTEK_T100TAF_KEYBOARD),
QUIRK_T100_KEYBOARD | QUIRK_NO_CONSUMER_USAGES },
+ { HID_USB_DEVICE(USB_VENDOR_ID_ASUSTEK,
+ USB_DEVICE_ID_ASUSTEK_T101HA_KEYBOARD), QUIRK_T101HA_DOCK },
{ HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_ASUS_AK1D) },
{ HID_USB_DEVICE(USB_VENDOR_ID_TURBOX, USB_DEVICE_ID_ASUS_MD_5110) },
{ HID_USB_DEVICE(USB_VENDOR_ID_JESS, USB_DEVICE_ID_ASUS_MD_5112) },
#ifdef CONFIG_PM
.reset_resume = asus_reset_resume,
#endif
+ .event = asus_event,
.raw_event = asus_raw_event
};
module_hid_driver(asus_driver);
collection->type = type;
collection->usage = usage;
collection->level = parser->collection_stack_ptr - 1;
+ collection->parent = parser->active_collection;
+ parser->active_collection = collection;
if (type == HID_COLLECTION_APPLICATION)
parser->device->maxapplication++;
return -EINVAL;
}
parser->collection_stack_ptr--;
+ if (parser->active_collection)
+ parser->active_collection = parser->active_collection->parent;
return 0;
}
field->usage[i].collection_index =
parser->local.collection_index[j];
field->usage[i].usage_index = i;
+ field->usage[i].resolution_multiplier = 1;
}
field->maxusage = usages;
}
EXPORT_SYMBOL_GPL(hid_validate_values);
+static int hid_calculate_multiplier(struct hid_device *hid,
+ struct hid_field *multiplier)
+{
+ int m;
+ __s32 v = *multiplier->value;
+ __s32 lmin = multiplier->logical_minimum;
+ __s32 lmax = multiplier->logical_maximum;
+ __s32 pmin = multiplier->physical_minimum;
+ __s32 pmax = multiplier->physical_maximum;
+
+ /*
+ * "Because OS implementations will generally divide the control's
+ * reported count by the Effective Resolution Multiplier, designers
+ * should take care not to establish a potential Effective
+ * Resolution Multiplier of zero."
+ * HID Usage Table, v1.12, Section 4.3.1, p31
+ */
+ if (lmax - lmin == 0)
+ return 1;
+ /*
+ * Handling the unit exponent is left as an exercise to whoever
+ * finds a device where that exponent is not 0.
+ */
+ m = ((v - lmin)/(lmax - lmin) * (pmax - pmin) + pmin);
+ if (unlikely(multiplier->unit_exponent != 0)) {
+ hid_warn(hid,
+ "unsupported Resolution Multiplier unit exponent %d\n",
+ multiplier->unit_exponent);
+ }
+
+ /* There are no devices with an effective multiplier > 255 */
+ if (unlikely(m == 0 || m > 255 || m < -255)) {
+ hid_warn(hid, "unsupported Resolution Multiplier %d\n", m);
+ m = 1;
+ }
+
+ return m;
+}
+
+static void hid_apply_multiplier_to_field(struct hid_device *hid,
+ struct hid_field *field,
+ struct hid_collection *multiplier_collection,
+ int effective_multiplier)
+{
+ struct hid_collection *collection;
+ struct hid_usage *usage;
+ int i;
+
+ /*
+ * If multiplier_collection is NULL, the multiplier applies
+ * to all fields in the report.
+ * Otherwise, it is the Logical Collection the multiplier applies to
+ * but our field may be in a subcollection of that collection.
+ */
+ for (i = 0; i < field->maxusage; i++) {
+ usage = &field->usage[i];
+
+ collection = &hid->collection[usage->collection_index];
+ while (collection && collection != multiplier_collection)
+ collection = collection->parent;
+
+ if (collection || multiplier_collection == NULL)
+ usage->resolution_multiplier = effective_multiplier;
+
+ }
+}
+
+static void hid_apply_multiplier(struct hid_device *hid,
+ struct hid_field *multiplier)
+{
+ struct hid_report_enum *rep_enum;
+ struct hid_report *rep;
+ struct hid_field *field;
+ struct hid_collection *multiplier_collection;
+ int effective_multiplier;
+ int i;
+
+ /*
+ * "The Resolution Multiplier control must be contained in the same
+ * Logical Collection as the control(s) to which it is to be applied.
+ * If no Resolution Multiplier is defined, then the Resolution
+ * Multiplier defaults to 1. If more than one control exists in a
+ * Logical Collection, the Resolution Multiplier is associated with
+ * all controls in the collection. If no Logical Collection is
+ * defined, the Resolution Multiplier is associated with all
+ * controls in the report."
+ * HID Usage Table, v1.12, Section 4.3.1, p30
+ *
+ * Thus, search from the current collection upwards until we find a
+ * logical collection. Then search all fields for that same parent
+ * collection. Those are the fields the multiplier applies to.
+ *
+ * If we have more than one multiplier, it will overwrite the
+ * applicable fields later.
+ */
+ multiplier_collection = &hid->collection[multiplier->usage->collection_index];
+ while (multiplier_collection &&
+ multiplier_collection->type != HID_COLLECTION_LOGICAL)
+ multiplier_collection = multiplier_collection->parent;
+
+ effective_multiplier = hid_calculate_multiplier(hid, multiplier);
+
+ rep_enum = &hid->report_enum[HID_INPUT_REPORT];
+ list_for_each_entry(rep, &rep_enum->report_list, list) {
+ for (i = 0; i < rep->maxfield; i++) {
+ field = rep->field[i];
+ hid_apply_multiplier_to_field(hid, field,
+ multiplier_collection,
+ effective_multiplier);
+ }
+ }
+}
+
+/*
+ * hid_setup_resolution_multiplier - set up all resolution multipliers
+ *
+ * @device: hid device
+ *
+ * Search for all Resolution Multiplier Feature Reports and apply their
+ * value to all matching Input items. This only updates the internal struct
+ * fields.
+ *
+ * The Resolution Multiplier is applied by the hardware. If the multiplier
+ * is anything other than 1, the hardware will send pre-multiplied events
+ * so that the same physical interaction generates an accumulated
+ * accumulated_value = value * * multiplier
+ * This may be achieved by sending
+ * - "value * multiplier" for each event, or
+ * - "value" but "multiplier" times as frequently, or
+ * - a combination of the above
+ * The only guarantee is that the same physical interaction always generates
+ * an accumulated 'value * multiplier'.
+ *
+ * This function must be called before any event processing and after
+ * any SetRequest to the Resolution Multiplier.
+ */
+void hid_setup_resolution_multiplier(struct hid_device *hid)
+{
+ struct hid_report_enum *rep_enum;
+ struct hid_report *rep;
+ struct hid_usage *usage;
+ int i, j;
+
+ rep_enum = &hid->report_enum[HID_FEATURE_REPORT];
+ list_for_each_entry(rep, &rep_enum->report_list, list) {
+ for (i = 0; i < rep->maxfield; i++) {
+ /* Ignore if report count is out of bounds. */
+ if (rep->field[i]->report_count < 1)
+ continue;
+
+ for (j = 0; j < rep->field[i]->maxusage; j++) {
+ usage = &rep->field[i]->usage[j];
+ if (usage->hid == HID_GD_RESOLUTION_MULTIPLIER)
+ hid_apply_multiplier(hid,
+ rep->field[i]);
+ }
+ }
+ }
+}
+EXPORT_SYMBOL_GPL(hid_setup_resolution_multiplier);
+
/**
* hid_open_report - open a driver-specific device report
*
hid_err(device, "unbalanced delimiter at end of report description\n");
goto err;
}
+
+ /*
+ * fetch initial values in case the device's
+ * default multiplier isn't the recommended 1
+ */
+ hid_setup_resolution_multiplier(device);
+
kfree(parser->collection_stack);
vfree(parser);
device->status |= HID_STAT_PARSED;
+
return 0;
}
}
static struct hid_device_id cougar_id_table[] = {
{ HID_USB_DEVICE(USB_VENDOR_ID_SOLID_YEAR,
USB_DEVICE_ID_COUGAR_500K_GAMING_KEYBOARD) },
+ { HID_USB_DEVICE(USB_VENDOR_ID_SOLID_YEAR,
+ USB_DEVICE_ID_COUGAR_700K_GAMING_KEYBOARD) },
{}
};
MODULE_DEVICE_TABLE(hid, cougar_id_table);
return 0;
}
-static int hid_debug_rdesc_open(struct inode *inode, struct file *file)
-{
- return single_open(file, hid_debug_rdesc_show, inode->i_private);
-}
-
static int hid_debug_events_open(struct inode *inode, struct file *file)
{
int err = 0;
return 0;
}
-static const struct file_operations hid_debug_rdesc_fops = {
- .open = hid_debug_rdesc_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(hid_debug_rdesc);
static const struct file_operations hid_debug_events_fops = {
.owner = THIS_MODULE,
hid_input_report(input_dev->hid_device, HID_INPUT_REPORT,
input_dev->input_buf, len, 1);
- pm_wakeup_event(&input_dev->device->device, 0);
+ pm_wakeup_hard_event(&input_dev->device->device);
break;
default:
#define USB_DEVICE_ID_ASUSTEK_T100TA_KEYBOARD 0x17e0
#define USB_DEVICE_ID_ASUSTEK_T100TAF_KEYBOARD 0x1807
#define USB_DEVICE_ID_ASUSTEK_T100CHI_KEYBOARD 0x8502
+#define USB_DEVICE_ID_ASUSTEK_T101HA_KEYBOARD 0x183d
#define USB_DEVICE_ID_ASUSTEK_T304_KEYBOARD 0x184a
#define USB_DEVICE_ID_ASUSTEK_I2C_KEYBOARD 0x8585
#define USB_DEVICE_ID_ASUSTEK_I2C_TOUCHPAD 0x0101
#define USB_DEVICE_ID_ASUSTEK_ROG_KEYBOARD1 0x1854
#define USB_DEVICE_ID_ASUSTEK_ROG_KEYBOARD2 0x1837
#define USB_DEVICE_ID_ASUSTEK_ROG_KEYBOARD3 0x1822
+#define USB_DEVICE_ID_ASUSTEK_FX503VD_KEYBOARD 0x1869
#define USB_VENDOR_ID_ATEN 0x0557
#define USB_DEVICE_ID_ATEN_UC100KM 0x2004
#define USB_VENDOR_ID_SOLID_YEAR 0x060b
#define USB_DEVICE_ID_COUGAR_500K_GAMING_KEYBOARD 0x500a
+#define USB_DEVICE_ID_COUGAR_700K_GAMING_KEYBOARD 0x700a
#define USB_VENDOR_ID_SOUNDGRAPH 0x15c2
#define USB_DEVICE_ID_SOUNDGRAPH_IMON_FIRST 0x0034
map_abs_clear(usage->hid & 0xf);
break;
- case HID_GD_SLIDER: case HID_GD_DIAL: case HID_GD_WHEEL:
+ case HID_GD_WHEEL:
+ if (field->flags & HID_MAIN_ITEM_RELATIVE) {
+ set_bit(REL_WHEEL, input->relbit);
+ map_rel(REL_WHEEL_HI_RES);
+ } else {
+ map_abs(usage->hid & 0xf);
+ }
+ break;
+ case HID_GD_SLIDER: case HID_GD_DIAL:
if (field->flags & HID_MAIN_ITEM_RELATIVE)
map_rel(usage->hid & 0xf);
else
case 0x22f: map_key_clear(KEY_ZOOMRESET); break;
case 0x233: map_key_clear(KEY_SCROLLUP); break;
case 0x234: map_key_clear(KEY_SCROLLDOWN); break;
- case 0x238: map_rel(REL_HWHEEL); break;
+ case 0x238: /* AC Pan */
+ set_bit(REL_HWHEEL, input->relbit);
+ map_rel(REL_HWHEEL_HI_RES);
+ break;
case 0x23d: map_key_clear(KEY_EDIT); break;
case 0x25f: map_key_clear(KEY_CANCEL); break;
case 0x269: map_key_clear(KEY_INSERT); break;
}
+static void hidinput_handle_scroll(struct hid_usage *usage,
+ struct input_dev *input,
+ __s32 value)
+{
+ int code;
+ int hi_res, lo_res;
+
+ if (value == 0)
+ return;
+
+ if (usage->code == REL_WHEEL_HI_RES)
+ code = REL_WHEEL;
+ else
+ code = REL_HWHEEL;
+
+ /*
+ * Windows reports one wheel click as value 120. Where a high-res
+ * scroll wheel is present, a fraction of 120 is reported instead.
+ * Our REL_WHEEL_HI_RES axis does the same because all HW must
+ * adhere to the 120 expectation.
+ */
+ hi_res = value * 120/usage->resolution_multiplier;
+
+ usage->wheel_accumulated += hi_res;
+ lo_res = usage->wheel_accumulated/120;
+ if (lo_res)
+ usage->wheel_accumulated -= lo_res * 120;
+
+ input_event(input, EV_REL, code, lo_res);
+ input_event(input, EV_REL, usage->code, hi_res);
+}
+
void hidinput_hid_event(struct hid_device *hid, struct hid_field *field, struct hid_usage *usage, __s32 value)
{
struct input_dev *input;
if ((usage->type == EV_KEY) && (usage->code == 0)) /* Key 0 is "unassigned", not KEY_UNKNOWN */
return;
+ if ((usage->type == EV_REL) && (usage->code == REL_WHEEL_HI_RES ||
+ usage->code == REL_HWHEEL_HI_RES)) {
+ hidinput_handle_scroll(usage, input, value);
+ return;
+ }
+
if ((usage->type == EV_ABS) && (field->flags & HID_MAIN_ITEM_RELATIVE) &&
(usage->code == ABS_VOLUME)) {
int count = abs(value);
hid_hw_close(hid);
}
+static void hidinput_change_resolution_multipliers(struct hid_device *hid)
+{
+ struct hid_report_enum *rep_enum;
+ struct hid_report *rep;
+ struct hid_usage *usage;
+ int i, j;
+
+ rep_enum = &hid->report_enum[HID_FEATURE_REPORT];
+ list_for_each_entry(rep, &rep_enum->report_list, list) {
+ bool update_needed = false;
+
+ if (rep->maxfield == 0)
+ continue;
+
+ /*
+ * If we have more than one feature within this report we
+ * need to fill in the bits from the others before we can
+ * overwrite the ones for the Resolution Multiplier.
+ */
+ if (rep->maxfield > 1) {
+ hid_hw_request(hid, rep, HID_REQ_GET_REPORT);
+ hid_hw_wait(hid);
+ }
+
+ for (i = 0; i < rep->maxfield; i++) {
+ __s32 logical_max = rep->field[i]->logical_maximum;
+
+ /* There is no good reason for a Resolution
+ * Multiplier to have a count other than 1.
+ * Ignore that case.
+ */
+ if (rep->field[i]->report_count != 1)
+ continue;
+
+ for (j = 0; j < rep->field[i]->maxusage; j++) {
+ usage = &rep->field[i]->usage[j];
+
+ if (usage->hid != HID_GD_RESOLUTION_MULTIPLIER)
+ continue;
+
+ *rep->field[i]->value = logical_max;
+ update_needed = true;
+ }
+ }
+ if (update_needed)
+ hid_hw_request(hid, rep, HID_REQ_SET_REPORT);
+ }
+
+ /* refresh our structs */
+ hid_setup_resolution_multiplier(hid);
+}
+
static void report_features(struct hid_device *hid)
{
struct hid_driver *drv = hid->driver;
}
}
+ hidinput_change_resolution_multipliers(hid);
+
list_for_each_entry_safe(hidinput, next, &hid->inputs, list) {
if (drv->input_configured &&
drv->input_configured(hid, hidinput))
cancel_work_sync(&hid->led_work);
}
EXPORT_SYMBOL_GPL(hidinput_disconnect);
-
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/sched.h>
+#include <linux/sched/clock.h>
#include <linux/kfifo.h>
#include <linux/input/mt.h>
#include <linux/workqueue.h>
#define HIDPP_QUIRK_NO_HIDINPUT BIT(23)
#define HIDPP_QUIRK_FORCE_OUTPUT_REPORTS BIT(24)
#define HIDPP_QUIRK_UNIFYING BIT(25)
+#define HIDPP_QUIRK_HI_RES_SCROLL_1P0 BIT(26)
+#define HIDPP_QUIRK_HI_RES_SCROLL_X2120 BIT(27)
+#define HIDPP_QUIRK_HI_RES_SCROLL_X2121 BIT(28)
+
+/* Convenience constant to check for any high-res support. */
+#define HIDPP_QUIRK_HI_RES_SCROLL (HIDPP_QUIRK_HI_RES_SCROLL_1P0 | \
+ HIDPP_QUIRK_HI_RES_SCROLL_X2120 | \
+ HIDPP_QUIRK_HI_RES_SCROLL_X2121)
#define HIDPP_QUIRK_DELAYED_INIT HIDPP_QUIRK_NO_HIDINPUT
bool online;
};
+/**
+ * struct hidpp_scroll_counter - Utility class for processing high-resolution
+ * scroll events.
+ * @dev: the input device for which events should be reported.
+ * @wheel_multiplier: the scalar multiplier to be applied to each wheel event
+ * @remainder: counts the number of high-resolution units moved since the last
+ * low-resolution event (REL_WHEEL or REL_HWHEEL) was sent. Should
+ * only be used by class methods.
+ * @direction: direction of last movement (1 or -1)
+ * @last_time: last event time, used to reset remainder after inactivity
+ */
+struct hidpp_scroll_counter {
+ struct input_dev *dev;
+ int wheel_multiplier;
+ int remainder;
+ int direction;
+ unsigned long long last_time;
+};
+
struct hidpp_device {
struct hid_device *hid_dev;
struct mutex send_mutex;
unsigned long capabilities;
struct hidpp_battery battery;
+ struct hidpp_scroll_counter vertical_wheel_counter;
};
/* HID++ 1.0 error codes */
*name = new_name;
}
+/**
+ * hidpp_scroll_counter_handle_scroll() - Send high- and low-resolution scroll
+ * events given a high-resolution wheel
+ * movement.
+ * @counter: a hid_scroll_counter struct describing the wheel.
+ * @hi_res_value: the movement of the wheel, in the mouse's high-resolution
+ * units.
+ *
+ * Given a high-resolution movement, this function converts the movement into
+ * fractions of 120 and emits high-resolution scroll events for the input
+ * device. It also uses the multiplier from &struct hid_scroll_counter to
+ * emit low-resolution scroll events when appropriate for
+ * backwards-compatibility with userspace input libraries.
+ */
+static void hidpp_scroll_counter_handle_scroll(struct hidpp_scroll_counter *counter,
+ int hi_res_value)
+{
+ int low_res_value, remainder, direction;
+ unsigned long long now, previous;
+
+ hi_res_value = hi_res_value * 120/counter->wheel_multiplier;
+ input_report_rel(counter->dev, REL_WHEEL_HI_RES, hi_res_value);
+
+ remainder = counter->remainder;
+ direction = hi_res_value > 0 ? 1 : -1;
+
+ now = sched_clock();
+ previous = counter->last_time;
+ counter->last_time = now;
+ /*
+ * Reset the remainder after a period of inactivity or when the
+ * direction changes. This prevents the REL_WHEEL emulation point
+ * from sliding for devices that don't always provide the same
+ * number of movements per detent.
+ */
+ if (now - previous > 1000000000 || direction != counter->direction)
+ remainder = 0;
+
+ counter->direction = direction;
+ remainder += hi_res_value;
+
+ /* Some wheels will rest 7/8ths of a detent from the previous detent
+ * after slow movement, so we want the threshold for low-res events to
+ * be in the middle between two detents (e.g. after 4/8ths) as
+ * opposed to on the detents themselves (8/8ths).
+ */
+ if (abs(remainder) >= 60) {
+ /* Add (or subtract) 1 because we want to trigger when the wheel
+ * is half-way to the next detent (i.e. scroll 1 detent after a
+ * 1/2 detent movement, 2 detents after a 1 1/2 detent movement,
+ * etc.).
+ */
+ low_res_value = remainder / 120;
+ if (low_res_value == 0)
+ low_res_value = (hi_res_value > 0 ? 1 : -1);
+ input_report_rel(counter->dev, REL_WHEEL, low_res_value);
+ remainder -= low_res_value * 120;
+ }
+ counter->remainder = remainder;
+}
+
/* -------------------------------------------------------------------------- */
/* HIDP++ 1.0 commands */
/* -------------------------------------------------------------------------- */
#define HIDPP_SET_LONG_REGISTER 0x82
#define HIDPP_GET_LONG_REGISTER 0x83
-#define HIDPP_REG_GENERAL 0x00
-
-static int hidpp10_enable_battery_reporting(struct hidpp_device *hidpp_dev)
+/**
+ * hidpp10_set_register_bit() - Sets a single bit in a HID++ 1.0 register.
+ * @hidpp_dev: the device to set the register on.
+ * @register_address: the address of the register to modify.
+ * @byte: the byte of the register to modify. Should be less than 3.
+ * Return: 0 if successful, otherwise a negative error code.
+ */
+static int hidpp10_set_register_bit(struct hidpp_device *hidpp_dev,
+ u8 register_address, u8 byte, u8 bit)
{
struct hidpp_report response;
int ret;
u8 params[3] = { 0 };
ret = hidpp_send_rap_command_sync(hidpp_dev,
- REPORT_ID_HIDPP_SHORT,
- HIDPP_GET_REGISTER,
- HIDPP_REG_GENERAL,
- NULL, 0, &response);
+ REPORT_ID_HIDPP_SHORT,
+ HIDPP_GET_REGISTER,
+ register_address,
+ NULL, 0, &response);
if (ret)
return ret;
memcpy(params, response.rap.params, 3);
- /* Set the battery bit */
- params[0] |= BIT(4);
+ params[byte] |= BIT(bit);
return hidpp_send_rap_command_sync(hidpp_dev,
- REPORT_ID_HIDPP_SHORT,
- HIDPP_SET_REGISTER,
- HIDPP_REG_GENERAL,
- params, 3, &response);
+ REPORT_ID_HIDPP_SHORT,
+ HIDPP_SET_REGISTER,
+ register_address,
+ params, 3, &response);
+}
+
+
+#define HIDPP_REG_GENERAL 0x00
+
+static int hidpp10_enable_battery_reporting(struct hidpp_device *hidpp_dev)
+{
+ return hidpp10_set_register_bit(hidpp_dev, HIDPP_REG_GENERAL, 0, 4);
+}
+
+#define HIDPP_REG_FEATURES 0x01
+
+/* On HID++ 1.0 devices, high-res scroll was called "scrolling acceleration". */
+static int hidpp10_enable_scrolling_acceleration(struct hidpp_device *hidpp_dev)
+{
+ return hidpp10_set_register_bit(hidpp_dev, HIDPP_REG_FEATURES, 0, 6);
}
#define HIDPP_REG_BATTERY_STATUS 0x07
return ret;
}
+/* -------------------------------------------------------------------------- */
+/* 0x2120: Hi-resolution scrolling */
+/* -------------------------------------------------------------------------- */
+
+#define HIDPP_PAGE_HI_RESOLUTION_SCROLLING 0x2120
+
+#define CMD_HI_RESOLUTION_SCROLLING_SET_HIGHRES_SCROLLING_MODE 0x10
+
+static int hidpp_hrs_set_highres_scrolling_mode(struct hidpp_device *hidpp,
+ bool enabled, u8 *multiplier)
+{
+ u8 feature_index;
+ u8 feature_type;
+ int ret;
+ u8 params[1];
+ struct hidpp_report response;
+
+ ret = hidpp_root_get_feature(hidpp,
+ HIDPP_PAGE_HI_RESOLUTION_SCROLLING,
+ &feature_index,
+ &feature_type);
+ if (ret)
+ return ret;
+
+ params[0] = enabled ? BIT(0) : 0;
+ ret = hidpp_send_fap_command_sync(hidpp, feature_index,
+ CMD_HI_RESOLUTION_SCROLLING_SET_HIGHRES_SCROLLING_MODE,
+ params, sizeof(params), &response);
+ if (ret)
+ return ret;
+ *multiplier = response.fap.params[1];
+ return 0;
+}
+
+/* -------------------------------------------------------------------------- */
+/* 0x2121: HiRes Wheel */
+/* -------------------------------------------------------------------------- */
+
+#define HIDPP_PAGE_HIRES_WHEEL 0x2121
+
+#define CMD_HIRES_WHEEL_GET_WHEEL_CAPABILITY 0x00
+#define CMD_HIRES_WHEEL_SET_WHEEL_MODE 0x20
+
+static int hidpp_hrw_get_wheel_capability(struct hidpp_device *hidpp,
+ u8 *multiplier)
+{
+ u8 feature_index;
+ u8 feature_type;
+ int ret;
+ struct hidpp_report response;
+
+ ret = hidpp_root_get_feature(hidpp, HIDPP_PAGE_HIRES_WHEEL,
+ &feature_index, &feature_type);
+ if (ret)
+ goto return_default;
+
+ ret = hidpp_send_fap_command_sync(hidpp, feature_index,
+ CMD_HIRES_WHEEL_GET_WHEEL_CAPABILITY,
+ NULL, 0, &response);
+ if (ret)
+ goto return_default;
+
+ *multiplier = response.fap.params[0];
+ return 0;
+return_default:
+ hid_warn(hidpp->hid_dev,
+ "Couldn't get wheel multiplier (error %d)\n", ret);
+ return ret;
+}
+
+static int hidpp_hrw_set_wheel_mode(struct hidpp_device *hidpp, bool invert,
+ bool high_resolution, bool use_hidpp)
+{
+ u8 feature_index;
+ u8 feature_type;
+ int ret;
+ u8 params[1];
+ struct hidpp_report response;
+
+ ret = hidpp_root_get_feature(hidpp, HIDPP_PAGE_HIRES_WHEEL,
+ &feature_index, &feature_type);
+ if (ret)
+ return ret;
+
+ params[0] = (invert ? BIT(2) : 0) |
+ (high_resolution ? BIT(1) : 0) |
+ (use_hidpp ? BIT(0) : 0);
+
+ return hidpp_send_fap_command_sync(hidpp, feature_index,
+ CMD_HIRES_WHEEL_SET_WHEEL_MODE,
+ params, sizeof(params), &response);
+}
+
/* -------------------------------------------------------------------------- */
/* 0x4301: Solar Keyboard */
/* -------------------------------------------------------------------------- */
u8 size;
};
-static const signed short hiddpp_ff_effects[] = {
+static const signed short hidpp_ff_effects[] = {
FF_CONSTANT,
FF_PERIODIC,
FF_SINE,
-1
};
-static const signed short hiddpp_ff_effects_v2[] = {
+static const signed short hidpp_ff_effects_v2[] = {
FF_RAMP,
FF_FRICTION,
FF_INERTIA,
version = bcdDevice & 255;
/* Set supported force feedback capabilities */
- for (j = 0; hiddpp_ff_effects[j] >= 0; j++)
- set_bit(hiddpp_ff_effects[j], dev->ffbit);
+ for (j = 0; hidpp_ff_effects[j] >= 0; j++)
+ set_bit(hidpp_ff_effects[j], dev->ffbit);
if (version > 1)
- for (j = 0; hiddpp_ff_effects_v2[j] >= 0; j++)
- set_bit(hiddpp_ff_effects_v2[j], dev->ffbit);
+ for (j = 0; hidpp_ff_effects_v2[j] >= 0; j++)
+ set_bit(hidpp_ff_effects_v2[j], dev->ffbit);
/* Read number of slots available in device */
error = hidpp_send_fap_command_sync(hidpp, feature_index,
input_report_key(mydata->input, BTN_RIGHT,
!!(data[1] & M560_MOUSE_BTN_RIGHT));
- if (data[1] & M560_MOUSE_BTN_WHEEL_LEFT)
+ if (data[1] & M560_MOUSE_BTN_WHEEL_LEFT) {
input_report_rel(mydata->input, REL_HWHEEL, -1);
- else if (data[1] & M560_MOUSE_BTN_WHEEL_RIGHT)
+ input_report_rel(mydata->input, REL_HWHEEL_HI_RES,
+ -120);
+ } else if (data[1] & M560_MOUSE_BTN_WHEEL_RIGHT) {
input_report_rel(mydata->input, REL_HWHEEL, 1);
+ input_report_rel(mydata->input, REL_HWHEEL_HI_RES,
+ 120);
+ }
v = hid_snto32(hid_field_extract(hdev, data+3, 0, 12), 12);
input_report_rel(mydata->input, REL_X, v);
input_report_rel(mydata->input, REL_Y, v);
v = hid_snto32(data[6], 8);
- input_report_rel(mydata->input, REL_WHEEL, v);
+ hidpp_scroll_counter_handle_scroll(
+ &hidpp->vertical_wheel_counter, v);
input_sync(mydata->input);
}
__set_bit(REL_Y, mydata->input->relbit);
__set_bit(REL_WHEEL, mydata->input->relbit);
__set_bit(REL_HWHEEL, mydata->input->relbit);
+ __set_bit(REL_WHEEL_HI_RES, mydata->input->relbit);
+ __set_bit(REL_HWHEEL_HI_RES, mydata->input->relbit);
}
static int m560_input_mapping(struct hid_device *hdev, struct hid_input *hi,
return 0;
}
+/* -------------------------------------------------------------------------- */
+/* High-resolution scroll wheels */
+/* -------------------------------------------------------------------------- */
+
+static int hi_res_scroll_enable(struct hidpp_device *hidpp)
+{
+ int ret;
+ u8 multiplier = 1;
+
+ if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL_X2121) {
+ ret = hidpp_hrw_set_wheel_mode(hidpp, false, true, false);
+ if (ret == 0)
+ ret = hidpp_hrw_get_wheel_capability(hidpp, &multiplier);
+ } else if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL_X2120) {
+ ret = hidpp_hrs_set_highres_scrolling_mode(hidpp, true,
+ &multiplier);
+ } else /* if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL_1P0) */ {
+ ret = hidpp10_enable_scrolling_acceleration(hidpp);
+ multiplier = 8;
+ }
+ if (ret)
+ return ret;
+
+ if (multiplier == 0)
+ multiplier = 1;
+
+ hidpp->vertical_wheel_counter.wheel_multiplier = multiplier;
+ hid_info(hidpp->hid_dev, "multiplier = %d\n", multiplier);
+ return 0;
+}
+
/* -------------------------------------------------------------------------- */
/* Generic HID++ devices */
/* -------------------------------------------------------------------------- */
wtp_populate_input(hidpp, input, origin_is_hid_core);
else if (hidpp->quirks & HIDPP_QUIRK_CLASS_M560)
m560_populate_input(hidpp, input, origin_is_hid_core);
+
+ if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL)
+ hidpp->vertical_wheel_counter.dev = input;
}
static int hidpp_input_configured(struct hid_device *hdev,
return 0;
}
+static int hidpp_event(struct hid_device *hdev, struct hid_field *field,
+ struct hid_usage *usage, __s32 value)
+{
+ /* This function will only be called for scroll events, due to the
+ * restriction imposed in hidpp_usages.
+ */
+ struct hidpp_device *hidpp = hid_get_drvdata(hdev);
+ struct hidpp_scroll_counter *counter = &hidpp->vertical_wheel_counter;
+ /* A scroll event may occur before the multiplier has been retrieved or
+ * the input device set, or high-res scroll enabling may fail. In such
+ * cases we must return early (falling back to default behaviour) to
+ * avoid a crash in hidpp_scroll_counter_handle_scroll.
+ */
+ if (!(hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL) || value == 0
+ || counter->dev == NULL || counter->wheel_multiplier == 0)
+ return 0;
+
+ hidpp_scroll_counter_handle_scroll(counter, value);
+ return 1;
+}
+
static int hidpp_initialize_battery(struct hidpp_device *hidpp)
{
static atomic_t battery_no = ATOMIC_INIT(0);
if (hidpp->battery.ps)
power_supply_changed(hidpp->battery.ps);
+ if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL)
+ hi_res_scroll_enable(hidpp);
+
if (!(hidpp->quirks & HIDPP_QUIRK_NO_HIDINPUT) || hidpp->delayed_input)
/* if the input nodes are already created, we can stop now */
return;
mutex_destroy(&hidpp->send_mutex);
}
+#define LDJ_DEVICE(product) \
+ HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, \
+ USB_VENDOR_ID_LOGITECH, (product))
+
static const struct hid_device_id hidpp_devices[] = {
{ /* wireless touchpad */
- HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE,
- USB_VENDOR_ID_LOGITECH, 0x4011),
+ LDJ_DEVICE(0x4011),
.driver_data = HIDPP_QUIRK_CLASS_WTP | HIDPP_QUIRK_DELAYED_INIT |
HIDPP_QUIRK_WTP_PHYSICAL_BUTTONS },
{ /* wireless touchpad T650 */
- HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE,
- USB_VENDOR_ID_LOGITECH, 0x4101),
+ LDJ_DEVICE(0x4101),
.driver_data = HIDPP_QUIRK_CLASS_WTP | HIDPP_QUIRK_DELAYED_INIT },
{ /* wireless touchpad T651 */
HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH,
USB_DEVICE_ID_LOGITECH_T651),
.driver_data = HIDPP_QUIRK_CLASS_WTP },
+ { /* Mouse Logitech Anywhere MX */
+ LDJ_DEVICE(0x1017), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 },
+ { /* Mouse Logitech Cube */
+ LDJ_DEVICE(0x4010), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2120 },
+ { /* Mouse Logitech M335 */
+ LDJ_DEVICE(0x4050), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { /* Mouse Logitech M515 */
+ LDJ_DEVICE(0x4007), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2120 },
{ /* Mouse logitech M560 */
- HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE,
- USB_VENDOR_ID_LOGITECH, 0x402d),
- .driver_data = HIDPP_QUIRK_DELAYED_INIT | HIDPP_QUIRK_CLASS_M560 },
+ LDJ_DEVICE(0x402d),
+ .driver_data = HIDPP_QUIRK_DELAYED_INIT | HIDPP_QUIRK_CLASS_M560
+ | HIDPP_QUIRK_HI_RES_SCROLL_X2120 },
+ { /* Mouse Logitech M705 (firmware RQM17) */
+ LDJ_DEVICE(0x101b), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 },
+ { /* Mouse Logitech M705 (firmware RQM67) */
+ LDJ_DEVICE(0x406d), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { /* Mouse Logitech M720 */
+ LDJ_DEVICE(0x405e), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { /* Mouse Logitech MX Anywhere 2 */
+ LDJ_DEVICE(0x404a), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { LDJ_DEVICE(0xb013), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { LDJ_DEVICE(0xb018), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { LDJ_DEVICE(0xb01f), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { /* Mouse Logitech MX Anywhere 2S */
+ LDJ_DEVICE(0x406a), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { /* Mouse Logitech MX Master */
+ LDJ_DEVICE(0x4041), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { LDJ_DEVICE(0x4060), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { LDJ_DEVICE(0x4071), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { /* Mouse Logitech MX Master 2S */
+ LDJ_DEVICE(0x4069), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 },
+ { /* Mouse Logitech Performance MX */
+ LDJ_DEVICE(0x101a), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 },
{ /* Keyboard logitech K400 */
- HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE,
- USB_VENDOR_ID_LOGITECH, 0x4024),
+ LDJ_DEVICE(0x4024),
.driver_data = HIDPP_QUIRK_CLASS_K400 },
{ /* Solar Keyboard Logitech K750 */
- HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE,
- USB_VENDOR_ID_LOGITECH, 0x4002),
+ LDJ_DEVICE(0x4002),
.driver_data = HIDPP_QUIRK_CLASS_K750 },
- { HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE,
- USB_VENDOR_ID_LOGITECH, HID_ANY_ID)},
+ { LDJ_DEVICE(HID_ANY_ID) },
{ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_G920_WHEEL),
.driver_data = HIDPP_QUIRK_CLASS_G920 | HIDPP_QUIRK_FORCE_OUTPUT_REPORTS},
MODULE_DEVICE_TABLE(hid, hidpp_devices);
+static const struct hid_usage_id hidpp_usages[] = {
+ { HID_GD_WHEEL, EV_REL, REL_WHEEL_HI_RES },
+ { HID_ANY_ID - 1, HID_ANY_ID - 1, HID_ANY_ID - 1}
+};
+
static struct hid_driver hidpp_driver = {
.name = "logitech-hidpp-device",
.id_table = hidpp_devices,
.probe = hidpp_probe,
.remove = hidpp_remove,
.raw_event = hidpp_raw_event,
+ .usage_table = hidpp_usages,
+ .event = hidpp_event,
.input_configured = hidpp_input_configured,
.input_mapping = hidpp_input_mapping,
.input_mapped = hidpp_input_mapped,
sensor_inst->hsdev,
sensor_inst->hsdev->usage,
usage, report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC, false);
} else if (!strncmp(name, "units", strlen("units")))
value = sensor_inst->fields[field_index].attribute.units;
else if (!strncmp(name, "unit-expo", strlen("unit-expo")))
int sensor_hub_input_attr_get_raw_value(struct hid_sensor_hub_device *hsdev,
u32 usage_id,
u32 attr_usage_id, u32 report_id,
- enum sensor_hub_read_flags flag)
+ enum sensor_hub_read_flags flag,
+ bool is_signed)
{
struct sensor_hub_data *data = hid_get_drvdata(hsdev->hdev);
unsigned long flags;
&hsdev->pending.ready, HZ*5);
switch (hsdev->pending.raw_size) {
case 1:
- ret_val = *(u8 *)hsdev->pending.raw_data;
+ if (is_signed)
+ ret_val = *(s8 *)hsdev->pending.raw_data;
+ else
+ ret_val = *(u8 *)hsdev->pending.raw_data;
break;
case 2:
- ret_val = *(u16 *)hsdev->pending.raw_data;
+ if (is_signed)
+ ret_val = *(s16 *)hsdev->pending.raw_data;
+ else
+ ret_val = *(u16 *)hsdev->pending.raw_data;
break;
case 4:
ret_val = *(u32 *)hsdev->pending.raw_data;
/*
* The first byte of the report buffer is expected to be a report number.
- *
- * This function is to be called with the minors_lock mutex held.
*/
static ssize_t hidraw_send_report(struct file *file, const char __user *buffer, size_t count, unsigned char report_type)
{
__u8 *buf;
int ret = 0;
+ lockdep_assert_held(&minors_lock);
+
if (!hidraw_table[minor] || !hidraw_table[minor]->exist) {
ret = -ENODEV;
goto out;
* of buffer is the report number to request, or 0x0 if the defice does not
* use numbered reports. The report_type parameter can be HID_FEATURE_REPORT
* or HID_INPUT_REPORT.
- *
- * This function is to be called with the minors_lock mutex held.
*/
static ssize_t hidraw_get_report(struct file *file, char __user *buffer, size_t count, unsigned char report_type)
{
int ret = 0, len;
unsigned char report_number;
+ lockdep_assert_held(&minors_lock);
+
if (!hidraw_table[minor] || !hidraw_table[minor]->exist) {
ret = -ENODEV;
goto out;
{
int ret;
struct ish_hw *hw;
+ unsigned long irq_flag = 0;
struct ishtp_device *ishtp;
struct device *dev = &pdev->dev;
pdev->dev_flags |= PCI_DEV_FLAGS_NO_D3;
/* request and enable interrupt */
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (!pdev->msi_enabled && !pdev->msix_enabled)
+ irq_flag = IRQF_SHARED;
+
ret = devm_request_irq(dev, pdev->irq, ish_irq_handler,
- IRQF_SHARED, KBUILD_MODNAME, ishtp);
+ irq_flag, KBUILD_MODNAME, ishtp);
if (ret) {
dev_err(dev, "ISH: request IRQ %d failed\n", pdev->irq);
return ret;
}
wait_for_completion(&msginfo->waitevent);
+ if (msginfo->response.gpadl_created.creation_status != 0) {
+ pr_err("Failed to establish GPADL: err = 0x%x\n",
+ msginfo->response.gpadl_created.creation_status);
+
+ ret = -EDQUOT;
+ goto cleanup;
+ }
+
if (channel->rescind) {
ret = -ENODEV;
goto cleanup;
}
}
-/*
- * vmbus_process_offer - Process the offer by creating a channel/device
- * associated with this offer
- */
-static void vmbus_process_offer(struct vmbus_channel *newchannel)
+/* Note: the function can run concurrently for primary/sub channels. */
+static void vmbus_add_channel_work(struct work_struct *work)
{
- struct vmbus_channel *channel;
- bool fnew = true;
+ struct vmbus_channel *newchannel =
+ container_of(work, struct vmbus_channel, add_channel_work);
+ struct vmbus_channel *primary_channel = newchannel->primary_channel;
unsigned long flags;
u16 dev_type;
int ret;
- /* Make sure this is a new offer */
- mutex_lock(&vmbus_connection.channel_mutex);
-
- /*
- * Now that we have acquired the channel_mutex,
- * we can release the potentially racing rescind thread.
- */
- atomic_dec(&vmbus_connection.offer_in_progress);
-
- list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
- if (!uuid_le_cmp(channel->offermsg.offer.if_type,
- newchannel->offermsg.offer.if_type) &&
- !uuid_le_cmp(channel->offermsg.offer.if_instance,
- newchannel->offermsg.offer.if_instance)) {
- fnew = false;
- break;
- }
- }
-
- if (fnew)
- list_add_tail(&newchannel->listentry,
- &vmbus_connection.chn_list);
-
- mutex_unlock(&vmbus_connection.channel_mutex);
-
- if (!fnew) {
- /*
- * Check to see if this is a sub-channel.
- */
- if (newchannel->offermsg.offer.sub_channel_index != 0) {
- /*
- * Process the sub-channel.
- */
- newchannel->primary_channel = channel;
- spin_lock_irqsave(&channel->lock, flags);
- list_add_tail(&newchannel->sc_list, &channel->sc_list);
- channel->num_sc++;
- spin_unlock_irqrestore(&channel->lock, flags);
- } else {
- goto err_free_chan;
- }
- }
-
dev_type = hv_get_dev_type(newchannel);
init_vp_index(newchannel, dev_type);
/*
* This state is used to indicate a successful open
* so that when we do close the channel normally, we
- * can cleanup properly
+ * can cleanup properly.
*/
newchannel->state = CHANNEL_OPEN_STATE;
- if (!fnew) {
- struct hv_device *dev
- = newchannel->primary_channel->device_obj;
+ if (primary_channel != NULL) {
+ /* newchannel is a sub-channel. */
+ struct hv_device *dev = primary_channel->device_obj;
if (vmbus_add_channel_kobj(dev, newchannel))
- goto err_free_chan;
+ goto err_deq_chan;
+
+ if (primary_channel->sc_creation_callback != NULL)
+ primary_channel->sc_creation_callback(newchannel);
- if (channel->sc_creation_callback != NULL)
- channel->sc_creation_callback(newchannel);
newchannel->probe_done = true;
return;
}
/*
- * Start the process of binding this offer to the driver
- * We need to set the DeviceObject field before calling
- * vmbus_child_dev_add()
+ * Start the process of binding the primary channel to the driver
*/
newchannel->device_obj = vmbus_device_create(
&newchannel->offermsg.offer.if_type,
err_deq_chan:
mutex_lock(&vmbus_connection.channel_mutex);
- list_del(&newchannel->listentry);
+
+ /*
+ * We need to set the flag, otherwise
+ * vmbus_onoffer_rescind() can be blocked.
+ */
+ newchannel->probe_done = true;
+
+ if (primary_channel == NULL) {
+ list_del(&newchannel->listentry);
+ } else {
+ spin_lock_irqsave(&primary_channel->lock, flags);
+ list_del(&newchannel->sc_list);
+ spin_unlock_irqrestore(&primary_channel->lock, flags);
+ }
+
mutex_unlock(&vmbus_connection.channel_mutex);
if (newchannel->target_cpu != get_cpu()) {
put_cpu();
smp_call_function_single(newchannel->target_cpu,
- percpu_channel_deq, newchannel, true);
+ percpu_channel_deq,
+ newchannel, true);
} else {
percpu_channel_deq(newchannel);
put_cpu();
vmbus_release_relid(newchannel->offermsg.child_relid);
-err_free_chan:
free_channel(newchannel);
}
+/*
+ * vmbus_process_offer - Process the offer by creating a channel/device
+ * associated with this offer
+ */
+static void vmbus_process_offer(struct vmbus_channel *newchannel)
+{
+ struct vmbus_channel *channel;
+ struct workqueue_struct *wq;
+ unsigned long flags;
+ bool fnew = true;
+
+ mutex_lock(&vmbus_connection.channel_mutex);
+
+ /*
+ * Now that we have acquired the channel_mutex,
+ * we can release the potentially racing rescind thread.
+ */
+ atomic_dec(&vmbus_connection.offer_in_progress);
+
+ list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
+ if (!uuid_le_cmp(channel->offermsg.offer.if_type,
+ newchannel->offermsg.offer.if_type) &&
+ !uuid_le_cmp(channel->offermsg.offer.if_instance,
+ newchannel->offermsg.offer.if_instance)) {
+ fnew = false;
+ break;
+ }
+ }
+
+ if (fnew)
+ list_add_tail(&newchannel->listentry,
+ &vmbus_connection.chn_list);
+ else {
+ /*
+ * Check to see if this is a valid sub-channel.
+ */
+ if (newchannel->offermsg.offer.sub_channel_index == 0) {
+ mutex_unlock(&vmbus_connection.channel_mutex);
+ /*
+ * Don't call free_channel(), because newchannel->kobj
+ * is not initialized yet.
+ */
+ kfree(newchannel);
+ WARN_ON_ONCE(1);
+ return;
+ }
+ /*
+ * Process the sub-channel.
+ */
+ newchannel->primary_channel = channel;
+ spin_lock_irqsave(&channel->lock, flags);
+ list_add_tail(&newchannel->sc_list, &channel->sc_list);
+ spin_unlock_irqrestore(&channel->lock, flags);
+ }
+
+ mutex_unlock(&vmbus_connection.channel_mutex);
+
+ /*
+ * vmbus_process_offer() mustn't call channel->sc_creation_callback()
+ * directly for sub-channels, because sc_creation_callback() ->
+ * vmbus_open() may never get the host's response to the
+ * OPEN_CHANNEL message (the host may rescind a channel at any time,
+ * e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
+ * may not wake up the vmbus_open() as it's blocked due to a non-zero
+ * vmbus_connection.offer_in_progress, and finally we have a deadlock.
+ *
+ * The above is also true for primary channels, if the related device
+ * drivers use sync probing mode by default.
+ *
+ * And, usually the handling of primary channels and sub-channels can
+ * depend on each other, so we should offload them to different
+ * workqueues to avoid possible deadlock, e.g. in sync-probing mode,
+ * NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
+ * rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
+ * and waits for all the sub-channels to appear, but the latter
+ * can't get the rtnl_lock and this blocks the handling of
+ * sub-channels.
+ */
+ INIT_WORK(&newchannel->add_channel_work, vmbus_add_channel_work);
+ wq = fnew ? vmbus_connection.handle_primary_chan_wq :
+ vmbus_connection.handle_sub_chan_wq;
+ queue_work(wq, &newchannel->add_channel_work);
+}
+
/*
* We use this state to statically distribute the channel interrupt load.
*/
static int next_numa_node_id;
+/*
+ * init_vp_index() accesses global variables like next_numa_node_id, and
+ * it can run concurrently for primary channels and sub-channels: see
+ * vmbus_process_offer(), so we need the lock to protect the global
+ * variables.
+ */
+static DEFINE_SPINLOCK(bind_channel_to_cpu_lock);
/*
* Starting with Win8, we can statically distribute the incoming
return;
}
+ spin_lock(&bind_channel_to_cpu_lock);
+
/*
* Based on the channel affinity policy, we will assign the NUMA
* nodes.
channel->target_cpu = cur_cpu;
channel->target_vp = hv_cpu_number_to_vp_number(cur_cpu);
+ spin_unlock(&bind_channel_to_cpu_lock);
+
free_cpumask_var(available_mask);
}
goto cleanup;
}
+ vmbus_connection.handle_primary_chan_wq =
+ create_workqueue("hv_pri_chan");
+ if (!vmbus_connection.handle_primary_chan_wq) {
+ ret = -ENOMEM;
+ goto cleanup;
+ }
+
+ vmbus_connection.handle_sub_chan_wq =
+ create_workqueue("hv_sub_chan");
+ if (!vmbus_connection.handle_sub_chan_wq) {
+ ret = -ENOMEM;
+ goto cleanup;
+ }
+
INIT_LIST_HEAD(&vmbus_connection.chn_msg_list);
spin_lock_init(&vmbus_connection.channelmsg_lock);
*/
vmbus_initiate_unload(false);
- if (vmbus_connection.work_queue) {
- drain_workqueue(vmbus_connection.work_queue);
+ if (vmbus_connection.handle_sub_chan_wq)
+ destroy_workqueue(vmbus_connection.handle_sub_chan_wq);
+
+ if (vmbus_connection.handle_primary_chan_wq)
+ destroy_workqueue(vmbus_connection.handle_primary_chan_wq);
+
+ if (vmbus_connection.work_queue)
destroy_workqueue(vmbus_connection.work_queue);
- }
if (vmbus_connection.int_page) {
free_pages((unsigned long)vmbus_connection.int_page, 0);
out->body.kvp_ip_val.dhcp_enabled = in->kvp_ip_val.dhcp_enabled;
+ /* fallthrough */
+
+ case KVP_OP_GET_IP_INFO:
utf16s_to_utf8s((wchar_t *)in->kvp_ip_val.adapter_id,
MAX_ADAPTER_ID_SIZE,
UTF16_LITTLE_ENDIAN,
process_ib_ipinfo(in_msg, message, KVP_OP_SET_IP_INFO);
break;
case KVP_OP_GET_IP_INFO:
- /* We only need to pass on message->kvp_hdr.operation. */
+ /*
+ * We only need to pass on the info of operation, adapter_id
+ * and addr_family to the userland kvp daemon.
+ */
+ process_ib_ipinfo(in_msg, message, KVP_OP_GET_IP_INFO);
break;
case KVP_OP_SET:
switch (in_msg->body.kvp_set.data.value_type) {
}
- break;
-
- case KVP_OP_GET:
+ /*
+ * The key is always a string - utf16 encoding.
+ */
message->body.kvp_set.data.key_size =
utf16s_to_utf8s(
(wchar_t *)in_msg->body.kvp_set.data.key,
UTF16_LITTLE_ENDIAN,
message->body.kvp_set.data.key,
HV_KVP_EXCHANGE_MAX_KEY_SIZE - 1) + 1;
+
+ break;
+
+ case KVP_OP_GET:
+ message->body.kvp_get.data.key_size =
+ utf16s_to_utf8s(
+ (wchar_t *)in_msg->body.kvp_get.data.key,
+ in_msg->body.kvp_get.data.key_size,
+ UTF16_LITTLE_ENDIAN,
+ message->body.kvp_get.data.key,
+ HV_KVP_EXCHANGE_MAX_KEY_SIZE - 1) + 1;
break;
case KVP_OP_DELETE:
struct list_head chn_list;
struct mutex channel_mutex;
+ /*
+ * An offer message is handled first on the work_queue, and then
+ * is further handled on handle_primary_chan_wq or
+ * handle_sub_chan_wq.
+ */
struct workqueue_struct *work_queue;
+ struct workqueue_struct *handle_primary_chan_wq;
+ struct workqueue_struct *handle_sub_chan_wq;
};
if (info[i]->config[j] & HWMON_T_INPUT) {
err = hwmon_thermal_add_sensor(dev,
hwdev, j);
- if (err)
- goto free_device;
+ if (err) {
+ device_unregister(hdev);
+ goto ida_remove;
+ }
}
}
}
return hdev;
-free_device:
- device_unregister(hdev);
free_hwmon:
kfree(hwdev);
ida_remove:
return sprintf(buf, "%s\n", sdata->label);
}
-static int __init get_logical_cpu(int hwcpu)
+static int get_logical_cpu(int hwcpu)
{
int cpu;
return -ENOENT;
}
-static void __init make_sensor_label(struct device_node *np,
- struct sensor_data *sdata,
- const char *label)
+static void make_sensor_label(struct device_node *np,
+ struct sensor_data *sdata, const char *label)
{
u32 id;
size_t n;
break;
case INA2XX_CURRENT:
/* signed register, result in mA */
- val = regval * data->current_lsb_uA;
+ val = (s16)regval * data->current_lsb_uA;
val = DIV_ROUND_CLOSEST(val, 1000);
break;
case INA2XX_CALIBRATION:
}
data->groups[group++] = &ina2xx_group;
- if (id->driver_data == ina226)
+ if (chip == ina226)
data->groups[group++] = &ina226_group;
hwmon_dev = devm_hwmon_device_register_with_groups(dev, client->name,
return PTR_ERR(hwmon_dev);
dev_info(dev, "power monitor %s (Rshunt = %li uOhm)\n",
- id->name, data->rshunt);
+ client->name, data->rshunt);
return 0;
}
*/
#define MLXREG_FAN_GET_RPM(rval, d, s) (DIV_ROUND_CLOSEST(15000000 * 100, \
((rval) + (s)) * (d)))
-#define MLXREG_FAN_GET_FAULT(val, mask) (!!((val) ^ (mask)))
+#define MLXREG_FAN_GET_FAULT(val, mask) (!((val) ^ (mask)))
#define MLXREG_FAN_PWM_DUTY2STATE(duty) (DIV_ROUND_CLOSEST((duty) * \
MLXREG_FAN_MAX_STATE, \
MLXREG_FAN_MAX_DUTY))
{
struct device *dev = &pdev->dev;
struct rpi_hwmon_data *data;
- int ret;
data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
if (!data)
/* Parent driver assure that firmware is correct */
data->fw = dev_get_drvdata(dev->parent);
- /* Init throttled */
- ret = rpi_firmware_property(data->fw, RPI_FIRMWARE_GET_THROTTLED,
- &data->last_throttled,
- sizeof(data->last_throttled));
-
data->hwmon_dev = devm_hwmon_device_register_with_info(dev, "rpi_volt",
data,
&rpi_chip_info,
* somewhere else in the code
*/
#define SENSOR_ATTR_TEMP(index) { \
- SENSOR_ATTR_2(temp##index##_type, S_IRUGO | (index < 4 ? S_IWUSR : 0), \
+ SENSOR_ATTR_2(temp##index##_type, S_IRUGO | (index < 5 ? S_IWUSR : 0), \
show_temp_mode, store_temp_mode, NOT_USED, index - 1), \
SENSOR_ATTR_2(temp##index##_input, S_IRUGO, show_temp, \
NULL, TEMP_READ, index - 1), \
This driver can also be built as a module. If so, the module
will be called i2c-nforce2-s4985.
+config I2C_NVIDIA_GPU
+ tristate "NVIDIA GPU I2C controller"
+ depends on PCI
+ help
+ If you say yes to this option, support will be included for the
+ NVIDIA GPU I2C controller which is used to communicate with the GPU's
+ Type-C controller. This driver can also be built as a module called
+ i2c-nvidia-gpu.
+
config I2C_SIS5595
tristate "SiS 5595"
depends on PCI
config I2C_OMAP
tristate "OMAP I2C adapter"
- depends on ARCH_OMAP
+ depends on ARCH_OMAP || ARCH_K3
default y if MACH_OMAP_H3 || MACH_OMAP_OSK
help
If you say yes to this option, support will be included for the
obj-$(CONFIG_I2C_ISMT) += i2c-ismt.o
obj-$(CONFIG_I2C_NFORCE2) += i2c-nforce2.o
obj-$(CONFIG_I2C_NFORCE2_S4985) += i2c-nforce2-s4985.o
+obj-$(CONFIG_I2C_NVIDIA_GPU) += i2c-nvidia-gpu.o
obj-$(CONFIG_I2C_PIIX4) += i2c-piix4.o
obj-$(CONFIG_I2C_SIS5595) += i2c-sis5595.o
obj-$(CONFIG_I2C_SIS630) += i2c-sis630.o
MST_STATUS_ND)
#define MST_STATUS_ERR (MST_STATUS_NAK | \
MST_STATUS_AL | \
- MST_STATUS_IP | \
- MST_STATUS_TSS)
+ MST_STATUS_IP)
#define MST_TX_BYTES_XFRD 0x50
#define MST_RX_BYTES_XFRD 0x54
#define SCL_HIGH_PERIOD 0x80
*/
if (c <= 0 || c > I2C_SMBUS_BLOCK_MAX) {
idev->msg_err = -EPROTO;
- i2c_int_disable(idev, ~0);
+ i2c_int_disable(idev, ~MST_STATUS_TSS);
complete(&idev->msg_complete);
break;
}
if (status & MST_STATUS_SCC) {
/* Stop completed */
- i2c_int_disable(idev, ~0);
+ i2c_int_disable(idev, ~MST_STATUS_TSS);
complete(&idev->msg_complete);
} else if (status & MST_STATUS_SNS) {
/* Transfer done */
- i2c_int_disable(idev, ~0);
+ i2c_int_disable(idev, ~MST_STATUS_TSS);
if (i2c_m_rd(idev->msg) && idev->msg_xfrd < idev->msg->len)
axxia_i2c_empty_rx_fifo(idev);
complete(&idev->msg_complete);
+ } else if (status & MST_STATUS_TSS) {
+ /* Transfer timeout */
+ idev->msg_err = -ETIMEDOUT;
+ i2c_int_disable(idev, ~MST_STATUS_TSS);
+ complete(&idev->msg_complete);
} else if (unlikely(status & MST_STATUS_ERR)) {
/* Transfer error */
i2c_int_disable(idev, ~0);
u32 rx_xfer, tx_xfer;
u32 addr_1, addr_2;
unsigned long time_left;
+ unsigned int wt_value;
idev->msg = msg;
idev->msg_xfrd = 0;
- idev->msg_err = 0;
reinit_completion(&idev->msg_complete);
if (i2c_m_ten(msg)) {
else if (axxia_i2c_fill_tx_fifo(idev) != 0)
int_mask |= MST_STATUS_TFL;
+ wt_value = WT_VALUE(readl(idev->base + WAIT_TIMER_CONTROL));
+ /* Disable wait timer temporarly */
+ writel(wt_value, idev->base + WAIT_TIMER_CONTROL);
+ /* Check if timeout error happened */
+ if (idev->msg_err)
+ goto out;
+
/* Start manual mode */
writel(CMD_MANUAL, idev->base + MST_COMMAND);
+ writel(WT_EN | wt_value, idev->base + WAIT_TIMER_CONTROL);
+
i2c_int_enable(idev, int_mask);
time_left = wait_for_completion_timeout(&idev->msg_complete,
if (readl(idev->base + MST_COMMAND) & CMD_BUSY)
dev_warn(idev->dev, "busy after xfer\n");
- if (time_left == 0)
+ if (time_left == 0) {
idev->msg_err = -ETIMEDOUT;
-
- if (idev->msg_err == -ETIMEDOUT)
i2c_recover_bus(&idev->adapter);
+ axxia_i2c_init(idev);
+ }
- if (unlikely(idev->msg_err) && idev->msg_err != -ENXIO)
+out:
+ if (unlikely(idev->msg_err) && idev->msg_err != -ENXIO &&
+ idev->msg_err != -ETIMEDOUT)
axxia_i2c_init(idev);
return idev->msg_err;
static int axxia_i2c_stop(struct axxia_i2c_dev *idev)
{
- u32 int_mask = MST_STATUS_ERR | MST_STATUS_SCC;
+ u32 int_mask = MST_STATUS_ERR | MST_STATUS_SCC | MST_STATUS_TSS;
unsigned long time_left;
reinit_completion(&idev->msg_complete);
int i;
int ret = 0;
+ idev->msg_err = 0;
+ i2c_int_enable(idev, MST_STATUS_TSS);
+
for (i = 0; ret == 0 && i < num; ++i)
ret = axxia_i2c_xfer_msg(idev, &msgs[i]);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Nvidia GPU I2C controller Driver
+ *
+ * Copyright (C) 2018 NVIDIA Corporation. All rights reserved.
+ * Author: Ajay Gupta <ajayg@nvidia.com>
+ */
+#include <linux/delay.h>
+#include <linux/i2c.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+
+#include <asm/unaligned.h>
+
+/* I2C definitions */
+#define I2C_MST_CNTL 0x00
+#define I2C_MST_CNTL_GEN_START BIT(0)
+#define I2C_MST_CNTL_GEN_STOP BIT(1)
+#define I2C_MST_CNTL_CMD_READ (1 << 2)
+#define I2C_MST_CNTL_CMD_WRITE (2 << 2)
+#define I2C_MST_CNTL_BURST_SIZE_SHIFT 6
+#define I2C_MST_CNTL_GEN_NACK BIT(28)
+#define I2C_MST_CNTL_STATUS GENMASK(30, 29)
+#define I2C_MST_CNTL_STATUS_OKAY (0 << 29)
+#define I2C_MST_CNTL_STATUS_NO_ACK (1 << 29)
+#define I2C_MST_CNTL_STATUS_TIMEOUT (2 << 29)
+#define I2C_MST_CNTL_STATUS_BUS_BUSY (3 << 29)
+#define I2C_MST_CNTL_CYCLE_TRIGGER BIT(31)
+
+#define I2C_MST_ADDR 0x04
+
+#define I2C_MST_I2C0_TIMING 0x08
+#define I2C_MST_I2C0_TIMING_SCL_PERIOD_100KHZ 0x10e
+#define I2C_MST_I2C0_TIMING_TIMEOUT_CLK_CNT 16
+#define I2C_MST_I2C0_TIMING_TIMEOUT_CLK_CNT_MAX 255
+#define I2C_MST_I2C0_TIMING_TIMEOUT_CHECK BIT(24)
+
+#define I2C_MST_DATA 0x0c
+
+#define I2C_MST_HYBRID_PADCTL 0x20
+#define I2C_MST_HYBRID_PADCTL_MODE_I2C BIT(0)
+#define I2C_MST_HYBRID_PADCTL_I2C_SCL_INPUT_RCV BIT(14)
+#define I2C_MST_HYBRID_PADCTL_I2C_SDA_INPUT_RCV BIT(15)
+
+struct gpu_i2c_dev {
+ struct device *dev;
+ void __iomem *regs;
+ struct i2c_adapter adapter;
+ struct i2c_board_info *gpu_ccgx_ucsi;
+};
+
+static void gpu_enable_i2c_bus(struct gpu_i2c_dev *i2cd)
+{
+ u32 val;
+
+ /* enable I2C */
+ val = readl(i2cd->regs + I2C_MST_HYBRID_PADCTL);
+ val |= I2C_MST_HYBRID_PADCTL_MODE_I2C |
+ I2C_MST_HYBRID_PADCTL_I2C_SCL_INPUT_RCV |
+ I2C_MST_HYBRID_PADCTL_I2C_SDA_INPUT_RCV;
+ writel(val, i2cd->regs + I2C_MST_HYBRID_PADCTL);
+
+ /* enable 100KHZ mode */
+ val = I2C_MST_I2C0_TIMING_SCL_PERIOD_100KHZ;
+ val |= (I2C_MST_I2C0_TIMING_TIMEOUT_CLK_CNT_MAX
+ << I2C_MST_I2C0_TIMING_TIMEOUT_CLK_CNT);
+ val |= I2C_MST_I2C0_TIMING_TIMEOUT_CHECK;
+ writel(val, i2cd->regs + I2C_MST_I2C0_TIMING);
+}
+
+static int gpu_i2c_check_status(struct gpu_i2c_dev *i2cd)
+{
+ unsigned long target = jiffies + msecs_to_jiffies(1000);
+ u32 val;
+
+ do {
+ val = readl(i2cd->regs + I2C_MST_CNTL);
+ if (!(val & I2C_MST_CNTL_CYCLE_TRIGGER))
+ break;
+ if ((val & I2C_MST_CNTL_STATUS) !=
+ I2C_MST_CNTL_STATUS_BUS_BUSY)
+ break;
+ usleep_range(500, 600);
+ } while (time_is_after_jiffies(target));
+
+ if (time_is_before_jiffies(target)) {
+ dev_err(i2cd->dev, "i2c timeout error %x\n", val);
+ return -ETIMEDOUT;
+ }
+
+ val = readl(i2cd->regs + I2C_MST_CNTL);
+ switch (val & I2C_MST_CNTL_STATUS) {
+ case I2C_MST_CNTL_STATUS_OKAY:
+ return 0;
+ case I2C_MST_CNTL_STATUS_NO_ACK:
+ return -ENXIO;
+ case I2C_MST_CNTL_STATUS_TIMEOUT:
+ return -ETIMEDOUT;
+ default:
+ return 0;
+ }
+}
+
+static int gpu_i2c_read(struct gpu_i2c_dev *i2cd, u8 *data, u16 len)
+{
+ int status;
+ u32 val;
+
+ val = I2C_MST_CNTL_GEN_START | I2C_MST_CNTL_CMD_READ |
+ (len << I2C_MST_CNTL_BURST_SIZE_SHIFT) |
+ I2C_MST_CNTL_CYCLE_TRIGGER | I2C_MST_CNTL_GEN_NACK;
+ writel(val, i2cd->regs + I2C_MST_CNTL);
+
+ status = gpu_i2c_check_status(i2cd);
+ if (status < 0)
+ return status;
+
+ val = readl(i2cd->regs + I2C_MST_DATA);
+ switch (len) {
+ case 1:
+ data[0] = val;
+ break;
+ case 2:
+ put_unaligned_be16(val, data);
+ break;
+ case 3:
+ put_unaligned_be16(val >> 8, data);
+ data[2] = val;
+ break;
+ case 4:
+ put_unaligned_be32(val, data);
+ break;
+ default:
+ break;
+ }
+ return status;
+}
+
+static int gpu_i2c_start(struct gpu_i2c_dev *i2cd)
+{
+ writel(I2C_MST_CNTL_GEN_START, i2cd->regs + I2C_MST_CNTL);
+ return gpu_i2c_check_status(i2cd);
+}
+
+static int gpu_i2c_stop(struct gpu_i2c_dev *i2cd)
+{
+ writel(I2C_MST_CNTL_GEN_STOP, i2cd->regs + I2C_MST_CNTL);
+ return gpu_i2c_check_status(i2cd);
+}
+
+static int gpu_i2c_write(struct gpu_i2c_dev *i2cd, u8 data)
+{
+ u32 val;
+
+ writel(data, i2cd->regs + I2C_MST_DATA);
+
+ val = I2C_MST_CNTL_CMD_WRITE | (1 << I2C_MST_CNTL_BURST_SIZE_SHIFT);
+ writel(val, i2cd->regs + I2C_MST_CNTL);
+
+ return gpu_i2c_check_status(i2cd);
+}
+
+static int gpu_i2c_master_xfer(struct i2c_adapter *adap,
+ struct i2c_msg *msgs, int num)
+{
+ struct gpu_i2c_dev *i2cd = i2c_get_adapdata(adap);
+ int status, status2;
+ int i, j;
+
+ /*
+ * The controller supports maximum 4 byte read due to known
+ * limitation of sending STOP after every read.
+ */
+ for (i = 0; i < num; i++) {
+ if (msgs[i].flags & I2C_M_RD) {
+ /* program client address before starting read */
+ writel(msgs[i].addr, i2cd->regs + I2C_MST_ADDR);
+ /* gpu_i2c_read has implicit start */
+ status = gpu_i2c_read(i2cd, msgs[i].buf, msgs[i].len);
+ if (status < 0)
+ goto stop;
+ } else {
+ u8 addr = i2c_8bit_addr_from_msg(msgs + i);
+
+ status = gpu_i2c_start(i2cd);
+ if (status < 0) {
+ if (i == 0)
+ return status;
+ goto stop;
+ }
+
+ status = gpu_i2c_write(i2cd, addr);
+ if (status < 0)
+ goto stop;
+
+ for (j = 0; j < msgs[i].len; j++) {
+ status = gpu_i2c_write(i2cd, msgs[i].buf[j]);
+ if (status < 0)
+ goto stop;
+ }
+ }
+ }
+ status = gpu_i2c_stop(i2cd);
+ if (status < 0)
+ return status;
+
+ return i;
+stop:
+ status2 = gpu_i2c_stop(i2cd);
+ if (status2 < 0)
+ dev_err(i2cd->dev, "i2c stop failed %d\n", status2);
+ return status;
+}
+
+static const struct i2c_adapter_quirks gpu_i2c_quirks = {
+ .max_read_len = 4,
+ .max_comb_2nd_msg_len = 4,
+ .flags = I2C_AQ_COMB_WRITE_THEN_READ,
+};
+
+static u32 gpu_i2c_functionality(struct i2c_adapter *adap)
+{
+ return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
+}
+
+static const struct i2c_algorithm gpu_i2c_algorithm = {
+ .master_xfer = gpu_i2c_master_xfer,
+ .functionality = gpu_i2c_functionality,
+};
+
+/*
+ * This driver is for Nvidia GPU cards with USB Type-C interface.
+ * We want to identify the cards using vendor ID and class code only
+ * to avoid dependency of adding product id for any new card which
+ * requires this driver.
+ * Currently there is no class code defined for UCSI device over PCI
+ * so using UNKNOWN class for now and it will be updated when UCSI
+ * over PCI gets a class code.
+ * There is no other NVIDIA cards with UNKNOWN class code. Even if the
+ * driver gets loaded for an undesired card then eventually i2c_read()
+ * (initiated from UCSI i2c_client) will timeout or UCSI commands will
+ * timeout.
+ */
+#define PCI_CLASS_SERIAL_UNKNOWN 0x0c80
+static const struct pci_device_id gpu_i2c_ids[] = {
+ { PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
+ PCI_CLASS_SERIAL_UNKNOWN << 8, 0xffffff00},
+ { }
+};
+MODULE_DEVICE_TABLE(pci, gpu_i2c_ids);
+
+static int gpu_populate_client(struct gpu_i2c_dev *i2cd, int irq)
+{
+ struct i2c_client *ccgx_client;
+
+ i2cd->gpu_ccgx_ucsi = devm_kzalloc(i2cd->dev,
+ sizeof(*i2cd->gpu_ccgx_ucsi),
+ GFP_KERNEL);
+ if (!i2cd->gpu_ccgx_ucsi)
+ return -ENOMEM;
+
+ strlcpy(i2cd->gpu_ccgx_ucsi->type, "ccgx-ucsi",
+ sizeof(i2cd->gpu_ccgx_ucsi->type));
+ i2cd->gpu_ccgx_ucsi->addr = 0x8;
+ i2cd->gpu_ccgx_ucsi->irq = irq;
+ ccgx_client = i2c_new_device(&i2cd->adapter, i2cd->gpu_ccgx_ucsi);
+ if (!ccgx_client)
+ return -ENODEV;
+
+ return 0;
+}
+
+static int gpu_i2c_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct gpu_i2c_dev *i2cd;
+ int status;
+
+ i2cd = devm_kzalloc(&pdev->dev, sizeof(*i2cd), GFP_KERNEL);
+ if (!i2cd)
+ return -ENOMEM;
+
+ i2cd->dev = &pdev->dev;
+ dev_set_drvdata(&pdev->dev, i2cd);
+
+ status = pcim_enable_device(pdev);
+ if (status < 0) {
+ dev_err(&pdev->dev, "pcim_enable_device failed %d\n", status);
+ return status;
+ }
+
+ pci_set_master(pdev);
+
+ i2cd->regs = pcim_iomap(pdev, 0, 0);
+ if (!i2cd->regs) {
+ dev_err(&pdev->dev, "pcim_iomap failed\n");
+ return -ENOMEM;
+ }
+
+ status = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
+ if (status < 0) {
+ dev_err(&pdev->dev, "pci_alloc_irq_vectors err %d\n", status);
+ return status;
+ }
+
+ gpu_enable_i2c_bus(i2cd);
+
+ i2c_set_adapdata(&i2cd->adapter, i2cd);
+ i2cd->adapter.owner = THIS_MODULE;
+ strlcpy(i2cd->adapter.name, "NVIDIA GPU I2C adapter",
+ sizeof(i2cd->adapter.name));
+ i2cd->adapter.algo = &gpu_i2c_algorithm;
+ i2cd->adapter.quirks = &gpu_i2c_quirks;
+ i2cd->adapter.dev.parent = &pdev->dev;
+ status = i2c_add_adapter(&i2cd->adapter);
+ if (status < 0)
+ goto free_irq_vectors;
+
+ status = gpu_populate_client(i2cd, pdev->irq);
+ if (status < 0) {
+ dev_err(&pdev->dev, "gpu_populate_client failed %d\n", status);
+ goto del_adapter;
+ }
+
+ return 0;
+
+del_adapter:
+ i2c_del_adapter(&i2cd->adapter);
+free_irq_vectors:
+ pci_free_irq_vectors(pdev);
+ return status;
+}
+
+static void gpu_i2c_remove(struct pci_dev *pdev)
+{
+ struct gpu_i2c_dev *i2cd = dev_get_drvdata(&pdev->dev);
+
+ i2c_del_adapter(&i2cd->adapter);
+ pci_free_irq_vectors(pdev);
+}
+
+static int gpu_i2c_resume(struct device *dev)
+{
+ struct gpu_i2c_dev *i2cd = dev_get_drvdata(dev);
+
+ gpu_enable_i2c_bus(i2cd);
+ return 0;
+}
+
+static UNIVERSAL_DEV_PM_OPS(gpu_i2c_driver_pm, NULL, gpu_i2c_resume, NULL);
+
+static struct pci_driver gpu_i2c_driver = {
+ .name = "nvidia-gpu",
+ .id_table = gpu_i2c_ids,
+ .probe = gpu_i2c_probe,
+ .remove = gpu_i2c_remove,
+ .driver = {
+ .pm = &gpu_i2c_driver_pm,
+ },
+};
+
+module_pci_driver(gpu_i2c_driver);
+
+MODULE_AUTHOR("Ajay Gupta <ajayg@nvidia.com>");
+MODULE_DESCRIPTION("Nvidia GPU I2C controller Driver");
+MODULE_LICENSE("GPL v2");
dev_dbg(&pdev->dev, "i2c fifo/se-dma mode. fifo depth:%d\n", tx_depth);
- ret = i2c_add_adapter(&gi2c->adap);
- if (ret) {
- dev_err(&pdev->dev, "Error adding i2c adapter %d\n", ret);
- return ret;
- }
-
gi2c->suspended = 1;
pm_runtime_set_suspended(gi2c->se.dev);
pm_runtime_set_autosuspend_delay(gi2c->se.dev, I2C_AUTO_SUSPEND_DELAY);
pm_runtime_use_autosuspend(gi2c->se.dev);
pm_runtime_enable(gi2c->se.dev);
+ ret = i2c_add_adapter(&gi2c->adap);
+ if (ret) {
+ dev_err(&pdev->dev, "Error adding i2c adapter %d\n", ret);
+ pm_runtime_disable(gi2c->se.dev);
+ return ret;
+ }
+
return 0;
}
{
struct geni_i2c_dev *gi2c = platform_get_drvdata(pdev);
- pm_runtime_disable(gi2c->se.dev);
i2c_del_adapter(&gi2c->adap);
+ pm_runtime_disable(gi2c->se.dev);
return 0;
}
pm_runtime_get_sync(dev);
+ /* Check bus state before init otherwise bus busy info will be lost */
+ ret = rcar_i2c_bus_barrier(priv);
+ if (ret < 0)
+ goto out;
+
/* Gen3 needs a reset before allowing RXDMA once */
if (priv->devtype == I2C_RCAR_GEN3) {
priv->flags |= ID_P_NO_RXDMA;
rcar_i2c_init(priv);
- ret = rcar_i2c_bus_barrier(priv);
- if (ret < 0)
- goto out;
-
for (i = 0; i < num; i++)
rcar_i2c_request_dma(priv, msgs + i);
{
struct acpi_smbus_cmi *smbus_cmi;
const struct acpi_device_id *id;
+ int ret;
smbus_cmi = kzalloc(sizeof(struct acpi_smbus_cmi), GFP_KERNEL);
if (!smbus_cmi)
acpi_walk_namespace(ACPI_TYPE_METHOD, smbus_cmi->handle, 1,
acpi_smbus_cmi_query_methods, NULL, smbus_cmi, NULL);
- if (smbus_cmi->cap_info == 0)
+ if (smbus_cmi->cap_info == 0) {
+ ret = -ENODEV;
goto err;
+ }
snprintf(smbus_cmi->adapter.name, sizeof(smbus_cmi->adapter.name),
"SMBus CMI adapter %s",
smbus_cmi->adapter.class = I2C_CLASS_HWMON | I2C_CLASS_SPD;
smbus_cmi->adapter.dev.parent = &device->dev;
- if (i2c_add_adapter(&smbus_cmi->adapter)) {
+ ret = i2c_add_adapter(&smbus_cmi->adapter);
+ if (ret) {
dev_err(&device->dev, "Couldn't register adapter!\n");
goto err;
}
err:
kfree(smbus_cmi);
device->driver_data = NULL;
- return -EIO;
+ return ret;
}
static int acpi_smbus_cmi_remove(struct acpi_device *device)
"interrupt: enabled_irqs=%04x, irq_status=%04x\n",
priv->enabled_irqs, irq_status);
- uniphier_fi2c_clear_irqs(priv, irq_status);
-
if (irq_status & UNIPHIER_FI2C_INT_STOP)
goto complete;
if (irq_status & (UNIPHIER_FI2C_INT_RF | UNIPHIER_FI2C_INT_RB)) {
uniphier_fi2c_drain_rxfifo(priv);
- if (!priv->len)
+ /*
+ * If the number of bytes to read is multiple of the FIFO size
+ * (msg->len == 8, 16, 24, ...), the INT_RF bit is set a little
+ * earlier than INT_RB. We wait for INT_RB to confirm the
+ * completion of the current message.
+ */
+ if (!priv->len && (irq_status & UNIPHIER_FI2C_INT_RB))
goto data_done;
if (unlikely(priv->flags & UNIPHIER_FI2C_MANUAL_NACK)) {
}
handled:
+ /*
+ * This controller makes a pause while any bit of the IRQ status is
+ * asserted. Clear the asserted bit to kick the controller just before
+ * exiting the handler.
+ */
+ uniphier_fi2c_clear_irqs(priv, irq_status);
+
spin_unlock(&priv->lock);
return IRQ_HANDLED;
}
-static void uniphier_fi2c_tx_init(struct uniphier_fi2c_priv *priv, u16 addr)
+static void uniphier_fi2c_tx_init(struct uniphier_fi2c_priv *priv, u16 addr,
+ bool repeat)
{
priv->enabled_irqs |= UNIPHIER_FI2C_INT_TE;
uniphier_fi2c_set_irqs(priv);
/* set slave address */
writel(UNIPHIER_FI2C_DTTX_CMD | addr << 1,
priv->membase + UNIPHIER_FI2C_DTTX);
- /* first chunk of data */
- uniphier_fi2c_fill_txfifo(priv, true);
+ /*
+ * First chunk of data. For a repeated START condition, do not write
+ * data to the TX fifo here to avoid the timing issue.
+ */
+ if (!repeat)
+ uniphier_fi2c_fill_txfifo(priv, true);
}
static void uniphier_fi2c_rx_init(struct uniphier_fi2c_priv *priv, u16 addr)
if (is_read)
uniphier_fi2c_rx_init(priv, msg->addr);
else
- uniphier_fi2c_tx_init(priv, msg->addr);
+ uniphier_fi2c_tx_init(priv, msg->addr, repeat);
dev_dbg(&adap->dev, "start condition\n");
/*
uniphier_fi2c_reset(priv);
+ /*
+ * Standard-mode: tLOW + tHIGH = 10 us
+ * Fast-mode: tLOW + tHIGH = 2.5 us
+ */
writel(cyc, priv->membase + UNIPHIER_FI2C_CYC);
- writel(cyc / 2, priv->membase + UNIPHIER_FI2C_LCTL);
+ /*
+ * Standard-mode: tLOW = 4.7 us, tHIGH = 4.0 us, tBUF = 4.7 us
+ * Fast-mode: tLOW = 1.3 us, tHIGH = 0.6 us, tBUF = 1.3 us
+ * "tLow/tHIGH = 5/4" meets both.
+ */
+ writel(cyc * 5 / 9, priv->membase + UNIPHIER_FI2C_LCTL);
+ /*
+ * Standard-mode: tHD;STA = 4.0 us, tSU;STA = 4.7 us, tSU;STO = 4.0 us
+ * Fast-mode: tHD;STA = 0.6 us, tSU;STA = 0.6 us, tSU;STO = 0.6 us
+ */
writel(cyc / 2, priv->membase + UNIPHIER_FI2C_SSUT);
+ /*
+ * Standard-mode: tSU;DAT = 250 ns
+ * Fast-mode: tSU;DAT = 100 ns
+ */
writel(cyc / 16, priv->membase + UNIPHIER_FI2C_DSUT);
uniphier_fi2c_prepare_operation(priv);
uniphier_i2c_reset(priv, true);
- writel((cyc / 2 << 16) | cyc, priv->membase + UNIPHIER_I2C_CLK);
+ /*
+ * Bit30-16: clock cycles of tLOW.
+ * Standard-mode: tLOW = 4.7 us, tHIGH = 4.0 us
+ * Fast-mode: tLOW = 1.3 us, tHIGH = 0.6 us
+ * "tLow/tHIGH = 5/4" meets both.
+ */
+ writel((cyc * 5 / 9 << 16) | cyc, priv->membase + UNIPHIER_I2C_CLK);
uniphier_i2c_reset(priv, false);
}
if (client->flags & I2C_CLIENT_TEN)
return -EINVAL;
- irq = irq_find_mapping(adap->host_notify_domain, client->addr);
- if (!irq)
- irq = irq_create_mapping(adap->host_notify_domain,
- client->addr);
+ irq = irq_create_mapping(adap->host_notify_domain, client->addr);
return irq > 0 ? irq : -ENXIO;
}
dev_pm_clear_wake_irq(&client->dev);
device_init_wakeup(&client->dev, false);
+ client->irq = 0;
+
return status;
}
return 0;
}
-static int ide_drivers_open(struct inode *inode, struct file *file)
-{
- return single_open(file, &ide_drivers_show, NULL);
-}
-
-static const struct file_operations ide_drivers_operations = {
- .owner = THIS_MODULE,
- .open = ide_drivers_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(ide_drivers);
void proc_ide_create(void)
{
if (!proc_ide_root)
return;
- proc_create("drivers", 0, proc_ide_root, &ide_drivers_operations);
+ proc_create("drivers", 0, proc_ide_root, &ide_drivers_fops);
}
void proc_ide_destroy(void)
struct device_node *root = of_find_node_by_path("/");
const char *model = of_get_property(root, "model", NULL);
+ of_node_put(root);
/* Get cable type from device-tree. */
if (cable && !strncmp(cable, "80-", 3)) {
/* Some drives fail to detect 80c cable in PowerBook */
int report_id = -1;
u32 address;
int ret_type;
+ s32 min;
struct hid_sensor_hub_device *hsdev =
accel_state->common_attributes.hsdev;
case IIO_CHAN_INFO_RAW:
hid_sensor_power_state(&accel_state->common_attributes, true);
report_id = accel_state->accel[chan->scan_index].report_id;
+ min = accel_state->accel[chan->scan_index].logical_minimum;
address = accel_3d_addresses[chan->scan_index];
if (report_id >= 0)
*val = sensor_hub_input_attr_get_raw_value(
accel_state->common_attributes.hsdev,
hsdev->usage, address, report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ min < 0);
else {
*val = 0;
hid_sensor_power_state(&accel_state->common_attributes,
int report_id = -1;
u32 address;
int ret_type;
+ s32 min;
*val = 0;
*val2 = 0;
case IIO_CHAN_INFO_RAW:
hid_sensor_power_state(&gyro_state->common_attributes, true);
report_id = gyro_state->gyro[chan->scan_index].report_id;
+ min = gyro_state->gyro[chan->scan_index].logical_minimum;
address = gyro_3d_addresses[chan->scan_index];
if (report_id >= 0)
*val = sensor_hub_input_attr_get_raw_value(
gyro_state->common_attributes.hsdev,
HID_USAGE_SENSOR_GYRO_3D, address,
report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ min < 0);
else {
*val = 0;
hid_sensor_power_state(&gyro_state->common_attributes,
HID_USAGE_SENSOR_HUMIDITY,
HID_USAGE_SENSOR_ATMOSPHERIC_HUMIDITY,
humid_st->humidity_attr.report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ humid_st->humidity_attr.logical_minimum < 0);
hid_sensor_power_state(&humid_st->common_attributes, false);
return IIO_VAL_INT;
int report_id = -1;
u32 address;
int ret_type;
+ s32 min;
*val = 0;
*val2 = 0;
case CHANNEL_SCAN_INDEX_INTENSITY:
case CHANNEL_SCAN_INDEX_ILLUM:
report_id = als_state->als_illum.report_id;
- address =
- HID_USAGE_SENSOR_LIGHT_ILLUM;
+ min = als_state->als_illum.logical_minimum;
+ address = HID_USAGE_SENSOR_LIGHT_ILLUM;
break;
default:
report_id = -1;
als_state->common_attributes.hsdev,
HID_USAGE_SENSOR_ALS, address,
report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ min < 0);
hid_sensor_power_state(&als_state->common_attributes,
false);
} else {
int report_id = -1;
u32 address;
int ret_type;
+ s32 min;
*val = 0;
*val2 = 0;
switch (chan->scan_index) {
case CHANNEL_SCAN_INDEX_PRESENCE:
report_id = prox_state->prox_attr.report_id;
- address =
- HID_USAGE_SENSOR_HUMAN_PRESENCE;
+ min = prox_state->prox_attr.logical_minimum;
+ address = HID_USAGE_SENSOR_HUMAN_PRESENCE;
break;
default:
report_id = -1;
prox_state->common_attributes.hsdev,
HID_USAGE_SENSOR_PROX, address,
report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ min < 0);
hid_sensor_power_state(&prox_state->common_attributes,
false);
} else {
int report_id = -1;
u32 address;
int ret_type;
+ s32 min;
*val = 0;
*val2 = 0;
switch (mask) {
case IIO_CHAN_INFO_RAW:
hid_sensor_power_state(&magn_state->magn_flux_attributes, true);
- report_id =
- magn_state->magn[chan->address].report_id;
+ report_id = magn_state->magn[chan->address].report_id;
+ min = magn_state->magn[chan->address].logical_minimum;
address = magn_3d_addresses[chan->address];
if (report_id >= 0)
*val = sensor_hub_input_attr_get_raw_value(
magn_state->magn_flux_attributes.hsdev,
HID_USAGE_SENSOR_COMPASS_3D, address,
report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ min < 0);
else {
*val = 0;
hid_sensor_power_state(
return st_sensors_set_dataready_irq(indio_dev, state);
}
-static int st_magn_buffer_preenable(struct iio_dev *indio_dev)
-{
- return st_sensors_set_enable(indio_dev, true);
-}
-
static int st_magn_buffer_postenable(struct iio_dev *indio_dev)
{
int err;
if (err < 0)
goto st_magn_buffer_postenable_error;
- return err;
+ return st_sensors_set_enable(indio_dev, true);
st_magn_buffer_postenable_error:
kfree(mdata->buffer_data);
int err;
struct st_sensor_data *mdata = iio_priv(indio_dev);
- err = iio_triggered_buffer_predisable(indio_dev);
+ err = st_sensors_set_enable(indio_dev, false);
if (err < 0)
goto st_magn_buffer_predisable_error;
- err = st_sensors_set_enable(indio_dev, false);
+ err = iio_triggered_buffer_predisable(indio_dev);
st_magn_buffer_predisable_error:
kfree(mdata->buffer_data);
}
static const struct iio_buffer_setup_ops st_magn_buffer_setup_ops = {
- .preenable = &st_magn_buffer_preenable,
.postenable = &st_magn_buffer_postenable,
.predisable = &st_magn_buffer_predisable,
};
int report_id = -1;
u32 address;
int ret_type;
+ s32 min;
*val = 0;
*val2 = 0;
switch (mask) {
case IIO_CHAN_INFO_RAW:
hid_sensor_power_state(&incl_state->common_attributes, true);
- report_id =
- incl_state->incl[chan->scan_index].report_id;
+ report_id = incl_state->incl[chan->scan_index].report_id;
+ min = incl_state->incl[chan->scan_index].logical_minimum;
address = incl_3d_addresses[chan->scan_index];
if (report_id >= 0)
*val = sensor_hub_input_attr_get_raw_value(
incl_state->common_attributes.hsdev,
HID_USAGE_SENSOR_INCLINOMETER_3D, address,
report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ min < 0);
else {
hid_sensor_power_state(&incl_state->common_attributes,
false);
int report_id = -1;
u32 address;
int ret_type;
+ s32 min;
*val = 0;
*val2 = 0;
switch (chan->scan_index) {
case CHANNEL_SCAN_INDEX_PRESSURE:
report_id = press_state->press_attr.report_id;
- address =
- HID_USAGE_SENSOR_ATMOSPHERIC_PRESSURE;
+ min = press_state->press_attr.logical_minimum;
+ address = HID_USAGE_SENSOR_ATMOSPHERIC_PRESSURE;
break;
default:
report_id = -1;
press_state->common_attributes.hsdev,
HID_USAGE_SENSOR_PRESSURE, address,
report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ min < 0);
hid_sensor_power_state(&press_state->common_attributes,
false);
} else {
HID_USAGE_SENSOR_TEMPERATURE,
HID_USAGE_SENSOR_DATA_ENVIRONMENTAL_TEMPERATURE,
temp_st->temperature_attr.report_id,
- SENSOR_HUB_SYNC);
+ SENSOR_HUB_SYNC,
+ temp_st->temperature_attr.logical_minimum < 0);
hid_sensor_power_state(
&temp_st->common_attributes,
false);
case NETDEV_CHANGEADDR:
cmds[0] = netdev_del_cmd;
- cmds[1] = add_default_gid_cmd;
- cmds[2] = add_cmd;
+ if (ndev->reg_state == NETREG_REGISTERED) {
+ cmds[1] = add_default_gid_cmd;
+ cmds[2] = add_cmd;
+ }
break;
case NETDEV_CHANGEUPPER:
up_read(&per_mm->umem_rwsem);
}
-static int invalidate_page_trampoline(struct ib_umem_odp *item, u64 start,
- u64 end, void *cookie)
-{
- ib_umem_notifier_start_account(item);
- item->umem.context->invalidate_range(item, start, start + PAGE_SIZE);
- ib_umem_notifier_end_account(item);
- return 0;
-}
-
static int invalidate_range_start_trampoline(struct ib_umem_odp *item,
u64 start, u64 end, void *cookie)
{
put_page(page);
if (remove_existing_mapping && umem->context->invalidate_range) {
- invalidate_page_trampoline(
+ ib_umem_notifier_start_account(umem_odp);
+ umem->context->invalidate_range(
umem_odp,
- ib_umem_start(umem) + (page_index >> umem->page_shift),
- ib_umem_start(umem) + ((page_index + 1) >>
- umem->page_shift),
- NULL);
+ ib_umem_start(umem) + (page_index << umem->page_shift),
+ ib_umem_start(umem) +
+ ((page_index + 1) << umem->page_shift));
+ ib_umem_notifier_end_account(umem_odp);
ret = -EAGAIN;
}
/* Registered a new RoCE device instance to netdev */
rc = bnxt_re_register_netdev(rdev);
if (rc) {
+ rtnl_unlock();
pr_err("Failed to register with netedev: %#x\n", rc);
return -EINVAL;
}
"Failed to register with IB: %#x", rc);
bnxt_re_remove_one(rdev);
bnxt_re_dev_unreg(rdev);
+ goto exit;
}
break;
case NETDEV_UP:
}
smp_mb__before_atomic();
atomic_dec(&rdev->sched_count);
+exit:
kfree(re_work);
}
return hns_roce_cmq_send(hr_dev, &desc, 1);
}
-static int hns_roce_v2_write_mtpt(void *mb_buf, struct hns_roce_mr *mr,
- unsigned long mtpt_idx)
+static int set_mtpt_pbl(struct hns_roce_v2_mpt_entry *mpt_entry,
+ struct hns_roce_mr *mr)
{
- struct hns_roce_v2_mpt_entry *mpt_entry;
struct scatterlist *sg;
u64 page_addr;
u64 *pages;
int len;
int entry;
+ mpt_entry->pbl_size = cpu_to_le32(mr->pbl_size);
+ mpt_entry->pbl_ba_l = cpu_to_le32(lower_32_bits(mr->pbl_ba >> 3));
+ roce_set_field(mpt_entry->byte_48_mode_ba,
+ V2_MPT_BYTE_48_PBL_BA_H_M, V2_MPT_BYTE_48_PBL_BA_H_S,
+ upper_32_bits(mr->pbl_ba >> 3));
+
+ pages = (u64 *)__get_free_page(GFP_KERNEL);
+ if (!pages)
+ return -ENOMEM;
+
+ i = 0;
+ for_each_sg(mr->umem->sg_head.sgl, sg, mr->umem->nmap, entry) {
+ len = sg_dma_len(sg) >> PAGE_SHIFT;
+ for (j = 0; j < len; ++j) {
+ page_addr = sg_dma_address(sg) +
+ (j << mr->umem->page_shift);
+ pages[i] = page_addr >> 6;
+ /* Record the first 2 entry directly to MTPT table */
+ if (i >= HNS_ROCE_V2_MAX_INNER_MTPT_NUM - 1)
+ goto found;
+ i++;
+ }
+ }
+found:
+ mpt_entry->pa0_l = cpu_to_le32(lower_32_bits(pages[0]));
+ roce_set_field(mpt_entry->byte_56_pa0_h, V2_MPT_BYTE_56_PA0_H_M,
+ V2_MPT_BYTE_56_PA0_H_S, upper_32_bits(pages[0]));
+
+ mpt_entry->pa1_l = cpu_to_le32(lower_32_bits(pages[1]));
+ roce_set_field(mpt_entry->byte_64_buf_pa1, V2_MPT_BYTE_64_PA1_H_M,
+ V2_MPT_BYTE_64_PA1_H_S, upper_32_bits(pages[1]));
+ roce_set_field(mpt_entry->byte_64_buf_pa1,
+ V2_MPT_BYTE_64_PBL_BUF_PG_SZ_M,
+ V2_MPT_BYTE_64_PBL_BUF_PG_SZ_S,
+ mr->pbl_buf_pg_sz + PG_SHIFT_OFFSET);
+
+ free_page((unsigned long)pages);
+
+ return 0;
+}
+
+static int hns_roce_v2_write_mtpt(void *mb_buf, struct hns_roce_mr *mr,
+ unsigned long mtpt_idx)
+{
+ struct hns_roce_v2_mpt_entry *mpt_entry;
+ int ret;
+
mpt_entry = mb_buf;
memset(mpt_entry, 0, sizeof(*mpt_entry));
mr->pbl_ba_pg_sz + PG_SHIFT_OFFSET);
roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_PD_M,
V2_MPT_BYTE_4_PD_S, mr->pd);
- mpt_entry->byte_4_pd_hop_st = cpu_to_le32(mpt_entry->byte_4_pd_hop_st);
roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_RA_EN_S, 0);
roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_R_INV_EN_S, 1);
(mr->access & IB_ACCESS_REMOTE_WRITE ? 1 : 0));
roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_LW_EN_S,
(mr->access & IB_ACCESS_LOCAL_WRITE ? 1 : 0));
- mpt_entry->byte_8_mw_cnt_en = cpu_to_le32(mpt_entry->byte_8_mw_cnt_en);
roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_PA_S,
mr->type == MR_TYPE_MR ? 0 : 1);
roce_set_bit(mpt_entry->byte_12_mw_pa, V2_MPT_BYTE_12_INNER_PA_VLD_S,
1);
- mpt_entry->byte_12_mw_pa = cpu_to_le32(mpt_entry->byte_12_mw_pa);
mpt_entry->len_l = cpu_to_le32(lower_32_bits(mr->size));
mpt_entry->len_h = cpu_to_le32(upper_32_bits(mr->size));
if (mr->type == MR_TYPE_DMA)
return 0;
- mpt_entry->pbl_size = cpu_to_le32(mr->pbl_size);
-
- mpt_entry->pbl_ba_l = cpu_to_le32(lower_32_bits(mr->pbl_ba >> 3));
- roce_set_field(mpt_entry->byte_48_mode_ba, V2_MPT_BYTE_48_PBL_BA_H_M,
- V2_MPT_BYTE_48_PBL_BA_H_S,
- upper_32_bits(mr->pbl_ba >> 3));
- mpt_entry->byte_48_mode_ba = cpu_to_le32(mpt_entry->byte_48_mode_ba);
-
- pages = (u64 *)__get_free_page(GFP_KERNEL);
- if (!pages)
- return -ENOMEM;
-
- i = 0;
- for_each_sg(mr->umem->sg_head.sgl, sg, mr->umem->nmap, entry) {
- len = sg_dma_len(sg) >> PAGE_SHIFT;
- for (j = 0; j < len; ++j) {
- page_addr = sg_dma_address(sg) +
- (j << mr->umem->page_shift);
- pages[i] = page_addr >> 6;
-
- /* Record the first 2 entry directly to MTPT table */
- if (i >= HNS_ROCE_V2_MAX_INNER_MTPT_NUM - 1)
- goto found;
- i++;
- }
- }
-
-found:
- mpt_entry->pa0_l = cpu_to_le32(lower_32_bits(pages[0]));
- roce_set_field(mpt_entry->byte_56_pa0_h, V2_MPT_BYTE_56_PA0_H_M,
- V2_MPT_BYTE_56_PA0_H_S,
- upper_32_bits(pages[0]));
- mpt_entry->byte_56_pa0_h = cpu_to_le32(mpt_entry->byte_56_pa0_h);
-
- mpt_entry->pa1_l = cpu_to_le32(lower_32_bits(pages[1]));
- roce_set_field(mpt_entry->byte_64_buf_pa1, V2_MPT_BYTE_64_PA1_H_M,
- V2_MPT_BYTE_64_PA1_H_S, upper_32_bits(pages[1]));
+ ret = set_mtpt_pbl(mpt_entry, mr);
- free_page((unsigned long)pages);
-
- roce_set_field(mpt_entry->byte_64_buf_pa1,
- V2_MPT_BYTE_64_PBL_BUF_PG_SZ_M,
- V2_MPT_BYTE_64_PBL_BUF_PG_SZ_S,
- mr->pbl_buf_pg_sz + PG_SHIFT_OFFSET);
- mpt_entry->byte_64_buf_pa1 = cpu_to_le32(mpt_entry->byte_64_buf_pa1);
-
- return 0;
+ return ret;
}
static int hns_roce_v2_rereg_write_mtpt(struct hns_roce_dev *hr_dev,
u64 size, void *mb_buf)
{
struct hns_roce_v2_mpt_entry *mpt_entry = mb_buf;
+ int ret = 0;
if (flags & IB_MR_REREG_PD) {
roce_set_field(mpt_entry->byte_4_pd_hop_st, V2_MPT_BYTE_4_PD_M,
V2_MPT_BYTE_8_BIND_EN_S,
(mr_access_flags & IB_ACCESS_MW_BIND ? 1 : 0));
roce_set_bit(mpt_entry->byte_8_mw_cnt_en,
- V2_MPT_BYTE_8_ATOMIC_EN_S,
- (mr_access_flags & IB_ACCESS_REMOTE_ATOMIC ? 1 : 0));
+ V2_MPT_BYTE_8_ATOMIC_EN_S,
+ mr_access_flags & IB_ACCESS_REMOTE_ATOMIC ? 1 : 0);
roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_RR_EN_S,
- (mr_access_flags & IB_ACCESS_REMOTE_READ ? 1 : 0));
+ mr_access_flags & IB_ACCESS_REMOTE_READ ? 1 : 0);
roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_RW_EN_S,
- (mr_access_flags & IB_ACCESS_REMOTE_WRITE ? 1 : 0));
+ mr_access_flags & IB_ACCESS_REMOTE_WRITE ? 1 : 0);
roce_set_bit(mpt_entry->byte_8_mw_cnt_en, V2_MPT_BYTE_8_LW_EN_S,
- (mr_access_flags & IB_ACCESS_LOCAL_WRITE ? 1 : 0));
+ mr_access_flags & IB_ACCESS_LOCAL_WRITE ? 1 : 0);
}
if (flags & IB_MR_REREG_TRANS) {
mpt_entry->len_l = cpu_to_le32(lower_32_bits(size));
mpt_entry->len_h = cpu_to_le32(upper_32_bits(size));
- mpt_entry->pbl_size = cpu_to_le32(mr->pbl_size);
- mpt_entry->pbl_ba_l =
- cpu_to_le32(lower_32_bits(mr->pbl_ba >> 3));
- roce_set_field(mpt_entry->byte_48_mode_ba,
- V2_MPT_BYTE_48_PBL_BA_H_M,
- V2_MPT_BYTE_48_PBL_BA_H_S,
- upper_32_bits(mr->pbl_ba >> 3));
- mpt_entry->byte_48_mode_ba =
- cpu_to_le32(mpt_entry->byte_48_mode_ba);
-
mr->iova = iova;
mr->size = size;
+
+ ret = set_mtpt_pbl(mpt_entry, mr);
}
- return 0;
+ return ret;
}
static int hns_roce_v2_frmr_write_mtpt(void *mb_buf, struct hns_roce_mr *mr)
MLX5_IB_WIDTH_12X = 1 << 4
};
-static int translate_active_width(struct ib_device *ibdev, u8 active_width,
+static void translate_active_width(struct ib_device *ibdev, u8 active_width,
u8 *ib_width)
{
struct mlx5_ib_dev *dev = to_mdev(ibdev);
- int err = 0;
- if (active_width & MLX5_IB_WIDTH_1X) {
+ if (active_width & MLX5_IB_WIDTH_1X)
*ib_width = IB_WIDTH_1X;
- } else if (active_width & MLX5_IB_WIDTH_2X) {
- mlx5_ib_dbg(dev, "active_width %d is not supported by IB spec\n",
- (int)active_width);
- err = -EINVAL;
- } else if (active_width & MLX5_IB_WIDTH_4X) {
+ else if (active_width & MLX5_IB_WIDTH_4X)
*ib_width = IB_WIDTH_4X;
- } else if (active_width & MLX5_IB_WIDTH_8X) {
+ else if (active_width & MLX5_IB_WIDTH_8X)
*ib_width = IB_WIDTH_8X;
- } else if (active_width & MLX5_IB_WIDTH_12X) {
+ else if (active_width & MLX5_IB_WIDTH_12X)
*ib_width = IB_WIDTH_12X;
- } else {
- mlx5_ib_dbg(dev, "Invalid active_width %d\n",
+ else {
+ mlx5_ib_dbg(dev, "Invalid active_width %d, setting width to default value: 4x\n",
(int)active_width);
- err = -EINVAL;
+ *ib_width = IB_WIDTH_4X;
}
- return err;
+ return;
}
static int mlx5_mtu_to_ib_mtu(int mtu)
if (err)
goto out;
- err = translate_active_width(ibdev, ib_link_width_oper,
- &props->active_width);
- if (err)
- goto out;
+ translate_active_width(ibdev, ib_link_width_oper, &props->active_width);
+
err = mlx5_query_port_ib_proto_oper(mdev, &props->active_speed, port);
if (err)
goto out;
goto srcu_unlock;
}
+ if (!mr->umem->is_odp) {
+ mlx5_ib_dbg(dev, "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n",
+ key);
+ if (bytes_mapped)
+ *bytes_mapped += bcnt;
+ ret = 0;
+ goto srcu_unlock;
+ }
+
ret = pagefault_mr(dev, mr, io_virt, bcnt, bytes_mapped);
if (ret < 0)
goto srcu_unlock;
head = frame;
bcnt -= frame->bcnt;
+ offset = 0;
}
break;
if (access_flags & IB_ACCESS_REMOTE_READ)
*hw_access_flags |= MLX5_QP_BIT_RRE;
- if ((access_flags & IB_ACCESS_REMOTE_ATOMIC) &&
- qp->ibqp.qp_type == IB_QPT_RC) {
+ if (access_flags & IB_ACCESS_REMOTE_ATOMIC) {
int atomic_mode;
atomic_mode = get_atomic_mode(dev, qp->ibqp.qp_type);
goto out;
}
- if (wr->opcode == IB_WR_LOCAL_INV ||
- wr->opcode == IB_WR_REG_MR) {
+ if (wr->opcode == IB_WR_REG_MR) {
fence = dev->umr_fence;
next_fence = MLX5_FENCE_MODE_INITIATOR_SMALL;
- } else if (wr->send_flags & IB_SEND_FENCE) {
- if (qp->next_fence)
- fence = MLX5_FENCE_MODE_SMALL_AND_FENCE;
- else
- fence = MLX5_FENCE_MODE_FENCE;
- } else {
- fence = qp->next_fence;
+ } else {
+ if (wr->send_flags & IB_SEND_FENCE) {
+ if (qp->next_fence)
+ fence = MLX5_FENCE_MODE_SMALL_AND_FENCE;
+ else
+ fence = MLX5_FENCE_MODE_FENCE;
+ } else {
+ fence = qp->next_fence;
+ }
}
switch (ibqp->qp_type) {
* rvt_create_ah - create an address handle
* @pd: the protection domain
* @ah_attr: the attributes of the AH
+ * @udata: pointer to user's input output buffer information.
*
* This may be called from interrupt context.
*
* Return: newly allocated ah
*/
struct ib_ah *rvt_create_ah(struct ib_pd *pd,
- struct rdma_ah_attr *ah_attr)
+ struct rdma_ah_attr *ah_attr,
+ struct ib_udata *udata)
{
struct rvt_ah *ah;
struct rvt_dev_info *dev = ib_to_rvt(pd->device);
#include <rdma/rdma_vt.h>
struct ib_ah *rvt_create_ah(struct ib_pd *pd,
- struct rdma_ah_attr *ah_attr);
+ struct rdma_ah_attr *ah_attr,
+ struct ib_udata *udata);
int rvt_destroy_ah(struct ib_ah *ibah);
int rvt_modify_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
int rvt_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
IB_MR_CHECK_SIG_STATUS, &mr_status);
if (ret) {
pr_err("ib_check_mr_status failed, ret %d\n", ret);
- goto err;
+ /* Not a lot we can do, return ambiguous guard error */
+ *sector = 0;
+ return 0x1;
}
if (mr_status.fail_status & IB_MR_CHECK_SIG_STATUS) {
}
return 0;
-err:
- /* Not alot we can do here, return ambiguous guard error */
- return 0x1;
}
void iser_err_comp(struct ib_wc *wc, const char *type)
};
/*
- * This packet is required for some of the PDP pads to start
+ * This packet is required for most (all?) of the PDP pads to start
* sending input reports. These pads include: (0x0e6f:0x02ab),
- * (0x0e6f:0x02a4).
+ * (0x0e6f:0x02a4), (0x0e6f:0x02a6).
*/
static const u8 xboxone_pdp_init1[] = {
0x0a, 0x20, 0x00, 0x03, 0x00, 0x01, 0x14
};
/*
- * This packet is required for some of the PDP pads to start
+ * This packet is required for most (all?) of the PDP pads to start
* sending input reports. These pads include: (0x0e6f:0x02ab),
- * (0x0e6f:0x02a4).
+ * (0x0e6f:0x02a4), (0x0e6f:0x02a6).
*/
static const u8 xboxone_pdp_init2[] = {
0x06, 0x20, 0x00, 0x02, 0x01, 0x00
XBOXONE_INIT_PKT(0x0e6f, 0x0165, xboxone_hori_init),
XBOXONE_INIT_PKT(0x0f0d, 0x0067, xboxone_hori_init),
XBOXONE_INIT_PKT(0x0000, 0x0000, xboxone_fw2015_init),
- XBOXONE_INIT_PKT(0x0e6f, 0x02ab, xboxone_pdp_init1),
- XBOXONE_INIT_PKT(0x0e6f, 0x02ab, xboxone_pdp_init2),
- XBOXONE_INIT_PKT(0x0e6f, 0x02a4, xboxone_pdp_init1),
- XBOXONE_INIT_PKT(0x0e6f, 0x02a4, xboxone_pdp_init2),
- XBOXONE_INIT_PKT(0x0e6f, 0x02a6, xboxone_pdp_init1),
- XBOXONE_INIT_PKT(0x0e6f, 0x02a6, xboxone_pdp_init2),
+ XBOXONE_INIT_PKT(0x0e6f, 0x0000, xboxone_pdp_init1),
+ XBOXONE_INIT_PKT(0x0e6f, 0x0000, xboxone_pdp_init2),
XBOXONE_INIT_PKT(0x24c6, 0x541a, xboxone_rumblebegin_init),
XBOXONE_INIT_PKT(0x24c6, 0x542a, xboxone_rumblebegin_init),
XBOXONE_INIT_PKT(0x24c6, 0x543a, xboxone_rumblebegin_init),
if (param[0] != 3) {
param[0] = 2;
if (ps2_command(ps2dev, param, ATKBD_CMD_SSCANSET))
- return 2;
+ return 2;
}
ps2_command(ps2dev, param, ATKBD_CMD_SETALL_MBR);
for (i = 0; i < ARRAY_SIZE(cros_ec_keyb_bs); i++) {
const struct cros_ec_bs_map *map = &cros_ec_keyb_bs[i];
- if (buttons & BIT(map->bit))
+ if ((map->ev_type == EV_KEY && (buttons & BIT(map->bit))) ||
+ (map->ev_type == EV_SW && (switches & BIT(map->bit))))
input_set_capability(idev, map->ev_type, map->code);
}
struct matrix_keypad_platform_data *pdata;
struct device_node *np = dev->of_node;
unsigned int *gpios;
- int i, nrow, ncol;
+ int ret, i, nrow, ncol;
if (!np) {
dev_err(dev, "device lacks DT data\n");
return ERR_PTR(-ENOMEM);
}
- for (i = 0; i < pdata->num_row_gpios; i++)
- gpios[i] = of_get_named_gpio(np, "row-gpios", i);
+ for (i = 0; i < nrow; i++) {
+ ret = of_get_named_gpio(np, "row-gpios", i);
+ if (ret < 0)
+ return ERR_PTR(ret);
+ gpios[i] = ret;
+ }
- for (i = 0; i < pdata->num_col_gpios; i++)
- gpios[pdata->num_row_gpios + i] =
- of_get_named_gpio(np, "col-gpios", i);
+ for (i = 0; i < ncol; i++) {
+ ret = of_get_named_gpio(np, "col-gpios", i);
+ if (ret < 0)
+ return ERR_PTR(ret);
+ gpios[nrow + i] = ret;
+ }
pdata->row_gpios = gpios;
pdata->col_gpios = &gpios[pdata->num_row_gpios];
pdata = dev_get_platdata(&pdev->dev);
if (!pdata) {
pdata = matrix_keypad_parse_dt(&pdev->dev);
- if (IS_ERR(pdata)) {
- dev_err(&pdev->dev, "no platform data defined\n");
+ if (IS_ERR(pdata))
return PTR_ERR(pdata);
- }
} else if (!pdata->keymap_data) {
dev_err(&pdev->dev, "no keymap data defined\n");
return -EINVAL;
/* OMAP4 values */
#define OMAP4_VAL_IRQDISABLE 0x0
-#define OMAP4_VAL_DEBOUNCINGTIME 0x7
-#define OMAP4_VAL_PVT 0x7
+
+/*
+ * Errata i689: If a key is released for a time shorter than debounce time,
+ * the keyboard will idle and never detect the key release. The workaround
+ * is to use at least a 12ms debounce time. See omap5432 TRM chapter
+ * "26.4.6.2 Keyboard Controller Timer" for more information.
+ */
+#define OMAP4_KEYPAD_PTV_DIV_128 0x6
+#define OMAP4_KEYPAD_DEBOUNCINGTIME_MS(dbms, ptv) \
+ ((((dbms) * 1000) / ((1 << ((ptv) + 1)) * (1000000 / 32768))) - 1)
+#define OMAP4_VAL_DEBOUNCINGTIME_16MS \
+ OMAP4_KEYPAD_DEBOUNCINGTIME_MS(16, OMAP4_KEYPAD_PTV_DIV_128)
enum {
KBD_REVISION_OMAP4 = 0,
kbd_writel(keypad_data, OMAP4_KBD_CTRL,
OMAP4_DEF_CTRL_NOSOFTMODE |
- (OMAP4_VAL_PVT << OMAP4_DEF_CTRL_PTV_SHIFT));
+ (OMAP4_KEYPAD_PTV_DIV_128 << OMAP4_DEF_CTRL_PTV_SHIFT));
kbd_writel(keypad_data, OMAP4_KBD_DEBOUNCINGTIME,
- OMAP4_VAL_DEBOUNCINGTIME);
+ OMAP4_VAL_DEBOUNCINGTIME_16MS);
/* clear pending interrupts */
kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS,
kbd_read_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS));
{ "ELAN0618", 0 },
{ "ELAN061C", 0 },
{ "ELAN061D", 0 },
+ { "ELAN061E", 0 },
+ { "ELAN0620", 0 },
+ { "ELAN0621", 0 },
{ "ELAN0622", 0 },
{ "ELAN1000", 0 },
{ }
"LEN0048", /* X1 Carbon 3 */
"LEN0046", /* X250 */
"LEN004a", /* W541 */
+ "LEN005b", /* P50 */
"LEN0071", /* T480 */
"LEN0072", /* X1 Carbon Gen 5 (2017) - Elan/ALPS trackpoint */
"LEN0073", /* X1 Carbon G5 (Elantech) */
"LEN0096", /* X280 */
"LEN0097", /* X280 -> ALPS trackpoint */
"LEN200f", /* T450s */
+ "SYN3221", /* HP 15-ay000 */
NULL
};
* state because the Enter-UP can trigger a wakeup at once.
*/
if (!(info & IS_BREAK))
- pm_wakeup_event(&hv_dev->device, 0);
+ pm_wakeup_hard_event(&hv_dev->device);
break;
+// SPDX-License-Identifier: GPL-2.0+
/*
* Touch Screen driver for Renesas MIGO-R Platform
*
* Copyright (c) 2008 Magnus Damm
* Copyright (c) 2007 Ujjwal Pande <ujjwal@kenati.com>,
* Kenati Technologies Pvt Ltd.
- *
- * This file is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License as published by the Free Software Foundation; either
- * version 2 of the License, or (at your option) any later version.
- *
- * This file is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this library; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <linux/module.h>
#include <linux/kernel.h>
+// SPDX-License-Identifier: GPL-2.0
/*
* ST1232 Touchscreen Controller Driver
*
* Using code from:
* - android.git.kernel.org: projects/kernel/common.git: synaptics_i2c_rmi.c
* Copyright (C) 2007 Google, Inc.
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#include <linux/delay.h>
MODULE_AUTHOR("Tony SIM <chinyeow.sim.xt@renesas.com>");
MODULE_DESCRIPTION("SITRONIX ST1232 Touchscreen Controller Driver");
-MODULE_LICENSE("GPL");
+MODULE_LICENSE("GPL v2");
entry = iommu_virt_to_phys(iommu->ga_log) | GA_LOG_SIZE_512;
memcpy_toio(iommu->mmio_base + MMIO_GA_LOG_BASE_OFFSET,
&entry, sizeof(entry));
- entry = (iommu_virt_to_phys(iommu->ga_log) & 0xFFFFFFFFFFFFFULL) & ~7ULL;
+ entry = (iommu_virt_to_phys(iommu->ga_log_tail) &
+ (BIT_ULL(52)-1)) & ~7ULL;
memcpy_toio(iommu->mmio_base + MMIO_GA_LOG_TAIL_OFFSET,
&entry, sizeof(entry));
writel(0x00, iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
}
if (old_ce)
- iounmap(old_ce);
+ memunmap(old_ce);
ret = 0;
if (devfn < 0x80)
pr_err("%s: Page request without PASID: %08llx %08llx\n",
iommu->name, ((unsigned long long *)req)[0],
((unsigned long long *)req)[1]);
- goto bad_req;
+ goto no_pasid;
}
if (!svm || svm->pasid != req->pasid) {
static void ipmmu_domain_destroy_context(struct ipmmu_vmsa_domain *domain)
{
+ if (!domain->mmu)
+ return;
+
/*
* Disable the context. Flush the TLB as required when modifying the
* context registers.
sei->res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
sei->base = devm_ioremap_resource(sei->dev, sei->res);
- if (!sei->base) {
+ if (IS_ERR(sei->base)) {
dev_err(sei->dev, "Failed to remap SEI resource\n");
- return -ENODEV;
+ return PTR_ERR(sei->base);
}
/* Retrieve the SEI capabilities with the interrupt ranges */
printk(KERN_DEBUG "%s: socket created and open\n",
__func__);
while (!signal_pending(current)) {
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1,
- recvbuf_size);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, recvbuf_size);
recvlen = sock_recvmsg(socket, &msg, 0);
if (recvlen > 0) {
l1oip_socket_parse(hc, &sin_rx, recvbuf, recvlen);
{
struct pattern_trig_data *data = from_timer(data, t, timer);
- mutex_lock(&data->lock);
-
for (;;) {
if (!data->is_indefinite && !data->repeat)
break;
data->curr->brightness);
mod_timer(&data->timer,
jiffies + msecs_to_jiffies(data->curr->delta_t));
-
- /* Skip the tuple with zero duration */
- pattern_trig_update_patterns(data);
+ if (!data->next->delta_t) {
+ /* Skip the tuple with zero duration */
+ pattern_trig_update_patterns(data);
+ }
/* Select next tuple */
pattern_trig_update_patterns(data);
} else {
break;
}
-
- mutex_unlock(&data->lock);
}
static int pattern_trig_start_pattern(struct led_classdev *led_cdev)
if (res < -1 || res == 0)
return -EINVAL;
- /*
- * Clear previous patterns' performence firstly, and remove the timer
- * without mutex lock to avoid dead lock.
- */
- del_timer_sync(&data->timer);
-
mutex_lock(&data->lock);
+ del_timer_sync(&data->timer);
+
if (data->is_hw_pattern)
led_cdev->pattern_clear(led_cdev);
struct pattern_trig_data *data = led_cdev->trigger_data;
int ccount, cr, offset = 0, err = 0;
- /*
- * Clear previous patterns' performence firstly, and remove the timer
- * without mutex lock to avoid dead lock.
- */
- del_timer_sync(&data->timer);
-
mutex_lock(&data->lock);
+ del_timer_sync(&data->timer);
+
if (data->is_hw_pattern)
led_cdev->pattern_clear(led_cdev);
!discard_bio)
continue;
bio_chain(discard_bio, bio);
- bio_clone_blkg_association(discard_bio, bio);
+ bio_clone_blkcg_association(discard_bio, bio);
if (mddev->gendisk)
trace_block_bio_remap(bdev_get_queue(rdev->bdev),
discard_bio, disk_devt(mddev->gendisk),
}
if (adap->transmit_queue_sz >= CEC_MAX_MSG_TX_QUEUE_SZ) {
- dprintk(1, "%s: transmit queue full\n", __func__);
+ dprintk(2, "%s: transmit queue full\n", __func__);
return -EBUSY;
}
{
struct cec_log_addrs *las = &adap->log_addrs;
struct cec_msg msg = { };
+ const unsigned int max_retries = 2;
+ unsigned int i;
int err;
if (cec_has_log_addr(adap, log_addr))
/* Send poll message */
msg.len = 1;
msg.msg[0] = (log_addr << 4) | log_addr;
- err = cec_transmit_msg_fh(adap, &msg, NULL, true);
- /*
- * While trying to poll the physical address was reset
- * and the adapter was unconfigured, so bail out.
- */
- if (!adap->is_configuring)
- return -EINTR;
+ for (i = 0; i < max_retries; i++) {
+ err = cec_transmit_msg_fh(adap, &msg, NULL, true);
- if (err)
- return err;
+ /*
+ * While trying to poll the physical address was reset
+ * and the adapter was unconfigured, so bail out.
+ */
+ if (!adap->is_configuring)
+ return -EINTR;
+
+ if (err)
+ return err;
- if (msg.tx_status & CEC_TX_STATUS_OK)
+ /*
+ * The message was aborted due to a disconnect or
+ * unconfigure, just bail out.
+ */
+ if (msg.tx_status & CEC_TX_STATUS_ABORTED)
+ return -EINTR;
+ if (msg.tx_status & CEC_TX_STATUS_OK)
+ return 0;
+ if (msg.tx_status & CEC_TX_STATUS_NACK)
+ break;
+ /*
+ * Retry up to max_retries times if the message was neither
+ * OKed or NACKed. This can happen due to e.g. a Lost
+ * Arbitration condition.
+ */
+ }
+
+ /*
+ * If we are unable to get an OK or a NACK after max_retries attempts
+ * (and note that each attempt already consists of four polls), then
+ * then we assume that something is really weird and that it is not a
+ * good idea to try and claim this logical address.
+ */
+ if (i == max_retries)
return 0;
/*
static const struct dvb_pll_desc dvb_pll_thomson_dtt7579 = {
.name = "Thomson dtt7579",
- .min = 177000000,
- .max = 858000000,
+ .min = 177 * MHz,
+ .max = 858 * MHz,
.iffreq= 36166667,
.sleepdata = (u8[]){ 2, 0xb4, 0x03 },
.count = 4,
static const struct dvb_pll_desc dvb_pll_thomson_dtt759x = {
.name = "Thomson dtt759x",
- .min = 177000000,
- .max = 896000000,
+ .min = 177 * MHz,
+ .max = 896 * MHz,
.set = thomson_dtt759x_bw,
.iffreq= 36166667,
.sleepdata = (u8[]){ 2, 0x84, 0x03 },
static const struct dvb_pll_desc dvb_pll_thomson_dtt7520x = {
.name = "Thomson dtt7520x",
- .min = 185000000,
- .max = 900000000,
+ .min = 185 * MHz,
+ .max = 900 * MHz,
.set = thomson_dtt7520x_bw,
.iffreq = 36166667,
.count = 7,
static const struct dvb_pll_desc dvb_pll_lg_z201 = {
.name = "LG z201",
- .min = 174000000,
- .max = 862000000,
+ .min = 174 * MHz,
+ .max = 862 * MHz,
.iffreq= 36166667,
.sleepdata = (u8[]){ 2, 0xbc, 0x03 },
.count = 5,
static const struct dvb_pll_desc dvb_pll_unknown_1 = {
.name = "unknown 1", /* used by dntv live dvb-t */
- .min = 174000000,
- .max = 862000000,
+ .min = 174 * MHz,
+ .max = 862 * MHz,
.iffreq= 36166667,
.count = 9,
.entries = {
*/
static const struct dvb_pll_desc dvb_pll_tua6010xs = {
.name = "Infineon TUA6010XS",
- .min = 44250000,
- .max = 858000000,
+ .min = 44250 * kHz,
+ .max = 858 * MHz,
.iffreq= 36125000,
.count = 3,
.entries = {
/* Panasonic env57h1xd5 (some Philips PLL ?) */
static const struct dvb_pll_desc dvb_pll_env57h1xd5 = {
.name = "Panasonic ENV57H1XD5",
- .min = 44250000,
- .max = 858000000,
+ .min = 44250 * kHz,
+ .max = 858 * MHz,
.iffreq= 36125000,
.count = 4,
.entries = {
static const struct dvb_pll_desc dvb_pll_tda665x = {
.name = "Philips TDA6650/TDA6651",
- .min = 44250000,
- .max = 858000000,
+ .min = 44250 * kHz,
+ .max = 858 * MHz,
.set = tda665x_bw,
.iffreq= 36166667,
.initdata = (u8[]){ 4, 0x0b, 0xf5, 0x85, 0xab },
static const struct dvb_pll_desc dvb_pll_tua6034 = {
.name = "Infineon TUA6034",
- .min = 44250000,
- .max = 858000000,
+ .min = 44250 * kHz,
+ .max = 858 * MHz,
.iffreq= 36166667,
.count = 3,
.set = tua6034_bw,
static const struct dvb_pll_desc dvb_pll_tded4 = {
.name = "ALPS TDED4",
- .min = 47000000,
- .max = 863000000,
+ .min = 47 * MHz,
+ .max = 863 * MHz,
.iffreq= 36166667,
.set = tded4_bw,
.count = 4,
*/
static const struct dvb_pll_desc dvb_pll_tdhu2 = {
.name = "ALPS TDHU2",
- .min = 54000000,
- .max = 864000000,
+ .min = 54 * MHz,
+ .max = 864 * MHz,
.iffreq= 44000000,
.count = 4,
.entries = {
*/
static const struct dvb_pll_desc dvb_pll_samsung_tbmv = {
.name = "Samsung TBMV30111IN / TBMV30712IN1",
- .min = 54000000,
- .max = 860000000,
+ .min = 54 * MHz,
+ .max = 860 * MHz,
.iffreq= 44000000,
.count = 6,
.entries = {
*/
static const struct dvb_pll_desc dvb_pll_philips_sd1878_tda8261 = {
.name = "Philips SD1878",
- .min = 950000,
- .max = 2150000,
+ .min = 950 * MHz,
+ .max = 2150 * MHz,
.iffreq= 249, /* zero-IF, offset 249 is to round up */
.count = 4,
.entries = {
static const struct dvb_pll_desc dvb_pll_opera1 = {
.name = "Opera Tuner",
- .min = 900000,
- .max = 2250000,
+ .min = 900 * MHz,
+ .max = 2250 * MHz,
.initdata = (u8[]){ 4, 0x08, 0xe5, 0xe1, 0x00 },
.initdata2 = (u8[]){ 4, 0x08, 0xe5, 0xe5, 0x00 },
.iffreq= 0,
/* unknown pll used in Samsung DTOS403IH102A DVB-C tuner */
static const struct dvb_pll_desc dvb_pll_samsung_dtos403ih102a = {
.name = "Samsung DTOS403IH102A",
- .min = 44250000,
- .max = 858000000,
+ .min = 44250 * kHz,
+ .max = 858 * MHz,
.iffreq = 36125000,
.count = 8,
.set = samsung_dtos403ih102a_set,
/* Samsung TDTC9251DH0 DVB-T NIM, as used on AirStar 2 */
static const struct dvb_pll_desc dvb_pll_samsung_tdtc9251dh0 = {
.name = "Samsung TDTC9251DH0",
- .min = 48000000,
- .max = 863000000,
+ .min = 48 * MHz,
+ .max = 863 * MHz,
.iffreq = 36166667,
.count = 3,
.entries = {
/* Samsung TBDU18132 DVB-S NIM with TSA5059 PLL, used in SkyStar2 DVB-S 2.3 */
static const struct dvb_pll_desc dvb_pll_samsung_tbdu18132 = {
.name = "Samsung TBDU18132",
- .min = 950000,
- .max = 2150000, /* guesses */
+ .min = 950 * MHz,
+ .max = 2150 * MHz, /* guesses */
.iffreq = 0,
.count = 2,
.entries = {
/* Samsung TBMU24112 DVB-S NIM with SL1935 zero-IF tuner */
static const struct dvb_pll_desc dvb_pll_samsung_tbmu24112 = {
.name = "Samsung TBMU24112",
- .min = 950000,
- .max = 2150000, /* guesses */
+ .min = 950 * MHz,
+ .max = 2150 * MHz, /* guesses */
.iffreq = 0,
.count = 2,
.entries = {
* 822 - 862 1 * 0 0 1 0 0 0 0x88 */
static const struct dvb_pll_desc dvb_pll_alps_tdee4 = {
.name = "ALPS TDEE4",
- .min = 47000000,
- .max = 862000000,
+ .min = 47 * MHz,
+ .max = 862 * MHz,
.iffreq = 36125000,
.count = 4,
.entries = {
/* CP cur. 50uA, AGC takeover: 103dBuV, PORT3 on */
static const struct dvb_pll_desc dvb_pll_tua6034_friio = {
.name = "Infineon TUA6034 ISDB-T (Friio)",
- .min = 90000000,
- .max = 770000000,
+ .min = 90 * MHz,
+ .max = 770 * MHz,
.iffreq = 57000000,
.initdata = (u8[]){ 4, 0x9a, 0x50, 0xb2, 0x08 },
.sleepdata = (u8[]){ 4, 0x9a, 0x70, 0xb3, 0x0b },
/* Philips TDA6651 ISDB-T, used in Earthsoft PT1 */
static const struct dvb_pll_desc dvb_pll_tda665x_earth_pt1 = {
.name = "Philips TDA6651 ISDB-T (EarthSoft PT1)",
- .min = 90000000,
- .max = 770000000,
+ .min = 90 * MHz,
+ .max = 770 * MHz,
.iffreq = 57000000,
.initdata = (u8[]){ 5, 0x0e, 0x7f, 0xc1, 0x80, 0x80 },
.count = 10,
u32 div;
int i;
- if (frequency && (frequency < desc->min || frequency > desc->max))
- return -EINVAL;
-
for (i = 0; i < desc->count; i++) {
if (frequency > desc->entries[i].limit)
continue;
struct dvb_pll_priv *priv = NULL;
int ret;
const struct dvb_pll_desc *desc;
- struct dtv_frontend_properties *c = &fe->dtv_property_cache;
b1 = kmalloc(1, GFP_KERNEL);
if (!b1)
strncpy(fe->ops.tuner_ops.info.name, desc->name,
sizeof(fe->ops.tuner_ops.info.name));
- switch (c->delivery_system) {
- case SYS_DVBS:
- case SYS_DVBS2:
- case SYS_TURBO:
- case SYS_ISDBS:
- fe->ops.tuner_ops.info.frequency_min_hz = desc->min * kHz;
- fe->ops.tuner_ops.info.frequency_max_hz = desc->max * kHz;
- break;
- default:
- fe->ops.tuner_ops.info.frequency_min_hz = desc->min;
- fe->ops.tuner_ops.info.frequency_max_hz = desc->max;
- }
+
+ fe->ops.tuner_ops.info.frequency_min_hz = desc->min;
+ fe->ops.tuner_ops.info.frequency_max_hz = desc->max;
+
+ dprintk("%s tuner, frequency range: %u...%u\n",
+ desc->name, desc->min, desc->max);
if (!desc->initdata)
fe->ops.tuner_ops.init = NULL;
ret = v4l2_fwnode_endpoint_alloc_parse(of_fwnode_handle(ep), &endpoint);
if (ret) {
dev_err(dev, "failed to parse endpoint\n");
- ret = ret;
goto put_node;
}
.owner = THIS_MODULE,
.poll = media_request_poll,
.unlocked_ioctl = media_request_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = media_request_ioctl,
+#endif /* CONFIG_COMPAT */
.release = media_request_close,
};
static void cio2_pci_remove(struct pci_dev *pci_dev)
{
struct cio2_device *cio2 = pci_get_drvdata(pci_dev);
- unsigned int i;
+ media_device_unregister(&cio2->media_dev);
cio2_notifier_exit(cio2);
+ cio2_queues_exit(cio2);
cio2_fbpt_exit_dummy(cio2);
- for (i = 0; i < CIO2_QUEUES; i++)
- cio2_queue_exit(cio2, &cio2->queue[i]);
v4l2_device_unregister(&cio2->v4l2_dev);
- media_device_unregister(&cio2->media_dev);
media_device_cleanup(&cio2->media_dev);
mutex_destroy(&cio2->lock);
}
static void isp_unregister_entities(struct isp_device *isp)
{
+ media_device_unregister(&isp->media_dev);
+
omap3isp_csi2_unregister_entities(&isp->isp_csi2a);
omap3isp_ccp2_unregister_entities(&isp->isp_ccp2);
omap3isp_ccdc_unregister_entities(&isp->isp_ccdc);
omap3isp_stat_unregister_entities(&isp->isp_hist);
v4l2_device_unregister(&isp->v4l2_dev);
- media_device_unregister(&isp->media_dev);
media_device_cleanup(&isp->media_dev);
}
#define MAX_WIDTH 4096U
#define MIN_WIDTH 640U
#define MAX_HEIGHT 2160U
-#define MIN_HEIGHT 480U
+#define MIN_HEIGHT 360U
#define dprintk(dev, fmt, arg...) \
v4l2_dbg(1, debug, &dev->v4l2_dev, "%s: " fmt, __func__, ## arg)
for (; p < p_out + sz; p++) {
u32 copy;
- p = memchr(p, magic[ctx->comp_magic_cnt], sz);
+ p = memchr(p, magic[ctx->comp_magic_cnt],
+ p_out + sz - p);
if (!p) {
ctx->comp_magic_cnt = 0;
break;
static const struct media_device_ops m2m_media_ops = {
.req_validate = vb2_request_validate,
- .req_queue = vb2_m2m_request_queue,
+ .req_queue = v4l2_m2m_request_queue,
};
static int vim2m_probe(struct platform_device *pdev)
/* append the packet to the frame buffer */
if (len > 0) {
- if (gspca_dev->image_len + len > gspca_dev->pixfmt.sizeimage) {
+ if (gspca_dev->image_len + len > PAGE_ALIGN(gspca_dev->pixfmt.sizeimage)) {
gspca_err(gspca_dev, "frame overflow %d > %d\n",
gspca_dev->image_len + len,
- gspca_dev->pixfmt.sizeimage);
+ PAGE_ALIGN(gspca_dev->pixfmt.sizeimage));
packet_type = DISCARD_PACKET;
} else {
/* !! image is NULL only when last pkt is LAST or DISCARD
unsigned int sizes[], struct device *alloc_devs[])
{
struct gspca_dev *gspca_dev = vb2_get_drv_priv(vq);
+ unsigned int size = PAGE_ALIGN(gspca_dev->pixfmt.sizeimage);
if (*nplanes)
- return sizes[0] < gspca_dev->pixfmt.sizeimage ? -EINVAL : 0;
+ return sizes[0] < size ? -EINVAL : 0;
*nplanes = 1;
- sizes[0] = gspca_dev->pixfmt.sizeimage;
+ sizes[0] = size;
return 0;
}
static int gspca_buffer_prepare(struct vb2_buffer *vb)
{
struct gspca_dev *gspca_dev = vb2_get_drv_priv(vb->vb2_queue);
- unsigned long size = gspca_dev->pixfmt.sizeimage;
+ unsigned long size = PAGE_ALIGN(gspca_dev->pixfmt.sizeimage);
if (vb2_plane_size(vb, 0) < size) {
gspca_err(gspca_dev, "buffer too small (%lu < %lu)\n",
p_mpeg2_slice_params->forward_ref_index >= VIDEO_MAX_FRAME)
return -EINVAL;
+ if (p_mpeg2_slice_params->pad ||
+ p_mpeg2_slice_params->picture.pad ||
+ p_mpeg2_slice_params->sequence.pad)
+ return -EINVAL;
+
return 0;
case V4L2_CTRL_TYPE_MPEG2_QUANTIZATION:
}
EXPORT_SYMBOL_GPL(v4l2_event_pending);
+static void __v4l2_event_unsubscribe(struct v4l2_subscribed_event *sev)
+{
+ struct v4l2_fh *fh = sev->fh;
+ unsigned int i;
+
+ lockdep_assert_held(&fh->subscribe_lock);
+ assert_spin_locked(&fh->vdev->fh_lock);
+
+ /* Remove any pending events for this subscription */
+ for (i = 0; i < sev->in_use; i++) {
+ list_del(&sev->events[sev_pos(sev, i)].list);
+ fh->navailable--;
+ }
+ list_del(&sev->list);
+}
+
int v4l2_event_subscribe(struct v4l2_fh *fh,
const struct v4l2_event_subscription *sub, unsigned elems,
const struct v4l2_subscribed_event_ops *ops)
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
found_ev = v4l2_event_subscribed(fh, sub->type, sub->id);
+ if (!found_ev)
+ list_add(&sev->list, &fh->subscribed);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
if (found_ev) {
/* Already listening */
kvfree(sev);
- goto out_unlock;
- }
-
- if (sev->ops && sev->ops->add) {
+ } else if (sev->ops && sev->ops->add) {
ret = sev->ops->add(sev, elems);
if (ret) {
+ spin_lock_irqsave(&fh->vdev->fh_lock, flags);
+ __v4l2_event_unsubscribe(sev);
+ spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
kvfree(sev);
- goto out_unlock;
}
}
- spin_lock_irqsave(&fh->vdev->fh_lock, flags);
- list_add(&sev->list, &fh->subscribed);
- spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
-
-out_unlock:
mutex_unlock(&fh->subscribe_lock);
return ret;
{
struct v4l2_subscribed_event *sev;
unsigned long flags;
- int i;
if (sub->type == V4L2_EVENT_ALL) {
v4l2_event_unsubscribe_all(fh);
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
sev = v4l2_event_subscribed(fh, sub->type, sub->id);
- if (sev != NULL) {
- /* Remove any pending events for this subscription */
- for (i = 0; i < sev->in_use; i++) {
- list_del(&sev->events[sev_pos(sev, i)].list);
- fh->navailable--;
- }
- list_del(&sev->list);
- }
+ if (sev != NULL)
+ __v4l2_event_unsubscribe(sev);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
}
EXPORT_SYMBOL_GPL(v4l2_m2m_buf_queue);
-void vb2_m2m_request_queue(struct media_request *req)
+void v4l2_m2m_request_queue(struct media_request *req)
{
struct media_request_object *obj, *obj_safe;
struct v4l2_m2m_ctx *m2m_ctx = NULL;
if (m2m_ctx)
v4l2_m2m_try_schedule(m2m_ctx);
}
-EXPORT_SYMBOL_GPL(vb2_m2m_request_queue);
+EXPORT_SYMBOL_GPL(v4l2_m2m_request_queue);
/* Videobuf2 ioctl helpers */
#endif
};
+static void cros_ec_class_release(struct device *dev)
+{
+ kfree(to_cros_ec_dev(dev));
+}
+
static void cros_ec_sensors_register(struct cros_ec_dev *ec)
{
/*
int retval = -ENOMEM;
struct device *dev = &pdev->dev;
struct cros_ec_platform *ec_platform = dev_get_platdata(dev);
- struct cros_ec_dev *ec = devm_kzalloc(dev, sizeof(*ec), GFP_KERNEL);
+ struct cros_ec_dev *ec = kzalloc(sizeof(*ec), GFP_KERNEL);
if (!ec)
return retval;
ec->class_dev.devt = MKDEV(ec_major, pdev->id);
ec->class_dev.class = &cros_class;
ec->class_dev.parent = dev;
+ ec->class_dev.release = cros_ec_class_release;
retval = dev_set_name(&ec->class_dev, "%s", ec_platform->ec_name);
if (retval) {
MODULE_DEVICE_TABLE(of, atmel_ssc_dt_ids);
#endif
-static inline const struct atmel_ssc_platform_data * __init
+static inline const struct atmel_ssc_platform_data *
atmel_ssc_get_driver_data(struct platform_device *pdev)
{
if (pdev->dev.of_node) {
lkdtm-$(CONFIG_LKDTM) += refcount.o
lkdtm-$(CONFIG_LKDTM) += rodata_objcopy.o
lkdtm-$(CONFIG_LKDTM) += usercopy.o
+lkdtm-$(CONFIG_LKDTM) += stackleak.o
+KASAN_SANITIZE_stackleak.o := n
KCOV_INSTRUMENT_rodata.o := n
OBJCOPYFLAGS :=
CRASHTYPE(USERCOPY_STACK_BEYOND),
CRASHTYPE(USERCOPY_KERNEL),
CRASHTYPE(USERCOPY_KERNEL_DS),
+ CRASHTYPE(STACKLEAK_ERASING),
};
void lkdtm_USERCOPY_KERNEL(void);
void lkdtm_USERCOPY_KERNEL_DS(void);
+/* lkdtm_stackleak.c */
+void lkdtm_STACKLEAK_ERASING(void);
+
#endif
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This code tests that the current task stack is properly erased (filled
+ * with STACKLEAK_POISON).
+ *
+ * Authors:
+ * Alexander Popov <alex.popov@linux.com>
+ * Tycho Andersen <tycho@tycho.ws>
+ */
+
+#include "lkdtm.h"
+#include <linux/stackleak.h>
+
+void lkdtm_STACKLEAK_ERASING(void)
+{
+ unsigned long *sp, left, found, i;
+ const unsigned long check_depth =
+ STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
+
+ /*
+ * For the details about the alignment of the poison values, see
+ * the comment in stackleak_track_stack().
+ */
+ sp = PTR_ALIGN(&i, sizeof(unsigned long));
+
+ left = ((unsigned long)sp & (THREAD_SIZE - 1)) / sizeof(unsigned long);
+ sp--;
+
+ /*
+ * One 'long int' at the bottom of the thread stack is reserved
+ * and not poisoned.
+ */
+ if (left > 1) {
+ left--;
+ } else {
+ pr_err("FAIL: not enough stack space for the test\n");
+ return;
+ }
+
+ pr_info("checking unused part of the thread stack (%lu bytes)...\n",
+ left * sizeof(unsigned long));
+
+ /*
+ * Search for 'check_depth' poison values in a row (just like
+ * stackleak_erase() does).
+ */
+ for (i = 0, found = 0; i < left && found <= check_depth; i++) {
+ if (*(sp - i) == STACKLEAK_POISON)
+ found++;
+ else
+ found = 0;
+ }
+
+ if (found <= check_depth) {
+ pr_err("FAIL: thread stack is not erased (checked %lu bytes)\n",
+ i * sizeof(unsigned long));
+ return;
+ }
+
+ pr_info("first %lu bytes are unpoisoned\n",
+ (i - found) * sizeof(unsigned long));
+
+ /* The rest of thread stack should be erased */
+ for (; i < left; i++) {
+ if (*(sp - i) != STACKLEAK_POISON) {
+ pr_err("FAIL: thread stack is NOT properly erased\n");
+ return;
+ }
+ }
+
+ pr_info("OK: the rest of the thread stack is properly erased\n");
+ return;
+}
if (err)
goto error_window;
err = scif_map_page(&window->num_pages_lookup.lookup[j],
- vmalloc_dma_phys ?
+ vmalloc_num_pages ?
vmalloc_to_page(&window->num_pages[i]) :
virt_to_page(&window->num_pages[i]),
remote_dev);
#include <linux/delay.h>
#include <linux/bitops.h>
#include <asm/uv/uv_hub.h>
+
+#include <linux/nospec.h>
+
#include "gru.h"
#include "grutables.h"
#include "gruhandles.h"
/* Currently, only dump by gid is implemented */
if (req.gid >= gru_max_gids)
return -EINVAL;
+ req.gid = array_index_nospec(req.gid, gru_max_gids);
gru = GID_TO_GRU(req.gid);
ubuf = req.buf;
if (!qpair || !buf)
return VMCI_ERROR_INVALID_ARGS;
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &v, 1, buf_size);
+ iov_iter_kvec(&from, WRITE, &v, 1, buf_size);
qp_lock(qpair);
if (!qpair || !buf)
return VMCI_ERROR_INVALID_ARGS;
- iov_iter_kvec(&to, READ | ITER_KVEC, &v, 1, buf_size);
+ iov_iter_kvec(&to, READ, &v, 1, buf_size);
qp_lock(qpair);
if (!qpair || !buf)
return VMCI_ERROR_INVALID_ARGS;
- iov_iter_kvec(&to, READ | ITER_KVEC, &v, 1, buf_size);
+ iov_iter_kvec(&to, READ, &v, 1, buf_size);
qp_lock(qpair);
* - JMicron (hardware and technical support)
*/
+#include <linux/bitfield.h>
#include <linux/string.h>
#include <linux/delay.h>
#include <linux/highmem.h>
u32 dsm_fns;
int drv_strength;
bool d3_retune;
+ bool rpm_retune_ok;
+ u32 glk_rx_ctrl1;
+ u32 glk_tun_val;
};
static const guid_t intel_dsm_guid =
return ret;
}
+#ifdef CONFIG_PM
+#define GLK_RX_CTRL1 0x834
+#define GLK_TUN_VAL 0x840
+#define GLK_PATH_PLL GENMASK(13, 8)
+#define GLK_DLY GENMASK(6, 0)
+/* Workaround firmware failing to restore the tuning value */
+static void glk_rpm_retune_wa(struct sdhci_pci_chip *chip, bool susp)
+{
+ struct sdhci_pci_slot *slot = chip->slots[0];
+ struct intel_host *intel_host = sdhci_pci_priv(slot);
+ struct sdhci_host *host = slot->host;
+ u32 glk_rx_ctrl1;
+ u32 glk_tun_val;
+ u32 dly;
+
+ if (intel_host->rpm_retune_ok || !mmc_can_retune(host->mmc))
+ return;
+
+ glk_rx_ctrl1 = sdhci_readl(host, GLK_RX_CTRL1);
+ glk_tun_val = sdhci_readl(host, GLK_TUN_VAL);
+
+ if (susp) {
+ intel_host->glk_rx_ctrl1 = glk_rx_ctrl1;
+ intel_host->glk_tun_val = glk_tun_val;
+ return;
+ }
+
+ if (!intel_host->glk_tun_val)
+ return;
+
+ if (glk_rx_ctrl1 != intel_host->glk_rx_ctrl1) {
+ intel_host->rpm_retune_ok = true;
+ return;
+ }
+
+ dly = FIELD_PREP(GLK_DLY, FIELD_GET(GLK_PATH_PLL, glk_rx_ctrl1) +
+ (intel_host->glk_tun_val << 1));
+ if (dly == FIELD_GET(GLK_DLY, glk_rx_ctrl1))
+ return;
+
+ glk_rx_ctrl1 = (glk_rx_ctrl1 & ~GLK_DLY) | dly;
+ sdhci_writel(host, glk_rx_ctrl1, GLK_RX_CTRL1);
+
+ intel_host->rpm_retune_ok = true;
+ chip->rpm_retune = true;
+ mmc_retune_needed(host->mmc);
+ pr_info("%s: Requiring re-tune after rpm resume", mmc_hostname(host->mmc));
+}
+
+static void glk_rpm_retune_chk(struct sdhci_pci_chip *chip, bool susp)
+{
+ if (chip->pdev->device == PCI_DEVICE_ID_INTEL_GLK_EMMC &&
+ !chip->rpm_retune)
+ glk_rpm_retune_wa(chip, susp);
+}
+
+static int glk_runtime_suspend(struct sdhci_pci_chip *chip)
+{
+ glk_rpm_retune_chk(chip, true);
+
+ return sdhci_cqhci_runtime_suspend(chip);
+}
+
+static int glk_runtime_resume(struct sdhci_pci_chip *chip)
+{
+ glk_rpm_retune_chk(chip, false);
+
+ return sdhci_cqhci_runtime_resume(chip);
+}
+#endif
+
#ifdef CONFIG_ACPI
static int ni_set_max_freq(struct sdhci_pci_slot *slot)
{
.resume = sdhci_cqhci_resume,
#endif
#ifdef CONFIG_PM
- .runtime_suspend = sdhci_cqhci_runtime_suspend,
- .runtime_resume = sdhci_cqhci_runtime_resume,
+ .runtime_suspend = glk_runtime_suspend,
+ .runtime_resume = glk_runtime_resume,
#endif
.quirks = SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC,
.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
device_init_wakeup(&pdev->dev, true);
if (slot->cd_idx >= 0) {
- ret = mmc_gpiod_request_cd(host->mmc, NULL, slot->cd_idx,
+ ret = mmc_gpiod_request_cd(host->mmc, "cd", slot->cd_idx,
slot->cd_override_level, 0, NULL);
+ if (ret && ret != -EPROBE_DEFER)
+ ret = mmc_gpiod_request_cd(host->mmc, NULL,
+ slot->cd_idx,
+ slot->cd_override_level,
+ 0, NULL);
if (ret == -EPROBE_DEFER)
goto remove;
config MTD_DOCG3
tristate "M-Systems Disk-On-Chip G3"
select BCH
- select BCH_CONST_PARAMS
+ select BCH_CONST_PARAMS if !MTD_NAND_BCH
select BITREVERSE
help
This provides an MTD device driver for the M-Systems DiskOnChip
info->mtd = info->subdev[0].mtd;
ret = 0;
} else if (info->num_subdev > 1) {
- struct mtd_info *cdev[nr];
+ struct mtd_info **cdev;
+
+ cdev = kmalloc_array(nr, sizeof(*cdev), GFP_KERNEL);
+ if (!cdev) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
/*
* We detected multiple devices. Concatenate them together.
*/
info->mtd = mtd_concat_create(cdev, info->num_subdev,
plat->name);
+ kfree(cdev);
if (info->mtd == NULL) {
ret = -ENXIO;
goto err;
unsigned int nwords = DIV_ROUND_UP(nblocks * bits_per_block,
BITS_PER_LONG);
- nand->bbt.cache = kzalloc(nwords, GFP_KERNEL);
+ nand->bbt.cache = kcalloc(nwords, sizeof(*nand->bbt.cache),
+ GFP_KERNEL);
if (!nand->bbt.cache)
return -ENOMEM;
int ret;
nand_np = dev->of_node;
- nfc_np = of_find_compatible_node(dev->of_node, NULL,
- "atmel,sama5d3-nfc");
+ nfc_np = of_get_compatible_child(dev->of_node, "atmel,sama5d3-nfc");
if (!nfc_np) {
dev_err(dev, "Could not find device node for sama5d3-nfc\n");
return -ENODEV;
}
if (caps->legacy_of_bindings) {
+ struct device_node *nfc_node;
u32 ale_offs = 21;
/*
* If we are parsing legacy DT props and the DT contains a
* valid NFC node, forward the request to the sama5 logic.
*/
- if (of_find_compatible_node(pdev->dev.of_node, NULL,
- "atmel,sama5d3-nfc"))
+ nfc_node = of_get_compatible_child(pdev->dev.of_node,
+ "atmel,sama5d3-nfc");
+ if (nfc_node) {
caps = &atmel_sama5_nand_caps;
+ of_node_put(nfc_node);
+ }
/*
* Even if the compatible says we are dealing with an
/**
* panic_nand_wait - [GENERIC] wait until the command is done
- * @mtd: MTD device structure
* @chip: NAND chip structure
* @timeo: timeout
*
#define NAND_VERSION_MINOR_SHIFT 16
/* NAND OP_CMDs */
-#define PAGE_READ 0x2
-#define PAGE_READ_WITH_ECC 0x3
-#define PAGE_READ_WITH_ECC_SPARE 0x4
-#define PROGRAM_PAGE 0x6
-#define PAGE_PROGRAM_WITH_ECC 0x7
-#define PROGRAM_PAGE_SPARE 0x9
-#define BLOCK_ERASE 0xa
-#define FETCH_ID 0xb
-#define RESET_DEVICE 0xd
+#define OP_PAGE_READ 0x2
+#define OP_PAGE_READ_WITH_ECC 0x3
+#define OP_PAGE_READ_WITH_ECC_SPARE 0x4
+#define OP_PROGRAM_PAGE 0x6
+#define OP_PAGE_PROGRAM_WITH_ECC 0x7
+#define OP_PROGRAM_PAGE_SPARE 0x9
+#define OP_BLOCK_ERASE 0xa
+#define OP_FETCH_ID 0xb
+#define OP_RESET_DEVICE 0xd
/* Default Value for NAND_DEV_CMD_VLD */
#define NAND_DEV_CMD_VLD_VAL (READ_START_VLD | WRITE_START_VLD | \
if (read) {
if (host->use_ecc)
- cmd = PAGE_READ_WITH_ECC | PAGE_ACC | LAST_PAGE;
+ cmd = OP_PAGE_READ_WITH_ECC | PAGE_ACC | LAST_PAGE;
else
- cmd = PAGE_READ | PAGE_ACC | LAST_PAGE;
+ cmd = OP_PAGE_READ | PAGE_ACC | LAST_PAGE;
} else {
- cmd = PROGRAM_PAGE | PAGE_ACC | LAST_PAGE;
+ cmd = OP_PROGRAM_PAGE | PAGE_ACC | LAST_PAGE;
}
if (host->use_ecc) {
* in use. we configure the controller to perform a raw read of 512
* bytes to read onfi params
*/
- nandc_set_reg(nandc, NAND_FLASH_CMD, PAGE_READ | PAGE_ACC | LAST_PAGE);
+ nandc_set_reg(nandc, NAND_FLASH_CMD, OP_PAGE_READ | PAGE_ACC | LAST_PAGE);
nandc_set_reg(nandc, NAND_ADDR0, 0);
nandc_set_reg(nandc, NAND_ADDR1, 0);
nandc_set_reg(nandc, NAND_DEV0_CFG0, 0 << CW_PER_PAGE
struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
nandc_set_reg(nandc, NAND_FLASH_CMD,
- BLOCK_ERASE | PAGE_ACC | LAST_PAGE);
+ OP_BLOCK_ERASE | PAGE_ACC | LAST_PAGE);
nandc_set_reg(nandc, NAND_ADDR0, page_addr);
nandc_set_reg(nandc, NAND_ADDR1, 0);
nandc_set_reg(nandc, NAND_DEV0_CFG0,
if (column == -1)
return 0;
- nandc_set_reg(nandc, NAND_FLASH_CMD, FETCH_ID);
+ nandc_set_reg(nandc, NAND_FLASH_CMD, OP_FETCH_ID);
nandc_set_reg(nandc, NAND_ADDR0, column);
nandc_set_reg(nandc, NAND_ADDR1, 0);
nandc_set_reg(nandc, NAND_FLASH_CHIP_SELECT,
struct nand_chip *chip = &host->chip;
struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
- nandc_set_reg(nandc, NAND_FLASH_CMD, RESET_DEVICE);
+ nandc_set_reg(nandc, NAND_FLASH_CMD, OP_RESET_DEVICE);
nandc_set_reg(nandc, NAND_EXEC_CMD, 1);
write_reg_dma(nandc, NAND_FLASH_CMD, 1, NAND_BAM_NEXT_SGL);
ndelay(cqspi->wr_delay);
while (remaining > 0) {
+ size_t write_words, mod_bytes;
+
write_bytes = remaining > page_size ? page_size : remaining;
- iowrite32_rep(cqspi->ahb_base, txbuf,
- DIV_ROUND_UP(write_bytes, 4));
+ write_words = write_bytes / 4;
+ mod_bytes = write_bytes % 4;
+ /* Write 4 bytes at a time then single bytes. */
+ if (write_words) {
+ iowrite32_rep(cqspi->ahb_base, txbuf, write_words);
+ txbuf += (write_words * 4);
+ }
+ if (mod_bytes) {
+ unsigned int temp = 0xFFFFFFFF;
+
+ memcpy(&temp, txbuf, mod_bytes);
+ iowrite32(temp, cqspi->ahb_base);
+ txbuf += mod_bytes;
+ }
if (!wait_for_completion_timeout(&cqspi->transfer_complete,
msecs_to_jiffies(CQSPI_TIMEOUT_MS))) {
goto failwr;
}
- txbuf += write_bytes;
remaining -= write_bytes;
if (remaining > 0)
err_unmap:
dma_unmap_single(nor->dev, dma_dst, len, DMA_FROM_DEVICE);
- return 0;
+ return ret;
}
static ssize_t cqspi_read(struct spi_nor *nor, loff_t from,
* @nor: pointer to a 'struct spi_nor'
* @addr: offset in the serial flash memory
* @len: number of bytes to read
- * @buf: buffer where the data is copied into
+ * @buf: buffer where the data is copied into (dma-safe memory)
*
* Return: 0 on success, -errno otherwise.
*/
return left->size - right->size;
}
+/**
+ * spi_nor_sort_erase_mask() - sort erase mask
+ * @map: the erase map of the SPI NOR
+ * @erase_mask: the erase type mask to be sorted
+ *
+ * Replicate the sort done for the map's erase types in BFPT: sort the erase
+ * mask in ascending order with the smallest erase type size starting from
+ * BIT(0) in the sorted erase mask.
+ *
+ * Return: sorted erase mask.
+ */
+static u8 spi_nor_sort_erase_mask(struct spi_nor_erase_map *map, u8 erase_mask)
+{
+ struct spi_nor_erase_type *erase_type = map->erase_type;
+ int i;
+ u8 sorted_erase_mask = 0;
+
+ if (!erase_mask)
+ return 0;
+
+ /* Replicate the sort done for the map's erase types. */
+ for (i = 0; i < SNOR_ERASE_TYPE_MAX; i++)
+ if (erase_type[i].size && erase_mask & BIT(erase_type[i].idx))
+ sorted_erase_mask |= BIT(i);
+
+ return sorted_erase_mask;
+}
+
/**
* spi_nor_regions_sort_erase_types() - sort erase types in each region
* @map: the erase map of the SPI NOR
static void spi_nor_regions_sort_erase_types(struct spi_nor_erase_map *map)
{
struct spi_nor_erase_region *region = map->regions;
- struct spi_nor_erase_type *erase_type = map->erase_type;
- int i;
u8 region_erase_mask, sorted_erase_mask;
while (region) {
region_erase_mask = region->offset & SNOR_ERASE_TYPE_MASK;
- /* Replicate the sort done for the map's erase types. */
- sorted_erase_mask = 0;
- for (i = 0; i < SNOR_ERASE_TYPE_MAX; i++)
- if (erase_type[i].size &&
- region_erase_mask & BIT(erase_type[i].idx))
- sorted_erase_mask |= BIT(i);
+ sorted_erase_mask = spi_nor_sort_erase_mask(map,
+ region_erase_mask);
/* Overwrite erase mask. */
region->offset = (region->offset & ~SNOR_ERASE_TYPE_MASK) |
* spi_nor_get_map_in_use() - get the configuration map in use
* @nor: pointer to a 'struct spi_nor'
* @smpt: pointer to the sector map parameter table
+ * @smpt_len: sector map parameter table length
+ *
+ * Return: pointer to the map in use, ERR_PTR(-errno) otherwise.
*/
-static const u32 *spi_nor_get_map_in_use(struct spi_nor *nor, const u32 *smpt)
+static const u32 *spi_nor_get_map_in_use(struct spi_nor *nor, const u32 *smpt,
+ u8 smpt_len)
{
- const u32 *ret = NULL;
- u32 i, addr;
+ const u32 *ret;
+ u8 *buf;
+ u32 addr;
int err;
+ u8 i;
u8 addr_width, read_opcode, read_dummy;
- u8 read_data_mask, data_byte, map_id;
+ u8 read_data_mask, map_id;
+
+ /* Use a kmalloc'ed bounce buffer to guarantee it is DMA-able. */
+ buf = kmalloc(sizeof(*buf), GFP_KERNEL);
+ if (!buf)
+ return ERR_PTR(-ENOMEM);
addr_width = nor->addr_width;
read_dummy = nor->read_dummy;
read_opcode = nor->read_opcode;
map_id = 0;
- i = 0;
/* Determine if there are any optional Detection Command Descriptors */
- while (!(smpt[i] & SMPT_DESC_TYPE_MAP)) {
+ for (i = 0; i < smpt_len; i += 2) {
+ if (smpt[i] & SMPT_DESC_TYPE_MAP)
+ break;
+
read_data_mask = SMPT_CMD_READ_DATA(smpt[i]);
nor->addr_width = spi_nor_smpt_addr_width(nor, smpt[i]);
nor->read_dummy = spi_nor_smpt_read_dummy(nor, smpt[i]);
nor->read_opcode = SMPT_CMD_OPCODE(smpt[i]);
addr = smpt[i + 1];
- err = spi_nor_read_raw(nor, addr, 1, &data_byte);
- if (err)
+ err = spi_nor_read_raw(nor, addr, 1, buf);
+ if (err) {
+ ret = ERR_PTR(err);
goto out;
+ }
/*
* Build an index value that is used to select the Sector Map
* Configuration that is currently in use.
*/
- map_id = map_id << 1 | !!(data_byte & read_data_mask);
- i = i + 2;
+ map_id = map_id << 1 | !!(*buf & read_data_mask);
}
- /* Find the matching configuration map */
- while (SMPT_MAP_ID(smpt[i]) != map_id) {
+ /*
+ * If command descriptors are provided, they always precede map
+ * descriptors in the table. There is no need to start the iteration
+ * over smpt array all over again.
+ *
+ * Find the matching configuration map.
+ */
+ ret = ERR_PTR(-EINVAL);
+ while (i < smpt_len) {
+ if (SMPT_MAP_ID(smpt[i]) == map_id) {
+ ret = smpt + i;
+ break;
+ }
+
+ /*
+ * If there are no more configuration map descriptors and no
+ * configuration ID matched the configuration identifier, the
+ * sector address map is unknown.
+ */
if (smpt[i] & SMPT_DESC_END)
- goto out;
+ break;
+
/* increment the table index to the next map */
i += SMPT_MAP_REGION_COUNT(smpt[i]) + 1;
}
- ret = smpt + i;
/* fall through */
out:
+ kfree(buf);
nor->addr_width = addr_width;
nor->read_dummy = read_dummy;
nor->read_opcode = read_opcode;
const u32 *smpt)
{
struct spi_nor_erase_map *map = &nor->erase_map;
- const struct spi_nor_erase_type *erase = map->erase_type;
+ struct spi_nor_erase_type *erase = map->erase_type;
struct spi_nor_erase_region *region;
u64 offset;
u32 region_count;
int i, j;
- u8 erase_type;
+ u8 uniform_erase_type, save_uniform_erase_type;
+ u8 erase_type, regions_erase_type;
region_count = SMPT_MAP_REGION_COUNT(*smpt);
/*
return -ENOMEM;
map->regions = region;
- map->uniform_erase_type = 0xff;
+ uniform_erase_type = 0xff;
+ regions_erase_type = 0;
offset = 0;
/* Populate regions. */
for (i = 0; i < region_count; i++) {
* Save the erase types that are supported in all regions and
* can erase the entire flash memory.
*/
- map->uniform_erase_type &= erase_type;
+ uniform_erase_type &= erase_type;
+
+ /*
+ * regions_erase_type mask will indicate all the erase types
+ * supported in this configuration map.
+ */
+ regions_erase_type |= erase_type;
offset = (region[i].offset & ~SNOR_ERASE_FLAGS_MASK) +
region[i].size;
}
+ save_uniform_erase_type = map->uniform_erase_type;
+ map->uniform_erase_type = spi_nor_sort_erase_mask(map,
+ uniform_erase_type);
+
+ if (!regions_erase_type) {
+ /*
+ * Roll back to the previous uniform_erase_type mask, SMPT is
+ * broken.
+ */
+ map->uniform_erase_type = save_uniform_erase_type;
+ return -EINVAL;
+ }
+
+ /*
+ * BFPT advertises all the erase types supported by all the possible
+ * map configurations. Mask out the erase types that are not supported
+ * by the current map configuration.
+ */
+ for (i = 0; i < SNOR_ERASE_TYPE_MAX; i++)
+ if (!(regions_erase_type & BIT(erase[i].idx)))
+ spi_nor_set_erase_type(&erase[i], 0, 0xFF);
+
spi_nor_region_mark_end(®ion[i - 1]);
return 0;
for (i = 0; i < smpt_header->length; i++)
smpt[i] = le32_to_cpu(smpt[i]);
- sector_map = spi_nor_get_map_in_use(nor, smpt);
- if (!sector_map) {
- ret = -EINVAL;
+ sector_map = spi_nor_get_map_in_use(nor, smpt, smpt_header->length);
+ if (IS_ERR(sector_map)) {
+ ret = PTR_ERR(sector_map);
goto out;
}
if (err)
goto exit;
- /* Parse other parameter headers. */
+ /* Parse optional parameter tables. */
for (i = 0; i < header.nph; i++) {
param_header = ¶m_headers[i];
break;
}
- if (err)
- goto exit;
+ if (err) {
+ dev_warn(dev, "Failed to parse optional parameter table: %04x\n",
+ SFDP_PARAM_HEADER_ID(param_header));
+ /*
+ * Let's not drop all information we extracted so far
+ * if optional table parsers fail. In case of failing,
+ * each optional parser is responsible to roll back to
+ * the previously known spi_nor data.
+ */
+ err = 0;
+ }
}
exit:
memcpy(&sfdp_params, params, sizeof(sfdp_params));
memcpy(&prev_map, &nor->erase_map, sizeof(prev_map));
- if (spi_nor_parse_sfdp(nor, &sfdp_params))
+ if (spi_nor_parse_sfdp(nor, &sfdp_params)) {
+ nor->addr_width = 0;
/* restore previous erase map */
memcpy(&nor->erase_map, &prev_map,
sizeof(nor->erase_map));
- else
+ } else {
memcpy(params, &sfdp_params, sizeof(*params));
+ }
}
return 0;
* be a result of power cut during erasure.
*/
ai->maybe_bad_peb_count += 1;
+ /* fall through */
case UBI_IO_BAD_HDR:
/*
* If we're facing a bad VID header we have to drop *all*
switch (*endp) {
case 'G':
result *= 1024;
+ /* fall through */
case 'M':
result *= 1024;
+ /* fall through */
case 'K':
result *= 1024;
if (endp[1] == 'i' && endp[2] == 'B')
aggregator->aggregator_identifier);
/* Tell the partner that this port is not suitable for aggregation */
+ port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION;
+ port->actor_oper_port_state &= ~AD_STATE_COLLECTING;
+ port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING;
port->actor_oper_port_state &= ~AD_STATE_AGGREGATION;
__update_lacpdu_from_port(port);
ad_lacpdu_send(port);
case NETDEV_CHANGE:
/* For 802.3ad mode only:
* Getting invalid Speed/Duplex values here will put slave
- * in weird state. So mark it as link-down for the time
+ * in weird state. So mark it as link-fail for the time
* being and let link-monitoring (miimon) set it right when
* correct speeds/duplex are available.
*/
if (bond_update_speed_duplex(slave) &&
BOND_MODE(bond) == BOND_MODE_8023AD)
- slave->link = BOND_LINK_DOWN;
+ slave->link = BOND_LINK_FAIL;
if (BOND_MODE(bond) == BOND_MODE_8023AD)
bond_3ad_adapter_speed_duplex_changed(slave);
goto nla_put_failure;
if (nla_put(skb, IFLA_BOND_AD_ACTOR_SYSTEM,
- sizeof(bond->params.ad_actor_system),
- &bond->params.ad_actor_system))
+ ETH_ALEN, &bond->params.ad_actor_system))
goto nla_put_failure;
}
if (!bond_3ad_get_active_agg_info(bond, &info)) {
}
EXPORT_SYMBOL_GPL(can_put_echo_skb);
+struct sk_buff *__can_get_echo_skb(struct net_device *dev, unsigned int idx, u8 *len_ptr)
+{
+ struct can_priv *priv = netdev_priv(dev);
+ struct sk_buff *skb = priv->echo_skb[idx];
+ struct canfd_frame *cf;
+
+ if (idx >= priv->echo_skb_max) {
+ netdev_err(dev, "%s: BUG! Trying to access can_priv::echo_skb out of bounds (%u/max %u)\n",
+ __func__, idx, priv->echo_skb_max);
+ return NULL;
+ }
+
+ if (!skb) {
+ netdev_err(dev, "%s: BUG! Trying to echo non existing skb: can_priv::echo_skb[%u]\n",
+ __func__, idx);
+ return NULL;
+ }
+
+ /* Using "struct canfd_frame::len" for the frame
+ * length is supported on both CAN and CANFD frames.
+ */
+ cf = (struct canfd_frame *)skb->data;
+ *len_ptr = cf->len;
+ priv->echo_skb[idx] = NULL;
+
+ return skb;
+}
+
/*
* Get the skb from the stack and loop it back locally
*
*/
unsigned int can_get_echo_skb(struct net_device *dev, unsigned int idx)
{
- struct can_priv *priv = netdev_priv(dev);
-
- BUG_ON(idx >= priv->echo_skb_max);
-
- if (priv->echo_skb[idx]) {
- struct sk_buff *skb = priv->echo_skb[idx];
- struct can_frame *cf = (struct can_frame *)skb->data;
- u8 dlc = cf->can_dlc;
+ struct sk_buff *skb;
+ u8 len;
- netif_rx(priv->echo_skb[idx]);
- priv->echo_skb[idx] = NULL;
+ skb = __can_get_echo_skb(dev, idx, &len);
+ if (!skb)
+ return 0;
- return dlc;
- }
+ netif_rx(skb);
- return 0;
+ return len;
}
EXPORT_SYMBOL_GPL(can_get_echo_skb);
/* FLEXCAN interrupt flag register (IFLAG) bits */
/* Errata ERR005829 step7: Reserve first valid MB */
-#define FLEXCAN_TX_MB_RESERVED_OFF_FIFO 8
-#define FLEXCAN_TX_MB_OFF_FIFO 9
+#define FLEXCAN_TX_MB_RESERVED_OFF_FIFO 8
#define FLEXCAN_TX_MB_RESERVED_OFF_TIMESTAMP 0
-#define FLEXCAN_TX_MB_OFF_TIMESTAMP 1
-#define FLEXCAN_RX_MB_OFF_TIMESTAMP_FIRST (FLEXCAN_TX_MB_OFF_TIMESTAMP + 1)
-#define FLEXCAN_RX_MB_OFF_TIMESTAMP_LAST 63
-#define FLEXCAN_IFLAG_MB(x) BIT(x)
+#define FLEXCAN_TX_MB 63
+#define FLEXCAN_RX_MB_OFF_TIMESTAMP_FIRST (FLEXCAN_TX_MB_RESERVED_OFF_TIMESTAMP + 1)
+#define FLEXCAN_RX_MB_OFF_TIMESTAMP_LAST (FLEXCAN_TX_MB - 1)
+#define FLEXCAN_IFLAG_MB(x) BIT(x & 0x1f)
#define FLEXCAN_IFLAG_RX_FIFO_OVERFLOW BIT(7)
#define FLEXCAN_IFLAG_RX_FIFO_WARN BIT(6)
#define FLEXCAN_IFLAG_RX_FIFO_AVAILABLE BIT(5)
struct can_rx_offload offload;
struct flexcan_regs __iomem *regs;
- struct flexcan_mb __iomem *tx_mb;
struct flexcan_mb __iomem *tx_mb_reserved;
- u8 tx_mb_idx;
u32 reg_ctrl_default;
u32 reg_imask1_default;
u32 reg_imask2_default;
static netdev_tx_t flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
const struct flexcan_priv *priv = netdev_priv(dev);
+ struct flexcan_regs __iomem *regs = priv->regs;
struct can_frame *cf = (struct can_frame *)skb->data;
u32 can_id;
u32 data;
if (cf->can_dlc > 0) {
data = be32_to_cpup((__be32 *)&cf->data[0]);
- priv->write(data, &priv->tx_mb->data[0]);
+ priv->write(data, ®s->mb[FLEXCAN_TX_MB].data[0]);
}
if (cf->can_dlc > 4) {
data = be32_to_cpup((__be32 *)&cf->data[4]);
- priv->write(data, &priv->tx_mb->data[1]);
+ priv->write(data, ®s->mb[FLEXCAN_TX_MB].data[1]);
}
can_put_echo_skb(skb, dev, 0);
- priv->write(can_id, &priv->tx_mb->can_id);
- priv->write(ctrl, &priv->tx_mb->can_ctrl);
+ priv->write(can_id, ®s->mb[FLEXCAN_TX_MB].can_id);
+ priv->write(ctrl, ®s->mb[FLEXCAN_TX_MB].can_ctrl);
/* Errata ERR005829 step8:
* Write twice INACTIVE(0x8) code to first MB.
static void flexcan_irq_bus_err(struct net_device *dev, u32 reg_esr)
{
struct flexcan_priv *priv = netdev_priv(dev);
+ struct flexcan_regs __iomem *regs = priv->regs;
struct sk_buff *skb;
struct can_frame *cf;
bool rx_errors = false, tx_errors = false;
+ u32 timestamp;
+
+ timestamp = priv->read(®s->timer) << 16;
skb = alloc_can_err_skb(dev, &cf);
if (unlikely(!skb))
if (tx_errors)
dev->stats.tx_errors++;
- can_rx_offload_irq_queue_err_skb(&priv->offload, skb);
+ can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
}
static void flexcan_irq_state(struct net_device *dev, u32 reg_esr)
{
struct flexcan_priv *priv = netdev_priv(dev);
+ struct flexcan_regs __iomem *regs = priv->regs;
struct sk_buff *skb;
struct can_frame *cf;
enum can_state new_state, rx_state, tx_state;
int flt;
struct can_berr_counter bec;
+ u32 timestamp;
+
+ timestamp = priv->read(®s->timer) << 16;
flt = reg_esr & FLEXCAN_ESR_FLT_CONF_MASK;
if (likely(flt == FLEXCAN_ESR_FLT_CONF_ACTIVE)) {
if (unlikely(new_state == CAN_STATE_BUS_OFF))
can_bus_off(dev);
- can_rx_offload_irq_queue_err_skb(&priv->offload, skb);
+ can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
}
static inline struct flexcan_priv *rx_offload_to_priv(struct can_rx_offload *offload)
priv->write(BIT(n - 32), ®s->iflag2);
} else {
priv->write(FLEXCAN_IFLAG_RX_FIFO_AVAILABLE, ®s->iflag1);
- priv->read(®s->timer);
}
+ /* Read the Free Running Timer. It is optional but recommended
+ * to unlock Mailbox as soon as possible and make it available
+ * for reception.
+ */
+ priv->read(®s->timer);
+
return 1;
}
struct flexcan_regs __iomem *regs = priv->regs;
u32 iflag1, iflag2;
- iflag2 = priv->read(®s->iflag2) & priv->reg_imask2_default;
- iflag1 = priv->read(®s->iflag1) & priv->reg_imask1_default &
- ~FLEXCAN_IFLAG_MB(priv->tx_mb_idx);
+ iflag2 = priv->read(®s->iflag2) & priv->reg_imask2_default &
+ ~FLEXCAN_IFLAG_MB(FLEXCAN_TX_MB);
+ iflag1 = priv->read(®s->iflag1) & priv->reg_imask1_default;
return (u64)iflag2 << 32 | iflag1;
}
struct flexcan_priv *priv = netdev_priv(dev);
struct flexcan_regs __iomem *regs = priv->regs;
irqreturn_t handled = IRQ_NONE;
- u32 reg_iflag1, reg_esr;
+ u32 reg_iflag2, reg_esr;
enum can_state last_state = priv->can.state;
- reg_iflag1 = priv->read(®s->iflag1);
-
/* reception interrupt */
if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP) {
u64 reg_iflag;
break;
}
} else {
+ u32 reg_iflag1;
+
+ reg_iflag1 = priv->read(®s->iflag1);
if (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE) {
handled = IRQ_HANDLED;
can_rx_offload_irq_offload_fifo(&priv->offload);
}
}
+ reg_iflag2 = priv->read(®s->iflag2);
+
/* transmission complete interrupt */
- if (reg_iflag1 & FLEXCAN_IFLAG_MB(priv->tx_mb_idx)) {
+ if (reg_iflag2 & FLEXCAN_IFLAG_MB(FLEXCAN_TX_MB)) {
+ u32 reg_ctrl = priv->read(®s->mb[FLEXCAN_TX_MB].can_ctrl);
+
handled = IRQ_HANDLED;
- stats->tx_bytes += can_get_echo_skb(dev, 0);
+ stats->tx_bytes += can_rx_offload_get_echo_skb(&priv->offload,
+ 0, reg_ctrl << 16);
stats->tx_packets++;
can_led_event(dev, CAN_LED_EVENT_TX);
/* after sending a RTR frame MB is in RX mode */
priv->write(FLEXCAN_MB_CODE_TX_INACTIVE,
- &priv->tx_mb->can_ctrl);
- priv->write(FLEXCAN_IFLAG_MB(priv->tx_mb_idx), ®s->iflag1);
+ ®s->mb[FLEXCAN_TX_MB].can_ctrl);
+ priv->write(FLEXCAN_IFLAG_MB(FLEXCAN_TX_MB), ®s->iflag2);
netif_wake_queue(dev);
}
reg_mcr &= ~FLEXCAN_MCR_MAXMB(0xff);
reg_mcr |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT | FLEXCAN_MCR_SUPV |
FLEXCAN_MCR_WRN_EN | FLEXCAN_MCR_SRX_DIS | FLEXCAN_MCR_IRMQ |
- FLEXCAN_MCR_IDAM_C;
+ FLEXCAN_MCR_IDAM_C | FLEXCAN_MCR_MAXMB(FLEXCAN_TX_MB);
- if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP) {
+ if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP)
reg_mcr &= ~FLEXCAN_MCR_FEN;
- reg_mcr |= FLEXCAN_MCR_MAXMB(priv->offload.mb_last);
- } else {
- reg_mcr |= FLEXCAN_MCR_FEN |
- FLEXCAN_MCR_MAXMB(priv->tx_mb_idx);
- }
+ else
+ reg_mcr |= FLEXCAN_MCR_FEN;
+
netdev_dbg(dev, "%s: writing mcr=0x%08x", __func__, reg_mcr);
priv->write(reg_mcr, ®s->mcr);
priv->write(reg_ctrl2, ®s->ctrl2);
}
- /* clear and invalidate all mailboxes first */
- for (i = priv->tx_mb_idx; i < ARRAY_SIZE(regs->mb); i++) {
- priv->write(FLEXCAN_MB_CODE_RX_INACTIVE,
- ®s->mb[i].can_ctrl);
- }
-
if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP) {
- for (i = priv->offload.mb_first; i <= priv->offload.mb_last; i++)
+ for (i = priv->offload.mb_first; i <= priv->offload.mb_last; i++) {
priv->write(FLEXCAN_MB_CODE_RX_EMPTY,
®s->mb[i].can_ctrl);
+ }
+ } else {
+ /* clear and invalidate unused mailboxes first */
+ for (i = FLEXCAN_TX_MB_RESERVED_OFF_FIFO; i <= ARRAY_SIZE(regs->mb); i++) {
+ priv->write(FLEXCAN_MB_CODE_RX_INACTIVE,
+ ®s->mb[i].can_ctrl);
+ }
}
/* Errata ERR005829: mark first TX mailbox as INACTIVE */
/* mark TX mailbox as INACTIVE */
priv->write(FLEXCAN_MB_CODE_TX_INACTIVE,
- &priv->tx_mb->can_ctrl);
+ ®s->mb[FLEXCAN_TX_MB].can_ctrl);
/* acceptance mask/acceptance code (accept everything) */
priv->write(0x0, ®s->rxgmask);
priv->devtype_data = devtype_data;
priv->reg_xceiver = reg_xceiver;
- if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP) {
- priv->tx_mb_idx = FLEXCAN_TX_MB_OFF_TIMESTAMP;
+ if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP)
priv->tx_mb_reserved = ®s->mb[FLEXCAN_TX_MB_RESERVED_OFF_TIMESTAMP];
- } else {
- priv->tx_mb_idx = FLEXCAN_TX_MB_OFF_FIFO;
+ else
priv->tx_mb_reserved = ®s->mb[FLEXCAN_TX_MB_RESERVED_OFF_FIFO];
- }
- priv->tx_mb = ®s->mb[priv->tx_mb_idx];
- priv->reg_imask1_default = FLEXCAN_IFLAG_MB(priv->tx_mb_idx);
- priv->reg_imask2_default = 0;
+ priv->reg_imask1_default = 0;
+ priv->reg_imask2_default = FLEXCAN_IFLAG_MB(FLEXCAN_TX_MB);
priv->offload.mailbox_read = flexcan_mailbox_read;
#define RCAR_CAN_DRV_NAME "rcar_can"
+#define RCAR_SUPPORTED_CLOCKS (BIT(CLKR_CLKP1) | BIT(CLKR_CLKP2) | \
+ BIT(CLKR_CLKEXT))
+
/* Mailbox configuration:
* mailbox 60 - 63 - Rx FIFO mailboxes
* mailbox 56 - 59 - Tx FIFO mailboxes
goto fail_clk;
}
- if (clock_select >= ARRAY_SIZE(clock_names)) {
+ if (!(BIT(clock_select) & RCAR_SUPPORTED_CLOCKS)) {
err = -EINVAL;
dev_err(&pdev->dev, "invalid CAN clock selected\n");
goto fail_clk;
}
EXPORT_SYMBOL_GPL(can_rx_offload_irq_offload_fifo);
-int can_rx_offload_irq_queue_err_skb(struct can_rx_offload *offload, struct sk_buff *skb)
+int can_rx_offload_queue_sorted(struct can_rx_offload *offload,
+ struct sk_buff *skb, u32 timestamp)
+{
+ struct can_rx_offload_cb *cb;
+ unsigned long flags;
+
+ if (skb_queue_len(&offload->skb_queue) >
+ offload->skb_queue_len_max)
+ return -ENOMEM;
+
+ cb = can_rx_offload_get_cb(skb);
+ cb->timestamp = timestamp;
+
+ spin_lock_irqsave(&offload->skb_queue.lock, flags);
+ __skb_queue_add_sort(&offload->skb_queue, skb, can_rx_offload_compare);
+ spin_unlock_irqrestore(&offload->skb_queue.lock, flags);
+
+ can_rx_offload_schedule(offload);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(can_rx_offload_queue_sorted);
+
+unsigned int can_rx_offload_get_echo_skb(struct can_rx_offload *offload,
+ unsigned int idx, u32 timestamp)
+{
+ struct net_device *dev = offload->dev;
+ struct net_device_stats *stats = &dev->stats;
+ struct sk_buff *skb;
+ u8 len;
+ int err;
+
+ skb = __can_get_echo_skb(dev, idx, &len);
+ if (!skb)
+ return 0;
+
+ err = can_rx_offload_queue_sorted(offload, skb, timestamp);
+ if (err) {
+ stats->rx_errors++;
+ stats->tx_fifo_errors++;
+ }
+
+ return len;
+}
+EXPORT_SYMBOL_GPL(can_rx_offload_get_echo_skb);
+
+int can_rx_offload_queue_tail(struct can_rx_offload *offload,
+ struct sk_buff *skb)
{
if (skb_queue_len(&offload->skb_queue) >
offload->skb_queue_len_max)
return 0;
}
-EXPORT_SYMBOL_GPL(can_rx_offload_irq_queue_err_skb);
+EXPORT_SYMBOL_GPL(can_rx_offload_queue_tail);
static int can_rx_offload_init_queue(struct net_device *dev, struct can_rx_offload *offload, unsigned int weight)
{
{
struct hi3110_priv *priv = netdev_priv(net);
struct spi_device *spi = priv->spi;
- unsigned long flags = IRQF_ONESHOT | IRQF_TRIGGER_RISING;
+ unsigned long flags = IRQF_ONESHOT | IRQF_TRIGGER_HIGH;
int ret;
ret = open_candev(net);
context = &priv->tx_contexts[i];
context->echo_index = i;
- can_put_echo_skb(skb, netdev, context->echo_index);
++priv->active_tx_contexts;
if (priv->active_tx_contexts >= (int)dev->max_tx_urbs)
netif_stop_queue(netdev);
dev_kfree_skb(skb);
spin_lock_irqsave(&priv->tx_contexts_lock, flags);
- can_free_echo_skb(netdev, context->echo_index);
context->echo_index = dev->max_tx_urbs;
--priv->active_tx_contexts;
netif_wake_queue(netdev);
context->priv = priv;
+ can_put_echo_skb(skb, netdev, context->echo_index);
+
usb_fill_bulk_urb(urb, dev->udev,
usb_sndbulkpipe(dev->udev,
dev->bulk_out->bEndpointAddress),
new_state : CAN_STATE_ERROR_ACTIVE;
can_change_state(netdev, cf, tx_state, rx_state);
+
+ if (priv->can.restart_ms &&
+ old_state >= CAN_STATE_BUS_OFF &&
+ new_state < CAN_STATE_BUS_OFF)
+ cf->can_id |= CAN_ERR_RESTARTED;
}
if (new_state == CAN_STATE_BUS_OFF) {
can_bus_off(netdev);
}
-
- if (priv->can.restart_ms &&
- old_state >= CAN_STATE_BUS_OFF &&
- new_state < CAN_STATE_BUS_OFF)
- cf->can_id |= CAN_ERR_RESTARTED;
}
if (!skb) {
#include <linux/slab.h>
#include <linux/usb.h>
-#include <linux/can.h>
-#include <linux/can/dev.h>
-#include <linux/can/error.h>
-
#define UCAN_DRIVER_NAME "ucan"
#define UCAN_MAX_RX_URBS 8
/* the CAN controller needs a while to enable/disable the bus */
/* disconnect the device */
static void ucan_disconnect(struct usb_interface *intf)
{
- struct usb_device *udev;
struct ucan_priv *up = usb_get_intfdata(intf);
- udev = interface_to_usbdev(intf);
-
usb_set_intfdata(intf, NULL);
if (up) {
{
int i;
- mutex_init(&dev->reg_mutex);
- mutex_init(&dev->stats_mutex);
- mutex_init(&dev->alu_mutex);
- mutex_init(&dev->vlan_mutex);
-
dev->ds->ops = &ksz_switch_ops;
for (i = 0; i < ARRAY_SIZE(ksz_switch_chips); i++) {
if (dev->pdata)
dev->chip_id = dev->pdata->chip_id;
+ mutex_init(&dev->reg_mutex);
+ mutex_init(&dev->stats_mutex);
+ mutex_init(&dev->alu_mutex);
+ mutex_init(&dev->vlan_mutex);
+
if (ksz_switch_detect(dev))
return -EINVAL;
/* Reset the switch. */
REG_WRITE(REG_GLOBAL, GLOBAL_ATU_CONTROL,
GLOBAL_ATU_CONTROL_SWRESET |
- GLOBAL_ATU_CONTROL_ATUSIZE_1024 |
- GLOBAL_ATU_CONTROL_ATE_AGE_5MIN);
+ GLOBAL_ATU_CONTROL_LEARNDIS);
/* Wait up to one second for reset to complete. */
timeout = jiffies + 1 * HZ;
*/
REG_WRITE(REG_GLOBAL, GLOBAL_CONTROL, GLOBAL_CONTROL_MAX_FRAME_1536);
- /* Enable automatic address learning, set the address
- * database size to 1024 entries, and set the default aging
- * time to 5 minutes.
+ /* Disable automatic address learning.
*/
REG_WRITE(REG_GLOBAL, GLOBAL_ATU_CONTROL,
- GLOBAL_ATU_CONTROL_ATUSIZE_1024 |
- GLOBAL_ATU_CONTROL_ATE_AGE_5MIN);
+ GLOBAL_ATU_CONTROL_LEARNDIS);
return 0;
}
if (err)
return err;
+ /* Keep the histogram mode bits */
+ val &= MV88E6XXX_G1_STATS_OP_HIST_RX_TX;
val |= MV88E6XXX_G1_STATS_OP_BUSY | MV88E6XXX_G1_STATS_OP_FLUSH_ALL;
err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_STATS_OP, val);
rc = ena_com_dev_reset(adapter->ena_dev, adapter->reset_reason);
if (rc)
dev_err(&adapter->pdev->dev, "Device reset failed\n");
+ /* stop submitting admin commands on a device that was reset */
+ ena_com_set_admin_running_state(adapter->ena_dev, false);
}
ena_destroy_all_io_queues(adapter);
netif_dbg(adapter, ifdown, netdev, "%s\n", __func__);
+ if (!test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
+ return 0;
+
if (test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
ena_down(adapter);
ena_down(adapter);
/* Stop the device from sending AENQ events (in case reset flag is set
- * and device is up, ena_close already reset the device
- * In case the reset flag is set and the device is up, ena_down()
- * already perform the reset, so it can be skipped.
+ * and device is up, ena_down() already reset the device.
*/
if (!(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags) && dev_up))
ena_com_dev_reset(adapter->ena_dev, adapter->reset_reason);
ena_com_abort_admin_commands(ena_dev);
ena_com_wait_for_abort_completion(ena_dev);
ena_com_admin_destroy(ena_dev);
- ena_com_mmio_reg_read_request_destroy(ena_dev);
ena_com_dev_reset(ena_dev, ENA_REGS_RESET_DRIVER_INVALID_STATE);
+ ena_com_mmio_reg_read_request_destroy(ena_dev);
err:
clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags);
clear_bit(ENA_FLAG_ONGOING_RESET, &adapter->flags);
ena_com_rss_destroy(ena_dev);
err_free_msix:
ena_com_dev_reset(ena_dev, ENA_REGS_RESET_INIT_ERR);
+ /* stop submitting admin commands on a device that was reset */
+ ena_com_set_admin_running_state(ena_dev, false);
ena_free_mgmnt_irq(adapter);
ena_disable_msix(adapter);
err_worker_destroy:
cancel_work_sync(&adapter->reset_task);
- unregister_netdev(netdev);
-
- /* If the device is running then we want to make sure the device will be
- * reset to make sure no more events will be issued by the device.
- */
- if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
- set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
-
rtnl_lock();
ena_destroy_device(adapter, true);
rtnl_unlock();
+ unregister_netdev(netdev);
+
free_netdev(netdev);
ena_com_rss_destroy(ena_dev);
#define DRV_MODULE_VER_MAJOR 2
#define DRV_MODULE_VER_MINOR 0
-#define DRV_MODULE_VER_SUBMINOR 1
+#define DRV_MODULE_VER_SUBMINOR 2
#define DRV_MODULE_NAME "ena"
#ifndef DRV_MODULE_VERSION
prop = of_get_property(nd, "tpe-link-test?", NULL);
if (!prop)
- goto no_link_test;
+ goto node_put;
if (strcmp(prop, "true")) {
printk(KERN_NOTICE "SunLance: warning: overriding option "
"to ecd@skynet.be\n");
auxio_set_lte(AUXIO_LTE_ON);
}
+node_put:
+ of_node_put(nd);
no_link_test:
lp->auto_select = 1;
lp->tpe = 0;
struct ethtool_pauseparam *pause)
{
struct aq_nic_s *aq_nic = netdev_priv(ndev);
+ u32 fc = aq_nic->aq_nic_cfg.flow_control;
pause->autoneg = 0;
- if (aq_nic->aq_hw->aq_nic_cfg->flow_control & AQ_NIC_FC_RX)
- pause->rx_pause = 1;
- if (aq_nic->aq_hw->aq_nic_cfg->flow_control & AQ_NIC_FC_TX)
- pause->tx_pause = 1;
+ pause->rx_pause = !!(fc & AQ_NIC_FC_RX);
+ pause->tx_pause = !!(fc & AQ_NIC_FC_TX);
+
}
static int aq_ethtool_set_pauseparam(struct net_device *ndev,
int (*hw_get_fw_version)(struct aq_hw_s *self, u32 *fw_version);
+ int (*hw_set_offload)(struct aq_hw_s *self,
+ struct aq_nic_cfg_s *aq_nic_cfg);
+
+ int (*hw_set_fc)(struct aq_hw_s *self, u32 fc, u32 tc);
};
struct aq_fw_ops {
int (*update_stats)(struct aq_hw_s *self);
+ u32 (*get_flow_control)(struct aq_hw_s *self, u32 *fcmode);
+
int (*set_flow_control)(struct aq_hw_s *self);
int (*set_power)(struct aq_hw_s *self, unsigned int power_state,
struct aq_nic_s *aq_nic = netdev_priv(ndev);
struct aq_nic_cfg_s *aq_cfg = aq_nic_get_cfg(aq_nic);
bool is_lro = false;
+ int err = 0;
+
+ aq_cfg->features = features;
- if (aq_cfg->hw_features & NETIF_F_LRO) {
+ if (aq_cfg->aq_hw_caps->hw_features & NETIF_F_LRO) {
is_lro = features & NETIF_F_LRO;
if (aq_cfg->is_lro != is_lro) {
}
}
}
+ if ((aq_nic->ndev->features ^ features) & NETIF_F_RXCSUM)
+ err = aq_nic->aq_hw_ops->hw_set_offload(aq_nic->aq_hw,
+ aq_cfg);
- return 0;
+ return err;
}
static int aq_ndev_set_mac_address(struct net_device *ndev, void *addr)
}
cfg->link_speed_msk &= cfg->aq_hw_caps->link_speed_msk;
- cfg->hw_features = cfg->aq_hw_caps->hw_features;
+ cfg->features = cfg->aq_hw_caps->hw_features;
}
static int aq_nic_update_link_status(struct aq_nic_s *self)
{
int err = self->aq_fw_ops->update_link_status(self->aq_hw);
+ u32 fc = 0;
if (err)
return err;
AQ_CFG_DRV_NAME, self->link_status.mbps,
self->aq_hw->aq_link_status.mbps);
aq_nic_update_interrupt_moderation_settings(self);
+
+ /* Driver has to update flow control settings on RX block
+ * on any link event.
+ * We should query FW whether it negotiated FC.
+ */
+ if (self->aq_fw_ops->get_flow_control)
+ self->aq_fw_ops->get_flow_control(self->aq_hw, &fc);
+ if (self->aq_hw_ops->hw_set_fc)
+ self->aq_hw_ops->hw_set_fc(self->aq_hw, fc, 0);
}
self->link_status = self->aq_hw->aq_link_status;
}
}
- if (i > 0 && i < AQ_HW_MULTICAST_ADDRESS_MAX) {
+ if (i > 0 && i <= AQ_HW_MULTICAST_ADDRESS_MAX) {
packet_filter |= IFF_MULTICAST;
self->mc_list.count = i;
self->aq_hw_ops->hw_multicast_list_set(self->aq_hw,
ethtool_link_ksettings_add_link_mode(cmd, advertising,
Pause);
- if (self->aq_nic_cfg.flow_control & AQ_NIC_FC_TX)
+ /* Asym is when either RX or TX, but not both */
+ if (!!(self->aq_nic_cfg.flow_control & AQ_NIC_FC_TX) ^
+ !!(self->aq_nic_cfg.flow_control & AQ_NIC_FC_RX))
ethtool_link_ksettings_add_link_mode(cmd, advertising,
Asym_Pause);
struct aq_nic_cfg_s {
const struct aq_hw_caps_s *aq_hw_caps;
- u64 hw_features;
+ u64 features;
u32 rxds; /* rx ring size, descriptors # */
u32 txds; /* tx ring size, descriptors # */
u32 vecs; /* vecs==allocated irqs */
return !!budget;
}
+static void aq_rx_checksum(struct aq_ring_s *self,
+ struct aq_ring_buff_s *buff,
+ struct sk_buff *skb)
+{
+ if (!(self->aq_nic->ndev->features & NETIF_F_RXCSUM))
+ return;
+
+ if (unlikely(buff->is_cso_err)) {
+ ++self->stats.rx.errors;
+ skb->ip_summed = CHECKSUM_NONE;
+ return;
+ }
+ if (buff->is_ip_cso) {
+ __skb_incr_checksum_unnecessary(skb);
+ if (buff->is_udp_cso || buff->is_tcp_cso)
+ __skb_incr_checksum_unnecessary(skb);
+ } else {
+ skb->ip_summed = CHECKSUM_NONE;
+ }
+}
+
#define AQ_SKB_ALIGN SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
int aq_ring_rx_clean(struct aq_ring_s *self,
struct napi_struct *napi,
}
skb->protocol = eth_type_trans(skb, ndev);
- if (unlikely(buff->is_cso_err)) {
- ++self->stats.rx.errors;
- skb->ip_summed = CHECKSUM_NONE;
- } else {
- if (buff->is_ip_cso) {
- __skb_incr_checksum_unnecessary(skb);
- if (buff->is_udp_cso || buff->is_tcp_cso)
- __skb_incr_checksum_unnecessary(skb);
- } else {
- skb->ip_summed = CHECKSUM_NONE;
- }
- }
+
+ aq_rx_checksum(self, buff, skb);
skb_set_hash(skb, buff->rss_hash,
buff->is_hash_l4 ? PKT_HASH_TYPE_L4 :
return err;
}
+static int hw_atl_b0_set_fc(struct aq_hw_s *self, u32 fc, u32 tc)
+{
+ hw_atl_rpb_rx_xoff_en_per_tc_set(self, !!(fc & AQ_NIC_FC_RX), tc);
+ return 0;
+}
+
static int hw_atl_b0_hw_qos_set(struct aq_hw_s *self)
{
u32 tc = 0U;
u32 buff_size = 0U;
unsigned int i_priority = 0U;
- bool is_rx_flow_control = false;
/* TPS Descriptor rate init */
hw_atl_tps_tx_pkt_shed_desc_rate_curr_time_res_set(self, 0x0U);
/* QoS Rx buf size per TC */
tc = 0;
- is_rx_flow_control = (AQ_NIC_FC_RX & self->aq_nic_cfg->flow_control);
buff_size = HW_ATL_B0_RXBUF_MAX;
hw_atl_rpb_rx_pkt_buff_size_per_tc_set(self, buff_size, tc);
(buff_size *
(1024U / 32U) * 50U) /
100U, tc);
- hw_atl_rpb_rx_xoff_en_per_tc_set(self, is_rx_flow_control ? 1U : 0U, tc);
+
+ hw_atl_b0_set_fc(self, self->aq_nic_cfg->flow_control, tc);
/* QoS 802.1p priority -> TC mapping */
for (i_priority = 8U; i_priority--;)
hw_atl_tpo_tcp_udp_crc_offload_en_set(self, 1);
/* RX checksums offloads*/
- hw_atl_rpo_ipv4header_crc_offload_en_set(self, 1);
- hw_atl_rpo_tcp_udp_crc_offload_en_set(self, 1);
+ hw_atl_rpo_ipv4header_crc_offload_en_set(self, !!(aq_nic_cfg->features &
+ NETIF_F_RXCSUM));
+ hw_atl_rpo_tcp_udp_crc_offload_en_set(self, !!(aq_nic_cfg->features &
+ NETIF_F_RXCSUM));
/* LSO offloads*/
hw_atl_tdm_large_send_offload_en_set(self, 0xFFFFFFFFU);
struct hw_atl_rxd_wb_s *rxd_wb = (struct hw_atl_rxd_wb_s *)
&ring->dx_ring[ring->hw_head * HW_ATL_B0_RXD_SIZE];
- unsigned int is_err = 1U;
unsigned int is_rx_check_sum_enabled = 0U;
unsigned int pkt_type = 0U;
+ u8 rx_stat = 0U;
if (!(rxd_wb->status & 0x1U)) { /* RxD is not done */
break;
buff = &ring->buff_ring[ring->hw_head];
- is_err = (0x0000003CU & rxd_wb->status);
+ rx_stat = (0x0000003CU & rxd_wb->status) >> 2;
- is_rx_check_sum_enabled = (rxd_wb->type) & (0x3U << 19);
- is_err &= ~0x20U; /* exclude validity bit */
+ is_rx_check_sum_enabled = (rxd_wb->type >> 19) & 0x3U;
pkt_type = 0xFFU & (rxd_wb->type >> 4);
- if (is_rx_check_sum_enabled) {
- if (0x0U == (pkt_type & 0x3U))
- buff->is_ip_cso = (is_err & 0x08U) ? 0U : 1U;
+ if (is_rx_check_sum_enabled & BIT(0) &&
+ (0x0U == (pkt_type & 0x3U)))
+ buff->is_ip_cso = (rx_stat & BIT(1)) ? 0U : 1U;
+ if (is_rx_check_sum_enabled & BIT(1)) {
if (0x4U == (pkt_type & 0x1CU))
- buff->is_udp_cso = buff->is_cso_err ? 0U : 1U;
+ buff->is_udp_cso = (rx_stat & BIT(2)) ? 0U :
+ !!(rx_stat & BIT(3));
else if (0x0U == (pkt_type & 0x1CU))
- buff->is_tcp_cso = buff->is_cso_err ? 0U : 1U;
-
- /* Checksum offload workaround for small packets */
- if (rxd_wb->pkt_len <= 60) {
- buff->is_ip_cso = 0U;
- buff->is_cso_err = 0U;
- }
+ buff->is_tcp_cso = (rx_stat & BIT(2)) ? 0U :
+ !!(rx_stat & BIT(3));
+ }
+ buff->is_cso_err = !!(rx_stat & 0x6);
+ /* Checksum offload workaround for small packets */
+ if (unlikely(rxd_wb->pkt_len <= 60)) {
+ buff->is_ip_cso = 0U;
+ buff->is_cso_err = 0U;
}
-
- is_err &= ~0x18U;
dma_unmap_page(ndev, buff->pa, buff->len, DMA_FROM_DEVICE);
- if (is_err || rxd_wb->type & 0x1000U) {
- /* status error or DMA error */
+ if ((rx_stat & BIT(0)) || rxd_wb->type & 0x1000U) {
+ /* MAC error or DMA error */
buff->is_error = 1U;
} else {
if (self->aq_nic_cfg->is_rss) {
static int hw_atl_b0_hw_stop(struct aq_hw_s *self)
{
hw_atl_b0_hw_irq_disable(self, HW_ATL_B0_INT_MASK);
+
+ /* Invalidate Descriptor Cache to prevent writing to the cached
+ * descriptors and to the data pointer of those descriptors
+ */
+ hw_atl_rdm_rx_dma_desc_cache_init_set(self, 1);
+
return aq_hw_err_from_flags(self);
}
.hw_get_regs = hw_atl_utils_hw_get_regs,
.hw_get_hw_stats = hw_atl_utils_get_hw_stats,
.hw_get_fw_version = hw_atl_utils_get_fw_version,
+ .hw_set_offload = hw_atl_b0_hw_offload_set,
+ .hw_set_fc = hw_atl_b0_set_fc,
};
HW_ATL_RPB_RX_FC_MODE_SHIFT, rx_flow_ctl_mode);
}
+void hw_atl_rdm_rx_dma_desc_cache_init_set(struct aq_hw_s *aq_hw, u32 init)
+{
+ aq_hw_write_reg_bit(aq_hw, HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_ADR,
+ HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_MSK,
+ HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_SHIFT,
+ init);
+}
+
void hw_atl_rpb_rx_pkt_buff_size_per_tc_set(struct aq_hw_s *aq_hw,
u32 rx_pkt_buff_size_per_tc, u32 buffer)
{
u32 rx_pkt_buff_size_per_tc,
u32 buffer);
+/* set rdm rx dma descriptor cache init */
+void hw_atl_rdm_rx_dma_desc_cache_init_set(struct aq_hw_s *aq_hw, u32 init);
+
/* set rx xoff enable (per tc) */
void hw_atl_rpb_rx_xoff_en_per_tc_set(struct aq_hw_s *aq_hw, u32 rx_xoff_en_per_tc,
u32 buffer);
/* default value of bitfield desc{d}_reset */
#define HW_ATL_RDM_DESCDRESET_DEFAULT 0x0
+/* rdm_desc_init_i bitfield definitions
+ * preprocessor definitions for the bitfield rdm_desc_init_i.
+ * port="pif_rdm_desc_init_i"
+ */
+
+/* register address for bitfield rdm_desc_init_i */
+#define HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_ADR 0x00005a00
+/* bitmask for bitfield rdm_desc_init_i */
+#define HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_MSK 0xffffffff
+/* inverted bitmask for bitfield rdm_desc_init_i */
+#define HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_MSKN 0x00000000
+/* lower bit position of bitfield rdm_desc_init_i */
+#define HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_SHIFT 0
+/* width of bitfield rdm_desc_init_i */
+#define HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_WIDTH 32
+/* default value of bitfield rdm_desc_init_i */
+#define HW_ATL_RDM_RX_DMA_DESC_CACHE_INIT_DEFAULT 0x0
+
/* rx int_desc_wrb_en bitfield definitions
* preprocessor definitions for the bitfield "int_desc_wrb_en".
* port="pif_rdm_int_desc_wrb_en_i"
#define HW_ATL_FW2X_MPI_STATE_ADDR 0x370
#define HW_ATL_FW2X_MPI_STATE2_ADDR 0x374
+#define HW_ATL_FW2X_CAP_PAUSE BIT(CAPS_HI_PAUSE)
+#define HW_ATL_FW2X_CAP_ASYM_PAUSE BIT(CAPS_HI_ASYMMETRIC_PAUSE)
#define HW_ATL_FW2X_CAP_SLEEP_PROXY BIT(CAPS_HI_SLEEP_PROXY)
#define HW_ATL_FW2X_CAP_WOL BIT(CAPS_HI_WOL)
return 0;
}
+static u32 aq_fw2x_get_flow_control(struct aq_hw_s *self, u32 *fcmode)
+{
+ u32 mpi_state = aq_hw_read_reg(self, HW_ATL_FW2X_MPI_STATE2_ADDR);
+
+ if (mpi_state & HW_ATL_FW2X_CAP_PAUSE)
+ if (mpi_state & HW_ATL_FW2X_CAP_ASYM_PAUSE)
+ *fcmode = AQ_NIC_FC_RX;
+ else
+ *fcmode = AQ_NIC_FC_RX | AQ_NIC_FC_TX;
+ else
+ if (mpi_state & HW_ATL_FW2X_CAP_ASYM_PAUSE)
+ *fcmode = AQ_NIC_FC_TX;
+ else
+ *fcmode = 0;
+
+ return 0;
+}
+
const struct aq_fw_ops aq_fw_2x_ops = {
.init = aq_fw2x_init,
.deinit = aq_fw2x_deinit,
.set_eee_rate = aq_fw2x_set_eee_rate,
.get_eee_rate = aq_fw2x_get_eee_rate,
.set_flow_control = aq_fw2x_set_flow_control,
+ .get_flow_control = aq_fw2x_get_flow_control
};
};
extern const struct ethtool_ops alx_ethtool_ops;
-extern const char alx_drv_name[];
#endif
#include "hw.h"
#include "reg.h"
-const char alx_drv_name[] = "alx";
+static const char alx_drv_name[] = "alx";
static void alx_free_txbuf(struct alx_tx_queue *txq, int entry)
{
intrl2_1_mask_clear(priv, 0xffffffff);
else
intrl2_0_mask_clear(priv, INTRL2_0_TDMA_MBDONE_MASK);
-
- /* Last call before we start the real business */
- netif_tx_start_all_queues(dev);
}
static void rbuf_init(struct bcm_sysport_priv *priv)
bcm_sysport_netif_start(dev);
+ netif_tx_start_all_queues(dev);
+
return 0;
out_clear_rx_int:
struct bcm_sysport_priv *priv = netdev_priv(dev);
/* stop all software from updating hardware */
- netif_tx_stop_all_queues(dev);
+ netif_tx_disable(dev);
napi_disable(&priv->napi);
cancel_work_sync(&priv->dim.dim.work);
phy_stop(dev->phydev);
if (!netif_running(dev))
return 0;
+ netif_device_detach(dev);
+
bcm_sysport_netif_stop(dev);
phy_suspend(dev->phydev);
- netif_device_detach(dev);
-
/* Disable UniMAC RX */
umac_enable_set(priv, CMD_RX_EN, 0);
goto out_free_rx_ring;
}
- netif_device_attach(dev);
-
/* RX pipe enable */
topctrl_writel(priv, 0, RX_FLUSH_CNTL);
bcm_sysport_netif_start(dev);
+ netif_device_attach(dev);
+
return 0;
out_free_rx_ring:
#define PMF_DMAE_C(bp) (BP_PORT(bp) * MAX_DMAE_C_PER_PORT + \
E1HVN_MAX)
+/* Following is the DMAE channel number allocation for the clients.
+ * MFW: OCBB/OCSD implementations use DMAE channels 14/15 respectively.
+ * Driver: 0-3 and 8-11 (for PF dmae operations)
+ * 4 and 12 (for stats requests)
+ */
+#define BNX2X_FW_DMAE_C 13 /* Channel for FW DMAE operations */
+
/* PCIE link and speed */
#define PCICFG_LINK_WIDTH 0x1f00000
#define PCICFG_LINK_WIDTH_SHIFT 20
rdata->sd_vlan_tag = cpu_to_le16(start_params->sd_vlan_tag);
rdata->path_id = BP_PATH(bp);
rdata->network_cos_mode = start_params->network_cos_mode;
+ rdata->dmae_cmd_id = BNX2X_FW_DMAE_C;
rdata->vxlan_dst_port = cpu_to_le16(start_params->vxlan_dst_port);
rdata->geneve_dst_port = cpu_to_le16(start_params->geneve_dst_port);
} else {
if (rxcmp1->rx_cmp_cfa_code_errors_v2 & RX_CMP_L4_CS_ERR_BITS) {
if (dev->features & NETIF_F_RXCSUM)
- cpr->rx_l4_csum_errors++;
+ bnapi->cp_ring.rx_l4_csum_errors++;
}
}
cp = le16_to_cpu(resp->alloc_cmpl_rings);
stats = le16_to_cpu(resp->alloc_stat_ctx);
cp = min_t(u16, cp, stats);
+ hw_resc->resv_irqs = cp;
if (bp->flags & BNXT_FLAG_CHIP_P5) {
int rx = hw_resc->resv_rx_rings;
int tx = hw_resc->resv_tx_rings;
hw_resc->resv_rx_rings = rx;
hw_resc->resv_tx_rings = tx;
}
- cp = le16_to_cpu(resp->alloc_msix);
+ hw_resc->resv_irqs = le16_to_cpu(resp->alloc_msix);
hw_resc->resv_hw_ring_grps = rx;
}
hw_resc->resv_cp_rings = cp;
return bnxt_hwrm_reserve_vf_rings(bp, tx, rx, grp, cp, vnic);
}
-static int bnxt_cp_rings_in_use(struct bnxt *bp)
+static int bnxt_nq_rings_in_use(struct bnxt *bp)
{
int cp = bp->cp_nr_rings;
int ulp_msix, ulp_base;
return cp;
}
+static int bnxt_cp_rings_in_use(struct bnxt *bp)
+{
+ int cp;
+
+ if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+ return bnxt_nq_rings_in_use(bp);
+
+ cp = bp->tx_nr_rings + bp->rx_nr_rings;
+ return cp;
+}
+
static bool bnxt_need_reserve_rings(struct bnxt *bp)
{
struct bnxt_hw_resc *hw_resc = &bp->hw_resc;
int cp = bnxt_cp_rings_in_use(bp);
+ int nq = bnxt_nq_rings_in_use(bp);
int rx = bp->rx_nr_rings;
int vnic = 1, grp = rx;
rx <<= 1;
if (BNXT_NEW_RM(bp) &&
(hw_resc->resv_rx_rings != rx || hw_resc->resv_cp_rings != cp ||
- hw_resc->resv_vnics != vnic ||
+ hw_resc->resv_irqs < nq || hw_resc->resv_vnics != vnic ||
(hw_resc->resv_hw_ring_grps != grp &&
!(bp->flags & BNXT_FLAG_CHIP_P5))))
return true;
static int __bnxt_reserve_rings(struct bnxt *bp)
{
struct bnxt_hw_resc *hw_resc = &bp->hw_resc;
- int cp = bnxt_cp_rings_in_use(bp);
+ int cp = bnxt_nq_rings_in_use(bp);
int tx = bp->tx_nr_rings;
int rx = bp->rx_nr_rings;
int grp, rx_rings, rc;
tx = hw_resc->resv_tx_rings;
if (BNXT_NEW_RM(bp)) {
rx = hw_resc->resv_rx_rings;
- cp = hw_resc->resv_cp_rings;
+ cp = hw_resc->resv_irqs;
grp = hw_resc->resv_hw_ring_grps;
vnic = hw_resc->resv_vnics;
}
return rc;
}
+static int bnxt_hwrm_queue_qportcfg(struct bnxt *bp);
+
static int bnxt_hwrm_func_qcaps(struct bnxt *bp)
{
int rc;
rc = __bnxt_hwrm_func_qcaps(bp);
if (rc)
return rc;
+ rc = bnxt_hwrm_queue_qportcfg(bp);
+ if (rc) {
+ netdev_err(bp->dev, "hwrm query qportcfg failure rc: %d\n", rc);
+ return rc;
+ }
if (bp->hwrm_spec_code >= 0x10803) {
rc = bnxt_alloc_ctx_mem(bp);
if (rc)
unsigned int bnxt_get_max_func_cp_rings_for_en(struct bnxt *bp)
{
- return bp->hw_resc.max_cp_rings - bnxt_get_ulp_msix_num(bp);
+ unsigned int cp = bp->hw_resc.max_cp_rings;
+
+ if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+ cp -= bnxt_get_ulp_msix_num(bp);
+
+ return cp;
}
static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
int total_req = bp->cp_nr_rings + num;
int max_idx, avail_msix;
- max_idx = min_t(int, bp->total_irqs, max_cp);
+ max_idx = bp->total_irqs;
+ if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+ max_idx = min_t(int, bp->total_irqs, max_cp);
avail_msix = max_idx - bp->cp_nr_rings;
if (!BNXT_NEW_RM(bp) || avail_msix >= num)
return avail_msix;
if (!BNXT_NEW_RM(bp))
return bnxt_get_max_func_irqs(bp);
- return bnxt_cp_rings_in_use(bp);
+ return bnxt_nq_rings_in_use(bp);
}
static int bnxt_init_msix(struct bnxt *bp)
rc = bnxt_hwrm_func_resc_qcaps(bp, true);
hw_resc->resv_cp_rings = 0;
+ hw_resc->resv_irqs = 0;
hw_resc->resv_tx_rings = 0;
hw_resc->resv_rx_rings = 0;
hw_resc->resv_hw_ring_grps = 0;
return rc;
}
+static int bnxt_dbg_hwrm_ring_info_get(struct bnxt *bp, u8 ring_type,
+ u32 ring_id, u32 *prod, u32 *cons)
+{
+ struct hwrm_dbg_ring_info_get_output *resp = bp->hwrm_cmd_resp_addr;
+ struct hwrm_dbg_ring_info_get_input req = {0};
+ int rc;
+
+ bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_DBG_RING_INFO_GET, -1, -1);
+ req.ring_type = ring_type;
+ req.fw_ring_id = cpu_to_le32(ring_id);
+ mutex_lock(&bp->hwrm_cmd_lock);
+ rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+ if (!rc) {
+ *prod = le32_to_cpu(resp->producer_index);
+ *cons = le32_to_cpu(resp->consumer_index);
+ }
+ mutex_unlock(&bp->hwrm_cmd_lock);
+ return rc;
+}
+
static void bnxt_dump_tx_sw_state(struct bnxt_napi *bnapi)
{
struct bnxt_tx_ring_info *txr = bnapi->tx_ring;
bnxt_queue_sp_work(bp);
}
}
+
+ if ((bp->flags & BNXT_FLAG_CHIP_P5) && netif_carrier_ok(dev)) {
+ set_bit(BNXT_RING_COAL_NOW_SP_EVENT, &bp->sp_event);
+ bnxt_queue_sp_work(bp);
+ }
bnxt_restart_timer:
mod_timer(&bp->timer, jiffies + bp->current_interval);
}
bnxt_rtnl_unlock_sp(bp);
}
+static void bnxt_chk_missed_irq(struct bnxt *bp)
+{
+ int i;
+
+ if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+ return;
+
+ for (i = 0; i < bp->cp_nr_rings; i++) {
+ struct bnxt_napi *bnapi = bp->bnapi[i];
+ struct bnxt_cp_ring_info *cpr;
+ u32 fw_ring_id;
+ int j;
+
+ if (!bnapi)
+ continue;
+
+ cpr = &bnapi->cp_ring;
+ for (j = 0; j < 2; j++) {
+ struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[j];
+ u32 val[2];
+
+ if (!cpr2 || cpr2->has_more_work ||
+ !bnxt_has_work(bp, cpr2))
+ continue;
+
+ if (cpr2->cp_raw_cons != cpr2->last_cp_raw_cons) {
+ cpr2->last_cp_raw_cons = cpr2->cp_raw_cons;
+ continue;
+ }
+ fw_ring_id = cpr2->cp_ring_struct.fw_ring_id;
+ bnxt_dbg_hwrm_ring_info_get(bp,
+ DBG_RING_INFO_GET_REQ_RING_TYPE_L2_CMPL,
+ fw_ring_id, &val[0], &val[1]);
+ cpr->missed_irqs++;
+ }
+ }
+}
+
static void bnxt_cfg_ntp_filters(struct bnxt *);
static void bnxt_sp_task(struct work_struct *work)
if (test_and_clear_bit(BNXT_FLOW_STATS_SP_EVENT, &bp->sp_event))
bnxt_tc_flow_stats_work(bp);
+ if (test_and_clear_bit(BNXT_RING_COAL_NOW_SP_EVENT, &bp->sp_event))
+ bnxt_chk_missed_irq(bp);
+
/* These functions below will clear BNXT_STATE_IN_SP_TASK. They
* must be the last functions to be called before exiting.
*/
int *max_cp)
{
struct bnxt_hw_resc *hw_resc = &bp->hw_resc;
- int max_ring_grps = 0;
+ int max_ring_grps = 0, max_irq;
*max_tx = hw_resc->max_tx_rings;
*max_rx = hw_resc->max_rx_rings;
- *max_cp = min_t(int, bnxt_get_max_func_cp_rings_for_en(bp),
- hw_resc->max_irqs - bnxt_get_ulp_msix_num(bp));
- *max_cp = min_t(int, *max_cp, hw_resc->max_stat_ctxs);
+ *max_cp = bnxt_get_max_func_cp_rings_for_en(bp);
+ max_irq = min_t(int, bnxt_get_max_func_irqs(bp) -
+ bnxt_get_ulp_msix_num(bp),
+ bnxt_get_max_func_stat_ctxs(bp));
+ if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+ *max_cp = min_t(int, *max_cp, max_irq);
max_ring_grps = hw_resc->max_hw_ring_grps;
if (BNXT_CHIP_TYPE_NITRO_A0(bp) && BNXT_PF(bp)) {
*max_cp -= 1;
}
if (bp->flags & BNXT_FLAG_AGG_RINGS)
*max_rx >>= 1;
+ if (bp->flags & BNXT_FLAG_CHIP_P5) {
+ bnxt_trim_rings(bp, max_rx, max_tx, *max_cp, false);
+ /* On P5 chips, max_cp output param should be available NQs */
+ *max_cp = max_irq;
+ }
*max_rx = min_t(int, *max_rx, max_ring_grps);
}
}
bnxt_hwrm_func_qcfg(bp);
+ bnxt_hwrm_vnic_qcaps(bp);
bnxt_hwrm_port_led_qcaps(bp);
bnxt_ethtool_init(bp);
bnxt_dcb_init(bp);
VNIC_RSS_CFG_REQ_HASH_TYPE_UDP_IPV6;
}
- bnxt_hwrm_vnic_qcaps(bp);
if (bnxt_rfs_supported(bp)) {
dev->hw_features |= NETIF_F_NTUPLE;
if (bnxt_rfs_capable(bp)) {
u8 had_work_done:1;
u8 has_more_work:1;
+ u32 last_cp_raw_cons;
+
struct bnxt_coal rx_ring_coal;
u64 rx_packets;
u64 rx_bytes;
dma_addr_t hw_stats_map;
u32 hw_stats_ctx_id;
u64 rx_l4_csum_errors;
+ u64 missed_irqs;
struct bnxt_ring_struct cp_ring_struct;
u16 min_stat_ctxs;
u16 max_stat_ctxs;
u16 max_irqs;
+ u16 resv_irqs;
};
#if defined(CONFIG_BNXT_SRIOV)
#define BNXT_LINK_SPEED_CHNG_SP_EVENT 14
#define BNXT_FLOW_STATS_SP_EVENT 15
#define BNXT_UPDATE_PHY_SP_EVENT 16
+#define BNXT_RING_COAL_NOW_SP_EVENT 17
struct bnxt_hw_resc hw_resc;
struct bnxt_pf_info pf;
return rc;
}
-#define BNXT_NUM_STATS 21
+#define BNXT_NUM_STATS 22
#define BNXT_RX_STATS_ENTRY(counter) \
{ BNXT_RX_STATS_OFFSET(counter), __stringify(counter) }
for (k = 0; k < stat_fields; j++, k++)
buf[j] = le64_to_cpu(hw_stats[k]);
buf[j++] = cpr->rx_l4_csum_errors;
+ buf[j++] = cpr->missed_irqs;
bnxt_sw_func_stats[RX_TOTAL_DISCARDS].counter +=
le64_to_cpu(cpr->hw_stats->rx_discard_pkts);
buf += ETH_GSTRING_LEN;
sprintf(buf, "[%d]: rx_l4_csum_errors", i);
buf += ETH_GSTRING_LEN;
+ sprintf(buf, "[%d]: missed_irqs", i);
+ buf += ETH_GSTRING_LEN;
}
for (i = 0; i < BNXT_NUM_SW_FUNC_STATS; i++) {
strcpy(buf, bnxt_sw_func_stats[i].string);
record->asic_state = 0;
strlcpy(record->system_name, utsname()->nodename,
sizeof(record->system_name));
- record->year = cpu_to_le16(tm.tm_year);
- record->month = cpu_to_le16(tm.tm_mon);
+ record->year = cpu_to_le16(tm.tm_year + 1900);
+ record->month = cpu_to_le16(tm.tm_mon + 1);
record->day = cpu_to_le16(tm.tm_mday);
record->hour = cpu_to_le16(tm.tm_hour);
record->minute = cpu_to_le16(tm.tm_min);
if (ulp_id == BNXT_ROCE_ULP) {
unsigned int max_stat_ctxs;
+ if (bp->flags & BNXT_FLAG_CHIP_P5)
+ return -EOPNOTSUPP;
+
max_stat_ctxs = bnxt_get_max_func_stat_ctxs(bp);
if (max_stat_ctxs <= BNXT_MIN_ROCE_STAT_CTXS ||
bp->num_stat_ctxs == max_stat_ctxs)
if (BNXT_NEW_RM(bp)) {
struct bnxt_hw_resc *hw_resc = &bp->hw_resc;
- avail_msix = hw_resc->resv_cp_rings - bp->cp_nr_rings;
+ avail_msix = hw_resc->resv_irqs - bp->cp_nr_rings;
edev->ulp_tbl[ulp_id].msix_requested = avail_msix;
}
bnxt_fill_msix_vecs(bp, ent);
umac_enable_set(priv, CMD_TX_EN | CMD_RX_EN, true);
- netif_tx_start_all_queues(dev);
bcmgenet_enable_tx_napi(priv);
/* Monitor link interrupts now */
bcmgenet_netif_start(dev);
+ netif_tx_start_all_queues(dev);
+
return 0;
err_irq1:
struct bcmgenet_priv *priv = netdev_priv(dev);
bcmgenet_disable_tx_napi(priv);
- netif_tx_stop_all_queues(dev);
+ netif_tx_disable(dev);
/* Disable MAC receive */
umac_enable_set(priv, CMD_RX_EN, false);
if (!netif_running(dev))
return 0;
+ netif_device_detach(dev);
+
bcmgenet_netif_stop(dev);
if (!device_may_wakeup(d))
phy_suspend(dev->phydev);
- netif_device_detach(dev);
-
/* Prepare the device for Wake-on-LAN and switch to the slow clock */
if (device_may_wakeup(d) && priv->wolopts) {
ret = bcmgenet_power_down(priv, GENET_POWER_WOL_MAGIC);
/* Always enable ring 16 - descriptor ring */
bcmgenet_enable_dma(priv, dma_ctrl);
- netif_device_attach(dev);
-
if (!device_may_wakeup(d))
phy_resume(dev->phydev);
bcmgenet_netif_start(dev);
+ netif_device_attach(dev);
+
return 0;
out_clk_disable:
{
struct tg3 *tp = netdev_priv(dev);
int i, irq_sync = 0, err = 0;
+ bool reset_phy = false;
if ((ering->rx_pending > tp->rx_std_ring_mask) ||
(ering->rx_jumbo_pending > tp->rx_jmb_ring_mask) ||
if (netif_running(dev)) {
tg3_halt(tp, RESET_KIND_SHUTDOWN, 1);
- err = tg3_restart_hw(tp, false);
+ /* Reset PHY to avoid PHY lock up */
+ if (tg3_asic_rev(tp) == ASIC_REV_5717 ||
+ tg3_asic_rev(tp) == ASIC_REV_5719 ||
+ tg3_asic_rev(tp) == ASIC_REV_5720)
+ reset_phy = true;
+
+ err = tg3_restart_hw(tp, reset_phy);
if (!err)
tg3_netif_start(tp);
}
{
struct tg3 *tp = netdev_priv(dev);
int err = 0;
+ bool reset_phy = false;
if (tp->link_config.autoneg == AUTONEG_ENABLE)
tg3_warn_mgmt_link_flap(tp);
if (netif_running(dev)) {
tg3_halt(tp, RESET_KIND_SHUTDOWN, 1);
- err = tg3_restart_hw(tp, false);
+ /* Reset PHY to avoid PHY lock up */
+ if (tg3_asic_rev(tp) == ASIC_REV_5717 ||
+ tg3_asic_rev(tp) == ASIC_REV_5719 ||
+ tg3_asic_rev(tp) == ASIC_REV_5720)
+ reset_phy = true;
+
+ err = tg3_restart_hw(tp, reset_phy);
if (!err)
tg3_netif_start(tp);
}
"mac_tx_one_collision",
"mac_tx_multi_collision",
"mac_tx_max_collision_fail",
- "mac_tx_max_deferal_fail",
+ "mac_tx_max_deferral_fail",
"mac_tx_fifo_err",
"mac_tx_runts",
struct octeon_soft_command *sc = (struct octeon_soft_command *)buf;
struct sk_buff *skb = sc->ctxptr;
struct net_device *ndev = skb->dev;
+ u32 iq_no;
dma_unmap_single(&oct->pci_dev->dev, sc->dmadptr,
sc->datasize, DMA_TO_DEVICE);
dev_kfree_skb_any(skb);
+ iq_no = sc->iq_no;
octeon_free_soft_command(oct, sc);
- if (octnet_iq_is_full(oct, sc->iq_no))
+ if (octnet_iq_is_full(oct, iq_no))
return;
if (netif_queue_stopped(ndev))
{
struct nicpf *nic = pci_get_drvdata(pdev);
+ if (!nic)
+ return;
+
if (nic->flags & NIC_SRIOV_ENABLED)
pci_disable_sriov(pdev);
bool if_up = netif_running(nic->netdev);
struct bpf_prog *old_prog;
bool bpf_attached = false;
+ int ret = 0;
/* For now just support only the usual MTU sized frames */
if (prog && (dev->mtu > 1500)) {
if (nic->xdp_prog) {
/* Attach BPF program */
nic->xdp_prog = bpf_prog_add(nic->xdp_prog, nic->rx_queues - 1);
- if (!IS_ERR(nic->xdp_prog))
+ if (!IS_ERR(nic->xdp_prog)) {
bpf_attached = true;
+ } else {
+ ret = PTR_ERR(nic->xdp_prog);
+ nic->xdp_prog = NULL;
+ }
}
/* Calculate Tx queues needed for XDP and network stack */
netif_trans_update(nic->netdev);
}
- return 0;
+ return ret;
}
static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
if (!sq->dmem.base)
return;
- if (sq->tso_hdrs)
+ if (sq->tso_hdrs) {
dma_free_coherent(&nic->pdev->dev,
sq->dmem.q_len * TSO_HEADER_SIZE,
sq->tso_hdrs, sq->tso_hdrs_phys);
+ sq->tso_hdrs = NULL;
+ }
/* Free pending skbs in the queue */
smp_rmb();
config CHELSIO_T4
tristate "Chelsio Communications T4/T5/T6 Ethernet support"
depends on PCI && (IPV6 || IPV6=n)
- depends on THERMAL || !THERMAL
select FW_LOADER
select MDIO
select ZLIB_DEFLATE
cxgb4-$(CONFIG_CHELSIO_T4_DCB) += cxgb4_dcb.o
cxgb4-$(CONFIG_CHELSIO_T4_FCOE) += cxgb4_fcoe.o
cxgb4-$(CONFIG_DEBUG_FS) += cxgb4_debugfs.o
-ifdef CONFIG_THERMAL
-cxgb4-objs += cxgb4_thermal.o
-endif
+cxgb4-$(CONFIG_THERMAL) += cxgb4_thermal.o
if (!is_t4(adapter->params.chip))
cxgb4_ptp_init(adapter);
- if (IS_ENABLED(CONFIG_THERMAL) &&
+ if (IS_REACHABLE(CONFIG_THERMAL) &&
!is_t4(adapter->params.chip) && (adapter->flags & FW_OK))
cxgb4_thermal_init(adapter);
if (!is_t4(adapter->params.chip))
cxgb4_ptp_stop(adapter);
- if (IS_ENABLED(CONFIG_THERMAL))
+ if (IS_REACHABLE(CONFIG_THERMAL))
cxgb4_thermal_remove(adapter);
/* If we allocated filters, free up state associated with any
u64_stats_update_begin(&port->tx_stats_syncp);
port->tx_frag_stats[nfrags]++;
- u64_stats_update_end(&port->ir_stats_syncp);
+ u64_stats_update_end(&port->tx_stats_syncp);
}
}
struct net_device *netdev = dev_id;
struct ftmac100 *priv = netdev_priv(netdev);
- if (likely(netif_running(netdev))) {
- /* Disable interrupts for polling */
- ftmac100_disable_all_int(priv);
+ /* Disable interrupts for polling */
+ ftmac100_disable_all_int(priv);
+ if (likely(netif_running(netdev)))
napi_schedule(&priv->napi);
- }
return IRQ_HANDLED;
}
if (!muram_node) {
dev_err(&of_dev->dev, "%s: could not find MURAM node\n",
__func__);
- goto fman_node_put;
+ goto fman_free;
}
err = of_address_to_resource(muram_node, 0,
of_node_put(muram_node);
dev_err(&of_dev->dev, "%s: of_address_to_resource() = %d\n",
__func__, err);
- goto fman_node_put;
+ goto fman_free;
}
of_node_put(muram_node);
- of_node_put(fm_node);
err = devm_request_irq(&of_dev->dev, irq, fman_irq, IRQF_SHARED,
"fman", fman);
}
ret = register_netdev(ndev);
- if (ret) {
- free_netdev(ndev);
+ if (ret)
goto alloc_fail;
- }
return 0;
int (*set_loopback)(struct hnae3_handle *handle,
enum hnae3_loop loop_mode, bool en);
- void (*set_promisc_mode)(struct hnae3_handle *handle, bool en_uc_pmc,
- bool en_mc_pmc);
+ int (*set_promisc_mode)(struct hnae3_handle *handle, bool en_uc_pmc,
+ bool en_mc_pmc);
int (*set_mtu)(struct hnae3_handle *handle, int new_mtu);
void (*get_pauseparam)(struct hnae3_handle *handle,
int vector_num,
struct hnae3_ring_chain_node *vr_chain);
- void (*reset_queue)(struct hnae3_handle *handle, u16 queue_id);
+ int (*reset_queue)(struct hnae3_handle *handle, u16 queue_id);
u32 (*get_fw_version)(struct hnae3_handle *handle);
void (*get_mdix_mode)(struct hnae3_handle *handle,
u8 *tp_mdix_ctrl, u8 *tp_mdix);
h->netdev_flags = new_flags;
}
-void hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags)
+int hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags)
{
struct hns3_nic_priv *priv = netdev_priv(netdev);
struct hnae3_handle *h = priv->ae_handle;
if (h->ae_algo->ops->set_promisc_mode) {
- h->ae_algo->ops->set_promisc_mode(h,
- promisc_flags & HNAE3_UPE,
- promisc_flags & HNAE3_MPE);
+ return h->ae_algo->ops->set_promisc_mode(h,
+ promisc_flags & HNAE3_UPE,
+ promisc_flags & HNAE3_MPE);
}
+
+ return 0;
}
void hns3_enable_vlan_filter(struct net_device *netdev, bool enable)
return ret;
}
-static void hns3_restore_vlan(struct net_device *netdev)
+static int hns3_restore_vlan(struct net_device *netdev)
{
struct hns3_nic_priv *priv = netdev_priv(netdev);
+ int ret = 0;
u16 vid;
- int ret;
for_each_set_bit(vid, priv->active_vlans, VLAN_N_VID) {
ret = hns3_vlan_rx_add_vid(netdev, htons(ETH_P_8021Q), vid);
- if (ret)
- netdev_warn(netdev, "Restore vlan: %d filter, ret:%d\n",
- vid, ret);
+ if (ret) {
+ netdev_err(netdev, "Restore vlan: %d filter, ret:%d\n",
+ vid, ret);
+ return ret;
+ }
}
+
+ return ret;
}
static int hns3_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
chain = devm_kzalloc(&pdev->dev, sizeof(*chain),
GFP_KERNEL);
if (!chain)
- return -ENOMEM;
+ goto err_free_chain;
cur_chain->next = chain;
chain->tqp_index = tx_ring->tqp->tqp_index;
while (rx_ring) {
chain = devm_kzalloc(&pdev->dev, sizeof(*chain), GFP_KERNEL);
if (!chain)
- return -ENOMEM;
+ goto err_free_chain;
cur_chain->next = chain;
chain->tqp_index = rx_ring->tqp->tqp_index;
}
return 0;
+
+err_free_chain:
+ cur_chain = head->next;
+ while (cur_chain) {
+ chain = cur_chain->next;
+ devm_kfree(&pdev->dev, chain);
+ cur_chain = chain;
+ }
+
+ return -ENOMEM;
}
static void hns3_free_vector_ring_chain(struct hns3_enet_tqp_vector *tqp_vector,
struct hnae3_handle *h = priv->ae_handle;
struct hns3_enet_tqp_vector *tqp_vector;
int ret = 0;
- u16 i;
+ int i;
hns3_nic_set_cpumask(priv);
hns3_free_vector_ring_chain(tqp_vector, &vector_ring_chain);
if (ret)
- return ret;
+ goto map_ring_fail;
netif_napi_add(priv->netdev, &tqp_vector->napi,
hns3_nic_common_poll, NAPI_POLL_WEIGHT);
}
return 0;
+
+map_ring_fail:
+ while (i--)
+ netif_napi_del(&priv->tqp_vector[i].napi);
+
+ return ret;
}
static int hns3_nic_alloc_vector_data(struct hns3_nic_priv *priv)
return ret;
ret = hns3_ring_get_cfg(tqp, priv, HNAE3_RING_TYPE_RX);
- if (ret)
+ if (ret) {
+ devm_kfree(priv->dev, priv->ring_data[tqp->tqp_index].ring);
return ret;
+ }
return 0;
}
return 0;
err:
+ while (i--) {
+ devm_kfree(priv->dev, priv->ring_data[i].ring);
+ devm_kfree(priv->dev,
+ priv->ring_data[i + h->kinfo.num_tqps].ring);
+ }
+
devm_kfree(&pdev->dev, priv->ring_data);
return ret;
}
int i;
for (i = 0; i < h->kinfo.num_tqps; i++) {
- if (h->ae_algo->ops->reset_queue)
- h->ae_algo->ops->reset_queue(h, i);
-
hns3_fini_ring(priv->ring_data[i].ring);
hns3_fini_ring(priv->ring_data[i + h->kinfo.num_tqps].ring);
}
}
/* Set mac addr if it is configured. or leave it to the AE driver */
-static void hns3_init_mac_addr(struct net_device *netdev, bool init)
+static int hns3_init_mac_addr(struct net_device *netdev, bool init)
{
struct hns3_nic_priv *priv = netdev_priv(netdev);
struct hnae3_handle *h = priv->ae_handle;
u8 mac_addr_temp[ETH_ALEN];
+ int ret = 0;
if (h->ae_algo->ops->get_mac_addr && init) {
h->ae_algo->ops->get_mac_addr(h, mac_addr_temp);
}
if (h->ae_algo->ops->set_mac_addr)
- h->ae_algo->ops->set_mac_addr(h, netdev->dev_addr, true);
+ ret = h->ae_algo->ops->set_mac_addr(h, netdev->dev_addr, true);
+ return ret;
}
static int hns3_restore_fd_rules(struct net_device *netdev)
return ret;
}
-static void hns3_recover_hw_addr(struct net_device *ndev)
+static int hns3_recover_hw_addr(struct net_device *ndev)
{
struct netdev_hw_addr_list *list;
struct netdev_hw_addr *ha, *tmp;
+ int ret = 0;
/* go through and sync uc_addr entries to the device */
list = &ndev->uc;
- list_for_each_entry_safe(ha, tmp, &list->list, list)
- hns3_nic_uc_sync(ndev, ha->addr);
+ list_for_each_entry_safe(ha, tmp, &list->list, list) {
+ ret = hns3_nic_uc_sync(ndev, ha->addr);
+ if (ret)
+ return ret;
+ }
/* go through and sync mc_addr entries to the device */
list = &ndev->mc;
- list_for_each_entry_safe(ha, tmp, &list->list, list)
- hns3_nic_mc_sync(ndev, ha->addr);
+ list_for_each_entry_safe(ha, tmp, &list->list, list) {
+ ret = hns3_nic_mc_sync(ndev, ha->addr);
+ if (ret)
+ return ret;
+ }
+
+ return ret;
}
static void hns3_remove_hw_addr(struct net_device *netdev)
int ret;
for (i = 0; i < h->kinfo.num_tqps; i++) {
- h->ae_algo->ops->reset_queue(h, i);
+ ret = h->ae_algo->ops->reset_queue(h, i);
+ if (ret)
+ return ret;
+
hns3_init_ring_hw(priv->ring_data[i].ring);
/* We need to clear tx ring here because self test will
bool vlan_filter_enable;
int ret;
- hns3_init_mac_addr(netdev, false);
- hns3_recover_hw_addr(netdev);
- hns3_update_promisc_mode(netdev, handle->netdev_flags);
+ ret = hns3_init_mac_addr(netdev, false);
+ if (ret)
+ return ret;
+
+ ret = hns3_recover_hw_addr(netdev);
+ if (ret)
+ return ret;
+
+ ret = hns3_update_promisc_mode(netdev, handle->netdev_flags);
+ if (ret)
+ return ret;
+
vlan_filter_enable = netdev->flags & IFF_PROMISC ? false : true;
hns3_enable_vlan_filter(netdev, vlan_filter_enable);
-
/* Hardware table is only clear when pf resets */
- if (!(handle->flags & HNAE3_SUPPORT_VF))
- hns3_restore_vlan(netdev);
+ if (!(handle->flags & HNAE3_SUPPORT_VF)) {
+ ret = hns3_restore_vlan(netdev);
+ if (ret)
+ return ret;
+ }
- hns3_restore_fd_rules(netdev);
+ ret = hns3_restore_fd_rules(netdev);
+ if (ret)
+ return ret;
/* Carrier off reporting is important to ethtool even BEFORE open */
netif_carrier_off(netdev);
u32 rl_value);
void hns3_enable_vlan_filter(struct net_device *netdev, bool enable);
-void hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags);
+int hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags);
#ifdef CONFIG_HNS3_DCB
void hns3_dcbnl_setup(struct hnae3_handle *handle);
return ring->desc_num - used - 1;
}
-static int is_valid_csq_clean_head(struct hclge_cmq_ring *ring, int h)
+static int is_valid_csq_clean_head(struct hclge_cmq_ring *ring, int head)
{
- int u = ring->next_to_use;
- int c = ring->next_to_clean;
+ int ntu = ring->next_to_use;
+ int ntc = ring->next_to_clean;
- if (unlikely(h >= ring->desc_num))
- return 0;
+ if (ntu > ntc)
+ return head >= ntc && head <= ntu;
- return u > c ? (h > c && h <= u) : (h > c || h <= u);
+ return head >= ntc || head <= ntu;
}
static int hclge_alloc_cmd_desc(struct hclge_cmq_ring *ring)
{
int ret;
+ /* Setup the lock for command queue */
+ spin_lock_init(&hdev->hw.cmq.csq.lock);
+ spin_lock_init(&hdev->hw.cmq.crq.lock);
+
/* Setup the queue entries for use cmd queue */
hdev->hw.cmq.csq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
hdev->hw.cmq.crq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
u32 version;
int ret;
+ spin_lock_bh(&hdev->hw.cmq.csq.lock);
+ spin_lock_bh(&hdev->hw.cmq.crq.lock);
+
hdev->hw.cmq.csq.next_to_clean = 0;
hdev->hw.cmq.csq.next_to_use = 0;
hdev->hw.cmq.crq.next_to_clean = 0;
hdev->hw.cmq.crq.next_to_use = 0;
- /* Setup the lock for command queue */
- spin_lock_init(&hdev->hw.cmq.csq.lock);
- spin_lock_init(&hdev->hw.cmq.crq.lock);
-
hclge_cmd_init_regs(&hdev->hw);
clear_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state);
+ spin_unlock_bh(&hdev->hw.cmq.crq.lock);
+ spin_unlock_bh(&hdev->hw.cmq.csq.lock);
+
ret = hclge_cmd_query_firmware_version(&hdev->hw, &version);
if (ret) {
dev_err(&hdev->pdev->dev,
ret = hclge_cmd_clear_error(hdev, &desc_wr, &desc_rd,
HCLGE_NCSI_INT_CLR, 0);
if (ret)
- dev_err(dev, "failed(=%d) to clear NCSI intrerrupt status\n",
+ dev_err(dev, "failed(=%d) to clear NCSI interrupt status\n",
ret);
}
}
/* clear the source of interrupt if it is not cause by reset */
- if (event_cause != HCLGE_VECTOR0_EVENT_RST) {
+ if (event_cause == HCLGE_VECTOR0_EVENT_MBX) {
hclge_clear_event_cause(hdev, event_cause, clearval);
hclge_enable_vector(&hdev->misc_vector, true);
}
handle = &hdev->vport[0].nic;
rtnl_lock();
hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
+ rtnl_unlock();
if (!hclge_reset_wait(hdev)) {
+ rtnl_lock();
hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
hclge_reset_ae_dev(hdev->ae_dev);
hclge_notify_client(hdev, HNAE3_INIT_CLIENT);
hclge_clear_reset_cause(hdev);
} else {
+ rtnl_lock();
/* schedule again to check pending resets later */
set_bit(hdev->reset_type, &hdev->reset_pending);
hclge_reset_task_schedule(hdev);
param->vf_id = vport_id;
}
-static void hclge_set_promisc_mode(struct hnae3_handle *handle, bool en_uc_pmc,
- bool en_mc_pmc)
+static int hclge_set_promisc_mode(struct hnae3_handle *handle, bool en_uc_pmc,
+ bool en_mc_pmc)
{
struct hclge_vport *vport = hclge_get_vport(handle);
struct hclge_dev *hdev = vport->back;
hclge_promisc_param_init(¶m, en_uc_pmc, en_mc_pmc, true,
vport->vport_id);
- hclge_cmd_set_promisc_mode(hdev, ¶m);
+ return hclge_cmd_set_promisc_mode(hdev, ¶m);
}
static int hclge_get_fd_mode(struct hclge_dev *hdev, u8 *fd_mode)
return tqp->index;
}
-void hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
+int hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
{
struct hclge_vport *vport = hclge_get_vport(handle);
struct hclge_dev *hdev = vport->back;
int reset_try_times = 0;
int reset_status;
u16 queue_gid;
- int ret;
-
- if (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
- return;
+ int ret = 0;
queue_gid = hclge_covert_handle_qid_global(handle, queue_id);
ret = hclge_tqp_enable(hdev, queue_id, 0, false);
if (ret) {
- dev_warn(&hdev->pdev->dev, "Disable tqp fail, ret = %d\n", ret);
- return;
+ dev_err(&hdev->pdev->dev, "Disable tqp fail, ret = %d\n", ret);
+ return ret;
}
ret = hclge_send_reset_tqp_cmd(hdev, queue_gid, true);
if (ret) {
- dev_warn(&hdev->pdev->dev,
- "Send reset tqp cmd fail, ret = %d\n", ret);
- return;
+ dev_err(&hdev->pdev->dev,
+ "Send reset tqp cmd fail, ret = %d\n", ret);
+ return ret;
}
reset_try_times = 0;
}
if (reset_try_times >= HCLGE_TQP_RESET_TRY_TIMES) {
- dev_warn(&hdev->pdev->dev, "Reset TQP fail\n");
- return;
+ dev_err(&hdev->pdev->dev, "Reset TQP fail\n");
+ return ret;
}
ret = hclge_send_reset_tqp_cmd(hdev, queue_gid, false);
- if (ret) {
- dev_warn(&hdev->pdev->dev,
- "Deassert the soft reset fail, ret = %d\n", ret);
- return;
- }
+ if (ret)
+ dev_err(&hdev->pdev->dev,
+ "Deassert the soft reset fail, ret = %d\n", ret);
+
+ return ret;
}
void hclge_reset_vf_queue(struct hclge_vport *vport, u16 queue_id)
void hclge_rss_indir_init_cfg(struct hclge_dev *hdev);
void hclge_mbx_handler(struct hclge_dev *hdev);
-void hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id);
+int hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id);
void hclge_reset_vf_queue(struct hclge_vport *vport, u16 queue_id);
int hclge_cfg_flowctrl(struct hclge_dev *hdev);
int hclge_func_reset_cmd(struct hclge_dev *hdev, int func_id);
/* handle all the mailbox requests in the queue */
while (!hclge_cmd_crq_empty(&hdev->hw)) {
+ if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state)) {
+ dev_warn(&hdev->pdev->dev,
+ "command queue needs re-initializing\n");
+ return;
+ }
+
desc = &crq->desc[crq->next_to_use];
req = (struct hclge_mbx_vf_to_pf_cmd *)desc->data;
struct hclge_desc desc;
int ret;
- if (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+ if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state))
return 0;
hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_MDIO_CONFIG, false);
struct hclge_desc desc;
int ret;
- if (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+ if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state))
return 0;
hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_MDIO_CONFIG, true);
*/
static int hclge_bp_setup_hw(struct hclge_dev *hdev, u8 tc)
{
- struct hclge_vport *vport = hdev->vport;
- u32 i, k, qs_bitmap;
- int ret;
+ int i;
for (i = 0; i < HCLGE_BP_GRP_NUM; i++) {
- qs_bitmap = 0;
+ u32 qs_bitmap = 0;
+ int k, ret;
for (k = 0; k < hdev->num_alloc_vport; k++) {
+ struct hclge_vport *vport = &hdev->vport[k];
u16 qs_id = vport->qs_offset + tc;
u8 grp, sub_grp;
HCLGE_BP_SUB_GRP_ID_S);
if (i == grp)
qs_bitmap |= (1 << sub_grp);
-
- vport++;
}
ret = hclge_tm_qs_bp_cfg(hdev, tc, i, qs_bitmap);
return status;
}
-static void hclgevf_set_promisc_mode(struct hnae3_handle *handle,
- bool en_uc_pmc, bool en_mc_pmc)
+static int hclgevf_set_promisc_mode(struct hnae3_handle *handle,
+ bool en_uc_pmc, bool en_mc_pmc)
{
struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
- hclgevf_cmd_set_promisc_mode(hdev, en_uc_pmc, en_mc_pmc);
+ return hclgevf_cmd_set_promisc_mode(hdev, en_uc_pmc, en_mc_pmc);
}
static int hclgevf_tqp_enable(struct hclgevf_dev *hdev, int tqp_id,
1, false, NULL, 0);
}
-static void hclgevf_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
+static int hclgevf_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
{
struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
u8 msg_data[2];
/* disable vf queue before send queue reset msg to PF */
ret = hclgevf_tqp_enable(hdev, queue_id, 0, false);
if (ret)
- return;
+ return ret;
- hclgevf_send_mbx_msg(hdev, HCLGE_MBX_QUEUE_RESET, 0, msg_data,
- 2, true, NULL, 0);
+ return hclgevf_send_mbx_msg(hdev, HCLGE_MBX_QUEUE_RESET, 0, msg_data,
+ 2, true, NULL, 0);
}
static int hclgevf_notify_client(struct hclgevf_dev *hdev,
/* bring down the nic to stop any ongoing TX/RX */
hclgevf_notify_client(hdev, HNAE3_DOWN_CLIENT);
+ rtnl_unlock();
+
/* check if VF could successfully fetch the hardware reset completion
* status from the hardware
*/
ret);
dev_warn(&hdev->pdev->dev, "VF reset failed, disabling VF!\n");
+ rtnl_lock();
hclgevf_notify_client(hdev, HNAE3_UNINIT_CLIENT);
rtnl_unlock();
return ret;
}
+ rtnl_lock();
+
/* now, re-initialize the nic client and ae device*/
ret = hclgevf_reset_stack(hdev);
if (ret)
}
void hinic_task_set_tunnel_l4(struct hinic_sq_task *task,
- enum hinic_l4_offload_type l4_type,
+ enum hinic_l4_tunnel_type l4_type,
u32 tunnel_len)
{
task->pkt_info2 |= HINIC_SQ_TASK_INFO2_SET(l4_type, TUNNEL_L4TYPE) |
u32 network_len);
void hinic_task_set_tunnel_l4(struct hinic_sq_task *task,
- enum hinic_l4_offload_type l4_type,
+ enum hinic_l4_tunnel_type l4_type,
u32 tunnel_len);
void hinic_set_cs_inner_l4(struct hinic_sq_task *task,
#define EMAC_STACR_PHYE 0x00004000
#define EMAC_STACR_STAC_MASK 0x00003000
#define EMAC_STACR_STAC_READ 0x00001000
-#define EMAC_STACR_STAC_WRITE 0x00000800
+#define EMAC_STACR_STAC_WRITE 0x00002000
#define EMAC_STACR_OPBC_MASK 0x00000C00
#define EMAC_STACR_OPBC_50 0x00000000
#define EMAC_STACR_OPBC_66 0x00000400
for (j = 0; j < rx_pool->size; j++) {
if (rx_pool->rx_buff[j].skb) {
- dev_kfree_skb_any(rx_pool->rx_buff[i].skb);
- rx_pool->rx_buff[i].skb = NULL;
+ dev_kfree_skb_any(rx_pool->rx_buff[j].skb);
+ rx_pool->rx_buff[j].skb = NULL;
}
}
return 0;
}
- mutex_lock(&adapter->reset_lock);
-
if (adapter->state != VNIC_CLOSED) {
rc = ibmvnic_login(netdev);
- if (rc) {
- mutex_unlock(&adapter->reset_lock);
+ if (rc)
return rc;
- }
rc = init_resources(adapter);
if (rc) {
netdev_err(netdev, "failed to initialize resources\n");
release_resources(adapter);
- mutex_unlock(&adapter->reset_lock);
return rc;
}
}
rc = __ibmvnic_open(netdev);
netif_carrier_on(netdev);
- mutex_unlock(&adapter->reset_lock);
-
return rc;
}
return 0;
}
- mutex_lock(&adapter->reset_lock);
rc = __ibmvnic_close(netdev);
ibmvnic_cleanup(netdev);
- mutex_unlock(&adapter->reset_lock);
return rc;
}
tx_crq.v1.sge_len = cpu_to_be32(skb->len);
tx_crq.v1.ioba = cpu_to_be64(data_dma_addr);
- if (adapter->vlan_header_insertion) {
+ if (adapter->vlan_header_insertion && skb_vlan_tag_present(skb)) {
tx_crq.v1.flags2 |= IBMVNIC_TX_VLAN_INSERT;
tx_crq.v1.vlan_id = cpu_to_be16(skb->vlan_tci);
}
struct ibmvnic_rwi *rwi, u32 reset_state)
{
u64 old_num_rx_queues, old_num_tx_queues;
+ u64 old_num_rx_slots, old_num_tx_slots;
struct net_device *netdev = adapter->netdev;
int i, rc;
old_num_rx_queues = adapter->req_rx_queues;
old_num_tx_queues = adapter->req_tx_queues;
+ old_num_rx_slots = adapter->req_rx_add_entries_per_subcrq;
+ old_num_tx_slots = adapter->req_tx_entries_per_subcrq;
ibmvnic_cleanup(netdev);
if (rc)
return rc;
} else if (adapter->req_rx_queues != old_num_rx_queues ||
- adapter->req_tx_queues != old_num_tx_queues) {
- adapter->map_id = 1;
+ adapter->req_tx_queues != old_num_tx_queues ||
+ adapter->req_rx_add_entries_per_subcrq !=
+ old_num_rx_slots ||
+ adapter->req_tx_entries_per_subcrq !=
+ old_num_tx_slots) {
release_rx_pools(adapter);
release_tx_pools(adapter);
- rc = init_rx_pools(netdev);
- if (rc)
- return rc;
- rc = init_tx_pools(netdev);
- if (rc)
- return rc;
-
release_napi(adapter);
- rc = init_napi(adapter);
+ release_vpd_data(adapter);
+
+ rc = init_resources(adapter);
if (rc)
return rc;
+
} else {
rc = reset_tx_pools(adapter);
if (rc)
if (adapter->reset_reason != VNIC_RESET_FAILOVER &&
adapter->reset_reason != VNIC_RESET_CHANGE_PARAM)
- netdev_notify_peers(netdev);
+ call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, netdev);
netif_carrier_on(netdev);
adapter->state = VNIC_PROBED;
return 0;
}
- /* netif_set_real_num_xx_queues needs to take rtnl lock here
- * unless wait_for_reset is set, in which case the rtnl lock
- * has already been taken before initializing the reset
- */
- if (!adapter->wait_for_reset) {
- rtnl_lock();
- rc = init_resources(adapter);
- rtnl_unlock();
- } else {
- rc = init_resources(adapter);
- }
+
+ rc = init_resources(adapter);
if (rc)
return rc;
struct ibmvnic_rwi *rwi;
struct ibmvnic_adapter *adapter;
struct net_device *netdev;
+ bool we_lock_rtnl = false;
u32 reset_state;
int rc = 0;
adapter = container_of(work, struct ibmvnic_adapter, ibmvnic_reset);
netdev = adapter->netdev;
- mutex_lock(&adapter->reset_lock);
+ /* netif_set_real_num_xx_queues needs to take rtnl lock here
+ * unless wait_for_reset is set, in which case the rtnl lock
+ * has already been taken before initializing the reset
+ */
+ if (!adapter->wait_for_reset) {
+ rtnl_lock();
+ we_lock_rtnl = true;
+ }
reset_state = adapter->state;
rwi = get_next_rwi(adapter);
if (rc) {
netdev_dbg(adapter->netdev, "Reset failed\n");
free_all_rwi(adapter);
- mutex_unlock(&adapter->reset_lock);
- return;
}
adapter->resetting = false;
- mutex_unlock(&adapter->reset_lock);
+ if (we_lock_rtnl)
+ rtnl_unlock();
}
static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
INIT_WORK(&adapter->ibmvnic_reset, __ibmvnic_reset);
INIT_LIST_HEAD(&adapter->rwi_list);
- mutex_init(&adapter->reset_lock);
mutex_init(&adapter->rwi_lock);
adapter->resetting = false;
struct ibmvnic_adapter *adapter = netdev_priv(netdev);
adapter->state = VNIC_REMOVING;
- unregister_netdev(netdev);
- mutex_lock(&adapter->reset_lock);
+ rtnl_lock();
+ unregister_netdevice(netdev);
release_resources(adapter);
release_sub_crqs(adapter, 1);
adapter->state = VNIC_REMOVED;
- mutex_unlock(&adapter->reset_lock);
+ rtnl_unlock();
device_remove_file(&dev->dev, &dev_attr_failover);
free_netdev(netdev);
dev_set_drvdata(&dev->dev, NULL);
struct tasklet_struct tasklet;
enum vnic_state state;
enum ibmvnic_reset_reason reset_reason;
- struct mutex reset_lock, rwi_lock;
+ struct mutex rwi_lock;
struct list_head rwi_list;
struct work_struct ibmvnic_reset;
bool resetting;
If unsure, say N.
+config IXGBE_IPSEC
+ bool "IPSec XFRM cryptography-offload acceleration"
+ depends on IXGBE
+ depends on XFRM_OFFLOAD
+ default y
+ select XFRM_ALGO
+ ---help---
+ Enable support for IPSec offload in ixgbe.ko
+
config IXGBEVF
tristate "Intel(R) 10GbE PCI Express Virtual Function Ethernet support"
depends on PCI_MSI
will be called ixgbevf. MSI-X interrupt support is required
for this driver to work correctly.
+config IXGBEVF_IPSEC
+ bool "IPSec XFRM cryptography-offload acceleration"
+ depends on IXGBEVF
+ depends on XFRM_OFFLOAD
+ default y
+ select XFRM_ALGO
+ ---help---
+ Enable support for IPSec offload in ixgbevf.ko
+
config I40E
tristate "Intel(R) Ethernet Controller XL710 Family support"
imply PTP_1588_CLOCK
}
/* guarantee we have free space in the SM mailbox */
- if (!hw->mbx.ops.tx_ready(&hw->mbx, FM10K_VFMBX_MSG_MTU)) {
+ if (hw->mbx.state == FM10K_STATE_OPEN &&
+ !hw->mbx.ops.tx_ready(&hw->mbx, FM10K_VFMBX_MSG_MTU)) {
/* keep track of how many times this occurs */
interface->hw_sm_mbx_full++;
}
}
+static void fm10k_mask_aer_comp_abort(struct pci_dev *pdev)
+{
+ u32 err_mask;
+ int pos;
+
+ pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
+ if (!pos)
+ return;
+
+ /* Mask the completion abort bit in the ERR_UNCOR_MASK register,
+ * preventing the device from reporting these errors to the upstream
+ * PCIe root device. This avoids bringing down platforms which upgrade
+ * non-fatal completer aborts into machine check exceptions. Completer
+ * aborts can occur whenever a VF reads a queue it doesn't own.
+ */
+ pci_read_config_dword(pdev, pos + PCI_ERR_UNCOR_MASK, &err_mask);
+ err_mask |= PCI_ERR_UNC_COMP_ABORT;
+ pci_write_config_dword(pdev, pos + PCI_ERR_UNCOR_MASK, err_mask);
+
+ mmiowb();
+}
+
int fm10k_iov_resume(struct pci_dev *pdev)
{
struct fm10k_intfc *interface = pci_get_drvdata(pdev);
if (!iov_data)
return -ENOMEM;
+ /* Lower severity of completer abort error reporting as
+ * the VFs can trigger this any time they read a queue
+ * that they don't own.
+ */
+ fm10k_mask_aer_comp_abort(pdev);
+
/* allocate hardware resources for the VFs */
hw->iov.ops.assign_resources(hw, num_vfs, num_vfs);
fm10k_iov_free_data(pdev);
}
-static void fm10k_disable_aer_comp_abort(struct pci_dev *pdev)
-{
- u32 err_sev;
- int pos;
-
- pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
- if (!pos)
- return;
-
- pci_read_config_dword(pdev, pos + PCI_ERR_UNCOR_SEVER, &err_sev);
- err_sev &= ~PCI_ERR_UNC_COMP_ABORT;
- pci_write_config_dword(pdev, pos + PCI_ERR_UNCOR_SEVER, err_sev);
-}
-
int fm10k_iov_configure(struct pci_dev *pdev, int num_vfs)
{
int current_vfs = pci_num_vf(pdev);
/* allocate VFs if not already allocated */
if (num_vfs && num_vfs != current_vfs) {
- /* Disable completer abort error reporting as
- * the VFs can trigger this any time they read a queue
- * that they don't own.
- */
- fm10k_disable_aer_comp_abort(pdev);
-
err = pci_enable_sriov(pdev, num_vfs);
if (err) {
dev_err(&pdev->dev,
#include "fm10k.h"
-#define DRV_VERSION "0.23.4-k"
+#define DRV_VERSION "0.26.1-k"
#define DRV_SUMMARY "Intel(R) Ethernet Switch Host Interface Driver"
const char fm10k_driver_version[] = DRV_VERSION;
char fm10k_driver_name[] = "fm10k";
*/
static const struct pci_device_id fm10k_pci_tbl[] = {
{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_PF), fm10k_device_pf },
+ { PCI_VDEVICE(INTEL, FM10K_DEV_ID_SDI_FM10420_QDA2), fm10k_device_pf },
+ { PCI_VDEVICE(INTEL, FM10K_DEV_ID_SDI_FM10420_DA2), fm10k_device_pf },
{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_VF), fm10k_device_vf },
/* required last entry */
{ 0, }
#define FM10K_DEV_ID_PF 0x15A4
#define FM10K_DEV_ID_VF 0x15A5
+#define FM10K_DEV_ID_SDI_FM10420_QDA2 0x15D0
+#define FM10K_DEV_ID_SDI_FM10420_DA2 0x15D5
#define FM10K_MAX_QUEUES 256
#define FM10K_MAX_QUEUES_PF 128
}
vsi->flags |= I40E_VSI_FLAG_FILTER_CHANGED;
- set_bit(__I40E_MACVLAN_SYNC_PENDING, vsi->state);
+ set_bit(__I40E_MACVLAN_SYNC_PENDING, vsi->back->state);
}
/**
NETIF_F_GSO_GRE |
NETIF_F_GSO_GRE_CSUM |
NETIF_F_GSO_PARTIAL |
+ NETIF_F_GSO_IPXIP4 |
+ NETIF_F_GSO_IPXIP6 |
NETIF_F_GSO_UDP_TUNNEL |
NETIF_F_GSO_UDP_TUNNEL_CSUM |
NETIF_F_SCTP_CRC |
/* record features VLANs can make use of */
netdev->vlan_features |= hw_enc_features | NETIF_F_TSO_MANGLEID;
- if (!(pf->flags & I40E_FLAG_MFP_ENABLED))
- netdev->hw_features |= NETIF_F_NTUPLE | NETIF_F_HW_TC;
-
hw_features = hw_enc_features |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_CTAG_RX;
+ if (!(pf->flags & I40E_FLAG_MFP_ENABLED))
+ hw_features |= NETIF_F_NTUPLE | NETIF_F_HW_TC;
+
netdev->hw_features |= hw_features;
netdev->features |= hw_features | NETIF_F_HW_VLAN_CTAG_FILTER;
dev_err(&pf->pdev->dev, "Invalid message from VF %d, opcode %d, len %d\n",
local_vf_id, v_opcode, msglen);
switch (ret) {
- case VIRTCHNL_ERR_PARAM:
+ case VIRTCHNL_STATUS_ERR_PARAM:
return -EPERM;
default:
return -EINVAL;
}
/**
- * i40e_add_xsk_umem - Store an UMEM for a certain ring/qid
+ * i40e_add_xsk_umem - Store a UMEM for a certain ring/qid
* @vsi: Current VSI
* @umem: UMEM to store
* @qid: Ring/qid to associate with the UMEM
}
/**
- * i40e_remove_xsk_umem - Remove an UMEM for a certain ring/qid
+ * i40e_remove_xsk_umem - Remove a UMEM for a certain ring/qid
* @vsi: Current VSI
* @qid: Ring/qid associated with the UMEM
**/
}
/**
- * i40e_xsk_umem_enable - Enable/associate an UMEM to a certain ring/qid
+ * i40e_xsk_umem_enable - Enable/associate a UMEM to a certain ring/qid
* @vsi: Current VSI
* @umem: UMEM
* @qid: Rx ring to associate UMEM to
}
/**
- * i40e_xsk_umem_disable - Diassociate an UMEM from a certain ring/qid
+ * i40e_xsk_umem_disable - Disassociate a UMEM from a certain ring/qid
* @vsi: Current VSI
* @qid: Rx ring to associate UMEM to
*
}
/**
- * i40e_xsk_umem_query - Queries a certain ring/qid for its UMEM
+ * i40e_xsk_umem_setup - Enable/disassociate a UMEM to/from a ring/qid
* @vsi: Current VSI
* @umem: UMEM to enable/associate to a ring, or NULL to disable
* @qid: Rx ring to (dis)associate UMEM (from)to
*
- * This function enables or disables an UMEM to a certain ring.
+ * This function enables or disables a UMEM to a certain ring.
*
* Returns 0 on success, <0 on failure
**/
* @rx_ring: Rx ring
* @xdp: xdp_buff used as input to the XDP program
*
- * This function enables or disables an UMEM to a certain ring.
+ * This function enables or disables a UMEM to a certain ring.
*
* Returns any of I40E_XDP_{PASS, CONSUMED, TX, REDIR}
**/
#define ICE_MIN_INTR_PER_VF (ICE_MIN_QS_PER_VF + 1)
#define ICE_DFLT_INTR_PER_VF (ICE_DFLT_QS_PER_VF + 1)
+#define ICE_MAX_RESET_WAIT 20
+
#define ICE_VSIQF_HKEY_ARRAY_SIZE ((VSIQF_HKEY_MAX_INDEX + 1) * 4)
#define ICE_DFLT_NETIF_M (NETIF_MSG_DRV | NETIF_MSG_PROBE | NETIF_MSG_LINK)
u64 tx_linearize;
DECLARE_BITMAP(state, __ICE_STATE_NBITS);
DECLARE_BITMAP(flags, ICE_VSI_FLAG_NBITS);
- unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
unsigned int current_netdev_flags;
u32 tx_restart;
u32 tx_busy;
int ice_get_rss(struct ice_vsi *vsi, u8 *seed, u8 *lut, u16 lut_size);
void ice_fill_rss_lut(u8 *lut, u16 rss_table_size, u16 rss_size);
void ice_print_link_msg(struct ice_vsi *vsi, bool isup);
+void ice_napi_del(struct ice_vsi *vsi);
#endif /* _ICE_H_ */
/* Attempt to disable FW logging before shutting down control queues */
ice_cfg_fw_log(hw, false);
ice_shutdown_all_ctrlq(hw);
+
+ /* Clear VSI contexts if not already cleared */
+ ice_clear_all_vsi_ctx(hw);
}
/**
}
if (!test_bit(__ICE_DOWN, pf->state)) {
- /* Give it a little more time to try to come back */
+ /* Give it a little more time to try to come back. If still
+ * down, restart autoneg link or reinitialize the interface.
+ */
msleep(75);
if (!test_bit(__ICE_DOWN, pf->state))
return ice_nway_reset(netdev);
+
+ ice_down(vsi);
+ ice_up(vsi);
}
return err;
#define GLNVM_ULD 0x000B6008
#define GLNVM_ULD_CORER_DONE_M BIT(3)
#define GLNVM_ULD_GLOBR_DONE_M BIT(4)
+#define GLPCI_CNF2 0x000BE004
+#define GLPCI_CNF2_CACHELINE_SIZE_M BIT(1)
#define PF_FUNC_RID 0x0009E880
#define PF_FUNC_RID_FUNC_NUM_S 0
#define PF_FUNC_RID_FUNC_NUM_M ICE_M(0x7, 0)
status = ice_update_vsi(&vsi->back->hw, vsi->idx, ctxt, NULL);
if (status) {
netdev_err(vsi->netdev, "%sabling VLAN pruning on VSI handle: %d, VSI HW ID: %d failed, err = %d, aq_err = %d\n",
- ena ? "Ena" : "Dis", vsi->idx, vsi->vsi_num, status,
+ ena ? "En" : "Dis", vsi->idx, vsi->vsi_num, status,
vsi->back->hw.adminq.sq_last_status);
goto err_out;
}
* on this wq
*/
if (vsi->netdev && !ice_is_reset_in_progress(pf->state)) {
+ ice_napi_del(vsi);
unregister_netdev(vsi->netdev);
free_netdev(vsi->netdev);
vsi->netdev = NULL;
* ice_napi_del - Remove NAPI handler for the VSI
* @vsi: VSI for which NAPI handler is to be removed
*/
-static void ice_napi_del(struct ice_vsi *vsi)
+void ice_napi_del(struct ice_vsi *vsi)
{
int v_idx;
{
struct ice_netdev_priv *np = netdev_priv(netdev);
struct ice_vsi *vsi = np->vsi;
- int ret;
if (vid >= VLAN_N_VID) {
netdev_err(netdev, "VLAN id requested %d is out of range %d\n",
/* Enable VLAN pruning when VLAN 0 is added */
if (unlikely(!vid)) {
- ret = ice_cfg_vlan_pruning(vsi, true);
+ int ret = ice_cfg_vlan_pruning(vsi, true);
+
if (ret)
return ret;
}
* needed to continue allowing all untagged packets since VLAN prune
* list is applied to all packets by the switch
*/
- ret = ice_vsi_add_vlan(vsi, vid);
-
- if (!ret)
- set_bit(vid, vsi->active_vlans);
-
- return ret;
+ return ice_vsi_add_vlan(vsi, vid);
}
/**
if (status)
return status;
- clear_bit(vid, vsi->active_vlans);
-
/* Disable VLAN pruning when VLAN 0 is removed */
if (unlikely(!vid))
status = ice_cfg_vlan_pruning(vsi, false);
return 0;
}
+/**
+ * ice_verify_cacheline_size - verify driver's assumption of 64 Byte cache lines
+ * @pf: pointer to the PF structure
+ *
+ * There is no error returned here because the driver should be able to handle
+ * 128 Byte cache lines, so we only print a warning in case issues are seen,
+ * specifically with Tx.
+ */
+static void ice_verify_cacheline_size(struct ice_pf *pf)
+{
+ if (rd32(&pf->hw, GLPCI_CNF2) & GLPCI_CNF2_CACHELINE_SIZE_M)
+ dev_warn(&pf->pdev->dev,
+ "%d Byte cache line assumption is invalid, driver may have Tx timeouts!\n",
+ ICE_CACHE_LINE_BYTES);
+}
+
/**
* ice_probe - Device initialization routine
* @pdev: PCI device information struct
/* since everything is good, start the service timer */
mod_timer(&pf->serv_tmr, round_jiffies(jiffies + pf->serv_tmr_period));
+ ice_verify_cacheline_size(pf);
+
return 0;
err_alloc_sw_unroll:
if (!pf)
return;
+ for (i = 0; i < ICE_MAX_RESET_WAIT; i++) {
+ if (!ice_is_reset_in_progress(pf->state))
+ break;
+ msleep(100);
+ }
+
set_bit(__ICE_DOWN, pf->state);
ice_service_task_stop(pf);
return ret;
}
-/**
- * ice_restore_vlan - Reinstate VLANs when vsi/netdev comes back up
- * @vsi: the VSI being brought back up
- */
-static int ice_restore_vlan(struct ice_vsi *vsi)
-{
- int err;
- u16 vid;
-
- if (!vsi->netdev)
- return -EINVAL;
-
- err = ice_vsi_vlan_setup(vsi);
- if (err)
- return err;
-
- for_each_set_bit(vid, vsi->active_vlans, VLAN_N_VID) {
- err = ice_vlan_rx_add_vid(vsi->netdev, htons(ETH_P_8021Q), vid);
- if (err)
- break;
- }
-
- return err;
-}
-
/**
* ice_vsi_cfg - Setup the VSI
* @vsi: the VSI being configured
if (vsi->netdev) {
ice_set_rx_mode(vsi->netdev);
- err = ice_restore_vlan(vsi);
+
+ err = ice_vsi_vlan_setup(vsi);
+
if (err)
return err;
}
struct device *dev = &pf->pdev->dev;
struct ice_hw *hw = &pf->hw;
enum ice_status ret;
- int err;
+ int err, i;
if (test_bit(__ICE_DOWN, pf->state))
goto clear_recovery;
}
ice_reset_all_vfs(pf, true);
+
+ for (i = 0; i < pf->num_alloc_vsi; i++) {
+ bool link_up;
+
+ if (!pf->vsi[i] || pf->vsi[i]->type != ICE_VSI_PF)
+ continue;
+ ice_get_link_status(pf->vsi[i]->port_info, &link_up);
+ if (link_up) {
+ netif_carrier_on(pf->vsi[i]->netdev);
+ netif_tx_wake_all_queues(pf->vsi[i]->netdev);
+ } else {
+ netif_carrier_off(pf->vsi[i]->netdev);
+ netif_tx_stop_all_queues(pf->vsi[i]->netdev);
+ }
+ }
+
/* if we get here, reset flow is successful */
clear_bit(__ICE_RESET_FAILED, pf->state);
return;
}
}
+/**
+ * ice_clear_all_vsi_ctx - clear all the VSI context entries
+ * @hw: pointer to the hw struct
+ */
+void ice_clear_all_vsi_ctx(struct ice_hw *hw)
+{
+ u16 i;
+
+ for (i = 0; i < ICE_MAX_VSI; i++)
+ ice_clear_vsi_ctx(hw, i);
+}
+
/**
* ice_add_vsi - add VSI context to the hardware and VSI handle list
* @hw: pointer to the hw struct
struct ice_sq_cd *cd);
bool ice_is_vsi_valid(struct ice_hw *hw, u16 vsi_handle);
struct ice_vsi_ctx *ice_get_vsi_ctx(struct ice_hw *hw, u16 vsi_handle);
+void ice_clear_all_vsi_ctx(struct ice_hw *hw);
+/* Switch config */
enum ice_status ice_get_initial_sw_cfg(struct ice_hw *hw);
/* Switch/bridge related commands */
/* update gso_segs and bytecount */
first->gso_segs = skb_shinfo(skb)->gso_segs;
- first->bytecount = (first->gso_segs - 1) * off->header_len;
+ first->bytecount += (first->gso_segs - 1) * off->header_len;
cd_tso_len = skb->len - off->header_len;
cd_mss = skb_shinfo(skb)->gso_size;
* magnitude greater than our largest possible GSO size.
*
* This would then be implemented as:
- * return (((size >> 12) * 85) >> 8) + 1;
+ * return (((size >> 12) * 85) >> 8) + ICE_DESCS_FOR_SKB_DATA_PTR;
*
* Since multiplication and division are commutative, we can reorder
* operations into:
- * return ((size * 85) >> 20) + 1;
+ * return ((size * 85) >> 20) + ICE_DESCS_FOR_SKB_DATA_PTR;
*/
static unsigned int ice_txd_use_count(unsigned int size)
{
- return ((size * 85) >> 20) + 1;
+ return ((size * 85) >> 20) + ICE_DESCS_FOR_SKB_DATA_PTR;
}
/**
* + 1 desc for context descriptor,
* otherwise try next time
*/
- if (ice_maybe_stop_tx(tx_ring, count + 4 + 1)) {
+ if (ice_maybe_stop_tx(tx_ring, count + ICE_DESCS_PER_CACHE_LINE +
+ ICE_DESCS_FOR_CTX_DESC)) {
tx_ring->tx_stats.tx_busy++;
return NETDEV_TX_BUSY;
}
#define ICE_RX_BUF_WRITE 16 /* Must be power of 2 */
#define ICE_MAX_TXQ_PER_TXQG 128
-/* Tx Descriptors needed, worst case */
-#define DESC_NEEDED (MAX_SKB_FRAGS + 4)
+/* We are assuming that the cache line is always 64 Bytes here for ice.
+ * In order to make sure that is a correct assumption there is a check in probe
+ * to print a warning if the read from GLPCI_CNF2 tells us that the cache line
+ * size is 128 bytes. We do it this way because we do not want to read the
+ * GLPCI_CNF2 register or a variable containing the value on every pass through
+ * the Tx path.
+ */
+#define ICE_CACHE_LINE_BYTES 64
+#define ICE_DESCS_PER_CACHE_LINE (ICE_CACHE_LINE_BYTES / \
+ sizeof(struct ice_tx_desc))
+#define ICE_DESCS_FOR_CTX_DESC 1
+#define ICE_DESCS_FOR_SKB_DATA_PTR 1
+/* Tx descriptors needed, worst case */
+#define DESC_NEEDED (MAX_SKB_FRAGS + ICE_DESCS_FOR_CTX_DESC + \
+ ICE_DESCS_PER_CACHE_LINE + ICE_DESCS_FOR_SKB_DATA_PTR)
#define ICE_DESC_UNUSED(R) \
((((R)->next_to_clean > (R)->next_to_use) ? 0 : (R)->count) + \
(R)->next_to_clean - (R)->next_to_use - 1)
u64 phy_type_low;
u16 max_frame_size;
u16 link_speed;
+ u16 req_speeds;
u8 lse_ena; /* Link Status Event notification */
u8 link_info;
u8 an_info;
u8 ext_info;
u8 pacing;
- u8 req_speeds;
/* Refer to #define from module_type[ICE_MODULE_TYPE_TOTAL_BYTE] of
* ice_aqc_get_phy_caps structure
*/
struct ice_vsi_ctx ctxt = { 0 };
enum ice_status status;
- ctxt.info.vlan_flags = ICE_AQ_VSI_VLAN_MODE_TAGGED |
+ ctxt.info.vlan_flags = ICE_AQ_VSI_VLAN_MODE_UNTAGGED |
ICE_AQ_VSI_PVLAN_INSERT_PVID |
ICE_AQ_VSI_VLAN_EMOD_STR;
ctxt.info.pvid = cpu_to_le16(vid);
if (!ice_vsi_add_vlan(vsi, vid)) {
vf->num_vlan++;
- set_bit(vid, vsi->active_vlans);
/* Enable VLAN pruning when VLAN 0 is added */
if (unlikely(!vid))
*/
if (!ice_vsi_kill_vlan(vsi, vid)) {
vf->num_vlan--;
- clear_bit(vid, vsi->active_vlans);
/* Disable VLAN pruning when removing VLAN 0 */
if (unlikely(!vid))
nvm_word = E1000_INVM_DEFAULT_AL;
tmp_nvm = nvm_word | E1000_INVM_PLL_WO_VAL;
igb_write_phy_reg_82580(hw, I347AT4_PAGE_SELECT, E1000_PHY_PLL_FREQ_PAGE);
+ phy_word = E1000_PHY_PLL_UNCONF;
for (i = 0; i < E1000_MAX_PLL_TRIES; i++) {
/* check current state directly from internal PHY */
igb_read_phy_reg_82580(hw, E1000_PHY_PLL_FREQ_REG, &phy_word);
*
* The 40 bit 82580 SYSTIM overflows every
* 2^40 * 10^-9 / 60 = 18.3 minutes.
+ *
+ * SYSTIM is converted to real time using a timecounter. As
+ * timecounter_cyc2time() allows old timestamps, the timecounter needs
+ * to be updated at least once per half of the SYSTIM interval.
+ * Scheduling of delayed work is not very accurate, and also the NIC
+ * clock can be adjusted to run up to 6% faster and the system clock
+ * up to 10% slower, so we aim for 6 minutes to be sure the actual
+ * interval in the NIC time is shorter than 9.16 minutes.
*/
-#define IGB_SYSTIM_OVERFLOW_PERIOD (HZ * 60 * 9)
+#define IGB_SYSTIM_OVERFLOW_PERIOD (HZ * 60 * 6)
#define IGB_PTP_TX_TIMEOUT (HZ * 15)
#define INCPERIOD_82576 BIT(E1000_TIMINCA_16NS_SHIFT)
#define INCVALUE_82576_MASK GENMASK(E1000_TIMINCA_16NS_SHIFT - 1, 0)
ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
-ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
+ixgbe-$(CONFIG_IXGBE_IPSEC) += ixgbe_ipsec.o
#define IXGBE_RSS_KEY_SIZE 40 /* size of RSS Hash Key in bytes */
u32 *rss_key;
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
struct ixgbe_ipsec *ipsec;
-#endif /* CONFIG_XFRM_OFFLOAD */
+#endif /* CONFIG_IXGBE_IPSEC */
/* AF_XDP zero-copy */
struct xdp_umem **xsk_umems;
void ixgbe_store_reta(struct ixgbe_adapter *adapter);
s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
void ixgbe_stop_ipsec_offload(struct ixgbe_adapter *adapter);
void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
u32 *mbuf, u32 vf) { return -EACCES; }
static inline int ixgbe_ipsec_vf_del_sa(struct ixgbe_adapter *adapter,
u32 *mbuf, u32 vf) { return -EACCES; }
-#endif /* CONFIG_XFRM_OFFLOAD */
+#endif /* CONFIG_IXGBE_IPSEC */
#endif /* _IXGBE_H_ */
#endif /* IXGBE_FCOE */
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
if (skb->sp && !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
goto out_drop;
#endif
* the TSO, so it's the exception.
*/
if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID)) {
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
if (!skb->sp)
#endif
features &= ~NETIF_F_TSO;
if (hw->mac.type >= ixgbe_mac_82599EB)
netdev->features |= NETIF_F_SCTP_CRC;
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
#define IXGBE_ESP_FEATURES (NETIF_F_HW_ESP | \
NETIF_F_HW_ESP_TX_CSUM | \
NETIF_F_GSO_ESP)
ixgbe_set_vmvir(adapter, vfinfo->pf_vlan,
adapter->default_up, vf);
- if (vfinfo->spoofchk_enabled)
+ if (vfinfo->spoofchk_enabled) {
hw->mac.ops.set_vlan_anti_spoofing(hw, true, vf);
+ hw->mac.ops.set_mac_anti_spoofing(hw, true, vf);
+ }
}
/* reset multicast table array for vf */
*autoneg = false;
if (hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core0 ||
- hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1) {
+ hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1 ||
+ hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core0 ||
+ hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core1) {
*speed = IXGBE_LINK_SPEED_1GB_FULL;
return 0;
}
mbx.o \
ethtool.o \
ixgbevf_main.o
-ixgbevf-$(CONFIG_XFRM_OFFLOAD) += ipsec.o
+ixgbevf-$(CONFIG_IXGBEVF_IPSEC) += ipsec.o
extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector);
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBEVF_IPSEC
void ixgbevf_init_ipsec_offload(struct ixgbevf_adapter *adapter);
void ixgbevf_stop_ipsec_offload(struct ixgbevf_adapter *adapter);
void ixgbevf_ipsec_restore(struct ixgbevf_adapter *adapter);
struct ixgbevf_tx_buffer *first,
struct ixgbevf_ipsec_tx_data *itd)
{ return 0; }
-#endif /* CONFIG_XFRM_OFFLOAD */
+#endif /* CONFIG_IXGBEVF_IPSEC */
void ixgbe_napi_add_all(struct ixgbevf_adapter *adapter);
void ixgbe_napi_del_all(struct ixgbevf_adapter *adapter);
first->tx_flags = tx_flags;
first->protocol = vlan_get_protocol(skb);
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBEVF_IPSEC
if (skb->sp && !ixgbevf_ipsec_tx(tx_ring, first, &ipsec_tx))
goto out_drop;
#endif
err = register_netdev(net_dev);
if (err)
goto err_unprepare_clk;
- return err;
+
+ return 0;
err_unprepare_clk:
clk_disable_unprepare(priv->clk);
err_uninit_dma:
xrx200_hw_cleanup(priv);
- return 0;
+ return err;
}
static int xrx200_remove(struct platform_device *pdev)
#if defined(__LITTLE_ENDIAN)
struct mvneta_tx_desc {
u32 command; /* Options used by HW for packet transmitting.*/
- u16 reserverd1; /* csum_l4 (for future use) */
+ u16 reserved1; /* csum_l4 (for future use) */
u16 data_size; /* Data size of transmitted packet in bytes */
u32 buf_phys_addr; /* Physical addr of transmitted buffer */
u32 reserved2; /* hw_cmd - (for future use, PMT) */
#else
struct mvneta_tx_desc {
u16 data_size; /* Data size of transmitted packet in bytes */
- u16 reserverd1; /* csum_l4 (for future use) */
+ u16 reserved1; /* csum_l4 (for future use) */
u32 command; /* Options used by HW for packet transmitting.*/
u32 reserved2; /* hw_cmd - (for future use, PMT) */
u32 buf_phys_addr; /* Physical addr of transmitted buffer */
if (state->interface != PHY_INTERFACE_MODE_NA &&
state->interface != PHY_INTERFACE_MODE_QSGMII &&
state->interface != PHY_INTERFACE_MODE_SGMII &&
- state->interface != PHY_INTERFACE_MODE_2500BASEX &&
!phy_interface_mode_is_8023z(state->interface) &&
!phy_interface_mode_is_rgmii(state->interface)) {
bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
/* Asymmetric pause is unsupported */
phylink_set(mask, Pause);
- /* We cannot use 1Gbps when using the 2.5G interface. */
- if (state->interface == PHY_INTERFACE_MODE_2500BASEX) {
- phylink_set(mask, 2500baseT_Full);
- phylink_set(mask, 2500baseX_Full);
- } else {
- phylink_set(mask, 1000baseT_Full);
- phylink_set(mask, 1000baseX_Full);
- }
+ /* Half-duplex at speeds higher than 100Mbit is unsupported */
+ phylink_set(mask, 1000baseT_Full);
+ phylink_set(mask, 1000baseX_Full);
if (!phy_interface_mode_is_8023z(state->interface)) {
/* 10M and 100M are only supported in non-802.3z mode */
int nrxqs;
u32 pending_cause_rx;
struct mvpp2_port *port;
+ struct cpumask *mask;
};
struct mvpp2_port {
for (i = 0; i < port->nqvecs; i++) {
struct mvpp2_queue_vector *qv = port->qvecs + i;
- if (qv->type == MVPP2_QUEUE_VECTOR_PRIVATE)
+ if (qv->type == MVPP2_QUEUE_VECTOR_PRIVATE) {
+ qv->mask = kzalloc(cpumask_size(), GFP_KERNEL);
+ if (!qv->mask) {
+ err = -ENOMEM;
+ goto err;
+ }
+
irq_set_status_flags(qv->irq, IRQ_NO_BALANCING);
+ }
err = request_irq(qv->irq, mvpp2_isr, 0, port->dev->name, qv);
if (err)
goto err;
if (qv->type == MVPP2_QUEUE_VECTOR_PRIVATE) {
- unsigned long mask = 0;
unsigned int cpu;
for_each_present_cpu(cpu) {
if (mvpp2_cpu_to_thread(port->priv, cpu) ==
qv->sw_thread_id)
- mask |= BIT(cpu);
+ cpumask_set_cpu(cpu, qv->mask);
}
- irq_set_affinity_hint(qv->irq, to_cpumask(&mask));
+ irq_set_affinity_hint(qv->irq, qv->mask);
}
}
struct mvpp2_queue_vector *qv = port->qvecs + i;
irq_set_affinity_hint(qv->irq, NULL);
+ kfree(qv->mask);
+ qv->mask = NULL;
free_irq(qv->irq, qv);
}
struct mvpp2_queue_vector *qv = port->qvecs + i;
irq_set_affinity_hint(qv->irq, NULL);
+ kfree(qv->mask);
+ qv->mask = NULL;
irq_clear_status_flags(qv->irq, IRQ_NO_BALANCING);
free_irq(qv->irq, qv);
}
unsigned long *supported,
struct phylink_link_state *state)
{
+ struct mvpp2_port *port = netdev_priv(dev);
__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
+ /* Invalid combinations */
+ switch (state->interface) {
+ case PHY_INTERFACE_MODE_10GKR:
+ case PHY_INTERFACE_MODE_XAUI:
+ if (port->gop_id != 0)
+ goto empty_set;
+ break;
+ case PHY_INTERFACE_MODE_RGMII:
+ case PHY_INTERFACE_MODE_RGMII_ID:
+ case PHY_INTERFACE_MODE_RGMII_RXID:
+ case PHY_INTERFACE_MODE_RGMII_TXID:
+ if (port->gop_id == 0)
+ goto empty_set;
+ break;
+ default:
+ break;
+ }
+
phylink_set(mask, Autoneg);
phylink_set_port_modes(mask);
phylink_set(mask, Pause);
switch (state->interface) {
case PHY_INTERFACE_MODE_10GKR:
+ case PHY_INTERFACE_MODE_XAUI:
+ case PHY_INTERFACE_MODE_NA:
phylink_set(mask, 10000baseCR_Full);
phylink_set(mask, 10000baseSR_Full);
phylink_set(mask, 10000baseLR_Full);
phylink_set(mask, 10000baseER_Full);
phylink_set(mask, 10000baseKR_Full);
/* Fall-through */
- default:
+ case PHY_INTERFACE_MODE_RGMII:
+ case PHY_INTERFACE_MODE_RGMII_ID:
+ case PHY_INTERFACE_MODE_RGMII_RXID:
+ case PHY_INTERFACE_MODE_RGMII_TXID:
+ case PHY_INTERFACE_MODE_SGMII:
phylink_set(mask, 10baseT_Half);
phylink_set(mask, 10baseT_Full);
phylink_set(mask, 100baseT_Half);
phylink_set(mask, 1000baseT_Full);
phylink_set(mask, 1000baseX_Full);
phylink_set(mask, 2500baseX_Full);
+ break;
+ default:
+ goto empty_set;
}
bitmap_and(supported, supported, mask, __ETHTOOL_LINK_MODE_MASK_NBITS);
bitmap_and(state->advertising, state->advertising, mask,
__ETHTOOL_LINK_MODE_MASK_NBITS);
+ return;
+
+empty_set:
+ bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
}
static void mvpp22_xlg_link_state(struct mvpp2_port *port,
config MLX4_EN
tristate "Mellanox Technologies 1/10/40Gbit Ethernet support"
depends on MAY_USE_DEVLINK
- depends on PCI
+ depends on PCI && NETDEVICES && ETHERNET && INET
select MLX4_CORE
imply PTP_1588_CLOCK
---help---
static u32 __mlx4_alloc_from_zone(struct mlx4_zone_entry *zone, int count,
int align, u32 skip_mask, u32 *puid)
{
- u32 uid;
+ u32 uid = 0;
u32 res;
struct mlx4_zone_allocator *zone_alloc = zone->allocator;
struct mlx4_zone_entry *curr_node;
tx_pause = !!(pause->tx_pause);
rx_pause = !!(pause->rx_pause);
- rx_ppp = priv->prof->rx_ppp && !(tx_pause || rx_pause);
- tx_ppp = priv->prof->tx_ppp && !(tx_pause || rx_pause);
+ rx_ppp = (tx_pause || rx_pause) ? 0 : priv->prof->rx_ppp;
+ tx_ppp = (tx_pause || rx_pause) ? 0 : priv->prof->tx_ppp;
err = mlx4_SET_PORT_general(mdev->dev, priv->port,
priv->rx_skb_size + ETH_FCS_LEN,
dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM;
}
- /* MTU range: 46 - hw-specific max */
- dev->min_mtu = MLX4_EN_MIN_MTU;
+ /* MTU range: 68 - hw-specific max */
+ dev->min_mtu = ETH_MIN_MTU;
dev->max_mtu = priv->max_mtu;
mdev->pndev[port] = dev;
#include <linux/vmalloc.h>
#include <linux/irq.h>
+#include <net/ip.h>
#if IS_ENABLED(CONFIG_IPV6)
#include <net/ip6_checksum.h>
#endif
ring->packets++;
}
ring->bytes += tx_info->nr_bytes;
- netdev_tx_sent_queue(ring->tx_queue, tx_info->nr_bytes);
AVG_PERF_COUNTER(priv->pstats.tx_pktsz_avg, skb->len);
if (tx_info->inl)
netif_tx_stop_queue(ring->tx_queue);
ring->queue_stopped++;
}
- send_doorbell = !skb->xmit_more || netif_xmit_stopped(ring->tx_queue);
+
+ send_doorbell = __netdev_tx_sent_queue(ring->tx_queue,
+ tx_info->nr_bytes,
+ skb->xmit_more);
real_size = (real_size / 16) & 0x3f;
struct resource_allocator {
spinlock_t alloc_lock; /* protect quotas */
union {
- int res_reserved;
- int res_port_rsvd[MLX4_MAX_PORTS];
+ unsigned int res_reserved;
+ unsigned int res_port_rsvd[MLX4_MAX_PORTS];
};
union {
int res_free;
#define MLX4_SELFTEST_LB_MIN_MTU (MLX4_LOOPBACK_TEST_PAYLOAD + NET_IP_ALIGN + \
ETH_HLEN + PREAMBLE_LEN)
-#define MLX4_EN_MIN_MTU 46
/* VLAN_HLEN is added twice,to support skb vlan tagged with multiple
* headers. (For example: ETH_P_8021Q and ETH_P_8021AD).
*/
container_of((void *)mpt_entry, struct mlx4_cmd_mailbox,
buf);
+ (*mpt_entry)->lkey = 0;
err = mlx4_SW2HW_MPT(dev, mailbox, key);
}
unsigned long state;
int ix;
+ unsigned int hw_mtu;
struct net_dim dim; /* Dynamic Interrupt Moderation */
eth_proto_oper = MLX5_GET(ptys_reg, out, eth_proto_oper);
*speed = mlx5e_port_ptys2speed(eth_proto_oper);
- if (!(*speed)) {
- mlx5_core_warn(mdev, "cannot get port speed\n");
+ if (!(*speed))
err = -EINVAL;
- }
return err;
}
case 40000:
if (!write)
*fec_policy = MLX5_GET(pplm_reg, pplm,
- fec_override_cap_10g_40g);
+ fec_override_admin_10g_40g);
else
MLX5_SET(pplm_reg, pplm,
fec_override_admin_10g_40g, *fec_policy);
case 10000:
case 40000:
*fec_cap = MLX5_GET(pplm_reg, pplm,
- fec_override_admin_10g_40g);
+ fec_override_cap_10g_40g);
break;
case 25000:
*fec_cap = MLX5_GET(pplm_reg, pplm,
int mlx5e_set_fec_mode(struct mlx5_core_dev *dev, u8 fec_policy)
{
+ u8 fec_policy_nofec = BIT(MLX5E_FEC_NOFEC);
bool fec_mode_not_supp_in_speed = false;
- u8 no_fec_policy = BIT(MLX5E_FEC_NOFEC);
u32 out[MLX5_ST_SZ_DW(pplm_reg)] = {};
u32 in[MLX5_ST_SZ_DW(pplm_reg)] = {};
int sz = MLX5_ST_SZ_BYTES(pplm_reg);
- u32 current_fec_speed;
+ u8 fec_policy_auto = 0;
u8 fec_caps = 0;
int err;
int i;
if (err)
return err;
- err = mlx5e_port_linkspeed(dev, ¤t_fec_speed);
- if (err)
- return err;
+ MLX5_SET(pplm_reg, out, local_port, 1);
- memset(in, 0, sz);
- MLX5_SET(pplm_reg, in, local_port, 1);
- for (i = 0; i < MLX5E_FEC_SUPPORTED_SPEEDS && !!fec_policy; i++) {
+ for (i = 0; i < MLX5E_FEC_SUPPORTED_SPEEDS; i++) {
mlx5e_get_fec_cap_field(out, &fec_caps, fec_supported_speeds[i]);
- /* policy supported for link speed */
- if (!!(fec_caps & fec_policy)) {
- mlx5e_fec_admin_field(in, &fec_policy, 1,
+ /* policy supported for link speed, or policy is auto */
+ if (fec_caps & fec_policy || fec_policy == fec_policy_auto) {
+ mlx5e_fec_admin_field(out, &fec_policy, 1,
fec_supported_speeds[i]);
} else {
- if (fec_supported_speeds[i] == current_fec_speed)
- return -EOPNOTSUPP;
- mlx5e_fec_admin_field(in, &no_fec_policy, 1,
- fec_supported_speeds[i]);
+ /* turn off FEC if supported. Else, leave it the same */
+ if (fec_caps & fec_policy_nofec)
+ mlx5e_fec_admin_field(out, &fec_policy_nofec, 1,
+ fec_supported_speeds[i]);
fec_mode_not_supp_in_speed = true;
}
}
"FEC policy 0x%x is not supported for some speeds",
fec_policy);
- return mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PPLM, 0, 1);
+ return mlx5_core_access_reg(dev, out, sz, out, sz, MLX5_REG_PPLM, 0, 1);
}
int err;
err = mlx5e_port_linkspeed(priv->mdev, &speed);
- if (err)
+ if (err) {
+ mlx5_core_warn(priv->mdev, "cannot get port speed\n");
return 0;
+ }
xoff = (301 + 216 * priv->dcbx.cable_len / 100) * speed / 1000 + 272 * mtu / 100;
ethtool_link_ksettings_add_link_mode(link_ksettings, supported,
Autoneg);
- err = get_fec_supported_advertised(mdev, link_ksettings);
- if (err)
+ if (get_fec_supported_advertised(mdev, link_ksettings))
netdev_dbg(netdev, "%s: FEC caps query failed: %d\n",
__func__, err);
rq->channel = c;
rq->ix = c->ix;
rq->mdev = mdev;
+ rq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
rq->stats = &c->priv->channel_stats[c->ix].rq;
rq->xdp_prog = params->xdp_prog ? bpf_prog_inc(params->xdp_prog) : NULL;
int err;
u32 i;
+ err = mlx5_vector2eqn(mdev, param->eq_ix, &eqn_not_used, &irqn);
+ if (err)
+ return err;
+
err = mlx5_cqwq_create(mdev, ¶m->wq, param->cqc, &cq->wq,
&cq->wq_ctrl);
if (err)
return err;
- mlx5_vector2eqn(mdev, param->eq_ix, &eqn_not_used, &irqn);
-
mcq->cqe_sz = 64;
mcq->set_ci_db = cq->wq_ctrl.db.db;
mcq->arm_db = cq->wq_ctrl.db.db + 1;
int eqn;
int err;
+ err = mlx5_vector2eqn(mdev, param->eq_ix, &eqn, &irqn_not_used);
+ if (err)
+ return err;
+
inlen = MLX5_ST_SZ_BYTES(create_cq_in) +
sizeof(u64) * cq->wq_ctrl.buf.npages;
in = kvzalloc(inlen, GFP_KERNEL);
mlx5_fill_page_frag_array(&cq->wq_ctrl.buf,
(__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas));
- mlx5_vector2eqn(mdev, param->eq_ix, &eqn, &irqn_not_used);
-
MLX5_SET(cqc, cqc, cq_period_mode, param->cq_period_mode);
MLX5_SET(cqc, cqc, c_eqn, eqn);
MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index);
int err;
int eqn;
+ err = mlx5_vector2eqn(priv->mdev, ix, &eqn, &irq);
+ if (err)
+ return err;
+
c = kvzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
if (!c)
return -ENOMEM;
c->xdp = !!params->xdp_prog;
c->stats = &priv->channel_stats[ix].ch;
- mlx5_vector2eqn(priv->mdev, ix, &eqn, &irq);
c->irq_desc = irq_to_desc(irq);
netif_napi_add(netdev, &c->napi, mlx5e_napi_poll, 64);
return 0;
}
+#ifdef CONFIG_MLX5_ESWITCH
static int set_feature_tc_num_filters(struct net_device *netdev, bool enable)
{
struct mlx5e_priv *priv = netdev_priv(netdev);
return 0;
}
+#endif
static int set_feature_rx_all(struct net_device *netdev, bool enable)
{
err |= MLX5E_HANDLE_FEATURE(NETIF_F_LRO, set_feature_lro);
err |= MLX5E_HANDLE_FEATURE(NETIF_F_HW_VLAN_CTAG_FILTER,
set_feature_cvlan_filter);
+#ifdef CONFIG_MLX5_ESWITCH
err |= MLX5E_HANDLE_FEATURE(NETIF_F_HW_TC, set_feature_tc_num_filters);
+#endif
err |= MLX5E_HANDLE_FEATURE(NETIF_F_RXALL, set_feature_rx_all);
err |= MLX5E_HANDLE_FEATURE(NETIF_F_RXFCS, set_feature_rx_fcs);
err |= MLX5E_HANDLE_FEATURE(NETIF_F_HW_VLAN_CTAG_RX, set_feature_rx_vlan);
}
if (params->rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
+ bool is_linear = mlx5e_rx_mpwqe_is_linear_skb(priv->mdev, &new_channels.params);
u8 ppw_old = mlx5e_mpwqe_log_pkts_per_wqe(params);
u8 ppw_new = mlx5e_mpwqe_log_pkts_per_wqe(&new_channels.params);
- reset = reset && (ppw_old != ppw_new);
+ reset = reset && (is_linear || (ppw_old != ppw_new));
}
if (!reset) {
FT_CAP(modify_root) &&
FT_CAP(identified_miss_table_mode) &&
FT_CAP(flow_table_modify)) {
+#ifdef CONFIG_MLX5_ESWITCH
netdev->hw_features |= NETIF_F_HW_TC;
+#endif
#ifdef CONFIG_MLX5_EN_ARFS
netdev->hw_features |= NETIF_F_NTUPLE;
#endif
int mlx5e_attach_netdev(struct mlx5e_priv *priv)
{
const struct mlx5e_profile *profile;
+ int max_nch;
int err;
profile = priv->profile;
clear_bit(MLX5E_STATE_DESTROYING, &priv->state);
+ /* max number of channels may have changed */
+ max_nch = mlx5e_get_max_num_channels(priv->mdev);
+ if (priv->channels.params.num_channels > max_nch) {
+ mlx5_core_warn(priv->mdev, "MLX5E: Reducing number of channels to %d\n", max_nch);
+ priv->channels.params.num_channels = max_nch;
+ mlx5e_build_default_indir_rqt(priv->channels.params.indirection_rqt,
+ MLX5E_INDIR_RQT_SIZE, max_nch);
+ }
+
err = profile->init_tx(priv);
if (err)
goto out;
rq->stats->ecn_mark += !!rc;
}
-static __be32 mlx5e_get_fcs(struct sk_buff *skb)
+static u32 mlx5e_get_fcs(const struct sk_buff *skb)
{
- int last_frag_sz, bytes_in_prev, nr_frags;
- u8 *fcs_p1, *fcs_p2;
- skb_frag_t *last_frag;
- __be32 fcs_bytes;
+ const void *fcs_bytes;
+ u32 _fcs_bytes;
- if (!skb_is_nonlinear(skb))
- return *(__be32 *)(skb->data + skb->len - ETH_FCS_LEN);
+ fcs_bytes = skb_header_pointer(skb, skb->len - ETH_FCS_LEN,
+ ETH_FCS_LEN, &_fcs_bytes);
- nr_frags = skb_shinfo(skb)->nr_frags;
- last_frag = &skb_shinfo(skb)->frags[nr_frags - 1];
- last_frag_sz = skb_frag_size(last_frag);
-
- /* If all FCS data is in last frag */
- if (last_frag_sz >= ETH_FCS_LEN)
- return *(__be32 *)(skb_frag_address(last_frag) +
- last_frag_sz - ETH_FCS_LEN);
-
- fcs_p2 = (u8 *)skb_frag_address(last_frag);
- bytes_in_prev = ETH_FCS_LEN - last_frag_sz;
-
- /* Find where the other part of the FCS is - Linear or another frag */
- if (nr_frags == 1) {
- fcs_p1 = skb_tail_pointer(skb);
- } else {
- skb_frag_t *prev_frag = &skb_shinfo(skb)->frags[nr_frags - 2];
-
- fcs_p1 = skb_frag_address(prev_frag) +
- skb_frag_size(prev_frag);
- }
- fcs_p1 -= bytes_in_prev;
-
- memcpy(&fcs_bytes, fcs_p1, bytes_in_prev);
- memcpy(((u8 *)&fcs_bytes) + bytes_in_prev, fcs_p2, last_frag_sz);
-
- return fcs_bytes;
+ return __get_unaligned_cpu32(fcs_bytes);
}
-static u8 get_ip_proto(struct sk_buff *skb, __be16 proto)
+static u8 get_ip_proto(struct sk_buff *skb, int network_depth, __be16 proto)
{
- void *ip_p = skb->data + sizeof(struct ethhdr);
+ void *ip_p = skb->data + network_depth;
return (proto == htons(ETH_P_IP)) ? ((struct iphdr *)ip_p)->protocol :
((struct ipv6hdr *)ip_p)->nexthdr;
goto csum_unnecessary;
if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) {
- if (unlikely(get_ip_proto(skb, proto) == IPPROTO_SCTP))
+ if (unlikely(get_ip_proto(skb, network_depth, proto) == IPPROTO_SCTP))
goto csum_unnecessary;
skb->ip_summed = CHECKSUM_COMPLETE;
network_depth - ETH_HLEN,
skb->csum);
if (unlikely(netdev->features & NETIF_F_RXFCS))
- skb->csum = csum_add(skb->csum,
- (__force __wsum)mlx5e_get_fcs(skb));
+ skb->csum = csum_block_add(skb->csum,
+ (__force __wsum)mlx5e_get_fcs(skb),
+ skb->len - ETH_FCS_LEN);
stats->csum_complete++;
return;
}
u32 frag_size;
bool consumed;
+ /* Check packet size. Note LRO doesn't use linear SKB */
+ if (unlikely(cqe_bcnt > rq->hw_mtu)) {
+ rq->stats->oversize_pkts_sw_drop++;
+ return NULL;
+ }
+
va = page_address(di->page) + head_offset;
data = va + rx_headroom;
frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt32);
return 1;
}
-#ifdef CONFIG_INET
-/* loopback test */
-#define MLX5E_TEST_PKT_SIZE (MLX5E_RX_MAX_HEAD - NET_IP_ALIGN)
-static const char mlx5e_test_text[ETH_GSTRING_LEN] = "MLX5E SELF TEST";
-#define MLX5E_TEST_MAGIC 0x5AEED15C001ULL
-
struct mlx5ehdr {
__be32 version;
__be64 magic;
- char text[ETH_GSTRING_LEN];
};
+#ifdef CONFIG_INET
+/* loopback test */
+#define MLX5E_TEST_PKT_SIZE (sizeof(struct ethhdr) + sizeof(struct iphdr) +\
+ sizeof(struct udphdr) + sizeof(struct mlx5ehdr))
+#define MLX5E_TEST_MAGIC 0x5AEED15C001ULL
+
static struct sk_buff *mlx5e_test_get_udp_skb(struct mlx5e_priv *priv)
{
struct sk_buff *skb = NULL;
struct ethhdr *ethh;
struct udphdr *udph;
struct iphdr *iph;
- int datalen, iplen;
-
- datalen = MLX5E_TEST_PKT_SIZE -
- (sizeof(*ethh) + sizeof(*iph) + sizeof(*udph));
+ int iplen;
skb = netdev_alloc_skb(priv->netdev, MLX5E_TEST_PKT_SIZE);
if (!skb) {
/* Fill UDP header */
udph->source = htons(9);
udph->dest = htons(9); /* Discard Protocol */
- udph->len = htons(datalen + sizeof(struct udphdr));
+ udph->len = htons(sizeof(struct mlx5ehdr) + sizeof(struct udphdr));
udph->check = 0;
/* Fill IP header */
iph->ttl = 32;
iph->version = 4;
iph->protocol = IPPROTO_UDP;
- iplen = sizeof(struct iphdr) + sizeof(struct udphdr) + datalen;
+ iplen = sizeof(struct iphdr) + sizeof(struct udphdr) +
+ sizeof(struct mlx5ehdr);
iph->tot_len = htons(iplen);
iph->frag_off = 0;
iph->saddr = 0;
mlxh = skb_put(skb, sizeof(*mlxh));
mlxh->version = 0;
mlxh->magic = cpu_to_be64(MLX5E_TEST_MAGIC);
- strlcpy(mlxh->text, mlx5e_test_text, sizeof(mlxh->text));
- datalen -= sizeof(*mlxh);
- skb_put_zero(skb, datalen);
skb->csum = 0;
skb->ip_summed = CHECKSUM_PARTIAL;
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_wqe_err) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler_cqes) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler_strides) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_oversize_pkts_sw_drop) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_buff_alloc_err) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_blks) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_pkts) },
s->rx_wqe_err += rq_stats->wqe_err;
s->rx_mpwqe_filler_cqes += rq_stats->mpwqe_filler_cqes;
s->rx_mpwqe_filler_strides += rq_stats->mpwqe_filler_strides;
+ s->rx_oversize_pkts_sw_drop += rq_stats->oversize_pkts_sw_drop;
s->rx_buff_alloc_err += rq_stats->buff_alloc_err;
s->rx_cqe_compress_blks += rq_stats->cqe_compress_blks;
s->rx_cqe_compress_pkts += rq_stats->cqe_compress_pkts;
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, wqe_err) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler_cqes) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler_strides) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, oversize_pkts_sw_drop) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, buff_alloc_err) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_blks) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_pkts) },
u64 rx_wqe_err;
u64 rx_mpwqe_filler_cqes;
u64 rx_mpwqe_filler_strides;
+ u64 rx_oversize_pkts_sw_drop;
u64 rx_buff_alloc_err;
u64 rx_cqe_compress_blks;
u64 rx_cqe_compress_pkts;
u64 wqe_err;
u64 mpwqe_filler_cqes;
u64 mpwqe_filler_strides;
+ u64 oversize_pkts_sw_drop;
u64 buff_alloc_err;
u64 cqe_compress_blks;
u64 cqe_compress_pkts;
inner_headers);
}
- if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
- struct flow_dissector_key_eth_addrs *key =
+ if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
+ struct flow_dissector_key_basic *key =
skb_flow_dissector_target(f->dissector,
- FLOW_DISSECTOR_KEY_ETH_ADDRS,
+ FLOW_DISSECTOR_KEY_BASIC,
f->key);
- struct flow_dissector_key_eth_addrs *mask =
+ struct flow_dissector_key_basic *mask =
skb_flow_dissector_target(f->dissector,
- FLOW_DISSECTOR_KEY_ETH_ADDRS,
+ FLOW_DISSECTOR_KEY_BASIC,
f->mask);
+ MLX5_SET(fte_match_set_lyr_2_4, headers_c, ethertype,
+ ntohs(mask->n_proto));
+ MLX5_SET(fte_match_set_lyr_2_4, headers_v, ethertype,
+ ntohs(key->n_proto));
- ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
- dmac_47_16),
- mask->dst);
- ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
- dmac_47_16),
- key->dst);
-
- ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
- smac_47_16),
- mask->src);
- ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
- smac_47_16),
- key->src);
-
- if (!is_zero_ether_addr(mask->src) || !is_zero_ether_addr(mask->dst))
+ if (mask->n_proto)
*match_level = MLX5_MATCH_L2;
}
*match_level = MLX5_MATCH_L2;
}
- } else {
+ } else if (*match_level != MLX5_MATCH_NONE) {
MLX5_SET(fte_match_set_lyr_2_4, headers_c, svlan_tag, 1);
MLX5_SET(fte_match_set_lyr_2_4, headers_c, cvlan_tag, 1);
+ *match_level = MLX5_MATCH_L2;
}
if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_CVLAN)) {
}
}
- if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
- struct flow_dissector_key_basic *key =
+ if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
+ struct flow_dissector_key_eth_addrs *key =
skb_flow_dissector_target(f->dissector,
- FLOW_DISSECTOR_KEY_BASIC,
+ FLOW_DISSECTOR_KEY_ETH_ADDRS,
f->key);
- struct flow_dissector_key_basic *mask =
+ struct flow_dissector_key_eth_addrs *mask =
skb_flow_dissector_target(f->dissector,
- FLOW_DISSECTOR_KEY_BASIC,
+ FLOW_DISSECTOR_KEY_ETH_ADDRS,
f->mask);
- MLX5_SET(fte_match_set_lyr_2_4, headers_c, ethertype,
- ntohs(mask->n_proto));
- MLX5_SET(fte_match_set_lyr_2_4, headers_v, ethertype,
- ntohs(key->n_proto));
- if (mask->n_proto)
+ ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+ dmac_47_16),
+ mask->dst);
+ ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+ dmac_47_16),
+ key->dst);
+
+ ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_c,
+ smac_47_16),
+ mask->src);
+ ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, headers_v,
+ smac_47_16),
+ key->src);
+
+ if (!is_zero_ether_addr(mask->src) || !is_zero_ether_addr(mask->dst))
*match_level = MLX5_MATCH_L2;
}
/* the HW doesn't need L3 inline to match on frag=no */
if (!(key->flags & FLOW_DIS_IS_FRAGMENT))
- *match_level = MLX5_INLINE_MODE_L2;
+ *match_level = MLX5_MATCH_L2;
/* *** L2 attributes parsing up to here *** */
else
- *match_level = MLX5_INLINE_MODE_IP;
+ *match_level = MLX5_MATCH_L3;
}
}
if (!actions_match_supported(priv, exts, parse_attr, flow, extack))
return -EOPNOTSUPP;
- if (attr->out_count > 1 && !mlx5_esw_has_fwd_fdb(priv->mdev)) {
+ if (attr->mirror_count > 0 && !mlx5_esw_has_fwd_fdb(priv->mdev)) {
NL_SET_ERR_MSG_MOD(extack,
"current firmware doesn't support split rule for port mirroring");
netdev_warn_once(priv->netdev, "current firmware doesn't support split rule for port mirroring\n");
};
static const struct rhashtable_params rhash_sa = {
- .key_len = FIELD_SIZEOF(struct mlx5_fpga_ipsec_sa_ctx, hw_sa),
- .key_offset = offsetof(struct mlx5_fpga_ipsec_sa_ctx, hw_sa),
+ /* Keep out "cmd" field from the key as it's
+ * value is not constant during the lifetime
+ * of the key object.
+ */
+ .key_len = FIELD_SIZEOF(struct mlx5_fpga_ipsec_sa_ctx, hw_sa) -
+ FIELD_SIZEOF(struct mlx5_ifc_fpga_ipsec_sa_v1, cmd),
+ .key_offset = offsetof(struct mlx5_fpga_ipsec_sa_ctx, hw_sa) +
+ FIELD_SIZEOF(struct mlx5_ifc_fpga_ipsec_sa_v1, cmd),
.head_offset = offsetof(struct mlx5_fpga_ipsec_sa_ctx, hash),
.automatic_shrinking = true,
.min_size = 1,
netif_carrier_off(epriv->netdev);
mlx5_fs_remove_rx_underlay_qpn(mdev, ipriv->qp.qpn);
- mlx5i_uninit_underlay_qp(epriv);
mlx5e_deactivate_priv_channels(epriv);
mlx5e_close_channels(&epriv->channels);
+ mlx5i_uninit_underlay_qp(epriv);
unlock:
mutex_unlock(&epriv->state_lock);
return 0;
mlxsw_core->bus,
mlxsw_core->bus_priv, true,
devlink);
- if (err)
- mlxsw_core->reload_fail = true;
+ mlxsw_core->reload_fail = !!err;
+
return err;
}
{
struct devlink *devlink = priv_to_devlink(mlxsw_core);
- if (mlxsw_core->reload_fail)
- goto reload_fail;
+ if (mlxsw_core->reload_fail) {
+ if (!reload)
+ /* Only the parts that were not de-initialized in the
+ * failed reload attempt need to be de-initialized.
+ */
+ goto reload_fail_deinit;
+ else
+ return;
+ }
if (mlxsw_core->driver->fini)
mlxsw_core->driver->fini(mlxsw_core);
if (!reload)
devlink_resources_unregister(devlink, NULL);
mlxsw_core->bus->fini(mlxsw_core->bus_priv);
- if (reload)
- return;
-reload_fail:
+
+ return;
+
+reload_fail_deinit:
+ devlink_unregister(devlink);
+ devlink_resources_unregister(devlink, NULL);
devlink_free(devlink);
}
EXPORT_SYMBOL(mlxsw_core_bus_device_unregister);
* Configures the ETS elements.
*/
#define MLXSW_REG_QEEC_ID 0x400D
-#define MLXSW_REG_QEEC_LEN 0x1C
+#define MLXSW_REG_QEEC_LEN 0x20
MLXSW_REG_DEFINE(qeec, MLXSW_REG_QEEC_ID, MLXSW_REG_QEEC_LEN);
*/
MLXSW_ITEM32(reg, qeec, next_element_index, 0x08, 0, 8);
+/* reg_qeec_mise
+ * Min shaper configuration enable. Enables configuration of the min
+ * shaper on this ETS element
+ * 0 - Disable
+ * 1 - Enable
+ * Access: RW
+ */
+MLXSW_ITEM32(reg, qeec, mise, 0x0C, 31, 1);
+
enum {
MLXSW_REG_QEEC_BYTES_MODE,
MLXSW_REG_QEEC_PACKETS_MODE,
*/
MLXSW_ITEM32(reg, qeec, pb, 0x0C, 28, 1);
+/* The smallest permitted min shaper rate. */
+#define MLXSW_REG_QEEC_MIS_MIN 200000 /* Kbps */
+
+/* reg_qeec_min_shaper_rate
+ * Min shaper information rate.
+ * For CPU port, can only be configured for port hierarchy.
+ * When in bytes mode, value is specified in units of 1000bps.
+ * Access: RW
+ */
+MLXSW_ITEM32(reg, qeec, min_shaper_rate, 0x0C, 0, 28);
+
/* reg_qeec_mase
* Max shaper configuration enable. Enables configuration of the max
* shaper on this ETS element.
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(qeec), qeec_pl);
}
+static int mlxsw_sp_port_min_bw_set(struct mlxsw_sp_port *mlxsw_sp_port,
+ enum mlxsw_reg_qeec_hr hr, u8 index,
+ u8 next_index, u32 minrate)
+{
+ struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+ char qeec_pl[MLXSW_REG_QEEC_LEN];
+
+ mlxsw_reg_qeec_pack(qeec_pl, mlxsw_sp_port->local_port, hr, index,
+ next_index);
+ mlxsw_reg_qeec_mise_set(qeec_pl, true);
+ mlxsw_reg_qeec_min_shaper_rate_set(qeec_pl, minrate);
+
+ return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(qeec), qeec_pl);
+}
+
int mlxsw_sp_port_prio_tc_set(struct mlxsw_sp_port *mlxsw_sp_port,
u8 switch_prio, u8 tclass)
{
return err;
}
+ /* Configure the min shaper for multicast TCs. */
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ err = mlxsw_sp_port_min_bw_set(mlxsw_sp_port,
+ MLXSW_REG_QEEC_HIERARCY_TC,
+ i + 8, i,
+ MLXSW_REG_QEEC_MIS_MIN);
+ if (err)
+ return err;
+ }
+
/* Map all priorities to traffic class 0. */
for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
err = mlxsw_sp_port_prio_tc_set(mlxsw_sp_port, i, 0);
burst_size = 7;
break;
case MLXSW_REG_HTGT_TRAP_GROUP_SP_IP2ME:
- is_bytes = true;
rate = 4 * 1024;
burst_size = 4;
break;
mc_record = mlxsw_sp_nve_mc_record_find(mc_list, proto, addr,
&mc_entry);
- if (WARN_ON(!mc_record))
+ if (!mc_record)
return;
mlxsw_sp_nve_mc_record_entry_del(mc_record, mc_entry);
key.fid_index = mlxsw_sp_fid_index(fid);
mc_list = mlxsw_sp_nve_mc_list_find(mlxsw_sp, &key);
- if (WARN_ON(!mc_list))
+ if (!mc_list)
return;
mlxsw_sp_nve_fid_flood_index_clear(fid, mc_list);
{
u32 ul_tb_id = l3mdev_fib_table(ul_dev) ? : RT_TABLE_MAIN;
enum mlxsw_sp_ipip_type ipipt = ipip_entry->ipipt;
- struct net_device *ipip_ul_dev;
if (mlxsw_sp->router->ipip_ops_arr[ipipt]->ul_proto != ul_proto)
return false;
- ipip_ul_dev = __mlxsw_sp_ipip_netdev_ul_dev_get(ipip_entry->ol_dev);
return mlxsw_sp_ipip_entry_saddr_matches(mlxsw_sp, ul_proto, ul_dip,
- ul_tb_id, ipip_entry) &&
- (!ipip_ul_dev || ipip_ul_dev == ul_dev);
+ ul_tb_id, ipip_entry);
}
/* Given decap parameters, find the corresponding IPIP entry. */
mlxsw_sp_bridge_port_should_destroy(const struct mlxsw_sp_bridge_port *
bridge_port)
{
- struct mlxsw_sp *mlxsw_sp = mlxsw_sp_lower_get(bridge_port->dev);
+ struct net_device *dev = bridge_port->dev;
+ struct mlxsw_sp *mlxsw_sp;
+
+ if (is_vlan_dev(dev))
+ mlxsw_sp = mlxsw_sp_lower_get(vlan_dev_real_dev(dev));
+ else
+ mlxsw_sp = mlxsw_sp_lower_get(dev);
/* In case ports were pulled from out of a bridged LAG, then
* it's possible the reference count isn't zero, yet the bridge
vid = is_vlan_dev(dev) ? vlan_dev_vlan_id(dev) : 1;
mlxsw_sp_port_vlan = mlxsw_sp_port_vlan_find_by_vid(mlxsw_sp_port, vid);
- if (WARN_ON(!mlxsw_sp_port_vlan))
+ if (!mlxsw_sp_port_vlan)
return;
mlxsw_sp_port_vlan_bridge_leave(mlxsw_sp_port_vlan);
if (!fid)
return -EINVAL;
- if (mlxsw_sp_fid_vni_is_set(fid))
- return -EINVAL;
+ if (mlxsw_sp_fid_vni_is_set(fid)) {
+ err = -EINVAL;
+ goto err_vni_exists;
+ }
err = mlxsw_sp_nve_fid_enable(mlxsw_sp, fid, ¶ms, extack);
if (err)
return 0;
err_nve_fid_enable:
+err_vni_exists:
mlxsw_sp_fid_put(fid);
return err;
}
break;
case SWITCHDEV_FDB_DEL_TO_DEVICE:
fdb_info = &switchdev_work->fdb_info;
- if (!fdb_info->added_by_user)
- break;
mlxsw_sp_port_fdb_set(mlxsw_sp_port, fdb_info, false);
break;
case SWITCHDEV_FDB_ADD_TO_BRIDGE: /* fall through */
netif_wake_queue(adapter->netdev);
}
- if (!napi_complete_done(napi, weight))
+ if (!napi_complete(napi))
goto done;
/* enable isr */
lan743x_csr_read(adapter, INT_STS);
done:
- return weight;
+ return 0;
}
static void lan743x_tx_ring_cleanup(struct lan743x_tx *tx)
tx->vector_flags = lan743x_intr_get_vector_flags(adapter,
INT_BIT_DMA_TX_
(tx->channel_number));
- netif_napi_add(adapter->netdev,
- &tx->napi, lan743x_tx_napi_poll,
- tx->ring_size - 1);
+ netif_tx_napi_add(adapter->netdev,
+ &tx->napi, lan743x_tx_napi_poll,
+ tx->ring_size - 1);
napi_enable(&tx->napi);
data = 0;
static const struct pci_device_id lan743x_pcidev_tbl[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_SMSC, PCI_DEVICE_ID_SMSC_LAN7430) },
+ { PCI_DEVICE(PCI_VENDOR_ID_SMSC, PCI_DEVICE_ID_SMSC_LAN7431) },
{ 0, }
};
/* SMSC acquired EFAR late 1990's, MCHP acquired SMSC 2012 */
#define PCI_VENDOR_ID_SMSC PCI_VENDOR_ID_EFAR
#define PCI_DEVICE_ID_SMSC_LAN7430 (0x7430)
+#define PCI_DEVICE_ID_SMSC_LAN7431 (0x7431)
#define PCI_CONFIG_LENGTH (0x1000)
if (err)
goto err_destroy_flow;
- err = nfp_flower_xmit_flow(netdev, flow_pay,
- NFP_FLOWER_CMSG_TYPE_FLOW_ADD);
- if (err)
- goto err_destroy_flow;
-
flow_pay->tc_flower_cookie = flow->cookie;
err = rhashtable_insert_fast(&priv->flow_table, &flow_pay->fl_node,
nfp_flower_table_params);
if (err)
- goto err_destroy_flow;
+ goto err_release_metadata;
+
+ err = nfp_flower_xmit_flow(netdev, flow_pay,
+ NFP_FLOWER_CMSG_TYPE_FLOW_ADD);
+ if (err)
+ goto err_remove_rhash;
port->tc_offload_cnt++;
return 0;
+err_remove_rhash:
+ WARN_ON_ONCE(rhashtable_remove_fast(&priv->flow_table,
+ &flow_pay->fl_node,
+ nfp_flower_table_params));
+err_release_metadata:
+ nfp_modify_flow_metadata(app, flow_pay);
err_destroy_flow:
kfree(flow_pay->action_data);
kfree(flow_pay->mask_data);
static void
qed_dcbx_set_params(struct qed_dcbx_results *p_data,
struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt,
- bool enable, u8 prio, u8 tc,
+ bool app_tlv, bool enable, u8 prio, u8 tc,
enum dcbx_protocol_type type,
enum qed_pci_personality personality)
{
p_data->arr[type].dont_add_vlan0 = true;
/* QM reconf data */
- if (p_hwfn->hw_info.personality == personality)
+ if (app_tlv && p_hwfn->hw_info.personality == personality)
qed_hw_info_set_offload_tc(&p_hwfn->hw_info, tc);
/* Configure dcbx vlan priority in doorbell block for roce EDPM */
static void
qed_dcbx_update_app_info(struct qed_dcbx_results *p_data,
struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt,
- bool enable, u8 prio, u8 tc,
+ bool app_tlv, bool enable, u8 prio, u8 tc,
enum dcbx_protocol_type type)
{
enum qed_pci_personality personality;
personality = qed_dcbx_app_update[i].personality;
- qed_dcbx_set_params(p_data, p_hwfn, p_ptt, enable,
+ qed_dcbx_set_params(p_data, p_hwfn, p_ptt, app_tlv, enable,
prio, tc, type, personality);
}
}
enable = true;
}
- qed_dcbx_update_app_info(p_data, p_hwfn, p_ptt, enable,
- priority, tc, type);
+ qed_dcbx_update_app_info(p_data, p_hwfn, p_ptt, true,
+ enable, priority, tc, type);
}
}
continue;
enable = (type == DCBX_PROTOCOL_ETH) ? false : !!dcbx_version;
- qed_dcbx_update_app_info(p_data, p_hwfn, p_ptt, enable,
+ qed_dcbx_update_app_info(p_data, p_hwfn, p_ptt, false, enable,
priority, tc, type);
}
"no error",
"length error",
"function disabled",
- "VF sent command to attnetion address",
+ "VF sent command to attention address",
"host sent prod update command",
"read of during interrupt register while in MIMD mode",
"access to PXP BAR reserved address",
qed_iscsi_free(p_hwfn);
qed_ooo_free(p_hwfn);
}
+
+ if (QED_IS_RDMA_PERSONALITY(p_hwfn))
+ qed_rdma_info_free(p_hwfn);
+
qed_iov_free(p_hwfn);
qed_l2_free(p_hwfn);
qed_dmae_info_free(p_hwfn);
struct qed_qm_info *qm_info = &p_hwfn->qm_info;
/* Can't have multiple flags set here */
- if (bitmap_weight((unsigned long *)&pq_flags, sizeof(pq_flags)) > 1)
+ if (bitmap_weight((unsigned long *)&pq_flags,
+ sizeof(pq_flags) * BITS_PER_BYTE) > 1) {
+ DP_ERR(p_hwfn, "requested multiple pq flags 0x%x\n", pq_flags);
+ goto err;
+ }
+
+ if (!(qed_get_pq_flags(p_hwfn) & pq_flags)) {
+ DP_ERR(p_hwfn, "pq flag 0x%x is not set\n", pq_flags);
goto err;
+ }
switch (pq_flags) {
case PQ_FLAGS_RLS:
}
err:
- DP_ERR(p_hwfn, "BAD pq flags %d\n", pq_flags);
- return NULL;
+ return &qm_info->start_pq;
}
/* save pq index in qm info */
{
u8 max_tc = qed_init_qm_get_num_tcs(p_hwfn);
+ if (max_tc == 0) {
+ DP_ERR(p_hwfn, "pq with flag 0x%lx do not exist\n",
+ PQ_FLAGS_MCOS);
+ return p_hwfn->qm_info.start_pq;
+ }
+
if (tc > max_tc)
DP_ERR(p_hwfn, "tc %d must be smaller than %d\n", tc, max_tc);
- return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_MCOS) + tc;
+ return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_MCOS) + (tc % max_tc);
}
u16 qed_get_cm_pq_idx_vf(struct qed_hwfn *p_hwfn, u16 vf)
{
u16 max_vf = qed_init_qm_get_num_vfs(p_hwfn);
+ if (max_vf == 0) {
+ DP_ERR(p_hwfn, "pq with flag 0x%lx do not exist\n",
+ PQ_FLAGS_VFS);
+ return p_hwfn->qm_info.start_pq;
+ }
+
if (vf > max_vf)
DP_ERR(p_hwfn, "vf %d must be smaller than %d\n", vf, max_vf);
- return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_VFS) + vf;
+ return qed_get_cm_pq_idx(p_hwfn, PQ_FLAGS_VFS) + (vf % max_vf);
}
u16 qed_get_cm_pq_idx_ofld_mtc(struct qed_hwfn *p_hwfn, u8 tc)
goto alloc_err;
}
+ if (QED_IS_RDMA_PERSONALITY(p_hwfn)) {
+ rc = qed_rdma_info_alloc(p_hwfn);
+ if (rc)
+ goto alloc_err;
+ }
+
/* DMA info initialization */
rc = qed_dmae_info_alloc(p_hwfn);
if (rc)
if (!p_ptt)
return -EAGAIN;
- /* If roce info is allocated it means roce is initialized and should
- * be enabled in searcher.
- */
if (p_hwfn->p_rdma_info &&
- p_hwfn->b_rdma_enabled_in_prs)
+ p_hwfn->p_rdma_info->active && p_hwfn->b_rdma_enabled_in_prs)
qed_wr(p_hwfn, p_ptt, p_hwfn->rdma_prs_search_reg, 0x1);
/* Re-open incoming traffic */
"Cannot satisfy CQ amount. CQs requested %d, CQs available %d. Aborting function start\n",
fcoe_pf_params->num_cqs,
p_hwfn->hw_info.feat_num[QED_FCOE_CQ]);
- return -EINVAL;
+ rc = -EINVAL;
+ goto err;
}
p_data->mtu = cpu_to_le16(fcoe_pf_params->mtu);
rc = qed_cxt_acquire_cid(p_hwfn, PROTOCOLID_FCOE, &dummy_cid);
if (rc)
- return rc;
+ goto err;
cxt_info.iid = dummy_cid;
rc = qed_cxt_get_cid_info(p_hwfn, &cxt_info);
if (rc) {
DP_NOTICE(p_hwfn, "Cannot find context info for dummy cid=%d\n",
dummy_cid);
- return rc;
+ goto err;
}
p_cxt = cxt_info.p_cxt;
SET_FIELD(p_cxt->tstorm_ag_context.flags3,
rc = qed_spq_post(p_hwfn, p_ent, NULL);
return rc;
+
+err:
+ qed_sp_destroy_request(p_hwfn, p_ent);
+ return rc;
}
static int
*/
do {
index = p_sb_attn->sb_index;
+ /* finish reading index before the loop condition */
+ dma_rmb();
attn_bits = le32_to_cpu(p_sb_attn->atten_bits);
attn_acks = le32_to_cpu(p_sb_attn->atten_ack);
} while (index != p_sb_attn->sb_index);
"Cannot satisfy CQ amount. Queues requested %d, CQs available %d. Aborting function start\n",
p_params->num_queues,
p_hwfn->hw_info.feat_num[QED_ISCSI_CQ]);
+ qed_sp_destroy_request(p_hwfn, p_ent);
return -EINVAL;
}
rc = qed_sp_vport_update_rss(p_hwfn, p_ramrod, p_rss_params);
if (rc) {
- /* Return spq entry which is taken in qed_sp_init_request()*/
- qed_spq_return_entry(p_hwfn, p_ent);
+ qed_sp_destroy_request(p_hwfn, p_ent);
return rc;
}
DP_NOTICE(p_hwfn,
"%d is not supported yet\n",
p_filter_cmd->opcode);
+ qed_sp_destroy_request(p_hwfn, *pp_ent);
return -EINVAL;
}
} else {
rc = qed_fw_vport(p_hwfn, p_params->vport_id, &abs_vport_id);
if (rc)
- return rc;
+ goto err;
if (p_params->qid != QED_RFS_NTUPLE_QID_RSS) {
rc = qed_fw_l2_queue(p_hwfn, p_params->qid,
&abs_rx_q_id);
if (rc)
- return rc;
+ goto err;
p_ramrod->rx_qid_valid = 1;
p_ramrod->rx_qid = cpu_to_le16(abs_rx_q_id);
(u64)p_params->addr, p_params->length);
return qed_spq_post(p_hwfn, p_ent, NULL);
+
+err:
+ qed_sp_destroy_request(p_hwfn, p_ent);
+ return rc;
}
int qed_get_rxq_coalesce(struct qed_hwfn *p_hwfn,
return -EBUSY;
}
rc = qed_mcp_drain(hwfn, ptt);
+ qed_ptt_release(hwfn, ptt);
if (rc)
return rc;
- qed_ptt_release(hwfn, ptt);
}
return 0;
struct qed_ptt *p_ptt, u32 *p_speed_mask)
{
u32 transceiver_type, transceiver_state;
+ int ret;
- qed_mcp_get_transceiver_data(p_hwfn, p_ptt, &transceiver_state,
- &transceiver_type);
+ ret = qed_mcp_get_transceiver_data(p_hwfn, p_ptt, &transceiver_state,
+ &transceiver_type);
+ if (ret)
+ return ret;
if (qed_is_transceiver_ready(transceiver_state, transceiver_type) ==
false)
return FEAT_NUM((struct qed_hwfn *)p_hwfn, QED_PF_L2_QUE) + rel_sb_id;
}
-static int qed_rdma_alloc(struct qed_hwfn *p_hwfn,
- struct qed_ptt *p_ptt,
- struct qed_rdma_start_in_params *params)
+int qed_rdma_info_alloc(struct qed_hwfn *p_hwfn)
{
struct qed_rdma_info *p_rdma_info;
- u32 num_cons, num_tasks;
- int rc = -ENOMEM;
- DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "Allocating RDMA\n");
-
- /* Allocate a struct with current pf rdma info */
p_rdma_info = kzalloc(sizeof(*p_rdma_info), GFP_KERNEL);
if (!p_rdma_info)
- return rc;
+ return -ENOMEM;
+
+ spin_lock_init(&p_rdma_info->lock);
p_hwfn->p_rdma_info = p_rdma_info;
+ return 0;
+}
+
+void qed_rdma_info_free(struct qed_hwfn *p_hwfn)
+{
+ kfree(p_hwfn->p_rdma_info);
+ p_hwfn->p_rdma_info = NULL;
+}
+
+static int qed_rdma_alloc(struct qed_hwfn *p_hwfn)
+{
+ struct qed_rdma_info *p_rdma_info = p_hwfn->p_rdma_info;
+ u32 num_cons, num_tasks;
+ int rc = -ENOMEM;
+
+ DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "Allocating RDMA\n");
+
if (QED_IS_IWARP_PERSONALITY(p_hwfn))
p_rdma_info->proto = PROTOCOLID_IWARP;
else
/* Allocate a struct with device params and fill it */
p_rdma_info->dev = kzalloc(sizeof(*p_rdma_info->dev), GFP_KERNEL);
if (!p_rdma_info->dev)
- goto free_rdma_info;
+ return rc;
/* Allocate a struct with port params and fill it */
p_rdma_info->port = kzalloc(sizeof(*p_rdma_info->port), GFP_KERNEL);
kfree(p_rdma_info->port);
free_rdma_dev:
kfree(p_rdma_info->dev);
-free_rdma_info:
- kfree(p_rdma_info);
return rc;
}
kfree(p_rdma_info->port);
kfree(p_rdma_info->dev);
-
- kfree(p_rdma_info);
}
static void qed_rdma_free_tid(void *rdma_cxt, u32 itid)
DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "RDMA setup\n");
- spin_lock_init(&p_hwfn->p_rdma_info->lock);
-
qed_rdma_init_devinfo(p_hwfn, params);
qed_rdma_init_port(p_hwfn);
qed_rdma_init_events(p_hwfn, params);
/* Disable RoCE search */
qed_wr(p_hwfn, p_ptt, p_hwfn->rdma_prs_search_reg, 0);
p_hwfn->b_rdma_enabled_in_prs = false;
-
+ p_hwfn->p_rdma_info->active = 0;
qed_wr(p_hwfn, p_ptt, PRS_REG_ROCE_DEST_QP_MAX_PF, 0);
ll2_ethertype_en = qed_rd(p_hwfn, p_ptt, PRS_REG_LIGHT_L2_ETHERTYPE_EN);
u8 max_stats_queues;
int rc;
- if (!rdma_cxt || !in_params || !out_params || !p_hwfn->p_rdma_info) {
+ if (!rdma_cxt || !in_params || !out_params ||
+ !p_hwfn->p_rdma_info->active) {
DP_ERR(p_hwfn->cdev,
"qed roce create qp failed due to NULL entry (rdma_cxt=%p, in=%p, out=%p, roce_info=?\n",
rdma_cxt, in_params, out_params);
default:
rc = -EINVAL;
DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "rc = %d\n", rc);
+ qed_sp_destroy_request(p_hwfn, p_ent);
return rc;
}
SET_FIELD(p_ramrod->flags1,
{
bool result;
- /* if rdma info has not been allocated, naturally there are no qps */
- if (!p_hwfn->p_rdma_info)
+ /* if rdma wasn't activated yet, naturally there are no qps */
+ if (!p_hwfn->p_rdma_info->active)
return false;
spin_lock_bh(&p_hwfn->p_rdma_info->lock);
if (!p_ptt)
goto err;
- rc = qed_rdma_alloc(p_hwfn, p_ptt, params);
+ rc = qed_rdma_alloc(p_hwfn);
if (rc)
goto err1;
goto err2;
qed_ptt_release(p_hwfn, p_ptt);
+ p_hwfn->p_rdma_info->active = 1;
return rc;
u16 max_queue_zones;
enum protocol_type proto;
struct qed_iwarp_info iwarp;
+ u8 active:1;
};
struct qed_rdma_qp {
#if IS_ENABLED(CONFIG_QED_RDMA)
void qed_rdma_dpm_bar(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt);
void qed_rdma_dpm_conf(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt);
+int qed_rdma_info_alloc(struct qed_hwfn *p_hwfn);
+void qed_rdma_info_free(struct qed_hwfn *p_hwfn);
#else
static inline void qed_rdma_dpm_conf(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt) {}
static inline void qed_rdma_dpm_bar(struct qed_hwfn *p_hwfn,
struct qed_ptt *p_ptt) {}
+static inline int qed_rdma_info_alloc(struct qed_hwfn *p_hwfn) {return -EINVAL;}
+static inline void qed_rdma_info_free(struct qed_hwfn *p_hwfn) {}
#endif
int
DP_NOTICE(p_hwfn,
"qed destroy responder failed: cannot allocate memory (ramrod). rc = %d\n",
rc);
+ qed_sp_destroy_request(p_hwfn, p_ent);
return rc;
}
enum spq_mode comp_mode;
struct qed_spq_comp_cb comp_cb;
struct qed_spq_comp_done comp_done; /* SPQ_MODE_EBLOCK */
+
+ /* Posted entry for unlimited list entry in EBLOCK mode */
+ struct qed_spq_entry *post_ent;
};
struct qed_eq {
struct qed_spq_comp_cb *p_comp_data;
};
+/**
+ * @brief Returns a SPQ entry to the pool / frees the entry if allocated.
+ * Should be called on in error flows after initializing the SPQ entry
+ * and before posting it.
+ *
+ * @param p_hwfn
+ * @param p_ent
+ */
+void qed_sp_destroy_request(struct qed_hwfn *p_hwfn,
+ struct qed_spq_entry *p_ent);
+
int qed_sp_init_request(struct qed_hwfn *p_hwfn,
struct qed_spq_entry **pp_ent,
u8 cmd,
#include "qed_sp.h"
#include "qed_sriov.h"
+void qed_sp_destroy_request(struct qed_hwfn *p_hwfn,
+ struct qed_spq_entry *p_ent)
+{
+ /* qed_spq_get_entry() can either get an entry from the free_pool,
+ * or, if no entries are left, allocate a new entry and add it to
+ * the unlimited_pending list.
+ */
+ if (p_ent->queue == &p_hwfn->p_spq->unlimited_pending)
+ kfree(p_ent);
+ else
+ qed_spq_return_entry(p_hwfn, p_ent);
+}
+
int qed_sp_init_request(struct qed_hwfn *p_hwfn,
struct qed_spq_entry **pp_ent,
u8 cmd, u8 protocol, struct qed_sp_init_data *p_data)
case QED_SPQ_MODE_BLOCK:
if (!p_data->p_comp_data)
- return -EINVAL;
+ goto err;
p_ent->comp_cb.cookie = p_data->p_comp_data->cookie;
break;
default:
DP_NOTICE(p_hwfn, "Unknown SPQE completion mode %d\n",
p_ent->comp_mode);
- return -EINVAL;
+ goto err;
}
DP_VERBOSE(p_hwfn, QED_MSG_SPQ,
memset(&p_ent->ramrod, 0, sizeof(p_ent->ramrod));
return 0;
+
+err:
+ qed_sp_destroy_request(p_hwfn, p_ent);
+
+ return -EINVAL;
}
static enum tunnel_clss qed_tunn_clss_to_fw_clss(u8 type)
DP_INFO(p_hwfn, "Ramrod is stuck, requesting MCP drain\n");
rc = qed_mcp_drain(p_hwfn, p_ptt);
+ qed_ptt_release(p_hwfn, p_ptt);
if (rc) {
DP_NOTICE(p_hwfn, "MCP drain failed\n");
goto err;
/* Retry after drain */
rc = __qed_spq_block(p_hwfn, p_ent, p_fw_ret, true);
if (!rc)
- goto out;
+ return 0;
comp_done = (struct qed_spq_comp_done *)p_ent->comp_cb.cookie;
- if (comp_done->done == 1)
+ if (comp_done->done == 1) {
if (p_fw_ret)
*p_fw_ret = comp_done->fw_return_code;
-out:
- qed_ptt_release(p_hwfn, p_ptt);
- return 0;
-
+ return 0;
+ }
err:
- qed_ptt_release(p_hwfn, p_ptt);
DP_NOTICE(p_hwfn,
"Ramrod is stuck [CID %08x cmd %02x protocol %02x echo %04x]\n",
le32_to_cpu(p_ent->elem.hdr.cid),
/* EBLOCK responsible to free the allocated p_ent */
if (p_ent->comp_mode != QED_SPQ_MODE_EBLOCK)
kfree(p_ent);
+ else
+ p_ent->post_ent = p_en2;
p_ent = p_en2;
}
SPQ_HIGH_PRI_RESERVE_DEFAULT);
}
+/* Avoid overriding of SPQ entries when getting out-of-order completions, by
+ * marking the completions in a bitmap and increasing the chain consumer only
+ * for the first successive completed entries.
+ */
+static void qed_spq_comp_bmap_update(struct qed_hwfn *p_hwfn, __le16 echo)
+{
+ u16 pos = le16_to_cpu(echo) % SPQ_RING_SIZE;
+ struct qed_spq *p_spq = p_hwfn->p_spq;
+
+ __set_bit(pos, p_spq->p_comp_bitmap);
+ while (test_bit(p_spq->comp_bitmap_idx,
+ p_spq->p_comp_bitmap)) {
+ __clear_bit(p_spq->comp_bitmap_idx,
+ p_spq->p_comp_bitmap);
+ p_spq->comp_bitmap_idx++;
+ qed_chain_return_produced(&p_spq->chain);
+ }
+}
+
int qed_spq_post(struct qed_hwfn *p_hwfn,
struct qed_spq_entry *p_ent, u8 *fw_return_code)
{
p_ent->queue == &p_spq->unlimited_pending);
if (p_ent->queue == &p_spq->unlimited_pending) {
- /* This is an allocated p_ent which does not need to
- * return to pool.
- */
+ struct qed_spq_entry *p_post_ent = p_ent->post_ent;
+
kfree(p_ent);
- return rc;
+
+ /* Return the entry which was actually posted */
+ p_ent = p_post_ent;
}
if (rc)
spq_post_fail2:
spin_lock_bh(&p_spq->lock);
list_del(&p_ent->list);
- qed_chain_return_produced(&p_spq->chain);
+ qed_spq_comp_bmap_update(p_hwfn, p_ent->elem.hdr.echo);
spq_post_fail:
/* return to the free pool */
spin_lock_bh(&p_spq->lock);
list_for_each_entry_safe(p_ent, tmp, &p_spq->completion_pending, list) {
if (p_ent->elem.hdr.echo == echo) {
- u16 pos = le16_to_cpu(echo) % SPQ_RING_SIZE;
-
list_del(&p_ent->list);
-
- /* Avoid overriding of SPQ entries when getting
- * out-of-order completions, by marking the completions
- * in a bitmap and increasing the chain consumer only
- * for the first successive completed entries.
- */
- __set_bit(pos, p_spq->p_comp_bitmap);
-
- while (test_bit(p_spq->comp_bitmap_idx,
- p_spq->p_comp_bitmap)) {
- __clear_bit(p_spq->comp_bitmap_idx,
- p_spq->p_comp_bitmap);
- p_spq->comp_bitmap_idx++;
- qed_chain_return_produced(&p_spq->chain);
- }
-
+ qed_spq_comp_bmap_update(p_hwfn, echo);
p_spq->comp_count++;
found = p_ent;
break;
QED_MSG_SPQ,
"Got a completion without a callback function\n");
- if ((found->comp_mode != QED_SPQ_MODE_EBLOCK) ||
- (found->queue == &p_spq->unlimited_pending))
+ if (found->comp_mode != QED_SPQ_MODE_EBLOCK)
/* EBLOCK is responsible for returning its own entry into the
- * free list, unless it originally added the entry into the
- * unlimited pending list.
+ * free list.
*/
qed_spq_return_entry(p_hwfn, found);
default:
DP_NOTICE(p_hwfn, "Unknown VF personality %d\n",
p_hwfn->hw_info.personality);
+ qed_sp_destroy_request(p_hwfn, p_ent);
return -EINVAL;
}
struct cmd_desc_type0 *first_desc, struct sk_buff *skb,
struct qlcnic_host_tx_ring *tx_ring)
{
- u8 l4proto, opcode = 0, hdr_len = 0;
+ u8 l4proto, opcode = 0, hdr_len = 0, tag_vlan = 0;
u16 flags = 0, vlan_tci = 0;
int copied, offset, copy_len, size;
struct cmd_desc_type0 *hwdesc;
flags = QLCNIC_FLAGS_VLAN_TAGGED;
vlan_tci = ntohs(vh->h_vlan_TCI);
protocol = ntohs(vh->h_vlan_encapsulated_proto);
+ tag_vlan = 1;
} else if (skb_vlan_tag_present(skb)) {
flags = QLCNIC_FLAGS_VLAN_OOB;
vlan_tci = skb_vlan_tag_get(skb);
+ tag_vlan = 1;
}
if (unlikely(adapter->tx_pvid)) {
- if (vlan_tci && !(adapter->flags & QLCNIC_TAGGING_ENABLED))
+ if (tag_vlan && !(adapter->flags & QLCNIC_TAGGING_ENABLED))
return -EIO;
- if (vlan_tci && (adapter->flags & QLCNIC_TAGGING_ENABLED))
+ if (tag_vlan && (adapter->flags & QLCNIC_TAGGING_ENABLED))
goto set_flags;
flags = QLCNIC_FLAGS_VLAN_OOB;
struct net_device *real_dev,
struct rmnet_endpoint *ep)
{
- struct rmnet_priv *priv;
+ struct rmnet_priv *priv = netdev_priv(rmnet_dev);
int rc;
if (ep->egress_dev)
rmnet_dev->hw_features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
rmnet_dev->hw_features |= NETIF_F_SG;
+ priv->real_dev = real_dev;
+
rc = register_netdevice(rmnet_dev);
if (!rc) {
ep->egress_dev = rmnet_dev;
rmnet_dev->rtnl_link_ops = &rmnet_link_ops;
- priv = netdev_priv(rmnet_dev);
priv->mux_id = id;
- priv->real_dev = real_dev;
netdev_dbg(rmnet_dev, "rmnet dev created\n");
}
struct cp_private *cp;
int handled = 0;
u16 status;
+ u16 mask;
if (unlikely(dev == NULL))
return IRQ_NONE;
spin_lock(&cp->lock);
+ mask = cpr16(IntrMask);
+ if (!mask)
+ goto out_unlock;
+
status = cpr16(IntrStatus);
if (!status || (status == 0xFFFF))
goto out_unlock;
NETIF_MSG_TX_ERR)
/* Parameter for descriptor */
-#define AVE_NR_TXDESC 32 /* Tx descriptor */
-#define AVE_NR_RXDESC 64 /* Rx descriptor */
+#define AVE_NR_TXDESC 64 /* Tx descriptor */
+#define AVE_NR_RXDESC 256 /* Rx descriptor */
#define AVE_DESC_OFS_CMDSTS 0
#define AVE_DESC_OFS_ADDRL 4
/* Parameter for ethernet frame */
#define AVE_MAX_ETHFRAME 1518
+#define AVE_FRAME_HEADROOM 2
/* Parameter for interrupt */
#define AVE_INTM_COUNT 20
skb = priv->rx.desc[entry].skbs;
if (!skb) {
- skb = netdev_alloc_skb_ip_align(ndev,
- AVE_MAX_ETHFRAME);
+ skb = netdev_alloc_skb(ndev, AVE_MAX_ETHFRAME);
if (!skb) {
netdev_err(ndev, "can't allocate skb for Rx\n");
return -ENOMEM;
}
+ skb->data += AVE_FRAME_HEADROOM;
+ skb->tail += AVE_FRAME_HEADROOM;
}
/* set disable to cmdsts */
* - Rx buffer begins with 2 byte headroom, and data will be put from
* (buffer + 2).
* To satisfy this, specify the address to put back the buffer
- * pointer advanced by NET_IP_ALIGN by netdev_alloc_skb_ip_align(),
- * and expand the map size by NET_IP_ALIGN.
+ * pointer advanced by AVE_FRAME_HEADROOM, and expand the map size
+ * by AVE_FRAME_HEADROOM.
*/
ret = ave_dma_map(ndev, &priv->rx.desc[entry],
- skb->data - NET_IP_ALIGN,
- AVE_MAX_ETHFRAME + NET_IP_ALIGN,
+ skb->data - AVE_FRAME_HEADROOM,
+ AVE_MAX_ETHFRAME + AVE_FRAME_HEADROOM,
DMA_FROM_DEVICE, &paddr);
if (ret) {
netdev_err(ndev, "can't map skb for Rx\n");
pdev->name, pdev->id);
/* Register as a NAPI supported driver */
- netif_napi_add(ndev, &priv->napi_rx, ave_napi_poll_rx, priv->rx.ndesc);
+ netif_napi_add(ndev, &priv->napi_rx, ave_napi_poll_rx,
+ NAPI_POLL_WEIGHT);
netif_tx_napi_add(ndev, &priv->napi_tx, ave_napi_poll_tx,
- priv->tx.ndesc);
+ NAPI_POLL_WEIGHT);
platform_set_drvdata(pdev, ndev);
};
module_platform_driver(ave_driver);
+MODULE_AUTHOR("Kunihiko Hayashi <hayashi.kunihiko@socionext.com>");
MODULE_DESCRIPTION("Socionext UniPhier AVE ethernet driver");
MODULE_LICENSE("GPL v2");
/* GMAC TX FIFO is 8K, Rx FIFO is 16K */
#define BUF_SIZE_16KiB 16384
-#define BUF_SIZE_8KiB 8192
+/* RX Buffer size must be < 8191 and multiple of 4/8/16 bytes */
+#define BUF_SIZE_8KiB 8188
#define BUF_SIZE_4KiB 4096
#define BUF_SIZE_2KiB 2048
/* Enhanced descriptors */
static inline void ehn_desc_rx_set_on_ring(struct dma_desc *p, int end)
{
- p->des1 |= cpu_to_le32(((BUF_SIZE_8KiB - 1)
+ p->des1 |= cpu_to_le32((BUF_SIZE_8KiB
<< ERDES1_BUFFER2_SIZE_SHIFT)
& ERDES1_BUFFER2_SIZE_MASK);
int mode, int end)
{
p->des0 |= cpu_to_le32(RDES0_OWN);
- p->des1 |= cpu_to_le32((BUF_SIZE_8KiB - 1) & ERDES1_BUFFER1_SIZE_MASK);
+ p->des1 |= cpu_to_le32(BUF_SIZE_8KiB & ERDES1_BUFFER1_SIZE_MASK);
if (mode == STMMAC_CHAIN_MODE)
ehn_desc_rx_set_on_chain(p);
static int set_16kib_bfsize(int mtu)
{
int ret = 0;
- if (unlikely(mtu >= BUF_SIZE_8KiB))
+ if (unlikely(mtu > BUF_SIZE_8KiB))
ret = BUF_SIZE_16KiB;
return ret;
}
netdev_warn(priv->dev, "PTP init failed\n");
}
-#ifdef CONFIG_DEBUG_FS
- ret = stmmac_init_fs(dev);
- if (ret < 0)
- netdev_warn(priv->dev, "%s: failed debugFS registration\n",
- __func__);
-#endif
priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS;
if (priv->use_riwt) {
netif_carrier_off(dev);
-#ifdef CONFIG_DEBUG_FS
- stmmac_exit_fs(dev);
-#endif
-
stmmac_release_ptp(priv);
return 0;
u32 tx_count = priv->plat->tx_queues_to_use;
u32 queue;
+ if ((dev->flags & IFF_UP) == 0)
+ return 0;
+
for (queue = 0; queue < rx_count; queue++) {
struct stmmac_rx_queue *rx_q = &priv->rx_queue[queue];
goto error_netdev_register;
}
+#ifdef CONFIG_DEBUG_FS
+ ret = stmmac_init_fs(ndev);
+ if (ret < 0)
+ netdev_warn(priv->dev, "%s: failed debugFS registration\n",
+ __func__);
+#endif
+
return ret;
error_netdev_register:
netdev_info(priv->dev, "%s: removing driver", __func__);
+#ifdef CONFIG_DEBUG_FS
+ stmmac_exit_fs(ndev);
+#endif
stmmac_stop_all_dma(priv);
stmmac_mac_set(priv, priv->ioaddr, false);
*/
int stmmac_mdio_reset(struct mii_bus *bus)
{
-#if defined(CONFIG_STMMAC_PLATFORM)
+#if IS_ENABLED(CONFIG_STMMAC_PLATFORM)
struct net_device *ndev = bus->priv;
struct stmmac_priv *priv = netdev_priv(ndev);
unsigned int mii_address = priv->hw->mii.addr;
"tx_jumbo",
"rx_mac_control_frames",
"tx_mac_control_frames",
- "rx_frame_alignement_errors",
+ "rx_frame_alignment_errors",
"rx_long_ok",
"rx_long_err",
"tx_sqe_errors",
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: GPL-2.0+
/* FDDI network adapter driver for DEC FDDIcontroller 700/700-C devices.
*
* Copyright (c) 2018 Maciej W. Rozycki
#define DRV_VERSION "v.1.1.4"
#define DRV_RELDATE "Oct 6 2018"
-static char version[] =
+static const char version[] =
DRV_NAME ": " DRV_VERSION " " DRV_RELDATE " Maciej W. Rozycki\n";
MODULE_AUTHOR("Maciej W. Rozycki <macro@linux-mips.org>");
static void fza_tx_smt(struct net_device *dev)
{
struct fza_private *fp = netdev_priv(dev);
- struct fza_buffer_tx __iomem *smt_tx_ptr, *skb_data_ptr;
+ struct fza_buffer_tx __iomem *smt_tx_ptr;
int i, len;
u32 own;
if (!netif_queue_stopped(dev)) {
if (dev_nit_active(dev)) {
+ struct fza_buffer_tx *skb_data_ptr;
struct sk_buff *skb;
/* Length must be a multiple of 4 as only word
-/* SPDX-License-Identifier: GPL-2.0 */
+/* SPDX-License-Identifier: GPL-2.0+ */
/* FDDI network adapter driver for DEC FDDIcontroller 700/700-C devices.
*
* Copyright (c) 2018 Maciej W. Rozycki
#define FZA_RING_CMD 0x200400 /* command ring address */
#define FZA_RING_CMD_SIZE 0x40 /* command descriptor ring
* size
+ */
/* Command constants. */
#define FZA_RING_CMD_MASK 0x7fffffff
#define FZA_RING_CMD_NOP 0x00000000 /* nop */
goto hash_add;
}
- err = -EBUSY;
+ err = -EADDRINUSE;
if (macvlan_addr_busy(vlan->port, dev->dev_addr))
goto out;
} else {
/* Rehash and update the device filters */
if (macvlan_addr_busy(vlan->port, addr))
- return -EBUSY;
+ return -EADDRINUSE;
if (!macvlan_passthru(port)) {
err = dev_uc_add(lowerdev, addr);
return dev_set_mac_address(vlan->lowerdev, addr);
}
+ if (macvlan_addr_busy(vlan->port, addr->sa_data))
+ return -EADDRINUSE;
+
return macvlan_sync_address(dev, addr->sa_data);
}
static unsigned int tx_stop = 5;
struct ntb_netdev {
- struct list_head list;
struct pci_dev *pdev;
struct net_device *ndev;
struct ntb_transport_qp *qp;
#define NTB_TX_TIMEOUT_MS 1000
#define NTB_RXQ_SIZE 100
-static LIST_HEAD(dev_list);
-
static void ntb_netdev_event_handler(void *data, int link_is_up)
{
struct net_device *ndev = data;
struct net_device *ndev = dev->ndev;
if (ntb_transport_tx_free_entry(dev->qp) < tx_stop) {
- mod_timer(&dev->tx_timer, jiffies + msecs_to_jiffies(tx_time));
+ mod_timer(&dev->tx_timer, jiffies + usecs_to_jiffies(tx_time));
} else {
/* Make sure anybody stopping the queue after this sees the new
* value of ntb_transport_tx_free_entry()
if (rc)
goto err1;
- list_add(&dev->list, &dev_list);
+ dev_set_drvdata(client_dev, ndev);
dev_info(&pdev->dev, "%s created\n", ndev->name);
return 0;
static void ntb_netdev_remove(struct device *client_dev)
{
- struct ntb_dev *ntb;
- struct net_device *ndev;
- struct pci_dev *pdev;
- struct ntb_netdev *dev;
- bool found = false;
-
- ntb = dev_ntb(client_dev->parent);
- pdev = ntb->pdev;
-
- list_for_each_entry(dev, &dev_list, list) {
- if (dev->pdev == pdev) {
- found = true;
- break;
- }
- }
- if (!found)
- return;
-
- list_del(&dev->list);
-
- ndev = dev->ndev;
+ struct net_device *ndev = dev_get_drvdata(client_dev);
+ struct ntb_netdev *dev = netdev_priv(ndev);
unregister_netdev(ndev);
ntb_transport_free_queue(dev->qp);
return 0;
}
-static int bcm5481x_config(struct phy_device *phydev)
+static int bcm54xx_config_clock_delay(struct phy_device *phydev)
{
int rc, val;
ret = genphy_config_aneg(phydev);
/* Then we can set up the delay. */
- bcm5481x_config(phydev);
+ bcm54xx_config_clock_delay(phydev);
if (of_property_read_bool(np, "enet-phy-lane-swap")) {
/* Lane Swap - Undocumented register...magic! */
return ret;
}
+static int bcm54616s_config_aneg(struct phy_device *phydev)
+{
+ int ret;
+
+ /* Aneg firsly. */
+ ret = genphy_config_aneg(phydev);
+
+ /* Then we can set up the delay. */
+ bcm54xx_config_clock_delay(phydev);
+
+ return ret;
+}
+
static int brcm_phy_setbits(struct phy_device *phydev, int reg, int set)
{
int val;
.features = PHY_GBIT_FEATURES,
.flags = PHY_HAS_INTERRUPT,
.config_init = bcm54xx_config_init,
+ .config_aneg = bcm54616s_config_aneg,
.ack_interrupt = bcm_phy_ack_intr,
.config_intr = bcm_phy_config_intr,
}, {
* assume the pin serves as pull-up. If direction is
* output, the default value is high.
*/
- gpiod_set_value(bitbang->mdo, 1);
+ gpiod_set_value_cansleep(bitbang->mdo, 1);
return;
}
struct mdio_gpio_info *bitbang =
container_of(ctrl, struct mdio_gpio_info, ctrl);
- return gpiod_get_value(bitbang->mdio);
+ return gpiod_get_value_cansleep(bitbang->mdio);
}
static void mdio_set(struct mdiobb_ctrl *ctrl, int what)
container_of(ctrl, struct mdio_gpio_info, ctrl);
if (bitbang->mdo)
- gpiod_set_value(bitbang->mdo, what);
+ gpiod_set_value_cansleep(bitbang->mdo, what);
else
- gpiod_set_value(bitbang->mdio, what);
+ gpiod_set_value_cansleep(bitbang->mdio, what);
}
static void mdc_set(struct mdiobb_ctrl *ctrl, int what)
struct mdio_gpio_info *bitbang =
container_of(ctrl, struct mdio_gpio_info, ctrl);
- gpiod_set_value(bitbang->mdc, what);
+ gpiod_set_value_cansleep(bitbang->mdc, what);
}
static const struct mdiobb_ops mdio_gpio_ops = {
phydev->mdix_ctrl = ETH_TP_MDI_AUTO;
mutex_lock(&phydev->lock);
- rc = phy_select_page(phydev, MSCC_PHY_PAGE_EXTENDED_2);
- if (rc < 0)
- goto out_unlock;
- reg_val = phy_read(phydev, MSCC_PHY_RGMII_CNTL);
- reg_val &= ~(RGMII_RX_CLK_DELAY_MASK);
- reg_val |= (RGMII_RX_CLK_DELAY_1_1_NS << RGMII_RX_CLK_DELAY_POS);
- phy_write(phydev, MSCC_PHY_RGMII_CNTL, reg_val);
+ reg_val = RGMII_RX_CLK_DELAY_1_1_NS << RGMII_RX_CLK_DELAY_POS;
+
+ rc = phy_modify_paged(phydev, MSCC_PHY_PAGE_EXTENDED_2,
+ MSCC_PHY_RGMII_CNTL, RGMII_RX_CLK_DELAY_MASK,
+ reg_val);
-out_unlock:
- rc = phy_restore_page(phydev, rc, rc > 0 ? 0 : rc);
mutex_unlock(&phydev->lock);
return rc;
static int __set_phy_supported(struct phy_device *phydev, u32 max_speed)
{
- phydev->supported &= ~(PHY_1000BT_FEATURES | PHY_100BT_FEATURES |
- PHY_10BT_FEATURES);
-
switch (max_speed) {
- default:
- return -ENOTSUPP;
- case SPEED_1000:
- phydev->supported |= PHY_1000BT_FEATURES;
+ case SPEED_10:
+ phydev->supported &= ~PHY_100BT_FEATURES;
/* fall through */
case SPEED_100:
- phydev->supported |= PHY_100BT_FEATURES;
- /* fall through */
- case SPEED_10:
- phydev->supported |= PHY_10BT_FEATURES;
+ phydev->supported &= ~PHY_1000BT_FEATURES;
+ break;
+ case SPEED_1000:
+ break;
+ default:
+ return -ENOTSUPP;
}
return 0;
new_driver->mdiodrv.driver.remove = phy_remove;
new_driver->mdiodrv.driver.owner = owner;
+ /* The following works around an issue where the PHY driver doesn't bind
+ * to the device, resulting in the genphy driver being used instead of
+ * the dedicated driver. The root cause of the issue isn't known yet
+ * and seems to be in the base driver core. Once this is fixed we may
+ * remove this workaround.
+ */
+ new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS;
+
retval = driver_register(&new_driver->mdiodrv.driver);
if (retval) {
pr_err("%s: Error %d in registering driver\n",
.flags = PHY_HAS_INTERRUPT,
}, {
.phy_id = 0x001cc816,
- .name = "RTL8201F 10/100Mbps Ethernet",
+ .name = "RTL8201F Fast Ethernet",
.phy_id_mask = 0x001fffff,
.features = PHY_BASIC_FEATURES,
.flags = PHY_HAS_INTERRUPT,
/* 1000Base-PX or 1000Base-BX10 */
if ((id->base.e_base_px || id->base.e_base_bx10) &&
br_min <= 1300 && br_max >= 1200)
- phylink_set(support, 1000baseX_Full);
+ phylink_set(modes, 1000baseX_Full);
/* For active or passive cables, select the link modes
* based on the bit rates and the cable compliance bytes.
* it just report sending a packet to the target
* (without actual packet transfer).
*/
- dev_kfree_skb_any(skb);
ndev->stats.tx_packets++;
ndev->stats.tx_bytes += skb->len;
+ dev_kfree_skb_any(skb);
}
}
team->en_port_count--;
team_queue_override_port_del(team, port);
team_adjust_ops(team);
- team_notify_peers(team);
- team_mcast_rejoin(team);
team_lower_state_changed(port);
}
if (!rx_batched || (!more && skb_queue_empty(queue))) {
local_bh_disable();
+ skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
return;
struct sk_buff *nskb;
local_bh_disable();
- while ((nskb = __skb_dequeue(&process_queue)))
+ while ((nskb = __skb_dequeue(&process_queue))) {
+ skb_record_rx_queue(nskb, tfile->queue_index);
netif_receive_skb(nskb);
+ }
+ skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
}
static int tun_validate(struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- if (!data)
- return 0;
- return -EINVAL;
+ NL_SET_ERR_MSG(extack,
+ "tun/tap creation via rtnetlink is not supported.");
+ return -EOPNOTSUPP;
}
static size_t tun_get_size(const struct net_device *dev)
struct tun_file *tfile,
struct xdp_buff *xdp, int *flush)
{
+ unsigned int datasize = xdp->data_end - xdp->data;
struct tun_xdp_hdr *hdr = xdp->data_hard_start;
struct virtio_net_hdr *gso = &hdr->gso;
struct tun_pcpu_stats *stats;
if (!rcu_dereference(tun->steering_prog))
rxhash = __skb_get_hash_symmetric(skb);
+ skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
stats = get_cpu_ptr(tun->pcpu_stats);
u64_stats_update_begin(&stats->syncp);
stats->rx_packets++;
- stats->rx_bytes += skb->len;
+ stats->rx_bytes += datasize;
u64_stats_update_end(&stats->syncp);
put_cpu_ptr(stats);
struct usb_device *udev;
struct usb_interface *intf;
struct net_device *net;
- struct sk_buff *tx_skb;
struct urb *tx_urb;
struct urb *rx_urb;
unsigned char *tx_buf;
case -ENOENT:
case -ECONNRESET:
case -ESHUTDOWN:
+ case -EPROTO:
return;
case 0:
break;
dev_err(&dev->intf->dev, "%s: urb status: %d\n",
__func__, status);
- dev_kfree_skb_irq(dev->tx_skb);
if (status == 0)
netif_wake_queue(dev->net);
else
if (skb->len > IPHETH_BUF_SIZE) {
WARN(1, "%s: skb too large: %d bytes\n", __func__, skb->len);
dev->net->stats.tx_dropped++;
- dev_kfree_skb_irq(skb);
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}
dev_err(&dev->intf->dev, "%s: usb_submit_urb: %d\n",
__func__, retval);
dev->net->stats.tx_errors++;
- dev_kfree_skb_irq(skb);
+ dev_kfree_skb_any(skb);
} else {
- dev->tx_skb = skb;
-
dev->net->stats.tx_packets++;
dev->net->stats.tx_bytes += skb->len;
+ dev_consume_skb_any(skb);
netif_stop_queue(net);
}
dev->net->ethtool_ops = &smsc95xx_ethtool_ops;
dev->net->flags |= IFF_MULTICAST;
dev->net->hard_header_len += SMSC95XX_TX_OVERHEAD_CSUM;
+ dev->net->min_mtu = ETH_MIN_MTU;
+ dev->net->max_mtu = ETH_DATA_LEN;
dev->hard_mtu = dev->net->mtu + dev->net->hard_header_len;
pdata->dev = dev;
return ret;
}
+ cancel_delayed_work_sync(&pdata->carrier_check);
+
if (pdata->suspend_flags) {
netdev_warn(dev->net, "error during last resume\n");
pdata->suspend_flags = 0;
*/
if (ret && PMSG_IS_AUTO(message))
usbnet_resume(intf);
+
+ if (ret)
+ schedule_delayed_work(&pdata->carrier_check,
+ CARRIER_CHECK_DELAY);
+
return ret;
}
VIRTIO_NET_F_GUEST_TSO4,
VIRTIO_NET_F_GUEST_TSO6,
VIRTIO_NET_F_GUEST_ECN,
- VIRTIO_NET_F_GUEST_UFO
+ VIRTIO_NET_F_GUEST_UFO,
+ VIRTIO_NET_F_GUEST_CSUM
};
struct virtnet_stat_desc {
static struct sk_buff *page_to_skb(struct virtnet_info *vi,
struct receive_queue *rq,
struct page *page, unsigned int offset,
- unsigned int len, unsigned int truesize)
+ unsigned int len, unsigned int truesize,
+ bool hdr_valid)
{
struct sk_buff *skb;
struct virtio_net_hdr_mrg_rxbuf *hdr;
else
hdr_padded_len = sizeof(struct padded_vnet_hdr);
- memcpy(hdr, p, hdr_len);
+ if (hdr_valid)
+ memcpy(hdr, p, hdr_len);
len -= hdr_len;
offset += hdr_padded_len;
struct virtnet_rq_stats *stats)
{
struct page *page = buf;
- struct sk_buff *skb = page_to_skb(vi, rq, page, 0, len, PAGE_SIZE);
+ struct sk_buff *skb = page_to_skb(vi, rq, page, 0, len,
+ PAGE_SIZE, true);
stats->bytes += len - vi->hdr_len;
if (unlikely(!skb))
rcu_read_unlock();
put_page(page);
head_skb = page_to_skb(vi, rq, xdp_page,
- offset, len, PAGE_SIZE);
+ offset, len,
+ PAGE_SIZE, false);
return head_skb;
}
break;
goto err_skb;
}
- head_skb = page_to_skb(vi, rq, page, offset, len, truesize);
+ head_skb = page_to_skb(vi, rq, page, offset, len, truesize, !xdp_prog);
curr_skb = head_skb;
if (unlikely(!curr_skb))
if (!vi->guest_offloads)
return 0;
- if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
- offloads = 1ULL << VIRTIO_NET_F_GUEST_CSUM;
-
return virtnet_set_guest_offloads(vi, offloads);
}
if (!vi->guest_offloads)
return 0;
- if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
- offloads |= 1ULL << VIRTIO_NET_F_GUEST_CSUM;
return virtnet_set_guest_offloads(vi, offloads);
}
&& (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
- virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO))) {
- NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing LRO, disable LRO first");
+ virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO) ||
+ virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))) {
+ NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing LRO/CSUM, disable LRO/CSUM first");
return -EOPNOTSUPP;
}
u32 bitmap;
if (drop) {
- if (vif->type == NL80211_IFTYPE_STATION) {
+ if (vif && vif->type == NL80211_IFTYPE_STATION) {
bitmap = ~(1 << WMI_MGMT_TID);
list_for_each_entry(arvif, &ar->arvifs, list) {
if (arvif->vdev_type == WMI_VDEV_TYPE_STA)
struct ath_vif *avp = (void *)vif->drv_priv;
struct ath_node *an = &avp->mcast_node;
+ mutex_lock(&sc->mutex);
if (IS_ENABLED(CONFIG_ATH9K_TX99)) {
if (sc->cur_chan->nvifs >= 1) {
mutex_unlock(&sc->mutex);
sc->tx99_vif = vif;
}
- mutex_lock(&sc->mutex);
-
ath_dbg(common, CONFIG, "Attach a VIF of type: %d\n", vif->type);
sc->cur_chan->nvifs++;
* for subsequent chanspecs.
*/
channel->flags = IEEE80211_CHAN_NO_HT40 |
- IEEE80211_CHAN_NO_80MHZ;
+ IEEE80211_CHAN_NO_80MHZ |
+ IEEE80211_CHAN_NO_160MHZ;
ch.bw = BRCMU_CHAN_BW_20;
cfg->d11inf.encchspec(&ch);
chaninfo = ch.chspec;
}
break;
case BRCMU_CHSPEC_D11AC_BW_160:
+ ch->bw = BRCMU_CHAN_BW_160;
+ ch->sb = brcmu_maskget16(ch->chspec, BRCMU_CHSPEC_D11AC_SB_MASK,
+ BRCMU_CHSPEC_D11AC_SB_SHIFT);
switch (ch->sb) {
case BRCMU_CHAN_SB_LLL:
ch->control_ch_num -= CH_70MHZ_APART;
* GPL LICENSE SUMMARY
*
* Copyright(c) 2017 Intel Deutschland GmbH
+ * Copyright(c) 2018 Intel Corporation
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* BSD LICENSE
*
* Copyright(c) 2017 Intel Deutschland GmbH
+ * Copyright(c) 2018 Intel Corporation
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
#define ACPI_WRDS_WIFI_DATA_SIZE (ACPI_SAR_TABLE_SIZE + 2)
#define ACPI_EWRD_WIFI_DATA_SIZE ((ACPI_SAR_PROFILE_NUM - 1) * \
ACPI_SAR_TABLE_SIZE + 3)
-#define ACPI_WGDS_WIFI_DATA_SIZE 18
+#define ACPI_WGDS_WIFI_DATA_SIZE 19
#define ACPI_WRDD_WIFI_DATA_SIZE 2
#define ACPI_SPLC_WIFI_DATA_SIZE 2
const struct iwl_fw_runtime_ops *ops, void *ops_ctx,
struct dentry *dbgfs_dir);
-void iwl_fw_runtime_exit(struct iwl_fw_runtime *fwrt);
+static inline void iwl_fw_runtime_free(struct iwl_fw_runtime *fwrt)
+{
+ kfree(fwrt->dump.d3_debug_data);
+ fwrt->dump.d3_debug_data = NULL;
+}
void iwl_fw_runtime_suspend(struct iwl_fw_runtime *fwrt);
IWL_DEBUG_RADIO(mvm, "Sending GEO_TX_POWER_LIMIT\n");
BUILD_BUG_ON(ACPI_NUM_GEO_PROFILES * ACPI_WGDS_NUM_BANDS *
- ACPI_WGDS_TABLE_SIZE != ACPI_WGDS_WIFI_DATA_SIZE);
+ ACPI_WGDS_TABLE_SIZE + 1 != ACPI_WGDS_WIFI_DATA_SIZE);
BUILD_BUG_ON(ACPI_NUM_GEO_PROFILES > IWL_NUM_GEO_PROFILES);
return -ENOENT;
}
+static int iwl_mvm_sar_get_wgds_table(struct iwl_mvm *mvm)
+{
+ return -ENOENT;
+}
+
static int iwl_mvm_sar_geo_init(struct iwl_mvm *mvm)
{
return 0;
IWL_DEBUG_RADIO(mvm,
"WRDS SAR BIOS table invalid or unavailable. (%d)\n",
ret);
- /* if not available, don't fail and don't bother with EWRD */
- return 0;
+ /*
+ * If not available, don't fail and don't bother with EWRD.
+ * Return 1 to tell that we can't use WGDS either.
+ */
+ return 1;
}
ret = iwl_mvm_sar_get_ewrd_table(mvm);
/* choose profile 1 (WRDS) as default for both chains */
ret = iwl_mvm_sar_select_profile(mvm, 1, 1);
- /* if we don't have profile 0 from BIOS, just skip it */
+ /*
+ * If we don't have profile 0 from BIOS, just skip it. This
+ * means that SAR Geo will not be enabled either, even if we
+ * have other valid profiles.
+ */
if (ret == -ENOENT)
- return 0;
+ return 1;
return ret;
}
iwl_mvm_unref(mvm, IWL_MVM_REF_UCODE_DOWN);
ret = iwl_mvm_sar_init(mvm);
- if (ret)
- goto error;
+ if (ret == 0) {
+ ret = iwl_mvm_sar_geo_init(mvm);
+ } else if (ret > 0 && !iwl_mvm_sar_get_wgds_table(mvm)) {
+ /*
+ * If basic SAR is not available, we check for WGDS,
+ * which should *not* be available either. If it is
+ * available, issue an error, because we can't use SAR
+ * Geo without basic SAR.
+ */
+ IWL_ERR(mvm, "BIOS contains WGDS but no WRDS\n");
+ }
- ret = iwl_mvm_sar_geo_init(mvm);
- if (ret)
+ if (ret < 0)
goto error;
iwl_mvm_leds_sync(mvm);
goto out;
}
- if (changed)
- *changed = (resp->status == MCC_RESP_NEW_CHAN_PROFILE);
+ if (changed) {
+ u32 status = le32_to_cpu(resp->status);
+
+ *changed = (status == MCC_RESP_NEW_CHAN_PROFILE ||
+ status == MCC_RESP_ILLEGAL);
+ }
regd = iwl_parse_nvm_mcc_info(mvm->trans->dev, mvm->cfg,
__le32_to_cpu(resp->n_channels),
sinfo->filled |= BIT_ULL(NL80211_STA_INFO_SIGNAL_AVG);
}
- if (!fw_has_capa(&mvm->fw->ucode_capa,
- IWL_UCODE_TLV_CAPA_RADIO_BEACON_STATS))
- return;
-
/* if beacon filtering isn't on mac80211 does it anyway */
if (!(vif->driver_flags & IEEE80211_VIF_BEACON_FILTER))
return;
}
IWL_DEBUG_LAR(mvm,
- "MCC response status: 0x%x. new MCC: 0x%x ('%c%c') change: %d n_chans: %d\n",
- status, mcc, mcc >> 8, mcc & 0xff,
- !!(status == MCC_RESP_NEW_CHAN_PROFILE), n_channels);
+ "MCC response status: 0x%x. new MCC: 0x%x ('%c%c') n_chans: %d\n",
+ status, mcc, mcc >> 8, mcc & 0xff, n_channels);
exit:
iwl_free_resp(&cmd);
iwl_mvm_thermal_exit(mvm);
out_free:
iwl_fw_flush_dump(&mvm->fwrt);
+ iwl_fw_runtime_free(&mvm->fwrt);
if (iwlmvm_mod_params.init_dbg)
return op_mode;
iwl_mvm_tof_clean(mvm);
+ iwl_fw_runtime_free(&mvm->fwrt);
mutex_destroy(&mvm->mutex);
mutex_destroy(&mvm->d0i3_suspend_mutex);
wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_CQM_RSSI_LIST);
+ tasklet_hrtimer_init(&data->beacon_timer,
+ mac80211_hwsim_beacon,
+ CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+
err = ieee80211_register_hw(hw);
if (err < 0) {
pr_debug("mac80211_hwsim: ieee80211_register_hw failed (%d)\n",
data->debugfs,
data, &hwsim_simulate_radar);
- tasklet_hrtimer_init(&data->beacon_timer,
- mac80211_hwsim_beacon,
- CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
-
spin_lock_bh(&hwsim_radio_lock);
err = rhashtable_insert_fast(&hwsim_radios_rht, &data->rht,
hwsim_rht_params);
if (err)
goto out_unregister_pernet;
+ err = hwsim_init_netlink();
+ if (err)
+ goto out_unregister_driver;
+
hwsim_class = class_create(THIS_MODULE, "mac80211_hwsim");
if (IS_ERR(hwsim_class)) {
err = PTR_ERR(hwsim_class);
- goto out_unregister_driver;
+ goto out_exit_netlink;
}
- err = hwsim_init_netlink();
- if (err < 0)
- goto out_unregister_driver;
-
for (i = 0; i < radios; i++) {
struct hwsim_new_radio_params param = { 0 };
free_netdev(hwsim_mon);
out_free_radios:
mac80211_hwsim_free();
+out_exit_netlink:
+ hwsim_exit_netlink();
out_unregister_driver:
platform_driver_unregister(&mac80211_hwsim_driver);
out_unregister_pernet:
config MT76_CORE
tristate
+config MT76_LEDS
+ bool
+ depends on MT76_CORE
+ depends on LEDS_CLASS=y || MT76_CORE=LEDS_CLASS
+ default y
+
config MT76_USB
tristate
depends on MT76_CORE
mt76_check_sband(dev, NL80211_BAND_2GHZ);
mt76_check_sband(dev, NL80211_BAND_5GHZ);
- ret = mt76_led_init(dev);
- if (ret)
- return ret;
+ if (IS_ENABLED(CONFIG_MT76_LEDS)) {
+ ret = mt76_led_init(dev);
+ if (ret)
+ return ret;
+ }
return ieee80211_register_hw(hw);
}
struct mac_address macaddr_list[8];
struct mutex phy_mutex;
- struct mutex mutex;
u8 txdone_seq;
DECLARE_KFIFO_PTR(txstatus_fifo, struct mt76x02_tx_status);
mt76x2_dfs_init_detector(dev);
/* init led callbacks */
- dev->mt76.led_cdev.brightness_set = mt76x2_led_set_brightness;
- dev->mt76.led_cdev.blink_set = mt76x2_led_set_blink;
+ if (IS_ENABLED(CONFIG_MT76_LEDS)) {
+ dev->mt76.led_cdev.brightness_set = mt76x2_led_set_brightness;
+ dev->mt76.led_cdev.blink_set = mt76x2_led_set_blink;
+ }
ret = mt76_register_device(&dev->mt76, true, mt76x02_rates,
ARRAY_SIZE(mt76x02_rates));
if (val != ~0 && val > 0xffff)
return -EINVAL;
- mutex_lock(&dev->mutex);
+ mutex_lock(&dev->mt76.mutex);
mt76x2_mac_set_tx_protection(dev, val);
- mutex_unlock(&dev->mutex);
+ mutex_unlock(&dev->mt76.mutex);
return 0;
}
struct resource res[2];
mmc_pm_flag_t mmcflags;
int ret = -ENOMEM;
- int irq, wakeirq;
+ int irq, wakeirq, num_irqs;
const char *chip_family;
/* We are only able to handle the wlan function */
irqd_get_trigger_type(irq_get_irq_data(irq));
res[0].name = "irq";
- res[1].start = wakeirq;
- res[1].flags = IORESOURCE_IRQ |
- irqd_get_trigger_type(irq_get_irq_data(wakeirq));
- res[1].name = "wakeirq";
- ret = platform_device_add_resources(glue->core, res, ARRAY_SIZE(res));
+ if (wakeirq > 0) {
+ res[1].start = wakeirq;
+ res[1].flags = IORESOURCE_IRQ |
+ irqd_get_trigger_type(irq_get_irq_data(wakeirq));
+ res[1].name = "wakeirq";
+ num_irqs = 2;
+ } else {
+ num_irqs = 1;
+ }
+ ret = platform_device_add_resources(glue->core, res, num_irqs);
if (ret) {
dev_err(glue->dev, "can't add resources\n");
goto out_dev_put;
config NTB_IDT
tristate "IDT PCIe-switch Non-Transparent Bridge support"
depends on PCI
+ select HWMON
help
This driver supports NTB of cappable IDT PCIe-switches.
BAR settings of peer NT-functions, the BAR setups can't be done over
kernel PCI fixups. That's why the alternative pre-initialization
techniques like BIOS using SMBus interface or EEPROM should be
- utilized. Additionally if one needs to have temperature sensor
- information printed to system log, the corresponding registers must
- be initialized within BIOS/EEPROM as well.
+ utilized.
If unsure, say N.
*
* GPL LICENSE SUMMARY
*
- * Copyright (C) 2016 T-Platforms All Rights Reserved.
+ * Copyright (C) 2016-2018 T-Platforms JSC All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/spinlock.h>
+#include <linux/mutex.h>
#include <linux/pci.h>
#include <linux/aer.h>
#include <linux/slab.h>
#include <linux/list.h>
#include <linux/debugfs.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
#include <linux/ntb.h>
#include "ntb_hw_idt.h"
}
/* Allocate memory for memory window descriptors */
- ret_mws = devm_kcalloc(&ndev->ntb.pdev->dev, *mw_cnt,
- sizeof(*ret_mws), GFP_KERNEL);
- if (IS_ERR_OR_NULL(ret_mws))
+ ret_mws = devm_kcalloc(&ndev->ntb.pdev->dev, *mw_cnt, sizeof(*ret_mws),
+ GFP_KERNEL);
+ if (!ret_mws)
return ERR_PTR(-ENOMEM);
/* Copy the info of detected memory windows */
idt_nt_write(ndev, bar->ltbase, (u32)addr);
idt_nt_write(ndev, bar->utbase, (u32)(addr >> 32));
/* Set the custom BAR aperture limit */
- limit = pci_resource_start(ntb->pdev, mw_cfg->bar) + size;
+ limit = pci_bus_address(ntb->pdev, mw_cfg->bar) + size;
idt_nt_write(ndev, bar->limit, (u32)limit);
if (IS_FLD_SET(BARSETUP_TYPE, data, 64))
idt_nt_write(ndev, (bar + 1)->limit, (limit >> 32));
* 7. Temperature sensor operations
*
* IDT PCIe-switch has an embedded temperature sensor, which can be used to
- * warn a user-space of possible chip overheating. Since workload temperature
- * can be different on different platforms, temperature thresholds as well as
- * general sensor settings must be setup in the framework of BIOS/EEPROM
- * initializations. It includes the actual sensor enabling as well.
+ * check current chip core temperature. Since a workload environment can be
+ * different on different platforms, an offset and ADC/filter settings can be
+ * specified. Although the offset configuration is only exposed to the sysfs
+ * hwmon interface at the moment. The rest of the settings can be adjusted
+ * for instance by the BIOS/EEPROM firmware.
*=============================================================================
*/
+/*
+ * idt_get_deg() - convert millidegree Celsius value to just degree
+ * @mdegC: IN - millidegree Celsius value
+ *
+ * Return: Degree corresponding to the passed millidegree value
+ */
+static inline s8 idt_get_deg(long mdegC)
+{
+ return mdegC / 1000;
+}
+
+/*
+ * idt_get_frac() - retrieve 0/0.5 fraction of the millidegree Celsius value
+ * @mdegC: IN - millidegree Celsius value
+ *
+ * Return: 0/0.5 degree fraction of the passed millidegree value
+ */
+static inline u8 idt_get_deg_frac(long mdegC)
+{
+ return (mdegC % 1000) >= 500 ? 5 : 0;
+}
+
+/*
+ * idt_get_temp_fmt() - convert millidegree Celsius value to 0:7:1 format
+ * @mdegC: IN - millidegree Celsius value
+ *
+ * Return: 0:7:1 format acceptable by the IDT temperature sensor
+ */
+static inline u8 idt_temp_get_fmt(long mdegC)
+{
+ return (idt_get_deg(mdegC) << 1) | (idt_get_deg_frac(mdegC) ? 1 : 0);
+}
+
+/*
+ * idt_get_temp_sval() - convert temp sample to signed millidegree Celsius
+ * @data: IN - shifted to LSB 8-bits temperature sample
+ *
+ * Return: signed millidegree Celsius
+ */
+static inline long idt_get_temp_sval(u32 data)
+{
+ return ((s8)data / 2) * 1000 + (data & 0x1 ? 500 : 0);
+}
+
+/*
+ * idt_get_temp_sval() - convert temp sample to unsigned millidegree Celsius
+ * @data: IN - shifted to LSB 8-bits temperature sample
+ *
+ * Return: unsigned millidegree Celsius
+ */
+static inline long idt_get_temp_uval(u32 data)
+{
+ return (data / 2) * 1000 + (data & 0x1 ? 500 : 0);
+}
+
/*
* idt_read_temp() - read temperature from chip sensor
* @ntb: NTB device context.
- * @val: OUT - integer value of temperature
- * @frac: OUT - fraction
+ * @type: IN - type of the temperature value to read
+ * @val: OUT - integer value of temperature in millidegree Celsius
*/
-static void idt_read_temp(struct idt_ntb_dev *ndev, unsigned char *val,
- unsigned char *frac)
+static void idt_read_temp(struct idt_ntb_dev *ndev,
+ const enum idt_temp_val type, long *val)
{
u32 data;
- /* Read the data from TEMP field of the TMPSTS register */
- data = idt_sw_read(ndev, IDT_SW_TMPSTS);
- data = GET_FIELD(TMPSTS_TEMP, data);
- /* TEMP field has one fractional bit and seven integer bits */
- *val = data >> 1;
- *frac = ((data & 0x1) ? 5 : 0);
+ /* Alter the temperature field in accordance with the passed type */
+ switch (type) {
+ case IDT_TEMP_CUR:
+ data = GET_FIELD(TMPSTS_TEMP,
+ idt_sw_read(ndev, IDT_SW_TMPSTS));
+ break;
+ case IDT_TEMP_LOW:
+ data = GET_FIELD(TMPSTS_LTEMP,
+ idt_sw_read(ndev, IDT_SW_TMPSTS));
+ break;
+ case IDT_TEMP_HIGH:
+ data = GET_FIELD(TMPSTS_HTEMP,
+ idt_sw_read(ndev, IDT_SW_TMPSTS));
+ break;
+ case IDT_TEMP_OFFSET:
+ /* This is the only field with signed 0:7:1 format */
+ data = GET_FIELD(TMPADJ_OFFSET,
+ idt_sw_read(ndev, IDT_SW_TMPADJ));
+ *val = idt_get_temp_sval(data);
+ return;
+ default:
+ data = GET_FIELD(TMPSTS_TEMP,
+ idt_sw_read(ndev, IDT_SW_TMPSTS));
+ break;
+ }
+
+ /* The rest of the fields accept unsigned 0:7:1 format */
+ *val = idt_get_temp_uval(data);
}
/*
- * idt_temp_isr() - temperature sensor alarm events ISR
- * @ndev: IDT NTB hardware driver descriptor
- * @ntint_sts: NT-function interrupt status
+ * idt_write_temp() - write temperature to the chip sensor register
+ * @ntb: NTB device context.
+ * @type: IN - type of the temperature value to change
+ * @val: IN - integer value of temperature in millidegree Celsius
+ */
+static void idt_write_temp(struct idt_ntb_dev *ndev,
+ const enum idt_temp_val type, const long val)
+{
+ unsigned int reg;
+ u32 data;
+ u8 fmt;
+
+ /* Retrieve the properly formatted temperature value */
+ fmt = idt_temp_get_fmt(val);
+
+ mutex_lock(&ndev->hwmon_mtx);
+ switch (type) {
+ case IDT_TEMP_LOW:
+ reg = IDT_SW_TMPALARM;
+ data = SET_FIELD(TMPALARM_LTEMP, idt_sw_read(ndev, reg), fmt) &
+ ~IDT_TMPALARM_IRQ_MASK;
+ break;
+ case IDT_TEMP_HIGH:
+ reg = IDT_SW_TMPALARM;
+ data = SET_FIELD(TMPALARM_HTEMP, idt_sw_read(ndev, reg), fmt) &
+ ~IDT_TMPALARM_IRQ_MASK;
+ break;
+ case IDT_TEMP_OFFSET:
+ reg = IDT_SW_TMPADJ;
+ data = SET_FIELD(TMPADJ_OFFSET, idt_sw_read(ndev, reg), fmt);
+ break;
+ default:
+ goto inval_spin_unlock;
+ }
+
+ idt_sw_write(ndev, reg, data);
+
+inval_spin_unlock:
+ mutex_unlock(&ndev->hwmon_mtx);
+}
+
+/*
+ * idt_sysfs_show_temp() - printout corresponding temperature value
+ * @dev: Pointer to the NTB device structure
+ * @da: Sensor device attribute structure
+ * @buf: Buffer to print temperature out
*
- * It handles events of temperature crossing alarm thresholds. Since reading
- * of TMPALARM register clears it up, the function doesn't analyze the
- * read value, instead the current temperature value just warningly printed to
- * log.
- * The method is called from PCIe ISR bottom-half routine.
+ * Return: Number of written symbols or negative error
*/
-static void idt_temp_isr(struct idt_ntb_dev *ndev, u32 ntint_sts)
+static ssize_t idt_sysfs_show_temp(struct device *dev,
+ struct device_attribute *da, char *buf)
{
- unsigned char val, frac;
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+ struct idt_ntb_dev *ndev = dev_get_drvdata(dev);
+ enum idt_temp_val type = attr->index;
+ long mdeg;
- /* Read the current temperature value */
- idt_read_temp(ndev, &val, &frac);
+ idt_read_temp(ndev, type, &mdeg);
+ return sprintf(buf, "%ld\n", mdeg);
+}
- /* Read the temperature alarm to clean the alarm status out */
- /*(void)idt_sw_read(ndev, IDT_SW_TMPALARM);*/
+/*
+ * idt_sysfs_set_temp() - set corresponding temperature value
+ * @dev: Pointer to the NTB device structure
+ * @da: Sensor device attribute structure
+ * @buf: Buffer to print temperature out
+ * @count: Size of the passed buffer
+ *
+ * Return: Number of written symbols or negative error
+ */
+static ssize_t idt_sysfs_set_temp(struct device *dev,
+ struct device_attribute *da, const char *buf,
+ size_t count)
+{
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+ struct idt_ntb_dev *ndev = dev_get_drvdata(dev);
+ enum idt_temp_val type = attr->index;
+ long mdeg;
+ int ret;
- /* Clean the corresponding interrupt bit */
- idt_nt_write(ndev, IDT_NT_NTINTSTS, IDT_NTINTSTS_TMPSENSOR);
+ ret = kstrtol(buf, 10, &mdeg);
+ if (ret)
+ return ret;
+
+ /* Clamp the passed value in accordance with the type */
+ if (type == IDT_TEMP_OFFSET)
+ mdeg = clamp_val(mdeg, IDT_TEMP_MIN_OFFSET,
+ IDT_TEMP_MAX_OFFSET);
+ else
+ mdeg = clamp_val(mdeg, IDT_TEMP_MIN_MDEG, IDT_TEMP_MAX_MDEG);
+
+ idt_write_temp(ndev, type, mdeg);
+
+ return count;
+}
+
+/*
+ * idt_sysfs_reset_hist() - reset temperature history
+ * @dev: Pointer to the NTB device structure
+ * @da: Sensor device attribute structure
+ * @buf: Buffer to print temperature out
+ * @count: Size of the passed buffer
+ *
+ * Return: Number of written symbols or negative error
+ */
+static ssize_t idt_sysfs_reset_hist(struct device *dev,
+ struct device_attribute *da,
+ const char *buf, size_t count)
+{
+ struct idt_ntb_dev *ndev = dev_get_drvdata(dev);
+
+ /* Just set the maximal value to the lowest temperature field and
+ * minimal value to the highest temperature field
+ */
+ idt_write_temp(ndev, IDT_TEMP_LOW, IDT_TEMP_MAX_MDEG);
+ idt_write_temp(ndev, IDT_TEMP_HIGH, IDT_TEMP_MIN_MDEG);
- dev_dbg(&ndev->ntb.pdev->dev,
- "Temp sensor IRQ detected %#08x", ntint_sts);
+ return count;
+}
+
+/*
+ * Hwmon IDT sysfs attributes
+ */
+static SENSOR_DEVICE_ATTR(temp1_input, 0444, idt_sysfs_show_temp, NULL,
+ IDT_TEMP_CUR);
+static SENSOR_DEVICE_ATTR(temp1_lowest, 0444, idt_sysfs_show_temp, NULL,
+ IDT_TEMP_LOW);
+static SENSOR_DEVICE_ATTR(temp1_highest, 0444, idt_sysfs_show_temp, NULL,
+ IDT_TEMP_HIGH);
+static SENSOR_DEVICE_ATTR(temp1_offset, 0644, idt_sysfs_show_temp,
+ idt_sysfs_set_temp, IDT_TEMP_OFFSET);
+static DEVICE_ATTR(temp1_reset_history, 0200, NULL, idt_sysfs_reset_hist);
- /* Print temperature value to log */
- dev_warn(&ndev->ntb.pdev->dev, "Temperature %hhu.%hhu", val, frac);
+/*
+ * Hwmon IDT sysfs attributes group
+ */
+static struct attribute *idt_temp_attrs[] = {
+ &sensor_dev_attr_temp1_input.dev_attr.attr,
+ &sensor_dev_attr_temp1_lowest.dev_attr.attr,
+ &sensor_dev_attr_temp1_highest.dev_attr.attr,
+ &sensor_dev_attr_temp1_offset.dev_attr.attr,
+ &dev_attr_temp1_reset_history.attr,
+ NULL
+};
+ATTRIBUTE_GROUPS(idt_temp);
+
+/*
+ * idt_init_temp() - initialize temperature sensor interface
+ * @ndev: IDT NTB hardware driver descriptor
+ *
+ * Simple sensor initializarion method is responsible for device switching
+ * on and resource management based hwmon interface registration. Note, that
+ * since the device is shared we won't disable it on remove, but leave it
+ * working until the system is powered off.
+ */
+static void idt_init_temp(struct idt_ntb_dev *ndev)
+{
+ struct device *hwmon;
+
+ /* Enable sensor if it hasn't been already */
+ idt_sw_write(ndev, IDT_SW_TMPCTL, 0x0);
+
+ /* Initialize hwmon interface fields */
+ mutex_init(&ndev->hwmon_mtx);
+
+ hwmon = devm_hwmon_device_register_with_groups(&ndev->ntb.pdev->dev,
+ ndev->swcfg->name, ndev, idt_temp_groups);
+ if (IS_ERR(hwmon)) {
+ dev_err(&ndev->ntb.pdev->dev, "Couldn't create hwmon device");
+ return;
+ }
+
+ dev_dbg(&ndev->ntb.pdev->dev, "Temperature HWmon interface registered");
}
/*=============================================================================
goto err_free_vectors;
}
- /* Unmask Message/Doorbell/SE/Temperature interrupts */
+ /* Unmask Message/Doorbell/SE interrupts */
ntint_mask = idt_nt_read(ndev, IDT_NT_NTINTMSK) & ~IDT_NTINTMSK_ALL;
idt_nt_write(ndev, IDT_NT_NTINTMSK, ntint_mask);
return ret;
}
-
/*
* idt_deinit_ist() - deinitialize PCIe interrupt handler
* @ndev: IDT NTB hardware driver descriptor
handled = true;
}
- /* Handle temperature sensor interrupt */
- if (ntint_sts & IDT_NTINTSTS_TMPSENSOR) {
- idt_temp_isr(ndev, ntint_sts);
- handled = true;
- }
-
dev_dbg(&ndev->ntb.pdev->dev, "IDT IRQs 0x%08x handled", ntint_sts);
return handled ? IRQ_HANDLED : IRQ_NONE;
size_t count, loff_t *offp)
{
struct idt_ntb_dev *ndev = filp->private_data;
- unsigned char temp, frac, idx, pidx, cnt;
+ unsigned char idx, pidx, cnt;
+ unsigned long irqflags, mdeg;
ssize_t ret = 0, off = 0;
- unsigned long irqflags;
enum ntb_speed speed;
enum ntb_width width;
char *strbuf;
off += scnprintf(strbuf + off, size - off, "\n");
/* Current temperature */
- idt_read_temp(ndev, &temp, &frac);
+ idt_read_temp(ndev, IDT_TEMP_CUR, &mdeg);
off += scnprintf(strbuf + off, size - off,
- "Switch temperature\t\t- %hhu.%hhuC\n", temp, frac);
+ "Switch temperature\t\t- %hhd.%hhuC\n",
+ idt_get_deg(mdeg), idt_get_deg_frac(mdeg));
/* Copy the buffer to the User Space */
ret = simple_read_from_buffer(ubuf, count, offp, strbuf, off);
/* Allocate memory for the IDT PCIe-device descriptor */
ndev = devm_kzalloc(&pdev->dev, sizeof(*ndev), GFP_KERNEL);
- if (IS_ERR_OR_NULL(ndev)) {
+ if (!ndev) {
dev_err(&pdev->dev, "Memory allocation failed for descriptor");
return ERR_PTR(-ENOMEM);
}
/* Initialize Messaging subsystem */
idt_init_msg(ndev);
+ /* Initialize hwmon interface */
+ idt_init_temp(ndev);
+
/* Initialize IDT interrupts handler */
ret = idt_init_isr(ndev);
if (ret != 0)
*
* GPL LICENSE SUMMARY
*
- * Copyright (C) 2016 T-Platforms All Rights Reserved.
+ * Copyright (C) 2016-2018 T-Platforms JSC All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
#include <linux/pci_ids.h>
#include <linux/interrupt.h>
#include <linux/spinlock.h>
+#include <linux/mutex.h>
#include <linux/ntb.h>
-
/*
* Macro is used to create the struct pci_device_id that matches
* the supported IDT PCIe-switches
* @IDT_NTINTMSK_DBELL: Doorbell interrupt mask bit
* @IDT_NTINTMSK_SEVENT: Switch Event interrupt mask bit
* @IDT_NTINTMSK_TMPSENSOR: Temperature sensor interrupt mask bit
- * @IDT_NTINTMSK_ALL: All the useful interrupts mask
+ * @IDT_NTINTMSK_ALL: NTB-related interrupts mask
*/
#define IDT_NTINTMSK_MSG 0x00000001U
#define IDT_NTINTMSK_DBELL 0x00000002U
#define IDT_NTINTMSK_SEVENT 0x00000008U
#define IDT_NTINTMSK_TMPSENSOR 0x00000080U
#define IDT_NTINTMSK_ALL \
- (IDT_NTINTMSK_MSG | IDT_NTINTMSK_DBELL | \
- IDT_NTINTMSK_SEVENT | IDT_NTINTMSK_TMPSENSOR)
+ (IDT_NTINTMSK_MSG | IDT_NTINTMSK_DBELL | IDT_NTINTMSK_SEVENT)
/*
* NTGSIGNAL register fields related constants
#define IDT_SWPxMSGCTL_PART_MASK 0x00000070U
#define IDT_SWPxMSGCTL_PART_FLD 4
+/*
+ * TMPCTL register fields related constants
+ * @IDT_TMPCTL_LTH_MASK: Low temperature threshold field mask
+ * @IDT_TMPCTL_LTH_FLD: Low temperature threshold field offset
+ * @IDT_TMPCTL_MTH_MASK: Middle temperature threshold field mask
+ * @IDT_TMPCTL_MTH_FLD: Middle temperature threshold field offset
+ * @IDT_TMPCTL_HTH_MASK: High temperature threshold field mask
+ * @IDT_TMPCTL_HTH_FLD: High temperature threshold field offset
+ * @IDT_TMPCTL_PDOWN: Temperature sensor power down
+ */
+#define IDT_TMPCTL_LTH_MASK 0x000000FFU
+#define IDT_TMPCTL_LTH_FLD 0
+#define IDT_TMPCTL_MTH_MASK 0x0000FF00U
+#define IDT_TMPCTL_MTH_FLD 8
+#define IDT_TMPCTL_HTH_MASK 0x00FF0000U
+#define IDT_TMPCTL_HTH_FLD 16
+#define IDT_TMPCTL_PDOWN 0x80000000U
+
/*
* TMPSTS register fields related constants
* @IDT_TMPSTS_TEMP_MASK: Current temperature field mask
* @IDT_TMPSTS_TEMP_FLD: Current temperature field offset
+ * @IDT_TMPSTS_LTEMP_MASK: Lowest temperature field mask
+ * @IDT_TMPSTS_LTEMP_FLD: Lowest temperature field offset
+ * @IDT_TMPSTS_HTEMP_MASK: Highest temperature field mask
+ * @IDT_TMPSTS_HTEMP_FLD: Highest temperature field offset
*/
#define IDT_TMPSTS_TEMP_MASK 0x000000FFU
#define IDT_TMPSTS_TEMP_FLD 0
+#define IDT_TMPSTS_LTEMP_MASK 0x0000FF00U
+#define IDT_TMPSTS_LTEMP_FLD 8
+#define IDT_TMPSTS_HTEMP_MASK 0x00FF0000U
+#define IDT_TMPSTS_HTEMP_FLD 16
+
+/*
+ * TMPALARM register fields related constants
+ * @IDT_TMPALARM_LTEMP_MASK: Lowest temperature field mask
+ * @IDT_TMPALARM_LTEMP_FLD: Lowest temperature field offset
+ * @IDT_TMPALARM_HTEMP_MASK: Highest temperature field mask
+ * @IDT_TMPALARM_HTEMP_FLD: Highest temperature field offset
+ * @IDT_TMPALARM_IRQ_MASK: Alarm IRQ status mask
+ */
+#define IDT_TMPALARM_LTEMP_MASK 0x0000FF00U
+#define IDT_TMPALARM_LTEMP_FLD 8
+#define IDT_TMPALARM_HTEMP_MASK 0x00FF0000U
+#define IDT_TMPALARM_HTEMP_FLD 16
+#define IDT_TMPALARM_IRQ_MASK 0x3F000000U
+
+/*
+ * TMPADJ register fields related constants
+ * @IDT_TMPADJ_OFFSET_MASK: Temperature value offset field mask
+ * @IDT_TMPADJ_OFFSET_FLD: Temperature value offset field offset
+ */
+#define IDT_TMPADJ_OFFSET_MASK 0x000000FFU
+#define IDT_TMPADJ_OFFSET_FLD 0
/*
* Helper macro to get/set the corresponding field value
#define IDT_TRANS_ALIGN 4
#define IDT_DIR_SIZE_ALIGN 1
+/*
+ * IDT PCIe-switch temperature sensor value limits
+ * @IDT_TEMP_MIN_MDEG: Minimal integer value of temperature
+ * @IDT_TEMP_MAX_MDEG: Maximal integer value of temperature
+ * @IDT_TEMP_MIN_OFFSET:Minimal integer value of temperature offset
+ * @IDT_TEMP_MAX_OFFSET:Maximal integer value of temperature offset
+ */
+#define IDT_TEMP_MIN_MDEG 0
+#define IDT_TEMP_MAX_MDEG 127500
+#define IDT_TEMP_MIN_OFFSET -64000
+#define IDT_TEMP_MAX_OFFSET 63500
+
+/*
+ * Temperature sensor values enumeration
+ * @IDT_TEMP_CUR: Current temperature
+ * @IDT_TEMP_LOW: Lowest historical temperature
+ * @IDT_TEMP_HIGH: Highest historical temperature
+ * @IDT_TEMP_OFFSET: Current temperature offset
+ */
+enum idt_temp_val {
+ IDT_TEMP_CUR,
+ IDT_TEMP_LOW,
+ IDT_TEMP_HIGH,
+ IDT_TEMP_OFFSET
+};
+
/*
* IDT Memory Windows type. Depending on the device settings, IDT supports
* Direct Address Translation MW registers and Lookup Table registers
* @msg_mask_lock: Message mask register lock
* @gasa_lock: GASA registers access lock
*
+ * @hwmon_mtx: Temperature sensor interface update mutex
+ *
* @dbgfs_info: DebugFS info node
*/
struct idt_ntb_dev {
spinlock_t msg_mask_lock;
spinlock_t gasa_lock;
+ struct mutex hwmon_mtx;
+
struct dentry *dbgfs_info;
};
#define to_ndev_ntb(__ntb) container_of(__ntb, struct idt_ntb_dev, ntb)
return 0;
}
-static inline int ndev_vec_mask(struct intel_ntb_dev *ndev, int db_vector)
+static inline u64 ndev_vec_mask(struct intel_ntb_dev *ndev, int db_vector)
{
u64 shift, mask;
void __iomem *vbase;
size_t xlat_size;
size_t buff_size;
+ size_t alloc_size;
+ void *alloc_addr;
void *virt_addr;
dma_addr_t dma_addr;
};
return;
ntb_mw_clear_trans(nt->ndev, PIDX, num_mw);
- dma_free_coherent(&pdev->dev, mw->buff_size,
- mw->virt_addr, mw->dma_addr);
+ dma_free_coherent(&pdev->dev, mw->alloc_size,
+ mw->alloc_addr, mw->dma_addr);
mw->xlat_size = 0;
mw->buff_size = 0;
+ mw->alloc_size = 0;
+ mw->alloc_addr = NULL;
mw->virt_addr = NULL;
}
+static int ntb_alloc_mw_buffer(struct ntb_transport_mw *mw,
+ struct device *dma_dev, size_t align)
+{
+ dma_addr_t dma_addr;
+ void *alloc_addr, *virt_addr;
+ int rc;
+
+ alloc_addr = dma_alloc_coherent(dma_dev, mw->alloc_size,
+ &dma_addr, GFP_KERNEL);
+ if (!alloc_addr) {
+ dev_err(dma_dev, "Unable to alloc MW buff of size %zu\n",
+ mw->alloc_size);
+ return -ENOMEM;
+ }
+ virt_addr = alloc_addr;
+
+ /*
+ * we must ensure that the memory address allocated is BAR size
+ * aligned in order for the XLAT register to take the value. This
+ * is a requirement of the hardware. It is recommended to setup CMA
+ * for BAR sizes equal or greater than 4MB.
+ */
+ if (!IS_ALIGNED(dma_addr, align)) {
+ if (mw->alloc_size > mw->buff_size) {
+ virt_addr = PTR_ALIGN(alloc_addr, align);
+ dma_addr = ALIGN(dma_addr, align);
+ } else {
+ rc = -ENOMEM;
+ goto err;
+ }
+ }
+
+ mw->alloc_addr = alloc_addr;
+ mw->virt_addr = virt_addr;
+ mw->dma_addr = dma_addr;
+
+ return 0;
+
+err:
+ dma_free_coherent(dma_dev, mw->alloc_size, alloc_addr, dma_addr);
+
+ return rc;
+}
+
static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw,
resource_size_t size)
{
/* Alloc memory for receiving data. Must be aligned */
mw->xlat_size = xlat_size;
mw->buff_size = buff_size;
+ mw->alloc_size = buff_size;
- mw->virt_addr = dma_alloc_coherent(&pdev->dev, buff_size,
- &mw->dma_addr, GFP_KERNEL);
- if (!mw->virt_addr) {
- mw->xlat_size = 0;
- mw->buff_size = 0;
- dev_err(&pdev->dev, "Unable to alloc MW buff of size %zu\n",
- buff_size);
- return -ENOMEM;
- }
-
- /*
- * we must ensure that the memory address allocated is BAR size
- * aligned in order for the XLAT register to take the value. This
- * is a requirement of the hardware. It is recommended to setup CMA
- * for BAR sizes equal or greater than 4MB.
- */
- if (!IS_ALIGNED(mw->dma_addr, xlat_align)) {
- dev_err(&pdev->dev, "DMA memory %pad is not aligned\n",
- &mw->dma_addr);
- ntb_free_mw(nt, num_mw);
- return -ENOMEM;
+ rc = ntb_alloc_mw_buffer(mw, &pdev->dev, xlat_align);
+ if (rc) {
+ mw->alloc_size *= 2;
+ rc = ntb_alloc_mw_buffer(mw, &pdev->dev, xlat_align);
+ if (rc) {
+ dev_err(&pdev->dev,
+ "Unable to alloc aligned MW buff\n");
+ mw->xlat_size = 0;
+ mw->buff_size = 0;
+ mw->alloc_size = 0;
+ return rc;
+ }
}
/* Notify HW the memory location of the receive buffer */
case DMA_TRANS_READ_FAILED:
case DMA_TRANS_WRITE_FAILED:
entry->errors++;
+ /* fall through */
case DMA_TRANS_ABORTED:
{
struct ntb_transport_qp *qp = entry->qp;
case DMA_TRANS_READ_FAILED:
case DMA_TRANS_WRITE_FAILED:
entry->errors++;
+ /* fall through */
case DMA_TRANS_ABORTED:
{
void __iomem *offset =
struct nd_mapping *nd_mapping, resource_size_t *overlap);
resource_size_t nd_blk_available_dpa(struct nd_region *nd_region);
resource_size_t nd_region_available_dpa(struct nd_region *nd_region);
+int nd_region_conflict(struct nd_region *nd_region, resource_size_t start,
+ resource_size_t size);
resource_size_t nvdimm_allocated_dpa(struct nvdimm_drvdata *ndd,
struct nd_label_id *label_id);
int alias_dpa_busy(struct device *dev, void *data);
ALIGN_DOWN(phys, nd_pfn->align));
}
+/*
+ * Check if pmem collides with 'System RAM', or other regions when
+ * section aligned. Trim it accordingly.
+ */
+static void trim_pfn_device(struct nd_pfn *nd_pfn, u32 *start_pad, u32 *end_trunc)
+{
+ struct nd_namespace_common *ndns = nd_pfn->ndns;
+ struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev);
+ struct nd_region *nd_region = to_nd_region(nd_pfn->dev.parent);
+ const resource_size_t start = nsio->res.start;
+ const resource_size_t end = start + resource_size(&nsio->res);
+ resource_size_t adjust, size;
+
+ *start_pad = 0;
+ *end_trunc = 0;
+
+ adjust = start - PHYS_SECTION_ALIGN_DOWN(start);
+ size = resource_size(&nsio->res) + adjust;
+ if (region_intersects(start - adjust, size, IORESOURCE_SYSTEM_RAM,
+ IORES_DESC_NONE) == REGION_MIXED
+ || nd_region_conflict(nd_region, start - adjust, size))
+ *start_pad = PHYS_SECTION_ALIGN_UP(start) - start;
+
+ /* Now check that end of the range does not collide. */
+ adjust = PHYS_SECTION_ALIGN_UP(end) - end;
+ size = resource_size(&nsio->res) + adjust;
+ if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM,
+ IORES_DESC_NONE) == REGION_MIXED
+ || !IS_ALIGNED(end, nd_pfn->align)
+ || nd_region_conflict(nd_region, start, size + adjust))
+ *end_trunc = end - phys_pmem_align_down(nd_pfn, end);
+}
+
static int nd_pfn_init(struct nd_pfn *nd_pfn)
{
u32 dax_label_reserve = is_nd_dax(&nd_pfn->dev) ? SZ_128K : 0;
struct nd_namespace_common *ndns = nd_pfn->ndns;
- u32 start_pad = 0, end_trunc = 0;
+ struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev);
resource_size_t start, size;
- struct nd_namespace_io *nsio;
struct nd_region *nd_region;
+ u32 start_pad, end_trunc;
struct nd_pfn_sb *pfn_sb;
unsigned long npfns;
phys_addr_t offset;
memset(pfn_sb, 0, sizeof(*pfn_sb));
- /*
- * Check if pmem collides with 'System RAM' when section aligned and
- * trim it accordingly
- */
- nsio = to_nd_namespace_io(&ndns->dev);
- start = PHYS_SECTION_ALIGN_DOWN(nsio->res.start);
- size = resource_size(&nsio->res);
- if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM,
- IORES_DESC_NONE) == REGION_MIXED) {
- start = nsio->res.start;
- start_pad = PHYS_SECTION_ALIGN_UP(start) - start;
- }
-
- start = nsio->res.start;
- size = PHYS_SECTION_ALIGN_UP(start + size) - start;
- if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM,
- IORES_DESC_NONE) == REGION_MIXED
- || !IS_ALIGNED(start + resource_size(&nsio->res),
- nd_pfn->align)) {
- size = resource_size(&nsio->res);
- end_trunc = start + size - phys_pmem_align_down(nd_pfn,
- start + size);
- }
-
+ trim_pfn_device(nd_pfn, &start_pad, &end_trunc);
if (start_pad + end_trunc)
dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n",
dev_name(&ndns->dev), start_pad + end_trunc);
* implementation will limit the pfns advertised through
* ->direct_access() to those that are included in the memmap.
*/
- start += start_pad;
+ start = nsio->res.start + start_pad;
size = resource_size(&nsio->res);
npfns = PFN_SECTION_ALIGN_UP((size - start_pad - end_trunc - SZ_8K)
/ PAGE_SIZE);
}
EXPORT_SYMBOL_GPL(nvdimm_has_cache);
+struct conflict_context {
+ struct nd_region *nd_region;
+ resource_size_t start, size;
+};
+
+static int region_conflict(struct device *dev, void *data)
+{
+ struct nd_region *nd_region;
+ struct conflict_context *ctx = data;
+ resource_size_t res_end, region_end, region_start;
+
+ if (!is_memory(dev))
+ return 0;
+
+ nd_region = to_nd_region(dev);
+ if (nd_region == ctx->nd_region)
+ return 0;
+
+ res_end = ctx->start + ctx->size;
+ region_start = nd_region->ndr_start;
+ region_end = region_start + nd_region->ndr_size;
+ if (ctx->start >= region_start && ctx->start < region_end)
+ return -EBUSY;
+ if (res_end > region_start && res_end <= region_end)
+ return -EBUSY;
+ return 0;
+}
+
+int nd_region_conflict(struct nd_region *nd_region, resource_size_t start,
+ resource_size_t size)
+{
+ struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(&nd_region->dev);
+ struct conflict_context ctx = {
+ .nd_region = nd_region,
+ .start = start,
+ .size = size,
+ };
+
+ return device_for_each_child(&nvdimm_bus->dev, &ctx, region_conflict);
+}
+
void __exit nd_region_devs_exit(void)
{
ida_destroy(®ion_ida);
static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
{
struct nvme_ctrl *ctrl = rq->end_io_data;
+ unsigned long flags;
+ bool startka = false;
blk_mq_free_request(rq);
return;
}
- schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+ spin_lock_irqsave(&ctrl->lock, flags);
+ if (ctrl->state == NVME_CTRL_LIVE ||
+ ctrl->state == NVME_CTRL_CONNECTING)
+ startka = true;
+ spin_unlock_irqrestore(&ctrl->lock, flags);
+ if (startka)
+ schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
}
static int nvme_keep_alive(struct nvme_ctrl *ctrl)
if (ns->ndev)
nvme_nvm_update_nvm_info(ns);
#ifdef CONFIG_NVME_MULTIPATH
- if (ns->head->disk)
+ if (ns->head->disk) {
nvme_update_disk_info(ns->head->disk, ns, id);
+ blk_queue_stack_limits(ns->head->disk->queue, ns->queue);
+ }
#endif
}
struct nvme_ns *ns, *next;
LIST_HEAD(ns_list);
+ /* prevent racing with ns scanning */
+ flush_work(&ctrl->scan_work);
+
/*
* The dead states indicates the controller was not gracefully
* disconnected. In that case, we won't be able to flush any data while
nvme_mpath_stop(ctrl);
nvme_stop_keep_alive(ctrl);
flush_work(&ctrl->async_event_work);
- flush_work(&ctrl->scan_work);
cancel_work_sync(&ctrl->fw_act_work);
if (ctrl->ops->stop_ctrl)
ctrl->ops->stop_ctrl(ctrl);
return 0;
out_free_name:
- kfree_const(dev->kobj.name);
+ kfree_const(ctrl->device->kobj.name);
out_release_instance:
ida_simple_remove(&nvme_instance_ida, ctrl->instance);
out:
down_read(&ctrl->namespaces_rwsem);
/* Forcibly unquiesce queues to avoid blocking dispatch */
- if (ctrl->admin_q)
+ if (ctrl->admin_q && !blk_queue_dying(ctrl->admin_q))
blk_mq_unquiesce_queue(ctrl->admin_q);
list_for_each_entry(ns, &ctrl->namespaces, list)
bool ioq_live;
bool assoc_active;
+ atomic_t err_work_active;
u64 association_id;
struct list_head ctrl_list; /* rport->ctrl_list */
struct blk_mq_tag_set tag_set;
struct delayed_work connect_work;
+ struct work_struct err_work;
struct kref ref;
u32 flags;
struct nvme_fc_fcp_op *aen_op = ctrl->aen_ops;
int i;
+ /* ensure we've initialized the ops once */
+ if (!(aen_op->flags & FCOP_FLAGS_AEN))
+ return;
+
for (i = 0; i < NVME_NR_AEN_COMMANDS; i++, aen_op++)
__nvme_fc_abort_op(ctrl, aen_op);
}
op->fcp_req.rspaddr = &op->rsp_iu;
op->fcp_req.rsplen = sizeof(op->rsp_iu);
op->fcp_req.done = nvme_fc_fcpio_done;
- op->fcp_req.private = &op->fcp_req.first_sgl[SG_CHUNK_SIZE];
op->ctrl = ctrl;
op->queue = queue;
op->rq = rq;
struct nvme_fc_queue *queue = &ctrl->queues[queue_idx];
int res;
- nvme_req(rq)->ctrl = &ctrl->ctrl;
res = __nvme_fc_init_request(ctrl, queue, &op->op, rq, queue->rqcnt++);
if (res)
return res;
op->op.fcp_req.first_sgl = &op->sgl[0];
+ op->op.fcp_req.private = &op->priv[0];
+ nvme_req(rq)->ctrl = &ctrl->ctrl;
return res;
}
static void
nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg)
{
- /* only proceed if in LIVE state - e.g. on first error */
+ int active;
+
+ /*
+ * if an error (io timeout, etc) while (re)connecting,
+ * it's an error on creating the new association.
+ * Start the error recovery thread if it hasn't already
+ * been started. It is expected there could be multiple
+ * ios hitting this path before things are cleaned up.
+ */
+ if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) {
+ active = atomic_xchg(&ctrl->err_work_active, 1);
+ if (!active && !schedule_work(&ctrl->err_work)) {
+ atomic_set(&ctrl->err_work_active, 0);
+ WARN_ON(1);
+ }
+ return;
+ }
+
+ /* Otherwise, only proceed if in LIVE state - e.g. on first error */
if (ctrl->ctrl.state != NVME_CTRL_LIVE)
return;
{
struct nvme_fc_ctrl *ctrl = to_fc_ctrl(nctrl);
+ cancel_work_sync(&ctrl->err_work);
cancel_delayed_work_sync(&ctrl->connect_work);
/*
* kill the association on the link side. this will block
}
static void
-nvme_fc_reset_ctrl_work(struct work_struct *work)
+__nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl)
{
- struct nvme_fc_ctrl *ctrl =
- container_of(work, struct nvme_fc_ctrl, ctrl.reset_work);
- int ret;
-
- nvme_stop_ctrl(&ctrl->ctrl);
+ nvme_stop_keep_alive(&ctrl->ctrl);
/* will block will waiting for io to terminate */
nvme_fc_delete_association(ctrl);
- if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
+ if (ctrl->ctrl.state != NVME_CTRL_CONNECTING &&
+ !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))
dev_err(ctrl->ctrl.device,
"NVME-FC{%d}: error_recovery: Couldn't change state "
"to CONNECTING\n", ctrl->cnum);
- return;
- }
+}
+
+static void
+nvme_fc_reset_ctrl_work(struct work_struct *work)
+{
+ struct nvme_fc_ctrl *ctrl =
+ container_of(work, struct nvme_fc_ctrl, ctrl.reset_work);
+ int ret;
+
+ __nvme_fc_terminate_io(ctrl);
+
+ nvme_stop_ctrl(&ctrl->ctrl);
if (ctrl->rport->remoteport.port_state == FC_OBJSTATE_ONLINE)
ret = nvme_fc_create_association(ctrl);
ctrl->cnum);
}
+static void
+nvme_fc_connect_err_work(struct work_struct *work)
+{
+ struct nvme_fc_ctrl *ctrl =
+ container_of(work, struct nvme_fc_ctrl, err_work);
+
+ __nvme_fc_terminate_io(ctrl);
+
+ atomic_set(&ctrl->err_work_active, 0);
+
+ /*
+ * Rescheduling the connection after recovering
+ * from the io error is left to the reconnect work
+ * item, which is what should have stalled waiting on
+ * the io that had the error that scheduled this work.
+ */
+}
+
static const struct nvme_ctrl_ops nvme_fc_ctrl_ops = {
.name = "fc",
.module = THIS_MODULE,
ctrl->cnum = idx;
ctrl->ioq_live = false;
ctrl->assoc_active = false;
+ atomic_set(&ctrl->err_work_active, 0);
init_waitqueue_head(&ctrl->ioabort_wait);
get_device(ctrl->dev);
INIT_WORK(&ctrl->ctrl.reset_work, nvme_fc_reset_ctrl_work);
INIT_DELAYED_WORK(&ctrl->connect_work, nvme_fc_connect_ctrl_work);
+ INIT_WORK(&ctrl->err_work, nvme_fc_connect_err_work);
spin_lock_init(&ctrl->lock);
/* io queue count */
fail_ctrl:
nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING);
cancel_work_sync(&ctrl->ctrl.reset_work);
+ cancel_work_sync(&ctrl->err_work);
cancel_delayed_work_sync(&ctrl->connect_work);
ctrl->ctrl.opts = NULL;
blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
/* set to a default value for 512 until disk is validated */
blk_queue_logical_block_size(q, 512);
+ blk_set_stacking_limits(&q->limits);
/* we need to propagate up the VMC settings */
if (ctrl->vwc & NVME_CTRL_VWC_PRESENT)
static inline int nvme_mpath_init(struct nvme_ctrl *ctrl,
struct nvme_id_ctrl *id)
{
+ if (ctrl->subsys->cmic & (1 << 3))
+ dev_warn(ctrl->device,
+"Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices.\n");
return 0;
}
static inline void nvme_mpath_uninit(struct nvme_ctrl *ctrl)
struct pci_dev *pdev = to_pci_dev(dev->dev);
int bar;
+ if (dev->cmb_size)
+ return;
+
dev->cmbsz = readl(dev->bar + NVME_REG_CMBSZ);
if (!dev->cmbsz)
return;
{
struct pci_dev *pdev = to_pci_dev(dev->dev);
- nvme_release_cmb(dev);
pci_free_irq_vectors(pdev);
if (pci_is_enabled(pdev)) {
nvme_stop_ctrl(&dev->ctrl);
nvme_remove_namespaces(&dev->ctrl);
nvme_dev_disable(dev, true);
+ nvme_release_cmb(dev);
nvme_free_host_mem(dev);
nvme_dev_remove_admin(dev);
nvme_free_queues(dev, 0);
qe->dma = ib_dma_map_single(ibdev, qe->data, capsule_size, dir);
if (ib_dma_mapping_error(ibdev, qe->dma)) {
kfree(qe->data);
+ qe->data = NULL;
return -ENOMEM;
}
out_free_async_qe:
nvme_rdma_free_qe(ctrl->device->dev, &ctrl->async_event_sqe,
sizeof(struct nvme_command), DMA_TO_DEVICE);
+ ctrl->async_event_sqe.data = NULL;
out_free_queue:
nvme_rdma_free_queue(&ctrl->queues[0]);
return error;
struct pci_dev *p2p_dev;
int ret;
- if (!ctrl->p2p_client)
+ if (!ctrl->p2p_client || !ns->use_p2pmem)
return;
if (ns->p2p_dev) {
rw = READ;
}
- iov_iter_bvec(&iter, ITER_BVEC | rw, req->f.bvec, nr_segs, count);
+ iov_iter_bvec(&iter, rw, req->f.bvec, nr_segs, count);
iocb->ki_pos = pos;
iocb->ki_filp = req->ns->file;
int inline_page_count;
};
-static struct workqueue_struct *nvmet_rdma_delete_wq;
static bool nvmet_rdma_use_srq;
module_param_named(use_srq, nvmet_rdma_use_srq, bool, 0444);
MODULE_PARM_DESC(use_srq, "Use shared receive queue.");
{
struct nvmet_rdma_rsp *rsp =
container_of(wc->wr_cqe, struct nvmet_rdma_rsp, send_cqe);
+ struct nvmet_rdma_queue *queue = cq->cq_context;
nvmet_rdma_release_rsp(rsp);
wc->status != IB_WC_WR_FLUSH_ERR)) {
pr_err("SEND for CQE 0x%p failed with status %s (%d).\n",
wc->wr_cqe, ib_wc_status_msg(wc->status), wc->status);
- nvmet_rdma_error_comp(rsp->queue);
+ nvmet_rdma_error_comp(queue);
}
}
if (queue->host_qid == 0) {
/* Let inflight controller teardown complete */
- flush_workqueue(nvmet_rdma_delete_wq);
+ flush_scheduled_work();
}
ret = nvmet_rdma_cm_accept(cm_id, queue, &event->param.conn);
if (ret) {
- queue_work(nvmet_rdma_delete_wq, &queue->release_work);
+ schedule_work(&queue->release_work);
/* Destroying rdma_cm id is not needed here */
return 0;
}
if (disconnect) {
rdma_disconnect(queue->cm_id);
- queue_work(nvmet_rdma_delete_wq, &queue->release_work);
+ schedule_work(&queue->release_work);
}
}
mutex_unlock(&nvmet_rdma_queue_mutex);
pr_err("failed to connect queue %d\n", queue->idx);
- queue_work(nvmet_rdma_delete_wq, &queue->release_work);
+ schedule_work(&queue->release_work);
}
/**
if (ret)
goto err_ib_client;
- nvmet_rdma_delete_wq = alloc_workqueue("nvmet-rdma-delete-wq",
- WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0);
- if (!nvmet_rdma_delete_wq) {
- ret = -ENOMEM;
- goto err_unreg_transport;
- }
-
return 0;
-err_unreg_transport:
- nvmet_unregister_transport(&nvmet_rdma_ops);
err_ib_client:
ib_unregister_client(&nvmet_rdma_ib_client);
return ret;
static void __exit nvmet_rdma_exit(void)
{
- destroy_workqueue(nvmet_rdma_delete_wq);
nvmet_unregister_transport(&nvmet_rdma_ops);
ib_unregister_client(&nvmet_rdma_ib_client);
WARN_ON_ONCE(!list_empty(&nvmet_rdma_queue_list));
int bytes;
int bit_offset;
int nbits;
+ struct device_node *np;
struct nvmem_device *nvmem;
struct list_head node;
};
mutex_lock(&nvmem_mutex);
list_del(&cell->node);
mutex_unlock(&nvmem_mutex);
+ of_node_put(cell->np);
kfree(cell->name);
kfree(cell);
}
return -ENOMEM;
cell->nvmem = nvmem;
+ cell->np = of_node_get(child);
cell->offset = be32_to_cpup(addr++);
cell->bytes = be32_to_cpup(addr);
cell->name = kasprintf(GFP_KERNEL, "%pOFn", child);
#if IS_ENABLED(CONFIG_OF)
static struct nvmem_cell *
-nvmem_find_cell_by_index(struct nvmem_device *nvmem, int index)
+nvmem_find_cell_by_node(struct nvmem_device *nvmem, struct device_node *np)
{
struct nvmem_cell *cell = NULL;
- int i = 0;
mutex_lock(&nvmem_mutex);
list_for_each_entry(cell, &nvmem->cells, node) {
- if (index == i++)
+ if (np == cell->np)
break;
}
mutex_unlock(&nvmem_mutex);
if (IS_ERR(nvmem))
return ERR_CAST(nvmem);
- cell = nvmem_find_cell_by_index(nvmem, index);
+ cell = nvmem_find_cell_by_node(nvmem, cell_np);
if (!cell) {
__nvmem_device_put(nvmem);
return ERR_PTR(-ENOENT);
if (!(of_node_name_eq(next, "cpu") ||
(next->type && !of_node_cmp(next->type, "cpu"))))
continue;
- if (!__of_device_is_available(next))
- continue;
if (of_node_get(next))
break;
}
* set by the driver.
*/
mask = DMA_BIT_MASK(ilog2(dma_addr + size - 1) + 1);
- dev->bus_dma_mask = mask;
dev->coherent_dma_mask &= mask;
*dev->dma_mask &= mask;
+ /* ...but only set bus mask if we found valid dma-ranges earlier */
+ if (!ret)
+ dev->bus_dma_mask = mask;
coherent = of_dma_is_coherent(np);
dev_dbg(dev, "device is%sdma coherent\n",
distance = of_read_number(matrix, 1);
matrix++;
+ if ((nodea == nodeb && distance != LOCAL_DISTANCE) ||
+ (nodea != nodeb && distance <= LOCAL_DISTANCE)) {
+ pr_err("Invalid distance[node%d -> node%d] = %d\n",
+ nodea, nodeb, distance);
+ return -EINVAL;
+ }
+
numa_set_distance(nodea, nodeb, distance);
- pr_debug("distance[node%d -> node%d] = %d\n",
- nodea, nodeb, distance);
/* Set default distance of node B->A same as A->B */
if (nodeb > nodea)
*/
count = of_count_phandle_with_args(dev->of_node,
"operating-points-v2", NULL);
- if (count != 1)
- return -ENODEV;
-
- index = 0;
+ if (count == 1)
+ index = 0;
}
opp_table = dev_pm_opp_get_opp_table_indexed(dev, index);
int ret;
vdd_uv = _get_optimal_vdd_voltage(dev, &opp_data,
- new_supply_vbb->u_volt);
+ new_supply_vdd->u_volt);
+
+ if (new_supply_vdd->u_volt_min < vdd_uv)
+ new_supply_vdd->u_volt_min = vdd_uv;
/* Scaling up? Scale voltage before frequency */
if (freq > old_freq) {
.probe = ti_opp_supply_probe,
.driver = {
.name = "ti_opp_supply",
- .owner = THIS_MODULE,
.of_match_table = of_match_ptr(ti_opp_supply_of_match),
},
};
#define PCIE_PL_PFLR_FORCE_LINK (1 << 15)
#define PCIE_PHY_DEBUG_R0 (PL_OFFSET + 0x28)
#define PCIE_PHY_DEBUG_R1 (PL_OFFSET + 0x2c)
-#define PCIE_PHY_DEBUG_R1_XMLH_LINK_IN_TRAINING (1 << 29)
-#define PCIE_PHY_DEBUG_R1_XMLH_LINK_UP (1 << 4)
#define PCIE_PHY_CTRL (PL_OFFSET + 0x114)
#define PCIE_PHY_CTRL_DATA_LOC 0
return 0;
}
-static int imx6_pcie_link_up(struct dw_pcie *pci)
-{
- return dw_pcie_readl_dbi(pci, PCIE_PHY_DEBUG_R1) &
- PCIE_PHY_DEBUG_R1_XMLH_LINK_UP;
-}
-
static const struct dw_pcie_host_ops imx6_pcie_host_ops = {
.host_init = imx6_pcie_host_init,
};
}
static const struct dw_pcie_ops dw_pcie_ops = {
- .link_up = imx6_pcie_link_up,
+ /* No special ops needed, but pcie-designware still expects this struct */
};
#ifdef CONFIG_PM_SLEEP
int i;
for (i = 0; i < PCIE_IATU_NUM; i++)
- dw_pcie_disable_atu(pcie->pci, DW_PCIE_REGION_OUTBOUND, i);
+ dw_pcie_disable_atu(pcie->pci, i, DW_PCIE_REGION_OUTBOUND);
}
static int ls1021_pcie_link_up(struct dw_pcie *pci)
tbl_offset = dw_pcie_readl_dbi(pci, reg);
bir = (tbl_offset & PCI_MSIX_TABLE_BIR);
tbl_offset &= PCI_MSIX_TABLE_OFFSET;
- tbl_offset >>= 3;
reg = PCI_BASE_ADDRESS_0 + (4 * bir);
bar_addr_upper = 0;
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct acpi_device *adev = ACPI_COMPANION(dev);
- int node;
if (!adev)
return;
- node = acpi_get_node(adev->handle);
- if (node != NUMA_NO_NODE)
- set_dev_node(dev, node);
-
pci_acpi_optimize_delay(pci_dev, adev->handle);
pci_acpi_add_pm_notifier(adev, pci_dev);
u32 lnkcap2, lnkcap;
/*
- * PCIe r4.0 sec 7.5.3.18 recommends using the Supported Link
- * Speeds Vector in Link Capabilities 2 when supported, falling
- * back to Max Link Speed in Link Capabilities otherwise.
+ * Link Capabilities 2 was added in PCIe r3.0, sec 7.8.18. The
+ * implementation note there recommends using the Supported Link
+ * Speeds Vector in Link Capabilities 2 when supported.
+ *
+ * Without Link Capabilities 2, i.e., prior to PCIe r3.0, software
+ * should use the Supported Link Speeds field in Link Capabilities,
+ * where only 2.5 GT/s and 5.0 GT/s speeds were defined.
*/
pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &lnkcap2);
if (lnkcap2) { /* PCIe r3.0-compliant */
}
pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
- if (lnkcap) {
- if (lnkcap & PCI_EXP_LNKCAP_SLS_16_0GB)
- return PCIE_SPEED_16_0GT;
- else if (lnkcap & PCI_EXP_LNKCAP_SLS_8_0GB)
- return PCIE_SPEED_8_0GT;
- else if (lnkcap & PCI_EXP_LNKCAP_SLS_5_0GB)
- return PCIE_SPEED_5_0GT;
- else if (lnkcap & PCI_EXP_LNKCAP_SLS_2_5GB)
- return PCIE_SPEED_2_5GT;
- }
+ if ((lnkcap & PCI_EXP_LNKCAP_SLS) == PCI_EXP_LNKCAP_SLS_5_0GB)
+ return PCIE_SPEED_5_0GT;
+ else if ((lnkcap & PCI_EXP_LNKCAP_SLS) == PCI_EXP_LNKCAP_SLS_2_5GB)
+ return PCIE_SPEED_2_5GT;
return PCI_SPEED_UNKNOWN;
}
struct pcie_link_state *link;
int blacklist = !!pcie_aspm_sanity_check(pdev);
- if (!aspm_support_enabled || aspm_disabled)
+ if (!aspm_support_enabled)
return;
if (pdev->link_state)
.mask_core_ready = CORE_READY_STATUS,
.has_pll_override = true,
.autoresume_en = BIT(0),
+ .update_tune1_with_efuse = true,
};
static const char * const qusb2_phy_vreg_names[] = {
/*
* Read efuse register having TUNE2/1 parameter's high nibble.
- * If efuse register shows value as 0x0, or if we fail to find
- * a valid efuse register settings, then use default value
- * as 0xB for high nibble that we have already set while
- * configuring phy.
+ * If efuse register shows value as 0x0 (indicating value is not
+ * fused), or if we fail to find a valid efuse register setting,
+ * then use default value for high nibble that we have already
+ * set while configuring the phy.
*/
val = nvmem_cell_read(qphy->cell, NULL);
if (IS_ERR(val) || !val[0]) {
/* Fused TUNE1/2 value is the higher nibble only */
if (cfg->update_tune1_with_efuse)
- qusb2_setbits(qphy->base, cfg->regs[QUSB2PHY_PORT_TUNE1],
- val[0] << 0x4);
+ qusb2_write_mask(qphy->base, cfg->regs[QUSB2PHY_PORT_TUNE1],
+ val[0] << HSTX_TRIM_SHIFT,
+ HSTX_TRIM_MASK);
else
- qusb2_setbits(qphy->base, cfg->regs[QUSB2PHY_PORT_TUNE2],
- val[0] << 0x4);
-
+ qusb2_write_mask(qphy->base, cfg->regs[QUSB2PHY_PORT_TUNE2],
+ val[0] << HSTX_TRIM_SHIFT,
+ HSTX_TRIM_MASK);
}
static int qusb2_phy_set_mode(struct phy *phy, enum phy_mode mode)
config PHY_UNIPHIER_PCIE
tristate "Uniphier PHY driver for PCIe controller"
- depends on (ARCH_UNIPHIER || COMPILE_TEST) && OF
+ depends on ARCH_UNIPHIER || COMPILE_TEST
+ depends on OF && HAS_IOMEM
default PCIE_UNIPHIER
select GENERIC_PHY
help
static struct meson_bank meson_gxbb_aobus_banks[] = {
/* name first last irq pullen pull dir out in */
- BANK("AO", GPIOAO_0, GPIOAO_13, 0, 13, 0, 0, 0, 16, 0, 0, 0, 16, 1, 0),
+ BANK("AO", GPIOAO_0, GPIOAO_13, 0, 13, 0, 16, 0, 0, 0, 0, 0, 16, 1, 0),
};
static struct meson_pinctrl_data meson_gxbb_periphs_pinctrl_data = {
static struct meson_bank meson_gxl_aobus_banks[] = {
/* name first last irq pullen pull dir out in */
- BANK("AO", GPIOAO_0, GPIOAO_9, 0, 9, 0, 0, 0, 16, 0, 0, 0, 16, 1, 0),
+ BANK("AO", GPIOAO_0, GPIOAO_9, 0, 9, 0, 16, 0, 0, 0, 0, 0, 16, 1, 0),
};
static struct meson_pinctrl_data meson_gxl_periphs_pinctrl_data = {
dev_dbg(pc->dev, "pin %u: disable bias\n", pin);
meson_calc_reg_and_bit(bank, pin, REG_PULL, ®, &bit);
- ret = regmap_update_bits(pc->reg_pull, reg,
+ ret = regmap_update_bits(pc->reg_pullen, reg,
BIT(bit), 0);
if (ret)
return ret;
static struct meson_bank meson8_aobus_banks[] = {
/* name first last irq pullen pull dir out in */
- BANK("AO", GPIOAO_0, GPIO_TEST_N, 0, 13, 0, 0, 0, 16, 0, 0, 0, 16, 1, 0),
+ BANK("AO", GPIOAO_0, GPIO_TEST_N, 0, 13, 0, 16, 0, 0, 0, 0, 0, 16, 1, 0),
};
static struct meson_pinctrl_data meson8_cbus_pinctrl_data = {
static struct meson_bank meson8b_aobus_banks[] = {
/* name first lastc irq pullen pull dir out in */
- BANK("AO", GPIOAO_0, GPIO_TEST_N, 0, 13, 0, 0, 0, 16, 0, 0, 0, 16, 1, 0),
+ BANK("AO", GPIOAO_0, GPIO_TEST_N, 0, 13, 0, 16, 0, 0, 0, 0, 0, 16, 1, 0),
};
static struct meson_pinctrl_data meson8b_cbus_pinctrl_data = {
config PWM_TIECAP
tristate "ECAP PWM support"
- depends on ARCH_OMAP2PLUS || ARCH_DAVINCI_DA8XX || ARCH_KEYSTONE
+ depends on ARCH_OMAP2PLUS || ARCH_DAVINCI_DA8XX || ARCH_KEYSTONE || ARCH_K3
help
- PWM driver support for the ECAP APWM controller found on AM33XX
- TI SOC
+ PWM driver support for the ECAP APWM controller found on TI SOCs
To compile this driver as a module, choose M here: the module
will be called pwm-tiecap.
.clk_rate = 19200000,
.npwm = 1,
.base_unit_bits = 16,
+ .other_devices_aml_touches_pwm_regs = true,
};
/* Broxton */
platform_set_drvdata(pdev, lpwm);
+ dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_SMART_PREPARE);
pm_runtime_set_active(&pdev->dev);
pm_runtime_enable(&pdev->dev);
return pwm_lpss_remove(lpwm);
}
-static SIMPLE_DEV_PM_OPS(pwm_lpss_platform_pm_ops,
- pwm_lpss_suspend,
- pwm_lpss_resume);
+static int pwm_lpss_prepare(struct device *dev)
+{
+ struct pwm_lpss_chip *lpwm = dev_get_drvdata(dev);
+
+ /*
+ * If other device's AML code touches the PWM regs on suspend/resume
+ * force runtime-resume the PWM controller to allow this.
+ */
+ if (lpwm->info->other_devices_aml_touches_pwm_regs)
+ return 0; /* Force runtime-resume */
+
+ return 1; /* If runtime-suspended leave as is */
+}
+
+static const struct dev_pm_ops pwm_lpss_platform_pm_ops = {
+ .prepare = pwm_lpss_prepare,
+ SET_SYSTEM_SLEEP_PM_OPS(pwm_lpss_suspend, pwm_lpss_resume)
+};
static const struct acpi_device_id pwm_lpss_acpi_match[] = {
{ "80860F09", (unsigned long)&pwm_lpss_byt_info },
{ "80862288", (unsigned long)&pwm_lpss_bsw_info },
+ { "80862289", (unsigned long)&pwm_lpss_bsw_info },
{ "80865AC8", (unsigned long)&pwm_lpss_bxt_info },
{ },
};
/* Size of each PWM register space if multiple */
#define PWM_SIZE 0x400
-#define MAX_PWMS 4
-
-struct pwm_lpss_chip {
- struct pwm_chip chip;
- void __iomem *regs;
- const struct pwm_lpss_boardinfo *info;
- u32 saved_ctrl[MAX_PWMS];
-};
-
static inline struct pwm_lpss_chip *to_lpwm(struct pwm_chip *chip)
{
return container_of(chip, struct pwm_lpss_chip, chip);
unsigned long long on_time_div;
unsigned long c = lpwm->info->clk_rate, base_unit_range;
unsigned long long base_unit, freq = NSEC_PER_SEC;
- u32 ctrl;
+ u32 orig_ctrl, ctrl;
do_div(freq, period_ns);
do_div(on_time_div, period_ns);
on_time_div = 255ULL - on_time_div;
- ctrl = pwm_lpss_read(pwm);
+ orig_ctrl = ctrl = pwm_lpss_read(pwm);
ctrl &= ~PWM_ON_TIME_DIV_MASK;
ctrl &= ~(base_unit_range << PWM_BASE_UNIT_SHIFT);
base_unit &= base_unit_range;
ctrl |= (u32) base_unit << PWM_BASE_UNIT_SHIFT;
ctrl |= on_time_div;
- pwm_lpss_write(pwm, ctrl);
+
+ if (orig_ctrl != ctrl) {
+ pwm_lpss_write(pwm, ctrl);
+ pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE);
+ }
}
static inline void pwm_lpss_cond_enable(struct pwm_device *pwm, bool cond)
return ret;
}
pwm_lpss_prepare(lpwm, pwm, state->duty_cycle, state->period);
- pwm_lpss_write(pwm, pwm_lpss_read(pwm) | PWM_SW_UPDATE);
pwm_lpss_cond_enable(pwm, lpwm->info->bypass == false);
ret = pwm_lpss_wait_for_update(pwm);
if (ret) {
if (ret)
return ret;
pwm_lpss_prepare(lpwm, pwm, state->duty_cycle, state->period);
- pwm_lpss_write(pwm, pwm_lpss_read(pwm) | PWM_SW_UPDATE);
return pwm_lpss_wait_for_update(pwm);
}
} else if (pwm_is_enabled(pwm)) {
return 0;
}
+/* This function gets called once from pwmchip_add to get the initial state */
+static void pwm_lpss_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
+ struct pwm_state *state)
+{
+ struct pwm_lpss_chip *lpwm = to_lpwm(chip);
+ unsigned long base_unit_range;
+ unsigned long long base_unit, freq, on_time_div;
+ u32 ctrl;
+
+ base_unit_range = BIT(lpwm->info->base_unit_bits);
+
+ ctrl = pwm_lpss_read(pwm);
+ on_time_div = 255 - (ctrl & PWM_ON_TIME_DIV_MASK);
+ base_unit = (ctrl >> PWM_BASE_UNIT_SHIFT) & (base_unit_range - 1);
+
+ freq = base_unit * lpwm->info->clk_rate;
+ do_div(freq, base_unit_range);
+ if (freq == 0)
+ state->period = NSEC_PER_SEC;
+ else
+ state->period = NSEC_PER_SEC / (unsigned long)freq;
+
+ on_time_div *= state->period;
+ do_div(on_time_div, 255);
+ state->duty_cycle = on_time_div;
+
+ state->polarity = PWM_POLARITY_NORMAL;
+ state->enabled = !!(ctrl & PWM_ENABLE);
+
+ if (state->enabled)
+ pm_runtime_get(chip->dev);
+}
+
static const struct pwm_ops pwm_lpss_ops = {
.apply = pwm_lpss_apply,
+ .get_state = pwm_lpss_get_state,
.owner = THIS_MODULE,
};
int pwm_lpss_remove(struct pwm_lpss_chip *lpwm)
{
+ int i;
+
+ for (i = 0; i < lpwm->info->npwm; i++) {
+ if (pwm_is_enabled(&lpwm->chip.pwms[i]))
+ pm_runtime_put(lpwm->chip.dev);
+ }
return pwmchip_remove(&lpwm->chip);
}
EXPORT_SYMBOL_GPL(pwm_lpss_remove);
#include <linux/device.h>
#include <linux/pwm.h>
-struct pwm_lpss_chip;
+#define MAX_PWMS 4
+
+struct pwm_lpss_chip {
+ struct pwm_chip chip;
+ void __iomem *regs;
+ const struct pwm_lpss_boardinfo *info;
+ u32 saved_ctrl[MAX_PWMS];
+};
struct pwm_lpss_boardinfo {
unsigned long clk_rate;
unsigned int npwm;
unsigned long base_unit_bits;
bool bypass;
+ /*
+ * On some devices the _PS0/_PS3 AML code of the GPU (GFX0) device
+ * messes with the PWM0 controllers state,
+ */
+ bool other_devices_aml_touches_pwm_regs;
};
struct pwm_lpss_chip *pwm_lpss_probe(struct device *dev, struct resource *r,
+// SPDX-License-Identifier: GPL-2.0
/*
* R-Car PWM Timer driver
*
* Copyright (C) 2015 Renesas Electronics Corporation
- *
- * This is free software; you can redistribute it and/or modify
- * it under the terms of version 2 of the GNU General Public License as
- * published by the Free Software Foundation.
*/
#include <linux/clk.h>
+// SPDX-License-Identifier: GPL-2.0
/*
* R-Mobile TPU PWM driver
*
* Copyright (C) 2012 Renesas Solutions Corp.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#include <linux/clk.h>
{ .compatible = "nvidia,tegra186-pwm", .data = &tegra186_pwm_soc },
{ }
};
-
MODULE_DEVICE_TABLE(of, tegra_pwm_of_match);
static const struct dev_pm_ops tegra_pwm_pm_ops = {
static int pwm_export_child(struct device *parent, struct pwm_device *pwm)
{
struct pwm_export *export;
+ char *pwm_prop[2];
int ret;
if (test_and_set_bit(PWMF_EXPORTED, &pwm->flags))
export->pwm = pwm;
mutex_init(&export->lock);
- export->child.class = parent->class;
export->child.release = pwm_export_release;
export->child.parent = parent;
export->child.devt = MKDEV(0, 0);
export = NULL;
return ret;
}
+ pwm_prop[0] = kasprintf(GFP_KERNEL, "EXPORT=pwm%u", pwm->hwpwm);
+ pwm_prop[1] = NULL;
+ kobject_uevent_env(&parent->kobj, KOBJ_CHANGE, pwm_prop);
+ kfree(pwm_prop[0]);
return 0;
}
static int pwm_unexport_child(struct device *parent, struct pwm_device *pwm)
{
struct device *child;
+ char *pwm_prop[2];
if (!test_and_clear_bit(PWMF_EXPORTED, &pwm->flags))
return -ENODEV;
if (!child)
return -ENODEV;
+ pwm_prop[0] = kasprintf(GFP_KERNEL, "UNEXPORT=pwm%u", pwm->hwpwm);
+ pwm_prop[1] = NULL;
+ kobject_uevent_env(&parent->kobj, KOBJ_CHANGE, pwm_prop);
+ kfree(pwm_prop[0]);
+
/* for device_find_child() */
put_device(child);
device_unregister(child);
tv64.tv_sec = rtc_tm_to_time64(&tm);
#if BITS_PER_LONG == 32
- if (tv64.tv_sec > INT_MAX)
+ if (tv64.tv_sec > INT_MAX) {
+ err = -ERANGE;
goto err_read;
+ }
#endif
err = do_settimeofday64(&tv64);
struct cmos_rtc *cmos = dev_get_drvdata(dev);
unsigned char rtc_control;
+ /* This not only a rtc_op, but also called directly */
if (!is_valid_irq(cmos->irq))
return -EIO;
unsigned char mon, mday, hrs, min, sec, rtc_control;
int ret;
+ /* This not only a rtc_op, but also called directly */
if (!is_valid_irq(cmos->irq))
return -EIO;
struct cmos_rtc *cmos = dev_get_drvdata(dev);
unsigned long flags;
- if (!is_valid_irq(cmos->irq))
- return -EINVAL;
-
spin_lock_irqsave(&rtc_lock, flags);
if (enabled)
.alarm_irq_enable = cmos_alarm_irq_enable,
};
+static const struct rtc_class_ops cmos_rtc_ops_no_alarm = {
+ .read_time = cmos_read_time,
+ .set_time = cmos_set_time,
+ .proc = cmos_procfs,
+};
+
/*----------------------------------------------------------------*/
/*
dev_dbg(dev, "IRQ %d is already in use\n", rtc_irq);
goto cleanup1;
}
+
+ cmos_rtc.rtc->ops = &cmos_rtc_ops;
+ } else {
+ cmos_rtc.rtc->ops = &cmos_rtc_ops_no_alarm;
}
- cmos_rtc.rtc->ops = &cmos_rtc_ops;
cmos_rtc.rtc->nvram_old_abi = true;
retval = rtc_register_device(cmos_rtc.rtc);
if (retval)
/* get a report with all values through requesting one value */
sensor_hub_input_attr_get_raw_value(time_state->common_attributes.hsdev,
HID_USAGE_SENSOR_TIME, hid_time_addresses[0],
- time_state->info[0].report_id, SENSOR_HUB_SYNC);
+ time_state->info[0].report_id, SENSOR_HUB_SYNC, false);
/* wait for all values (event) */
ret = wait_for_completion_killable_timeout(
&time_state->comp_last_time, HZ*6);
memcpy(buf + 1, val, val_size);
ret = i2c_master_send(client, buf, val_size + 1);
+
+ kfree(buf);
+
if (ret != val_size + 1)
return ret < 0 ? ret : -EIO;
* orb specified one of the unsupported formats, we defer
* checking for IDAWs in unsupported formats to here.
*/
- if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw))
+ if ((!cp->orb.cmd.c64 || cp->orb.cmd.i2k) && ccw_is_idal(ccw)) {
+ kfree(p);
return -EOPNOTSUPP;
+ }
if ((!ccw_is_chain(ccw)) && (!ccw_is_tic(ccw)))
break;
ret = pfn_array_alloc_pin(pat->pat_pa, cp->mdev, ccw->cda, ccw->count);
if (ret < 0)
- goto out_init;
+ goto out_unpin;
/* Translate this direct ccw to a idal ccw. */
idaws = kcalloc(ret, sizeof(*idaws), GFP_DMA | GFP_KERNEL);
#include "vfio_ccw_private.h"
struct workqueue_struct *vfio_ccw_work_q;
-struct kmem_cache *vfio_ccw_io_region;
+static struct kmem_cache *vfio_ccw_io_region;
/*
* Helpers
if (ret)
goto out_free;
- ret = vfio_ccw_mdev_reg(sch);
- if (ret)
- goto out_disable;
-
INIT_WORK(&private->io_work, vfio_ccw_sch_io_todo);
atomic_set(&private->avail, 1);
private->state = VFIO_CCW_STATE_STANDBY;
+ ret = vfio_ccw_mdev_reg(sch);
+ if (ret)
+ goto out_disable;
+
return 0;
out_disable:
drvres = ap_drv->flags & AP_DRIVER_FLAG_DEFAULT;
if (!!devres != !!drvres)
return -ENODEV;
+ /* (re-)init queue's state machine */
+ ap_queue_reinit_state(to_ap_queue(dev));
}
/* Add queue/card to list of active queues/cards */
struct ap_device *ap_dev = to_ap_dev(dev);
struct ap_driver *ap_drv = ap_dev->drv;
+ if (is_queue_dev(dev))
+ ap_queue_remove(to_ap_queue(dev));
if (ap_drv->remove)
ap_drv->remove(ap_dev);
aq->ap_dev.device.parent = &ac->ap_dev.device;
dev_set_name(&aq->ap_dev.device,
"%02x.%04x", id, dom);
- /* Start with a device reset */
- spin_lock_bh(&aq->lock);
- ap_wait(ap_sm_event(aq, AP_EVENT_POLL));
- spin_unlock_bh(&aq->lock);
/* Register device */
rc = device_register(&aq->ap_dev.device);
if (rc) {
void ap_queue_remove(struct ap_queue *aq);
void ap_queue_suspend(struct ap_device *ap_dev);
void ap_queue_resume(struct ap_device *ap_dev);
+void ap_queue_reinit_state(struct ap_queue *aq);
struct ap_card *ap_card_create(int id, int queue_depth, int raw_device_type,
int comp_device_type, unsigned int functions);
{
ap_flush_queue(aq);
del_timer_sync(&aq->timeout);
+
+ /* reset with zero, also clears irq registration */
+ spin_lock_bh(&aq->lock);
+ ap_zapq(aq->qid);
+ aq->state = AP_STATE_BORKED;
+ spin_unlock_bh(&aq->lock);
}
EXPORT_SYMBOL(ap_queue_remove);
+
+void ap_queue_reinit_state(struct ap_queue *aq)
+{
+ spin_lock_bh(&aq->lock);
+ aq->state = AP_STATE_RESET_START;
+ ap_wait(ap_sm_event(aq, AP_EVENT_POLL));
+ spin_unlock_bh(&aq->lock);
+}
+EXPORT_SYMBOL(ap_queue_reinit_state);
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
struct zcrypt_queue *zq = aq->private;
- ap_queue_remove(aq);
if (zq)
zcrypt_queue_unregister(zq);
}
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
struct zcrypt_queue *zq = aq->private;
- ap_queue_remove(aq);
if (zq)
zcrypt_queue_unregister(zq);
}
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
struct zcrypt_queue *zq = aq->private;
- ap_queue_remove(aq);
if (zq)
zcrypt_queue_unregister(zq);
}
break;
clear_bit_inv(bit, bv);
+ ism->sba->dmbe_mask[bit + ISM_DMB_BIT_OFFSET] = 0;
barrier();
smcd_handle_irq(ism->smcd, bit + ISM_DMB_BIT_OFFSET);
- ism->sba->dmbe_mask[bit + ISM_DMB_BIT_OFFSET] = 0;
}
if (ism->sba->e) {
#define SENSE_RESETTING_EVENT_BYTE 1
#define SENSE_RESETTING_EVENT_FLAG 0x80
+static inline u32 qeth_get_device_id(struct ccw_device *cdev)
+{
+ struct ccw_dev_id dev_id;
+ u32 id;
+
+ ccw_device_get_id(cdev, &dev_id);
+ id = dev_id.devno;
+ id |= (u32) (dev_id.ssid << 16);
+
+ return id;
+}
+
/*
* Common IO related definitions
*/
#define CARD_RDEV_ID(card) dev_name(&card->read.ccwdev->dev)
#define CARD_WDEV_ID(card) dev_name(&card->write.ccwdev->dev)
#define CARD_DDEV_ID(card) dev_name(&card->data.ccwdev->dev)
-#define CHANNEL_ID(channel) dev_name(&channel->ccwdev->dev)
+#define CCW_DEVID(cdev) (qeth_get_device_id(cdev))
+#define CARD_DEVID(card) (CCW_DEVID(CARD_RDEV(card)))
/**
* card stuff
/*some helper functions*/
#define QETH_CARD_IFNAME(card) (((card)->dev)? (card)->dev->name : "")
+static inline bool qeth_netdev_is_registered(struct net_device *dev)
+{
+ return dev->netdev_ops != NULL;
+}
+
static inline void qeth_scrub_qdio_buffer(struct qdio_buffer *buf,
unsigned int elements)
{
int qeth_do_run_thread(struct qeth_card *, unsigned long);
void qeth_clear_thread_start_bit(struct qeth_card *, unsigned long);
void qeth_clear_thread_running_bit(struct qeth_card *, unsigned long);
-int qeth_core_hardsetup_card(struct qeth_card *);
+int qeth_core_hardsetup_card(struct qeth_card *card, bool *carrier_ok);
void qeth_print_status_message(struct qeth_card *);
int qeth_init_qdio_queues(struct qeth_card *);
int qeth_send_ipa_cmd(struct qeth_card *, struct qeth_cmd_buffer *,
int qeth_hw_trap(struct qeth_card *, enum qeth_diags_trap_action);
void qeth_trace_features(struct qeth_card *);
void qeth_close_dev(struct qeth_card *);
-int qeth_send_setassparms(struct qeth_card *, struct qeth_cmd_buffer *, __u16,
- long,
- int (*reply_cb)(struct qeth_card *,
- struct qeth_reply *, unsigned long),
- void *);
int qeth_setassparms_cb(struct qeth_card *, struct qeth_reply *, unsigned long);
struct qeth_cmd_buffer *qeth_get_setassparms_cmd(struct qeth_card *,
enum qeth_ipa_funcs,
return "OSD_1000";
case QETH_LINK_TYPE_10GBIT_ETH:
return "OSD_10GIG";
+ case QETH_LINK_TYPE_25GBIT_ETH:
+ return "OSD_25GIG";
case QETH_LINK_TYPE_LANE_ETH100:
return "OSD_FE_LANE";
case QETH_LINK_TYPE_LANE_TR:
if (!iob) {
dev_warn(&card->gdev->dev, "The qeth device driver "
"failed to recover an error on the device\n");
- QETH_DBF_MESSAGE(2, "%s issue_next_read failed: no iob "
- "available\n", dev_name(&card->gdev->dev));
+ QETH_DBF_MESSAGE(2, "issue_next_read on device %x failed: no iob available\n",
+ CARD_DEVID(card));
return -ENOMEM;
}
qeth_setup_ccw(channel->ccw, CCW_CMD_READ, QETH_BUFSIZE, iob->data);
rc = ccw_device_start(channel->ccwdev, channel->ccw,
(addr_t) iob, 0, 0);
if (rc) {
- QETH_DBF_MESSAGE(2, "%s error in starting next read ccw! "
- "rc=%i\n", dev_name(&card->gdev->dev), rc);
+ QETH_DBF_MESSAGE(2, "error %i on device %x when starting next read ccw!\n",
+ rc, CARD_DEVID(card));
atomic_set(&channel->irq_pending, 0);
card->read_or_write_problem = 1;
qeth_schedule_recovery(card);
const char *ipa_name;
int com = cmd->hdr.command;
ipa_name = qeth_get_ipa_cmd_name(com);
+
if (rc)
- QETH_DBF_MESSAGE(2, "IPA: %s(x%X) for %s/%s returned "
- "x%X \"%s\"\n",
- ipa_name, com, dev_name(&card->gdev->dev),
- QETH_CARD_IFNAME(card), rc,
- qeth_get_ipa_msg(rc));
+ QETH_DBF_MESSAGE(2, "IPA: %s(%#x) for device %x returned %#x \"%s\"\n",
+ ipa_name, com, CARD_DEVID(card), rc,
+ qeth_get_ipa_msg(rc));
else
- QETH_DBF_MESSAGE(5, "IPA: %s(x%X) for %s/%s succeeded\n",
- ipa_name, com, dev_name(&card->gdev->dev),
- QETH_CARD_IFNAME(card));
+ QETH_DBF_MESSAGE(5, "IPA: %s(%#x) for device %x succeeded\n",
+ ipa_name, com, CARD_DEVID(card));
}
static struct qeth_ipa_cmd *qeth_check_ipa_data(struct qeth_card *card,
QETH_DBF_HEX(CTRL, 2, buffer, QETH_DBF_CTRL_LEN);
if ((buffer[2] & 0xc0) == 0xc0) {
- QETH_DBF_MESSAGE(2, "received an IDX TERMINATE with cause code %#02x\n",
+ QETH_DBF_MESSAGE(2, "received an IDX TERMINATE with cause code %#04x\n",
buffer[4]);
QETH_CARD_TEXT(card, 2, "ckidxres");
QETH_CARD_TEXT(card, 2, " idxterm");
QETH_CARD_TEXT(card, 2, "CGENCHK");
dev_warn(&cdev->dev, "The qeth device driver "
"failed to recover an error on the device\n");
- QETH_DBF_MESSAGE(2, "%s check on device dstat=x%x, cstat=x%x\n",
- dev_name(&cdev->dev), dstat, cstat);
+ QETH_DBF_MESSAGE(2, "check on channel %x with dstat=%#x, cstat=%#x\n",
+ CCW_DEVID(cdev), dstat, cstat);
print_hex_dump(KERN_WARNING, "qeth: irb ", DUMP_PREFIX_OFFSET,
16, 1, irb, 64, 1);
return 1;
switch (PTR_ERR(irb)) {
case -EIO:
- QETH_DBF_MESSAGE(2, "%s i/o-error on device\n",
- dev_name(&cdev->dev));
+ QETH_DBF_MESSAGE(2, "i/o-error on channel %x\n",
+ CCW_DEVID(cdev));
QETH_CARD_TEXT(card, 2, "ckirberr");
QETH_CARD_TEXT_(card, 2, " rc%d", -EIO);
break;
}
break;
default:
- QETH_DBF_MESSAGE(2, "%s unknown error %ld on device\n",
- dev_name(&cdev->dev), PTR_ERR(irb));
+ QETH_DBF_MESSAGE(2, "unknown error %ld on channel %x\n",
+ PTR_ERR(irb), CCW_DEVID(cdev));
QETH_CARD_TEXT(card, 2, "ckirberr");
QETH_CARD_TEXT(card, 2, " rc???");
}
dev_warn(&channel->ccwdev->dev,
"The qeth device driver failed to recover "
"an error on the device\n");
- QETH_DBF_MESSAGE(2, "%s sense data available. cstat "
- "0x%X dstat 0x%X\n",
- dev_name(&channel->ccwdev->dev), cstat, dstat);
+ QETH_DBF_MESSAGE(2, "sense data available on channel %x: cstat %#X dstat %#X\n",
+ CCW_DEVID(channel->ccwdev), cstat,
+ dstat);
print_hex_dump(KERN_WARNING, "qeth: irb ",
DUMP_PREFIX_OFFSET, 16, 1, irb, 32, 1);
print_hex_dump(KERN_WARNING, "qeth: sense data ",
if (channel->state != CH_STATE_ACTIVATING) {
dev_warn(&channel->ccwdev->dev, "The qeth device driver"
" failed to recover an error on the device\n");
- QETH_DBF_MESSAGE(2, "%s IDX activate timed out\n",
- dev_name(&channel->ccwdev->dev));
+ QETH_DBF_MESSAGE(2, "IDX activate timed out on channel %x\n",
+ CCW_DEVID(channel->ccwdev));
QETH_DBF_TEXT_(SETUP, 2, "2err%d", -ETIME);
return -ETIME;
}
"The adapter is used exclusively by another "
"host\n");
else
- QETH_DBF_MESSAGE(2, "%s IDX_ACTIVATE on write channel:"
- " negative reply\n",
- dev_name(&channel->ccwdev->dev));
+ QETH_DBF_MESSAGE(2, "IDX_ACTIVATE on channel %x: negative reply\n",
+ CCW_DEVID(channel->ccwdev));
goto out;
}
memcpy(&temp, QETH_IDX_ACT_FUNC_LEVEL(iob->data), 2);
if ((temp & ~0x0100) != qeth_peer_func_level(card->info.func_level)) {
- QETH_DBF_MESSAGE(2, "%s IDX_ACTIVATE on write channel: "
- "function level mismatch (sent: 0x%x, received: "
- "0x%x)\n", dev_name(&channel->ccwdev->dev),
- card->info.func_level, temp);
+ QETH_DBF_MESSAGE(2, "IDX_ACTIVATE on channel %x: function level mismatch (sent: %#x, received: %#x)\n",
+ CCW_DEVID(channel->ccwdev),
+ card->info.func_level, temp);
goto out;
}
channel->state = CH_STATE_UP;
"insufficient authorization\n");
break;
default:
- QETH_DBF_MESSAGE(2, "%s IDX_ACTIVATE on read channel:"
- " negative reply\n",
- dev_name(&channel->ccwdev->dev));
+ QETH_DBF_MESSAGE(2, "IDX_ACTIVATE on channel %x: negative reply\n",
+ CCW_DEVID(channel->ccwdev));
}
QETH_CARD_TEXT_(card, 2, "idxread%c",
QETH_IDX_ACT_CAUSE_CODE(iob->data));
memcpy(&temp, QETH_IDX_ACT_FUNC_LEVEL(iob->data), 2);
if (temp != qeth_peer_func_level(card->info.func_level)) {
- QETH_DBF_MESSAGE(2, "%s IDX_ACTIVATE on read channel: function "
- "level mismatch (sent: 0x%x, received: 0x%x)\n",
- dev_name(&channel->ccwdev->dev),
- card->info.func_level, temp);
+ QETH_DBF_MESSAGE(2, "IDX_ACTIVATE on channel %x: function level mismatch (sent: %#x, received: %#x)\n",
+ CCW_DEVID(channel->ccwdev),
+ card->info.func_level, temp);
goto out;
}
memcpy(&card->token.issuer_rm_r,
(addr_t) iob, 0, 0, event_timeout);
spin_unlock_irq(get_ccwdev_lock(channel->ccwdev));
if (rc) {
- QETH_DBF_MESSAGE(2, "%s qeth_send_control_data: "
- "ccw_device_start rc = %i\n",
- dev_name(&channel->ccwdev->dev), rc);
+ QETH_DBF_MESSAGE(2, "qeth_send_control_data on device %x: ccw_device_start rc = %i\n",
+ CARD_DEVID(card), rc);
QETH_CARD_TEXT_(card, 2, " err%d", rc);
spin_lock_irq(&card->lock);
list_del_init(&reply->list);
} else {
dev_warn(&card->gdev->dev,
"The qeth driver ran out of channel command buffers\n");
- QETH_DBF_MESSAGE(1, "%s The qeth driver ran out of channel command buffers",
- dev_name(&card->gdev->dev));
+ QETH_DBF_MESSAGE(1, "device %x ran out of channel command buffers",
+ CARD_DEVID(card));
}
return iob;
return 0;
default:
if (cmd->hdr.return_code) {
- QETH_DBF_MESSAGE(1, "%s IPA_CMD_QIPASSIST: Unhandled "
- "rc=%d\n",
- dev_name(&card->gdev->dev),
- cmd->hdr.return_code);
+ QETH_DBF_MESSAGE(1, "IPA_CMD_QIPASSIST on device %x: Unhandled rc=%#x\n",
+ CARD_DEVID(card),
+ cmd->hdr.return_code);
return 0;
}
}
card->options.ipa6.supported_funcs = cmd->hdr.ipa_supported;
card->options.ipa6.enabled_funcs = cmd->hdr.ipa_enabled;
} else
- QETH_DBF_MESSAGE(1, "%s IPA_CMD_QIPASSIST: Flawed LIC detected"
- "\n", dev_name(&card->gdev->dev));
+ QETH_DBF_MESSAGE(1, "IPA_CMD_QIPASSIST on device %x: Flawed LIC detected\n",
+ CARD_DEVID(card));
return 0;
}
cmd->data.setadapterparms.hdr.return_code);
if (cmd->data.setadapterparms.hdr.return_code !=
SET_ACCESS_CTRL_RC_SUCCESS)
- QETH_DBF_MESSAGE(3, "ERR:SET_ACCESS_CTRL(%s,%d)==%d\n",
- card->gdev->dev.kobj.name,
- access_ctrl_req->subcmd_code,
- cmd->data.setadapterparms.hdr.return_code);
+ QETH_DBF_MESSAGE(3, "ERR:SET_ACCESS_CTRL(%#x) on device %x: %#x\n",
+ access_ctrl_req->subcmd_code, CARD_DEVID(card),
+ cmd->data.setadapterparms.hdr.return_code);
switch (cmd->data.setadapterparms.hdr.return_code) {
case SET_ACCESS_CTRL_RC_SUCCESS:
if (card->options.isolation == ISOLATION_MODE_NONE) {
}
break;
case SET_ACCESS_CTRL_RC_ALREADY_NOT_ISOLATED:
- QETH_DBF_MESSAGE(2, "%s QDIO data connection isolation already "
- "deactivated\n", dev_name(&card->gdev->dev));
+ QETH_DBF_MESSAGE(2, "QDIO data connection isolation on device %x already deactivated\n",
+ CARD_DEVID(card));
if (fallback)
card->options.isolation = card->options.prev_isolation;
break;
case SET_ACCESS_CTRL_RC_ALREADY_ISOLATED:
- QETH_DBF_MESSAGE(2, "%s QDIO data connection isolation already"
- " activated\n", dev_name(&card->gdev->dev));
+ QETH_DBF_MESSAGE(2, "QDIO data connection isolation on device %x already activated\n",
+ CARD_DEVID(card));
if (fallback)
card->options.isolation = card->options.prev_isolation;
break;
rc = qeth_setadpparms_set_access_ctrl(card,
card->options.isolation, fallback);
if (rc) {
- QETH_DBF_MESSAGE(3,
- "IPA(SET_ACCESS_CTRL,%s,%d) sent failed\n",
- card->gdev->dev.kobj.name,
- rc);
+ QETH_DBF_MESSAGE(3, "IPA(SET_ACCESS_CTRL(%d) on device %x: sent failed\n",
+ rc, CARD_DEVID(card));
rc = -EOPNOTSUPP;
}
} else if (card->options.isolation != ISOLATION_MODE_NONE) {
rc = BMCR_FULLDPLX;
if ((card->info.link_type != QETH_LINK_TYPE_GBIT_ETH) &&
(card->info.link_type != QETH_LINK_TYPE_OSN) &&
- (card->info.link_type != QETH_LINK_TYPE_10GBIT_ETH))
+ (card->info.link_type != QETH_LINK_TYPE_10GBIT_ETH) &&
+ (card->info.link_type != QETH_LINK_TYPE_25GBIT_ETH))
rc |= BMCR_SPEED100;
break;
case MII_BMSR: /* Basic mode status register */
{
struct qeth_ipa_cmd *cmd;
struct qeth_arp_query_info *qinfo;
- struct qeth_snmp_cmd *snmp;
unsigned char *data;
+ void *snmp_data;
__u16 data_len;
QETH_CARD_TEXT(card, 3, "snpcmdcb");
cmd = (struct qeth_ipa_cmd *) sdata;
data = (unsigned char *)((char *)cmd - reply->offset);
qinfo = (struct qeth_arp_query_info *) reply->param;
- snmp = &cmd->data.setadapterparms.data.snmp;
if (cmd->hdr.return_code) {
QETH_CARD_TEXT_(card, 4, "scer1%x", cmd->hdr.return_code);
return 0;
}
data_len = *((__u16 *)QETH_IPA_PDU_LEN_PDU1(data));
- if (cmd->data.setadapterparms.hdr.seq_no == 1)
- data_len -= (__u16)((char *)&snmp->data - (char *)cmd);
- else
- data_len -= (__u16)((char *)&snmp->request - (char *)cmd);
+ if (cmd->data.setadapterparms.hdr.seq_no == 1) {
+ snmp_data = &cmd->data.setadapterparms.data.snmp;
+ data_len -= offsetof(struct qeth_ipa_cmd,
+ data.setadapterparms.data.snmp);
+ } else {
+ snmp_data = &cmd->data.setadapterparms.data.snmp.request;
+ data_len -= offsetof(struct qeth_ipa_cmd,
+ data.setadapterparms.data.snmp.request);
+ }
/* check if there is enough room in userspace */
if ((qinfo->udata_len - qinfo->udata_offset) < data_len) {
QETH_CARD_TEXT_(card, 4, "sseqn%i",
cmd->data.setadapterparms.hdr.seq_no);
/*copy entries to user buffer*/
- if (cmd->data.setadapterparms.hdr.seq_no == 1) {
- memcpy(qinfo->udata + qinfo->udata_offset,
- (char *)snmp,
- data_len + offsetof(struct qeth_snmp_cmd, data));
- qinfo->udata_offset += offsetof(struct qeth_snmp_cmd, data);
- } else {
- memcpy(qinfo->udata + qinfo->udata_offset,
- (char *)&snmp->request, data_len);
- }
+ memcpy(qinfo->udata + qinfo->udata_offset, snmp_data, data_len);
qinfo->udata_offset += data_len;
+
/* check if all replies received ... */
QETH_CARD_TEXT_(card, 4, "srtot%i",
cmd->data.setadapterparms.hdr.used_total);
rc = qeth_send_ipa_snmp_cmd(card, iob, QETH_SETADP_BASE_LEN + req_len,
qeth_snmp_command_cb, (void *)&qinfo);
if (rc)
- QETH_DBF_MESSAGE(2, "SNMP command failed on %s: (0x%x)\n",
- QETH_CARD_IFNAME(card), rc);
+ QETH_DBF_MESSAGE(2, "SNMP command failed on device %x: (%#x)\n",
+ CARD_DEVID(card), rc);
else {
if (copy_to_user(udata, qinfo.udata, qinfo.udata_len))
rc = -EFAULT;
rc = qeth_read_conf_data(card, (void **) &prcd, &length);
if (rc) {
- QETH_DBF_MESSAGE(2, "%s qeth_read_conf_data returned %i\n",
- dev_name(&card->gdev->dev), rc);
+ QETH_DBF_MESSAGE(2, "qeth_read_conf_data on device %x returned %i\n",
+ CARD_DEVID(card), rc);
QETH_DBF_TEXT_(SETUP, 2, "5err%d", rc);
goto out_offline;
}
.remove = ccwgroup_remove_ccwdev,
};
-int qeth_core_hardsetup_card(struct qeth_card *card)
+int qeth_core_hardsetup_card(struct qeth_card *card, bool *carrier_ok)
{
int retries = 3;
int rc;
qeth_update_from_chp_desc(card);
retry:
if (retries < 3)
- QETH_DBF_MESSAGE(2, "%s Retrying to do IDX activates.\n",
- dev_name(&card->gdev->dev));
+ QETH_DBF_MESSAGE(2, "Retrying to do IDX activates on device %x.\n",
+ CARD_DEVID(card));
rc = qeth_qdio_clear_card(card, card->info.type != QETH_CARD_TYPE_IQD);
ccw_device_set_offline(CARD_DDEV(card));
ccw_device_set_offline(CARD_WDEV(card));
if (rc == IPA_RC_LAN_OFFLINE) {
dev_warn(&card->gdev->dev,
"The LAN is offline\n");
- netif_carrier_off(card->dev);
+ *carrier_ok = false;
} else {
rc = -ENODEV;
goto out;
}
} else {
- netif_carrier_on(card->dev);
+ *carrier_ok = true;
+ }
+
+ if (qeth_netdev_is_registered(card->dev)) {
+ if (*carrier_ok)
+ netif_carrier_on(card->dev);
+ else
+ netif_carrier_off(card->dev);
}
card->options.ipa4.supported_funcs = 0;
out:
dev_warn(&card->gdev->dev, "The qeth device driver failed to recover "
"an error on the device\n");
- QETH_DBF_MESSAGE(2, "%s Initialization in hardsetup failed! rc=%d\n",
- dev_name(&card->gdev->dev), rc);
+ QETH_DBF_MESSAGE(2, "Initialization for device %x failed in hardsetup! rc=%d\n",
+ CARD_DEVID(card), rc);
return rc;
}
EXPORT_SYMBOL_GPL(qeth_core_hardsetup_card);
}
EXPORT_SYMBOL_GPL(qeth_get_setassparms_cmd);
-int qeth_send_setassparms(struct qeth_card *card,
- struct qeth_cmd_buffer *iob, __u16 len, long data,
- int (*reply_cb)(struct qeth_card *,
- struct qeth_reply *, unsigned long),
- void *reply_param)
+static int qeth_send_setassparms(struct qeth_card *card,
+ struct qeth_cmd_buffer *iob, u16 len,
+ long data, int (*reply_cb)(struct qeth_card *,
+ struct qeth_reply *,
+ unsigned long),
+ void *reply_param)
{
int rc;
struct qeth_ipa_cmd *cmd;
rc = qeth_send_ipa_cmd(card, iob, reply_cb, reply_param);
return rc;
}
-EXPORT_SYMBOL_GPL(qeth_send_setassparms);
int qeth_send_simple_setassparms_prot(struct qeth_card *card,
enum qeth_ipa_funcs ipa_func,
WARN_ON_ONCE(1);
}
- /* fallthrough from high to low, to select all legal speeds: */
+ /* partially does fall through, to also select lower speeds */
switch (maxspeed) {
+ case SPEED_25000:
+ ethtool_link_ksettings_add_link_mode(cmd, supported,
+ 25000baseSR_Full);
+ ethtool_link_ksettings_add_link_mode(cmd, advertising,
+ 25000baseSR_Full);
+ break;
case SPEED_10000:
ethtool_link_ksettings_add_link_mode(cmd, supported,
10000baseT_Full);
cmd->base.speed = SPEED_10000;
cmd->base.port = PORT_FIBRE;
break;
+ case QETH_LINK_TYPE_25GBIT_ETH:
+ cmd->base.speed = SPEED_25000;
+ cmd->base.port = PORT_FIBRE;
+ break;
default:
cmd->base.speed = SPEED_10;
cmd->base.port = PORT_TP;
case CARD_INFO_PORTS_10G:
cmd->base.speed = SPEED_10000;
break;
+ case CARD_INFO_PORTS_25G:
+ cmd->base.speed = SPEED_25000;
+ break;
}
return 0;
QETH_LINK_TYPE_GBIT_ETH = 0x03,
QETH_LINK_TYPE_OSN = 0x04,
QETH_LINK_TYPE_10GBIT_ETH = 0x10,
+ QETH_LINK_TYPE_25GBIT_ETH = 0x12,
QETH_LINK_TYPE_LANE_ETH100 = 0x81,
QETH_LINK_TYPE_LANE_TR = 0x82,
QETH_LINK_TYPE_LANE_ETH1000 = 0x83,
CARD_INFO_PORTS_100M = 0x00000006,
CARD_INFO_PORTS_1G = 0x00000007,
CARD_INFO_PORTS_10G = 0x00000008,
+ CARD_INFO_PORTS_25G = 0x0000000A,
};
/* (SET)DELIP(M) IPA stuff ***************************************************/
__u32 flags_32bit;
struct qeth_ipa_caps caps;
struct qeth_checksum_cmd chksum;
- struct qeth_arp_cache_entry add_arp_entry;
+ struct qeth_arp_cache_entry arp_entry;
struct qeth_arp_query_data query_arp;
struct qeth_tso_start_data tso;
__u8 ip[16];
QETH_CARD_TEXT(card, 2, "L2Wmac");
rc = qeth_l2_send_setdelmac(card, mac, cmd);
if (rc == -EEXIST)
- QETH_DBF_MESSAGE(2, "MAC %pM already registered on %s\n",
- mac, QETH_CARD_IFNAME(card));
+ QETH_DBF_MESSAGE(2, "MAC already registered on device %x\n",
+ CARD_DEVID(card));
else if (rc)
- QETH_DBF_MESSAGE(2, "Failed to register MAC %pM on %s: %d\n",
- mac, QETH_CARD_IFNAME(card), rc);
+ QETH_DBF_MESSAGE(2, "Failed to register MAC on device %x: %d\n",
+ CARD_DEVID(card), rc);
return rc;
}
QETH_CARD_TEXT(card, 2, "L2Rmac");
rc = qeth_l2_send_setdelmac(card, mac, cmd);
if (rc)
- QETH_DBF_MESSAGE(2, "Failed to delete MAC %pM on %s: %d\n",
- mac, QETH_CARD_IFNAME(card), rc);
+ QETH_DBF_MESSAGE(2, "Failed to delete MAC on device %u: %d\n",
+ CARD_DEVID(card), rc);
return rc;
}
QETH_CARD_TEXT(card, 2, "L2sdvcb");
if (cmd->hdr.return_code) {
- QETH_DBF_MESSAGE(2, "Error in processing VLAN %i on %s: 0x%x.\n",
+ QETH_DBF_MESSAGE(2, "Error in processing VLAN %u on device %x: %#x.\n",
cmd->data.setdelvlan.vlan_id,
- QETH_CARD_IFNAME(card), cmd->hdr.return_code);
+ CARD_DEVID(card), cmd->hdr.return_code);
QETH_CARD_TEXT_(card, 2, "L2VL%4x", cmd->hdr.command);
QETH_CARD_TEXT_(card, 2, "err%d", cmd->hdr.return_code);
}
rc = qeth_vm_request_mac(card);
if (!rc)
goto out;
- QETH_DBF_MESSAGE(2, "z/VM MAC Service failed on device %s: x%x\n",
- CARD_BUS_ID(card), rc);
+ QETH_DBF_MESSAGE(2, "z/VM MAC Service failed on device %x: %#x\n",
+ CARD_DEVID(card), rc);
QETH_DBF_TEXT_(SETUP, 2, "err%04x", rc);
/* fall back to alternative mechanism: */
}
rc = qeth_setadpparms_change_macaddr(card);
if (!rc)
goto out;
- QETH_DBF_MESSAGE(2, "READ_MAC Assist failed on device %s: x%x\n",
- CARD_BUS_ID(card), rc);
+ QETH_DBF_MESSAGE(2, "READ_MAC Assist failed on device %x: %#x\n",
+ CARD_DEVID(card), rc);
QETH_DBF_TEXT_(SETUP, 2, "1err%04x", rc);
/* fall back once more: */
}
if (cgdev->state == CCWGROUP_ONLINE)
qeth_l2_set_offline(cgdev);
- unregister_netdev(card->dev);
+ if (qeth_netdev_is_registered(card->dev))
+ unregister_netdev(card->dev);
}
static const struct ethtool_ops qeth_l2_ethtool_ops = {
.ndo_set_features = qeth_set_features
};
-static int qeth_l2_setup_netdev(struct qeth_card *card)
+static int qeth_l2_setup_netdev(struct qeth_card *card, bool carrier_ok)
{
int rc;
- if (card->dev->netdev_ops)
+ if (qeth_netdev_is_registered(card->dev))
return 0;
card->dev->priv_flags |= IFF_UNICAST_FLT;
qeth_l2_request_initial_mac(card);
netif_napi_add(card->dev, &card->napi, qeth_poll, QETH_NAPI_WEIGHT);
rc = register_netdev(card->dev);
+ if (!rc && carrier_ok)
+ netif_carrier_on(card->dev);
+
if (rc)
card->dev->netdev_ops = NULL;
return rc;
struct qeth_card *card = dev_get_drvdata(&gdev->dev);
int rc = 0;
enum qeth_card_states recover_flag;
+ bool carrier_ok;
mutex_lock(&card->discipline_mutex);
mutex_lock(&card->conf_mutex);
QETH_DBF_HEX(SETUP, 2, &card, sizeof(void *));
recover_flag = card->state;
- rc = qeth_core_hardsetup_card(card);
+ rc = qeth_core_hardsetup_card(card, &carrier_ok);
if (rc) {
QETH_DBF_TEXT_(SETUP, 2, "2err%04x", rc);
rc = -ENODEV;
dev_info(&card->gdev->dev,
"The device represents a Bridge Capable Port\n");
- rc = qeth_l2_setup_netdev(card);
+ rc = qeth_l2_setup_netdev(card, carrier_ok);
if (rc)
goto out_remove;
QETH_CARD_TEXT(card, 4, "clearip");
- if (recover && card->options.sniffer)
- return;
-
spin_lock_bh(&card->ip_lock);
hash_for_each_safe(card->ip_htable, i, tmp, addr, hnode) {
QETH_PROT_IPV4);
if (rc) {
card->options.route4.type = NO_ROUTER;
- QETH_DBF_MESSAGE(2, "Error (0x%04x) while setting routing type"
- " on %s. Type set to 'no router'.\n", rc,
- QETH_CARD_IFNAME(card));
+ QETH_DBF_MESSAGE(2, "Error (%#06x) while setting routing type on device %x. Type set to 'no router'.\n",
+ rc, CARD_DEVID(card));
}
return rc;
}
QETH_PROT_IPV6);
if (rc) {
card->options.route6.type = NO_ROUTER;
- QETH_DBF_MESSAGE(2, "Error (0x%04x) while setting routing type"
- " on %s. Type set to 'no router'.\n", rc,
- QETH_CARD_IFNAME(card));
+ QETH_DBF_MESSAGE(2, "Error (%#06x) while setting routing type on device %x. Type set to 'no router'.\n",
+ rc, CARD_DEVID(card));
}
return rc;
}
int rc = 0;
int cnt = 3;
+ if (card->options.sniffer)
+ return 0;
if (addr->proto == QETH_PROT_IPV4) {
QETH_CARD_TEXT(card, 2, "setaddr4");
{
int rc = 0;
+ if (card->options.sniffer)
+ return 0;
+
if (addr->proto == QETH_PROT_IPV4) {
QETH_CARD_TEXT(card, 2, "deladdr4");
QETH_CARD_HEX(card, 3, &addr->u.a4.addr, sizeof(int));
}
break;
default:
- QETH_DBF_MESSAGE(2, "Unknown sniffer action (0x%04x) on %s\n",
- cmd->data.diagass.action, QETH_CARD_IFNAME(card));
+ QETH_DBF_MESSAGE(2, "Unknown sniffer action (%#06x) on device %x\n",
+ cmd->data.diagass.action, CARD_DEVID(card));
}
return 0;
qeth_l3_handle_promisc_mode(card);
}
-static const char *qeth_l3_arp_get_error_cause(int *rc)
+static int qeth_l3_arp_makerc(int rc)
{
- switch (*rc) {
- case QETH_IPA_ARP_RC_FAILED:
- *rc = -EIO;
- return "operation failed";
+ switch (rc) {
+ case IPA_RC_SUCCESS:
+ return 0;
case QETH_IPA_ARP_RC_NOTSUPP:
- *rc = -EOPNOTSUPP;
- return "operation not supported";
- case QETH_IPA_ARP_RC_OUT_OF_RANGE:
- *rc = -EINVAL;
- return "argument out of range";
case QETH_IPA_ARP_RC_Q_NOTSUPP:
- *rc = -EOPNOTSUPP;
- return "query operation not supported";
+ return -EOPNOTSUPP;
+ case QETH_IPA_ARP_RC_OUT_OF_RANGE:
+ return -EINVAL;
case QETH_IPA_ARP_RC_Q_NO_DATA:
- *rc = -ENOENT;
- return "no query data available";
+ return -ENOENT;
default:
- return "unknown error";
+ return -EIO;
}
}
static int qeth_l3_arp_set_no_entries(struct qeth_card *card, int no_entries)
{
- int tmp;
int rc;
QETH_CARD_TEXT(card, 3, "arpstnoe");
rc = qeth_send_simple_setassparms(card, IPA_ARP_PROCESSING,
IPA_CMD_ASS_ARP_SET_NO_ENTRIES,
no_entries);
- if (rc) {
- tmp = rc;
- QETH_DBF_MESSAGE(2, "Could not set number of ARP entries on "
- "%s: %s (0x%x/%d)\n", QETH_CARD_IFNAME(card),
- qeth_l3_arp_get_error_cause(&rc), tmp, tmp);
- }
- return rc;
+ if (rc)
+ QETH_DBF_MESSAGE(2, "Could not set number of ARP entries on device %x: %#x\n",
+ CARD_DEVID(card), rc);
+ return qeth_l3_arp_makerc(rc);
}
static __u32 get_arp_entry_size(struct qeth_card *card,
{
struct qeth_cmd_buffer *iob;
struct qeth_ipa_cmd *cmd;
- int tmp;
int rc;
QETH_CARD_TEXT_(card, 3, "qarpipv%i", prot);
rc = qeth_l3_send_ipa_arp_cmd(card, iob,
QETH_SETASS_BASE_LEN+QETH_ARP_CMD_LEN,
qeth_l3_arp_query_cb, (void *)qinfo);
- if (rc) {
- tmp = rc;
- QETH_DBF_MESSAGE(2,
- "Error while querying ARP cache on %s: %s "
- "(0x%x/%d)\n", QETH_CARD_IFNAME(card),
- qeth_l3_arp_get_error_cause(&rc), tmp, tmp);
- }
-
- return rc;
+ if (rc)
+ QETH_DBF_MESSAGE(2, "Error while querying ARP cache on device %x: %#x\n",
+ CARD_DEVID(card), rc);
+ return qeth_l3_arp_makerc(rc);
}
static int qeth_l3_arp_query(struct qeth_card *card, char __user *udata)
return rc;
}
-static int qeth_l3_arp_add_entry(struct qeth_card *card,
- struct qeth_arp_cache_entry *entry)
+static int qeth_l3_arp_modify_entry(struct qeth_card *card,
+ struct qeth_arp_cache_entry *entry,
+ enum qeth_arp_process_subcmds arp_cmd)
{
+ struct qeth_arp_cache_entry *cmd_entry;
struct qeth_cmd_buffer *iob;
- char buf[16];
- int tmp;
int rc;
- QETH_CARD_TEXT(card, 3, "arpadent");
+ if (arp_cmd == IPA_CMD_ASS_ARP_ADD_ENTRY)
+ QETH_CARD_TEXT(card, 3, "arpadd");
+ else
+ QETH_CARD_TEXT(card, 3, "arpdel");
/*
* currently GuestLAN only supports the ARP assist function
return -EOPNOTSUPP;
}
- iob = qeth_get_setassparms_cmd(card, IPA_ARP_PROCESSING,
- IPA_CMD_ASS_ARP_ADD_ENTRY,
- sizeof(struct qeth_arp_cache_entry),
- QETH_PROT_IPV4);
+ iob = qeth_get_setassparms_cmd(card, IPA_ARP_PROCESSING, arp_cmd,
+ sizeof(*cmd_entry), QETH_PROT_IPV4);
if (!iob)
return -ENOMEM;
- rc = qeth_send_setassparms(card, iob,
- sizeof(struct qeth_arp_cache_entry),
- (unsigned long) entry,
- qeth_setassparms_cb, NULL);
- if (rc) {
- tmp = rc;
- qeth_l3_ipaddr4_to_string((u8 *)entry->ipaddr, buf);
- QETH_DBF_MESSAGE(2, "Could not add ARP entry for address %s "
- "on %s: %s (0x%x/%d)\n", buf, QETH_CARD_IFNAME(card),
- qeth_l3_arp_get_error_cause(&rc), tmp, tmp);
- }
- return rc;
-}
-
-static int qeth_l3_arp_remove_entry(struct qeth_card *card,
- struct qeth_arp_cache_entry *entry)
-{
- struct qeth_cmd_buffer *iob;
- char buf[16] = {0, };
- int tmp;
- int rc;
- QETH_CARD_TEXT(card, 3, "arprment");
+ cmd_entry = &__ipa_cmd(iob)->data.setassparms.data.arp_entry;
+ ether_addr_copy(cmd_entry->macaddr, entry->macaddr);
+ memcpy(cmd_entry->ipaddr, entry->ipaddr, 4);
+ rc = qeth_send_ipa_cmd(card, iob, qeth_setassparms_cb, NULL);
+ if (rc)
+ QETH_DBF_MESSAGE(2, "Could not modify (cmd: %#x) ARP entry on device %x: %#x\n",
+ arp_cmd, CARD_DEVID(card), rc);
- /*
- * currently GuestLAN only supports the ARP assist function
- * IPA_CMD_ASS_ARP_QUERY_INFO, but not IPA_CMD_ASS_ARP_REMOVE_ENTRY;
- * thus we say EOPNOTSUPP for this ARP function
- */
- if (card->info.guestlan)
- return -EOPNOTSUPP;
- if (!qeth_is_supported(card, IPA_ARP_PROCESSING)) {
- return -EOPNOTSUPP;
- }
- memcpy(buf, entry, 12);
- iob = qeth_get_setassparms_cmd(card, IPA_ARP_PROCESSING,
- IPA_CMD_ASS_ARP_REMOVE_ENTRY,
- 12,
- QETH_PROT_IPV4);
- if (!iob)
- return -ENOMEM;
- rc = qeth_send_setassparms(card, iob,
- 12, (unsigned long)buf,
- qeth_setassparms_cb, NULL);
- if (rc) {
- tmp = rc;
- memset(buf, 0, 16);
- qeth_l3_ipaddr4_to_string((u8 *)entry->ipaddr, buf);
- QETH_DBF_MESSAGE(2, "Could not delete ARP entry for address %s"
- " on %s: %s (0x%x/%d)\n", buf, QETH_CARD_IFNAME(card),
- qeth_l3_arp_get_error_cause(&rc), tmp, tmp);
- }
- return rc;
+ return qeth_l3_arp_makerc(rc);
}
static int qeth_l3_arp_flush_cache(struct qeth_card *card)
{
int rc;
- int tmp;
QETH_CARD_TEXT(card, 3, "arpflush");
}
rc = qeth_send_simple_setassparms(card, IPA_ARP_PROCESSING,
IPA_CMD_ASS_ARP_FLUSH_CACHE, 0);
- if (rc) {
- tmp = rc;
- QETH_DBF_MESSAGE(2, "Could not flush ARP cache on %s: %s "
- "(0x%x/%d)\n", QETH_CARD_IFNAME(card),
- qeth_l3_arp_get_error_cause(&rc), tmp, tmp);
- }
- return rc;
+ if (rc)
+ QETH_DBF_MESSAGE(2, "Could not flush ARP cache on device %x: %#x\n",
+ CARD_DEVID(card), rc);
+ return qeth_l3_arp_makerc(rc);
}
static int qeth_l3_do_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
{
struct qeth_card *card = dev->ml_priv;
struct qeth_arp_cache_entry arp_entry;
+ enum qeth_arp_process_subcmds arp_cmd;
int rc = 0;
switch (cmd) {
rc = qeth_l3_arp_query(card, rq->ifr_ifru.ifru_data);
break;
case SIOC_QETH_ARP_ADD_ENTRY:
- if (!capable(CAP_NET_ADMIN)) {
- rc = -EPERM;
- break;
- }
- if (copy_from_user(&arp_entry, rq->ifr_ifru.ifru_data,
- sizeof(struct qeth_arp_cache_entry)))
- rc = -EFAULT;
- else
- rc = qeth_l3_arp_add_entry(card, &arp_entry);
- break;
case SIOC_QETH_ARP_REMOVE_ENTRY:
- if (!capable(CAP_NET_ADMIN)) {
- rc = -EPERM;
- break;
- }
- if (copy_from_user(&arp_entry, rq->ifr_ifru.ifru_data,
- sizeof(struct qeth_arp_cache_entry)))
- rc = -EFAULT;
- else
- rc = qeth_l3_arp_remove_entry(card, &arp_entry);
- break;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+ if (copy_from_user(&arp_entry, rq->ifr_data, sizeof(arp_entry)))
+ return -EFAULT;
+
+ arp_cmd = (cmd == SIOC_QETH_ARP_ADD_ENTRY) ?
+ IPA_CMD_ASS_ARP_ADD_ENTRY :
+ IPA_CMD_ASS_ARP_REMOVE_ENTRY;
+ return qeth_l3_arp_modify_entry(card, &arp_entry, arp_cmd);
case SIOC_QETH_ARP_FLUSH_CACHE:
if (!capable(CAP_NET_ADMIN)) {
rc = -EPERM;
.ndo_neigh_setup = qeth_l3_neigh_setup,
};
-static int qeth_l3_setup_netdev(struct qeth_card *card)
+static int qeth_l3_setup_netdev(struct qeth_card *card, bool carrier_ok)
{
unsigned int headroom;
int rc;
- if (card->dev->netdev_ops)
+ if (qeth_netdev_is_registered(card->dev))
return 0;
if (card->info.type == QETH_CARD_TYPE_OSD ||
netif_napi_add(card->dev, &card->napi, qeth_poll, QETH_NAPI_WEIGHT);
rc = register_netdev(card->dev);
+ if (!rc && carrier_ok)
+ netif_carrier_on(card->dev);
+
out:
if (rc)
card->dev->netdev_ops = NULL;
if (cgdev->state == CCWGROUP_ONLINE)
qeth_l3_set_offline(cgdev);
- unregister_netdev(card->dev);
+ if (qeth_netdev_is_registered(card->dev))
+ unregister_netdev(card->dev);
qeth_l3_clear_ip_htable(card, 0);
qeth_l3_clear_ipato_list(card);
}
struct qeth_card *card = dev_get_drvdata(&gdev->dev);
int rc = 0;
enum qeth_card_states recover_flag;
+ bool carrier_ok;
mutex_lock(&card->discipline_mutex);
mutex_lock(&card->conf_mutex);
QETH_DBF_HEX(SETUP, 2, &card, sizeof(void *));
recover_flag = card->state;
- rc = qeth_core_hardsetup_card(card);
+ rc = qeth_core_hardsetup_card(card, &carrier_ok);
if (rc) {
QETH_DBF_TEXT_(SETUP, 2, "2err%04x", rc);
rc = -ENODEV;
goto out_remove;
}
- rc = qeth_l3_setup_netdev(card);
+ rc = qeth_l3_setup_netdev(card, carrier_ok);
if (rc)
goto out_remove;
unsigned int revision; /* Transport revision */
wait_queue_head_t wait_q;
spinlock_t lock;
+ struct mutex io_lock; /* Serializes I/O requests */
struct list_head virtqueues;
unsigned long indicators;
unsigned long indicators2;
unsigned long flags;
int flag = intparm & VIRTIO_CCW_INTPARM_MASK;
+ mutex_lock(&vcdev->io_lock);
do {
spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags);
ret = ccw_device_start(vcdev->cdev, ccw, intparm, 0, 0);
cpu_relax();
} while (ret == -EBUSY);
wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0);
- return ret ? ret : vcdev->err;
+ ret = ret ? ret : vcdev->err;
+ mutex_unlock(&vcdev->io_lock);
+ return ret;
}
static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
int ret;
struct ccw1 *ccw;
void *config_area;
+ unsigned long flags;
ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
if (!ccw)
if (ret)
goto out_free;
+ spin_lock_irqsave(&vcdev->lock, flags);
memcpy(vcdev->config, config_area, offset + len);
- if (buf)
- memcpy(buf, &vcdev->config[offset], len);
if (vcdev->config_ready < offset + len)
vcdev->config_ready = offset + len;
+ spin_unlock_irqrestore(&vcdev->lock, flags);
+ if (buf)
+ memcpy(buf, config_area + offset, len);
out_free:
kfree(config_area);
struct virtio_ccw_device *vcdev = to_vc_device(vdev);
struct ccw1 *ccw;
void *config_area;
+ unsigned long flags;
ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
if (!ccw)
/* Make sure we don't overwrite fields. */
if (vcdev->config_ready < offset)
virtio_ccw_get_config(vdev, 0, NULL, offset);
+ spin_lock_irqsave(&vcdev->lock, flags);
memcpy(&vcdev->config[offset], buf, len);
/* Write the config area to the host. */
memcpy(config_area, vcdev->config, sizeof(vcdev->config));
+ spin_unlock_irqrestore(&vcdev->lock, flags);
ccw->cmd_code = CCW_CMD_WRITE_CONF;
ccw->flags = 0;
ccw->count = offset + len;
init_waitqueue_head(&vcdev->wait_q);
INIT_LIST_HEAD(&vcdev->virtqueues);
spin_lock_init(&vcdev->lock);
+ mutex_init(&vcdev->io_lock);
spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
dev_set_drvdata(&cdev->dev, vcdev);
dev_set_drvdata(&op->dev, p);
d7s_device = p;
err = 0;
+ of_node_put(opts);
out:
return err;
for (len = 0; len < PCF8584_MAX_CHANNELS; ++len) {
pchild->mon_type[len] = ENVCTRL_NOMON;
}
+ of_node_put(root_node);
return;
}
+ of_node_put(root_node);
}
/* Get the monitor channels. */
static int twa_post_command_packet(TW_Device_Extension *tw_dev, int request_id, char internal);
static int twa_reset_device_extension(TW_Device_Extension *tw_dev);
static int twa_reset_sequence(TW_Device_Extension *tw_dev, int soft_reset);
-static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id, char *cdb, int use_sg, TW_SG_Entry *sglistarg);
+static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
+ unsigned char *cdb, int use_sg,
+ TW_SG_Entry *sglistarg);
static void twa_scsiop_execute_scsi_complete(TW_Device_Extension *tw_dev, int request_id);
static char *twa_string_lookup(twa_message_type *table, unsigned int aen_code);
static int twa_aen_drain_queue(TW_Device_Extension *tw_dev, int no_check_reset)
{
int request_id = 0;
- char cdb[TW_MAX_CDB_LEN];
+ unsigned char cdb[TW_MAX_CDB_LEN];
TW_SG_Entry sglist[1];
int finished = 0, count = 0;
TW_Command_Full *full_command_packet;
/* This function will read the aen queue from the isr */
static int twa_aen_read_queue(TW_Device_Extension *tw_dev, int request_id)
{
- char cdb[TW_MAX_CDB_LEN];
+ unsigned char cdb[TW_MAX_CDB_LEN];
TW_SG_Entry sglist[1];
TW_Command_Full *full_command_packet;
int retval = 1;
static DEF_SCSI_QCMD(twa_scsi_queue)
/* This function hands scsi cdb's to the firmware */
-static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id, char *cdb, int use_sg, TW_SG_Entry *sglistarg)
+static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
+ unsigned char *cdb, int use_sg,
+ TW_SG_Entry *sglistarg)
{
TW_Command_Full *full_command_packet;
TW_Command_Apache *command_packet;
} /* End twl_post_command_packet() */
/* This function hands scsi cdb's to the firmware */
-static int twl_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id, char *cdb, int use_sg, TW_SG_Entry_ISO *sglistarg)
+static int twl_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
+ unsigned char *cdb, int use_sg,
+ TW_SG_Entry_ISO *sglistarg)
{
TW_Command_Full *full_command_packet;
TW_Command_Apache *command_packet;
/* This function will read the aen queue from the isr */
static int twl_aen_read_queue(TW_Device_Extension *tw_dev, int request_id)
{
- char cdb[TW_MAX_CDB_LEN];
+ unsigned char cdb[TW_MAX_CDB_LEN];
TW_SG_Entry_ISO sglist[1];
TW_Command_Full *full_command_packet;
int retval = 1;
static int twl_aen_drain_queue(TW_Device_Extension *tw_dev, int no_check_reset)
{
int request_id = 0;
- char cdb[TW_MAX_CDB_LEN];
+ unsigned char cdb[TW_MAX_CDB_LEN];
TW_SG_Entry_ISO sglist[1];
int finished = 0, count = 0;
TW_Command_Full *full_command_packet;
config SCSI_BUSLOGIC
tristate "BusLogic SCSI support"
- depends on (PCI || ISA || MCA) && SCSI && ISA_DMA_API && VIRT_TO_BUS
+ depends on (PCI || ISA) && SCSI && ISA_DMA_API && VIRT_TO_BUS
---help---
This is support for BusLogic MultiMaster and FlashPoint SCSI Host
Adapters. Consult the SCSI-HOWTO, available from
config SCSI_MYRS
tristate "Mylex DAC960/DAC1100 PCI RAID Controller (SCSI Interface)"
depends on PCI
+ depends on !CPU_BIG_ENDIAN || COMPILE_TEST
select RAID_ATTRS
help
This driver adds support for the Mylex DAC960, AcceleRAID, and
config SCSI_SIM710
tristate "Simple 53c710 SCSI support (Compaq, NCR machines)"
- depends on (EISA || MCA) && SCSI
+ depends on EISA && SCSI
select SCSI_SPI_ATTRS
---help---
This driver is for NCR53c710 based SCSI host adapters.
- It currently supports Compaq EISA cards and NCR MCA cards
+ It currently supports Compaq EISA cards.
config SCSI_DC395x
tristate "Tekram DC395(U/UW/F) and DC315(U) SCSI support"
out:
if (!hostdata->selecting)
- return NULL;
+ return false;
hostdata->selecting = NULL;
return ret;
}
/* DEFINES */
/* For PCMCIA cards, always use AUTOCONF */
-#if defined(PCMCIA) || defined(MODULE)
+#if defined(AHA152X_PCMCIA) || defined(MODULE)
#if !defined(AUTOCONF)
#define AUTOCONF
#endif
#define DELAY_DEFAULT 1000
-#if defined(PCMCIA)
+#if defined(AHA152X_PCMCIA)
#define IRQ_MIN 0
#define IRQ_MAX 16
#else
MODULE_DESCRIPTION(AHA152X_REVID);
MODULE_LICENSE("GPL");
-#if !defined(PCMCIA)
+#if !defined(AHA152X_PCMCIA)
#if defined(MODULE)
static int io[] = {0, 0};
module_param_hw_array(io, int, ioport, NULL, 0);
MODULE_DEVICE_TABLE(isapnp, id_table);
#endif /* ISAPNP */
-#endif /* !PCMCIA */
+#endif /* !AHA152X_PCMCIA */
static struct scsi_host_template aha152x_driver_template;
if (shpnt->irq)
free_irq(shpnt->irq, shpnt);
-#if !defined(PCMCIA)
+#if !defined(AHA152X_PCMCIA)
if (shpnt->io_port)
release_region(shpnt->io_port, IO_RANGE);
#endif
.slave_alloc = aha152x_adjust_queue,
};
-#if !defined(PCMCIA)
+#if !defined(AHA152X_PCMCIA)
static int setup_count;
static struct aha152x_setup setup[2];
__setup("aha152x=", aha152x_setup);
#endif
-#endif /* !PCMCIA */
+#endif /* !AHA152X_PCMCIA */
{
struct hisi_hba *hisi_hba = dq->hisi_hba;
struct hisi_sas_slot *s, *s1, *s2 = NULL;
- struct list_head *dq_list;
int dlvry_queue = dq->id;
int wp;
- dq_list = &dq->list;
list_for_each_entry_safe(s, s1, &dq->list, delivery) {
if (!s->ready)
break;
{
struct hisi_hba *hisi_hba = dq->hisi_hba;
struct hisi_sas_slot *s, *s1, *s2 = NULL;
- struct list_head *dq_list;
int dlvry_queue = dq->id;
int wp;
- dq_list = &dq->list;
list_for_each_entry_safe(s, s1, &dq->list, delivery) {
if (!s->ready)
break;
{
struct hisi_hba *hisi_hba = dq->hisi_hba;
struct hisi_sas_slot *s, *s1, *s2 = NULL;
- struct list_head *dq_list;
int dlvry_queue = dq->id;
int wp;
- dq_list = &dq->list;
list_for_each_entry_safe(s, s1, &dq->list, delivery) {
if (!s->ready)
break;
failed:
ISCSI_DBG_EH(session,
"failing session reset: Could not log back into "
- "%s, %s [age %d]\n", session->targetname,
- conn->persistent_address, session->age);
+ "%s [age %d]\n", session->targetname,
+ session->age);
spin_unlock_bh(&session->frwd_lock);
mutex_unlock(&session->eh_mutex);
return FAILED;
rport = lpfc_ndlp_get_nrport(ndlp);
if (rport)
nrport = rport->remoteport;
+ else
+ nrport = NULL;
spin_unlock(&phba->hbalock);
if (!nrport)
continue;
sizeof(phba->wwpn));
}
- phba->sli3_options = 0x0;
+ /*
+ * Clear all option bits except LPFC_SLI3_BG_ENABLED,
+ * which was already set in lpfc_get_cfgparam()
+ */
+ phba->sli3_options &= (uint32_t)LPFC_SLI3_BG_ENABLED;
/* Setup and issue mailbox READ REV command */
lpfc_read_rev(phba, pmb);
phba->sli3_options &= ~(LPFC_SLI3_NPIV_ENABLED |
LPFC_SLI3_HBQ_ENABLED |
LPFC_SLI3_CRP_ENABLED |
- LPFC_SLI3_BG_ENABLED |
LPFC_SLI3_DSS_ENABLED);
if (rc != MBX_SUCCESS) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
slot->n_elem = n_elem;
slot->slot_tag = tag;
- slot->buf = dma_pool_alloc(mvi->dma_pool, GFP_ATOMIC, &slot->buf_dma);
+ slot->buf = dma_pool_zalloc(mvi->dma_pool, GFP_ATOMIC, &slot->buf_dma);
if (!slot->buf) {
rc = -ENOMEM;
goto err_out_tag;
}
- memset(slot->buf, 0, MVS_SLOT_BUF_SZ);
tei.task = task;
tei.hdr = &mvi->slot[tag];
if (phy->phy_event & PHY_PLUG_OUT) {
u32 tmp;
- struct sas_identify_frame *id;
- id = (struct sas_identify_frame *)phy->frame_rcvd;
+
tmp = MVS_CHIP_DISP->read_phy_ctl(mvi, phy_no);
phy->phy_event &= ~PHY_PLUG_OUT;
if (!(tmp & PHY_READY_MASK)) {
enquiry2->fw.firmware_type = '0';
enquiry2->fw.turn_id = 0;
}
- sprintf(cb->fw_version, "%d.%02d-%c-%02d",
+ snprintf(cb->fw_version, sizeof(cb->fw_version),
+ "%d.%02d-%c-%02d",
enquiry2->fw.major_version,
enquiry2->fw.minor_version,
enquiry2->fw.firmware_type,
dma_addr_t ctlr_info_addr;
union myrs_sgl *sgl;
unsigned char status;
- struct myrs_ctlr_info old;
+ unsigned short ldev_present, ldev_critical, ldev_offline;
+
+ ldev_present = cs->ctlr_info->ldev_present;
+ ldev_critical = cs->ctlr_info->ldev_critical;
+ ldev_offline = cs->ctlr_info->ldev_offline;
- memcpy(&old, cs->ctlr_info, sizeof(struct myrs_ctlr_info));
ctlr_info_addr = dma_map_single(&cs->pdev->dev, cs->ctlr_info,
sizeof(struct myrs_ctlr_info),
DMA_FROM_DEVICE);
cs->ctlr_info->rbld_active +
cs->ctlr_info->exp_active != 0)
cs->needs_update = true;
- if (cs->ctlr_info->ldev_present != old.ldev_present ||
- cs->ctlr_info->ldev_critical != old.ldev_critical ||
- cs->ctlr_info->ldev_offline != old.ldev_offline)
+ if (cs->ctlr_info->ldev_present != ldev_present ||
+ cs->ctlr_info->ldev_critical != ldev_critical ||
+ cs->ctlr_info->ldev_offline != ldev_offline)
shost_printk(KERN_INFO, cs->host,
"Logical drive count changes (%d/%d/%d)\n",
cs->ctlr_info->ldev_critical,
-#define PCMCIA 1
+#define AHA152X_PCMCIA 1
#define AHA152X_STAT 1
#include "aha152x.c"
mutex_lock(&ha->optrom_mutex);
if (qla2x00_chip_is_down(vha)) {
- mutex_unlock(&vha->hw->optrom_mutex);
+ mutex_unlock(&ha->optrom_mutex);
return -EAGAIN;
}
__qla24xx_handle_gpdb_event(vha, ea);
}
-int qla_post_els_plogi_work(struct scsi_qla_host *vha, fc_port_t *fcport)
+static int qla_post_els_plogi_work(struct scsi_qla_host *vha, fc_port_t *fcport)
{
struct qla_work_evt *e;
fcport);
break;
}
- /* drop through */
+ /* fall through */
default:
if (fcport_is_smaller(fcport)) {
/* local adapter is bigger */
}
-void qla_handle_els_plogi_done(scsi_qla_host_t *vha, struct event_arg *ea)
+static void qla_handle_els_plogi_done(scsi_qla_host_t *vha,
+ struct event_arg *ea)
{
ql_dbg(ql_dbg_disc, vha, 0x2118,
"%s %d %8phC post PRLI\n",
fcport->loop_id = FC_NO_LOOP_ID;
qla2x00_set_fcport_state(fcport, FCS_UNCONFIGURED);
fcport->supported_classes = FC_COS_UNSPECIFIED;
+ fcport->fp_speed = PORT_SPEED_UNKNOWN;
fcport->ct_desc.ct_sns = dma_alloc_coherent(&vha->hw->pdev->dev,
sizeof(struct ct_sns_pkt), &fcport->ct_desc.ct_sns_dma,
* @sp: SRB command to process
* @cmd_pkt: Command type 3 IOCB
* @tot_dsds: Total number of segments to transfer
- * @tot_prot_dsds:
- * @fw_prot_opts:
+ * @tot_prot_dsds: Total number of segments with protection information
+ * @fw_prot_opts: Protection options to be passed to firmware
*/
inline int
qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, struct cmd_type_crc_2 *cmd_pkt,
/**
* qla2100_intr_handler() - Process interrupts for the ISP2100 and ISP2200.
- * @irq:
+ * @irq: interrupt number
* @dev_id: SCSI driver HA context
*
* Called by system whenever the host adapter generates an interrupt.
/**
* qla2300_intr_handler() - Process interrupts for the ISP23xx and ISP63xx.
- * @irq:
+ * @irq: interrupt number
* @dev_id: SCSI driver HA context
*
* Called by system whenever the host adapter generates an interrupt.
/**
* qla24xx_intr_handler() - Process interrupts for the ISP23xx and ISP24xx.
- * @irq:
+ * @irq: interrupt number
* @dev_id: SCSI driver HA context
*
* Called by system whenever the host adapter generates an interrupt.
/**
* qla2x00_set_serdes_params() -
* @vha: HA context
- * @sw_em_1g:
- * @sw_em_2g:
- * @sw_em_4g:
+ * @sw_em_1g: serial link options
+ * @sw_em_2g: serial link options
+ * @sw_em_4g: serial link options
*
* Returns
*/
struct bsg_job *bsg_job;
struct fc_bsg_reply *bsg_reply;
struct srb_iocb *iocb_job;
- int res;
+ int res = 0;
struct qla_mt_iocb_rsp_fx00 fstatus;
uint8_t *fw_sts_ptr;
* qlafx00_multistatus_entry() - Process Multi response queue entries.
* @vha: SCSI driver HA context
* @rsp: response queue
- * @pkt:
+ * @pkt: received packet
*/
static void
qlafx00_multistatus_entry(struct scsi_qla_host *vha,
* @vha: SCSI driver HA context
* @rsp: response queue
* @pkt: Entry pointer
- * @estatus:
- * @etype:
*/
static void
qlafx00_error_entry(scsi_qla_host_t *vha, struct rsp_que *rsp,
- struct sts_entry_fx00 *pkt, uint8_t estatus, uint8_t etype)
+ struct sts_entry_fx00 *pkt)
{
srb_t *sp;
struct qla_hw_data *ha = vha->hw;
struct req_que *req = NULL;
int res = DID_ERROR << 16;
- ql_dbg(ql_dbg_async, vha, 0x507f,
- "type of error status in response: 0x%x\n", estatus);
-
req = ha->req_q_map[que];
sp = qla2x00_get_sp_from_handle(vha, func, req, pkt);
if (pkt->entry_status != 0 &&
pkt->entry_type != IOCTL_IOSB_TYPE_FX00) {
+ ql_dbg(ql_dbg_async, vha, 0x507f,
+ "type of error status in response: 0x%x\n",
+ pkt->entry_status);
qlafx00_error_entry(vha, rsp,
- (struct sts_entry_fx00 *)pkt, pkt->entry_status,
- pkt->entry_type);
+ (struct sts_entry_fx00 *)pkt);
continue;
}
/**
* qlafx00x_mbx_completion() - Process mailbox command completions.
* @vha: SCSI driver HA context
- * @mb0:
+ * @mb0: value to be written into mailbox register 0
*/
static void
qlafx00_mbx_completion(scsi_qla_host_t *vha, uint32_t mb0)
/**
* qlafx00_intr_handler() - Process interrupts for the ISPFX00.
- * @irq:
+ * @irq: interrupt number
* @dev_id: SCSI driver HA context
*
* Called by system whenever the host adapter generates an interrupt.
/**
* qla82xx_intr_handler() - Process interrupts for the ISP23xx and ISP63xx.
- * @irq:
+ * @irq: interrupt number
* @dev_id: SCSI driver HA context
*
* Called by system whenever the host adapter generates an interrupt.
#define PF_BITS_MASK (0xF << 16)
/**
* qla8044_intr_handler() - Process interrupts for the ISP8044
- * @irq:
+ * @irq: interrupt number
* @dev_id: SCSI driver HA context
*
* Called by system whenever the host adapter generates an interrupt.
MODULE_PARM_DESC(ql2xplogiabsentdevice,
"Option to enable PLOGI to devices that are not present after "
"a Fabric scan. This is needed for several broken switches. "
- "Default is 0 - no PLOGI. 1 - perfom PLOGI.");
+ "Default is 0 - no PLOGI. 1 - perform PLOGI.");
int ql2xloginretrycount = 0;
module_param(ql2xloginretrycount, int, S_IRUGO);
spin_unlock_irqrestore
(qp->qp_lock_ptr, flags);
status = qla2xxx_eh_abort(
- GET_CMD_SP(sp));
+ GET_CMD_SP(sp));
spin_lock_irqsave
(qp->qp_lock_ptr, flags);
+ /*
+ * Get rid of extra reference caused
+ * by early exit from qla2xxx_eh_abort
+ */
+ if (status == FAST_IO_FAIL)
+ atomic_dec(&sp->ref_count);
}
}
sp->done(sp, res);
/**
* qla2x00_get_flash_manufacturer() - Read manufacturer ID from flash chip.
- * @ha:
+ * @ha: host adapter
* @man_id: Flash manufacturer ID
* @flash_id: Flash ID
*/
case QLA_TGT_CLEAR_TS:
case QLA_TGT_ABORT_TS:
abort_cmds_for_lun(vha, lun, a->u.isp24.fcp_hdr.s_id);
- /* drop through */
+ /* fall through */
case QLA_TGT_CLEAR_ACA:
h = qlt_find_qphint(vha, mcmd->unpacked_lun);
mcmd->qpair = h->qpair;
* qla_tgt_lport_register - register lport with external module
*
* @target_lport_ptr: pointer for tcm_qla2xxx specific lport data
- * @phys_wwpn:
- * @npiv_wwpn:
- * @npiv_wwnn:
+ * @phys_wwpn: physical port WWPN
+ * @npiv_wwpn: NPIV WWPN
+ * @npiv_wwnn: NPIV WWNN
* @callback: lport initialization callback for tcm_qla2xxx code
*/
int qlt_lport_register(void *target_lport_ptr, u64 phys_wwpn,
*/
scsi_mq_uninit_cmd(cmd);
+ /*
+ * queue is still alive, so grab the ref for preventing it
+ * from being cleaned up during running queue.
+ */
+ percpu_ref_get(&q->q_usage_counter);
+
__blk_mq_end_request(req, error);
if (scsi_target(sdev)->single_lun ||
kblockd_schedule_work(&sdev->requeue_work);
else
blk_mq_run_hw_queues(q, true);
+
+ percpu_ref_put(&q->q_usage_counter);
} else {
unsigned long flags;
bool destroy;
bool drain_notify;
- bool open_sub_channel;
atomic_t num_outstanding_req;
struct Scsi_Host *host;
static void handle_sc_creation(struct vmbus_channel *new_sc)
{
struct hv_device *device = new_sc->primary_channel->device_obj;
+ struct device *dev = &device->device;
struct storvsc_device *stor_device;
struct vmstorage_channel_properties props;
+ int ret;
stor_device = get_out_stor_device(device);
if (!stor_device)
return;
- if (stor_device->open_sub_channel == false)
- return;
-
memset(&props, 0, sizeof(struct vmstorage_channel_properties));
- vmbus_open(new_sc,
- storvsc_ringbuffer_size,
- storvsc_ringbuffer_size,
- (void *)&props,
- sizeof(struct vmstorage_channel_properties),
- storvsc_on_channel_callback, new_sc);
+ ret = vmbus_open(new_sc,
+ storvsc_ringbuffer_size,
+ storvsc_ringbuffer_size,
+ (void *)&props,
+ sizeof(struct vmstorage_channel_properties),
+ storvsc_on_channel_callback, new_sc);
- if (new_sc->state == CHANNEL_OPENED_STATE) {
- stor_device->stor_chns[new_sc->target_cpu] = new_sc;
- cpumask_set_cpu(new_sc->target_cpu, &stor_device->alloced_cpus);
+ /* In case vmbus_open() fails, we don't use the sub-channel. */
+ if (ret != 0) {
+ dev_err(dev, "Failed to open sub-channel: err=%d\n", ret);
+ return;
}
+
+ /* Add the sub-channel to the array of available channels. */
+ stor_device->stor_chns[new_sc->target_cpu] = new_sc;
+ cpumask_set_cpu(new_sc->target_cpu, &stor_device->alloced_cpus);
}
static void handle_multichannel_storage(struct hv_device *device, int max_chns)
{
+ struct device *dev = &device->device;
struct storvsc_device *stor_device;
int num_cpus = num_online_cpus();
int num_sc;
request = &stor_device->init_request;
vstor_packet = &request->vstor_packet;
- stor_device->open_sub_channel = true;
/*
* Establish a handler for dealing with subchannels.
*/
vmbus_set_sc_create_callback(device->channel, handle_sc_creation);
- /*
- * Check to see if sub-channels have already been created. This
- * can happen when this driver is re-loaded after unloading.
- */
-
- if (vmbus_are_subchannels_present(device->channel))
- return;
-
- stor_device->open_sub_channel = false;
/*
* Request the host to create sub-channels.
*/
VM_PKT_DATA_INBAND,
VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
- if (ret != 0)
+ if (ret != 0) {
+ dev_err(dev, "Failed to create sub-channel: err=%d\n", ret);
return;
+ }
t = wait_for_completion_timeout(&request->wait_event, 10*HZ);
- if (t == 0)
+ if (t == 0) {
+ dev_err(dev, "Failed to create sub-channel: timed out\n");
return;
+ }
if (vstor_packet->operation != VSTOR_OPERATION_COMPLETE_IO ||
- vstor_packet->status != 0)
+ vstor_packet->status != 0) {
+ dev_err(dev, "Failed to create sub-channel: op=%d, sts=%d\n",
+ vstor_packet->operation, vstor_packet->status);
return;
+ }
/*
- * Now that we created the sub-channels, invoke the check; this
- * may trigger the callback.
+ * We need to do nothing here, because vmbus_process_offer()
+ * invokes channel->sc_creation_callback, which will open and use
+ * the sub-channel(s).
*/
- stor_device->open_sub_channel = true;
- vmbus_are_subchannels_present(device->channel);
}
static void cache_wwn(struct storvsc_device *stor_device,
}
stor_device->destroy = false;
- stor_device->open_sub_channel = false;
init_waitqueue_head(&stor_device->waiting_to_drain);
stor_device->device = device;
stor_device->host = host;
#include "unipro.h"
#include "ufs-hisi.h"
#include "ufshci.h"
+#include "ufs_quirks.h"
static int ufs_hisi_check_hibern8(struct ufs_hba *hba)
{
static void ufs_hisi_pwr_change_pre_change(struct ufs_hba *hba)
{
+ if (hba->dev_quirks & UFS_DEVICE_QUIRK_HOST_VS_DEBUGSAVECONFIGTIME) {
+ pr_info("ufs flash device must set VS_DebugSaveConfigTime 0x10\n");
+ /* VS_DebugSaveConfigTime */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xD0A0), 0x10);
+ /* sync length */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x1556), 0x48);
+ }
+
/* update */
ufshcd_dme_set(hba, UIC_ARG_MIB(0x15A8), 0x1);
/* PA_TxSkip */
*/
#define UFS_DEVICE_QUIRK_HOST_PA_SAVECONFIGTIME (1 << 8)
+/*
+ * Some UFS devices require VS_DebugSaveConfigTime is 0x10,
+ * enabling this quirk ensure this.
+ */
+#define UFS_DEVICE_QUIRK_HOST_VS_DEBUGSAVECONFIGTIME (1 << 9)
+
#endif /* UFS_QUIRKS_H_ */
UFS_FIX(UFS_VENDOR_SKHYNIX, UFS_ANY_MODEL, UFS_DEVICE_NO_VCCQ),
UFS_FIX(UFS_VENDOR_SKHYNIX, UFS_ANY_MODEL,
UFS_DEVICE_QUIRK_HOST_PA_SAVECONFIGTIME),
+ UFS_FIX(UFS_VENDOR_SKHYNIX, "hB8aL1" /*H28U62301AMR*/,
+ UFS_DEVICE_QUIRK_HOST_VS_DEBUGSAVECONFIGTIME),
END_FIX
};
err = -ENOMEM;
goto out_error;
}
-
- /*
- * Do not use blk-mq at this time because blk-mq does not support
- * runtime pm.
- */
- host->use_blk_mq = false;
-
hba = shost_priv(host);
hba->host = host;
hba->dev = dev;
static void pvscsi_release_resources(struct pvscsi_adapter *adapter)
{
- pvscsi_shutdown_intr(adapter);
-
if (adapter->workqueue)
destroy_workqueue(adapter->workqueue);
out_reset_adapter:
ll_adapter_reset(adapter);
out_release_resources:
+ pvscsi_shutdown_intr(adapter);
pvscsi_release_resources(adapter);
scsi_host_put(host);
out_disable_device:
return error;
out_release_resources_and_disable:
+ pvscsi_shutdown_intr(adapter);
pvscsi_release_resources(adapter);
goto out_disable_device;
}
u8 la = txn->la;
bool usr_msg = false;
- if (txn->mc & SLIM_MSG_CLK_PAUSE_SEQ_FLG)
- return -EPROTONOSUPPORT;
-
if (txn->mt == SLIM_MSG_MT_CORE &&
(txn->mc >= SLIM_MSG_MC_BEGIN_RECONFIGURATION &&
txn->mc <= SLIM_MSG_MC_RECONFIGURE_NOW))
#define SLIM_MSG_MC_NEXT_REMOVE_CHANNEL 0x58
#define SLIM_MSG_MC_RECONFIGURE_NOW 0x5F
-/*
- * Clock pause flag to indicate that the reconfig message
- * corresponds to clock pause sequence
- */
-#define SLIM_MSG_CLK_PAUSE_SEQ_FLG (1U << 8)
-
/* Clock pause values per SLIMbus spec */
#define SLIM_CLK_FAST 0
#define SLIM_CLK_CONST_PHASE 1
};
struct knav_irq_info {
- int irq;
- u32 cpu_map;
+ int irq;
+ struct cpumask *cpu_mask;
};
struct knav_range_info {
{
struct knav_device *kdev = range->kdev;
struct knav_acc_channel *acc;
- unsigned long cpu_map;
+ struct cpumask *cpu_mask;
int ret = 0, irq;
u32 old, new;
if (range->flags & RANGE_MULTI_QUEUE) {
acc = range->acc;
irq = range->irqs[0].irq;
- cpu_map = range->irqs[0].cpu_map;
+ cpu_mask = range->irqs[0].cpu_mask;
} else {
acc = range->acc + queue;
irq = range->irqs[queue].irq;
- cpu_map = range->irqs[queue].cpu_map;
+ cpu_mask = range->irqs[queue].cpu_mask;
}
old = acc->open_mask;
acc->name, acc->name);
ret = request_irq(irq, knav_acc_int_handler, 0, acc->name,
range);
- if (!ret && cpu_map) {
- ret = irq_set_affinity_hint(irq, to_cpumask(&cpu_map));
+ if (!ret && cpu_mask) {
+ ret = irq_set_affinity_hint(irq, cpu_mask);
if (ret) {
dev_warn(range->kdev->dev,
"Failed to set IRQ affinity\n");
struct knav_queue_inst *inst)
{
unsigned queue = inst->id - range->queue_base;
- unsigned long cpu_map;
int ret = 0, irq;
if (range->flags & RANGE_HAS_IRQ) {
irq = range->irqs[queue].irq;
- cpu_map = range->irqs[queue].cpu_map;
ret = request_irq(irq, knav_queue_int_handler, 0,
inst->irq_name, inst);
if (ret)
return ret;
disable_irq(irq);
- if (cpu_map) {
- ret = irq_set_affinity_hint(irq, to_cpumask(&cpu_map));
+ if (range->irqs[queue].cpu_mask) {
+ ret = irq_set_affinity_hint(irq, range->irqs[queue].cpu_mask);
if (ret) {
dev_warn(range->kdev->dev,
"Failed to set IRQ affinity\n");
range->num_irqs++;
- if (IS_ENABLED(CONFIG_SMP) && oirq.args_count == 3)
- range->irqs[i].cpu_map =
- (oirq.args[2] & 0x0000ff00) >> 8;
+ if (IS_ENABLED(CONFIG_SMP) && oirq.args_count == 3) {
+ unsigned long mask;
+ int bit;
+
+ range->irqs[i].cpu_mask = devm_kzalloc(dev,
+ cpumask_size(), GFP_KERNEL);
+ if (!range->irqs[i].cpu_mask)
+ return -ENOMEM;
+
+ mask = (oirq.args[2] & 0x0000ff00) >> 8;
+ for_each_set_bit(bit, &mask, BITS_PER_LONG)
+ cpumask_set_cpu(bit, range->irqs[i].cpu_mask);
+ }
}
range->num_irqs = min(range->num_irqs, range->num_queues);
mdata->xfer_len = min(MTK_SPI_MAX_FIFO_SIZE, len);
mtk_spi_setup_packet(master);
- cnt = len / 4;
+ cnt = mdata->xfer_len / 4;
iowrite32_rep(mdata->base + SPI_TX_DATA_REG,
trans->tx_buf + mdata->num_xfered, cnt);
- remainder = len % 4;
+ remainder = mdata->xfer_len % 4;
if (remainder > 0) {
reg_val = 0;
memcpy(®_val,
/* work with hotplug and coldplug */
MODULE_ALIAS("platform:omap2_mcspi");
-#ifdef CONFIG_SUSPEND
-static int omap2_mcspi_suspend_noirq(struct device *dev)
+static int __maybe_unused omap2_mcspi_suspend(struct device *dev)
{
- return pinctrl_pm_select_sleep_state(dev);
+ struct spi_master *master = dev_get_drvdata(dev);
+ struct omap2_mcspi *mcspi = spi_master_get_devdata(master);
+ int error;
+
+ error = pinctrl_pm_select_sleep_state(dev);
+ if (error)
+ dev_warn(mcspi->dev, "%s: failed to set pins: %i\n",
+ __func__, error);
+
+ error = spi_master_suspend(master);
+ if (error)
+ dev_warn(mcspi->dev, "%s: master suspend failed: %i\n",
+ __func__, error);
+
+ return pm_runtime_force_suspend(dev);
}
-static int omap2_mcspi_resume_noirq(struct device *dev)
+static int __maybe_unused omap2_mcspi_resume(struct device *dev)
{
struct spi_master *master = dev_get_drvdata(dev);
struct omap2_mcspi *mcspi = spi_master_get_devdata(master);
dev_warn(mcspi->dev, "%s: failed to set pins: %i\n",
__func__, error);
- return 0;
-}
+ error = spi_master_resume(master);
+ if (error)
+ dev_warn(mcspi->dev, "%s: master resume failed: %i\n",
+ __func__, error);
-#else
-#define omap2_mcspi_suspend_noirq NULL
-#define omap2_mcspi_resume_noirq NULL
-#endif
+ return pm_runtime_force_resume(dev);
+}
static const struct dev_pm_ops omap2_mcspi_pm_ops = {
- .suspend_noirq = omap2_mcspi_suspend_noirq,
- .resume_noirq = omap2_mcspi_resume_noirq,
+ SET_SYSTEM_SLEEP_PM_OPS(omap2_mcspi_suspend,
+ omap2_mcspi_resume)
.runtime_resume = omap_mcspi_runtime_resume,
};
* and INSN_DEVICE_CONFIG_GET_ROUTES.
*/
#define NI_NAMES_BASE 0x8000u
+
+#define _TERM_N(base, n, x) ((base) + ((x) & ((n) - 1)))
+
/*
* not necessarily all allowed 64 PFIs are valid--certainly not for all devices
*/
-#define NI_PFI(x) (NI_NAMES_BASE + ((x) & 0x3f))
+#define NI_PFI(x) _TERM_N(NI_NAMES_BASE, 64, x)
/* 8 trigger lines by standard, Some devices cannot talk to all eight. */
-#define TRIGGER_LINE(x) (NI_PFI(-1) + 1 + ((x) & 0x7))
+#define TRIGGER_LINE(x) _TERM_N(NI_PFI(-1) + 1, 8, x)
/* 4 RTSI shared MUXes to route signals to/from TRIGGER_LINES on NI hardware */
-#define NI_RTSI_BRD(x) (TRIGGER_LINE(-1) + 1 + ((x) & 0x3))
+#define NI_RTSI_BRD(x) _TERM_N(TRIGGER_LINE(-1) + 1, 4, x)
/* *** Counter/timer names : 8 counters max *** */
-#define NI_COUNTER_NAMES_BASE (NI_RTSI_BRD(-1) + 1)
-#define NI_MAX_COUNTERS 7
-#define NI_CtrSource(x) (NI_COUNTER_NAMES_BASE + ((x) & NI_MAX_COUNTERS))
+#define NI_MAX_COUNTERS 8
+#define NI_COUNTER_NAMES_BASE (NI_RTSI_BRD(-1) + 1)
+#define NI_CtrSource(x) _TERM_N(NI_COUNTER_NAMES_BASE, NI_MAX_COUNTERS, x)
/* Gate, Aux, A,B,Z are all treated, at times as gates */
-#define NI_GATES_NAMES_BASE (NI_CtrSource(-1) + 1)
-#define NI_CtrGate(x) (NI_GATES_NAMES_BASE + ((x) & NI_MAX_COUNTERS))
-#define NI_CtrAux(x) (NI_CtrGate(-1) + 1 + ((x) & NI_MAX_COUNTERS))
-#define NI_CtrA(x) (NI_CtrAux(-1) + 1 + ((x) & NI_MAX_COUNTERS))
-#define NI_CtrB(x) (NI_CtrA(-1) + 1 + ((x) & NI_MAX_COUNTERS))
-#define NI_CtrZ(x) (NI_CtrB(-1) + 1 + ((x) & NI_MAX_COUNTERS))
-#define NI_GATES_NAMES_MAX NI_CtrZ(-1)
-#define NI_CtrArmStartTrigger(x) (NI_CtrZ(-1) + 1 + ((x) & NI_MAX_COUNTERS))
+#define NI_GATES_NAMES_BASE (NI_CtrSource(-1) + 1)
+#define NI_CtrGate(x) _TERM_N(NI_GATES_NAMES_BASE, NI_MAX_COUNTERS, x)
+#define NI_CtrAux(x) _TERM_N(NI_CtrGate(-1) + 1, NI_MAX_COUNTERS, x)
+#define NI_CtrA(x) _TERM_N(NI_CtrAux(-1) + 1, NI_MAX_COUNTERS, x)
+#define NI_CtrB(x) _TERM_N(NI_CtrA(-1) + 1, NI_MAX_COUNTERS, x)
+#define NI_CtrZ(x) _TERM_N(NI_CtrB(-1) + 1, NI_MAX_COUNTERS, x)
+#define NI_GATES_NAMES_MAX NI_CtrZ(-1)
+#define NI_CtrArmStartTrigger(x) _TERM_N(NI_CtrZ(-1) + 1, NI_MAX_COUNTERS, x)
#define NI_CtrInternalOutput(x) \
- (NI_CtrArmStartTrigger(-1) + 1 + ((x) & NI_MAX_COUNTERS))
+ _TERM_N(NI_CtrArmStartTrigger(-1) + 1, NI_MAX_COUNTERS, x)
/** external pin(s) labeled conveniently as Ctr<i>Out. */
-#define NI_CtrOut(x) (NI_CtrInternalOutput(-1) + 1 + ((x) & NI_MAX_COUNTERS))
+#define NI_CtrOut(x) _TERM_N(NI_CtrInternalOutput(-1) + 1, NI_MAX_COUNTERS, x)
/** For Buffered sampling of ctr -- x series capability. */
-#define NI_CtrSampleClock(x) (NI_CtrOut(-1) + 1 + ((x) & NI_MAX_COUNTERS))
-#define NI_COUNTER_NAMES_MAX NI_CtrSampleClock(-1)
+#define NI_CtrSampleClock(x) _TERM_N(NI_CtrOut(-1) + 1, NI_MAX_COUNTERS, x)
+#define NI_COUNTER_NAMES_MAX NI_CtrSampleClock(-1)
enum ni_common_signal_names {
/* PXI_Star: this is a non-NI-specific signal */
return ni_ao_arm(dev, s);
case INSN_CONFIG_GET_CMD_TIMING_CONSTRAINTS:
/* we don't care about actual channels */
- data[1] = board->ao_speed;
+ /* data[3] : chanlist_len */
+ data[1] = board->ao_speed * data[3];
data[2] = 0;
return 0;
default:
ipipeif_write(val, ipipeif_base_addr, IPIPEIF_CFG2);
break;
}
+ /* fall through */
case IPIPEIF_SDRAM_YUV:
/* Set clock divider */
* Userspace support for the Request API needs to be reviewed;
* Another stateless decoder driver should be submitted;
* At least one stateless encoder driver should be submitted.
+* When queueing a request containing references to I frames, the
+ refcount of the memory for those I frames needs to be incremented
+ and decremented when the request is completed. This will likely
+ require some help from vb2. The driver should fail the request
+ if the memory/buffer is gone.
unsigned int count;
unsigned int i;
- count = vb2_request_buffer_cnt(req);
- if (!count) {
- v4l2_info(&ctx->dev->v4l2_dev,
- "No buffer was provided with the request\n");
- return -ENOENT;
- } else if (count > 1) {
- v4l2_info(&ctx->dev->v4l2_dev,
- "More than one buffer was provided with the request\n");
- return -EINVAL;
- }
-
list_for_each_entry(obj, &req->objects, list) {
struct vb2_buffer *vb;
if (!ctx)
return -ENOENT;
+ count = vb2_request_buffer_cnt(req);
+ if (!count) {
+ v4l2_info(&ctx->dev->v4l2_dev,
+ "No buffer was provided with the request\n");
+ return -ENOENT;
+ } else if (count > 1) {
+ v4l2_info(&ctx->dev->v4l2_dev,
+ "More than one buffer was provided with the request\n");
+ return -EINVAL;
+ }
+
parent_hdl = &ctx->hdl;
hdl = v4l2_ctrl_request_hdl_find(req, parent_hdl);
static const struct media_device_ops cedrus_m2m_media_ops = {
.req_validate = cedrus_request_validate,
- .req_queue = vb2_m2m_request_queue,
+ .req_queue = v4l2_m2m_request_queue,
};
static int cedrus_probe(struct platform_device *pdev)
for (i = 0; i < ARRAY_SIZE(ch_data_type); i++) {
if (c->cfg.data_type & ch_data_type[i].most_ch_data_type)
- return snprintf(buf, PAGE_SIZE, ch_data_type[i].name);
+ return snprintf(buf, PAGE_SIZE, "%s", ch_data_type[i].name);
}
return snprintf(buf, PAGE_SIZE, "unconfigured\n");
}
/* tx desc */
src = sg->src_addr;
for (i = 0; i < chan->desc->num_sgs; i++) {
+ tx_desc = &chan->tx_ring[chan->tx_idx];
+
if (len > HSDMA_MAX_PLEN)
tlen = HSDMA_MAX_PLEN;
else
tx_desc->addr1 = src;
tx_desc->flags |= HSDMA_DESC_PLEN1(tlen);
} else {
- tx_desc = &chan->tx_ring[chan->tx_idx];
tx_desc->addr0 = src;
tx_desc->flags = HSDMA_DESC_PLEN0(tlen);
struct property *prop;
const char *function_name, *group_name;
int ret;
- int ngroups;
+ int ngroups = 0;
unsigned int reserved_maps = 0;
for_each_node_with_property(np_config, "group")
p = buff;
p += sprintf(p, "ASSOCINFO(ReqIEs=");
len = sec_ie[1] + 2;
- len = (len < IW_CUSTOM_MAX) ? len : IW_CUSTOM_MAX - 1;
+ len = (len < IW_CUSTOM_MAX) ? len : IW_CUSTOM_MAX;
for (i = 0; i < len; i++)
p += sprintf(p, "%02x", sec_ie[i]);
p += sprintf(p, ")");
u8 *out_ie, uint in_len)
{
u8 authmode = 0, match;
- u8 sec_ie[255], uncst_oui[4], bkup_ie[255];
+ u8 sec_ie[IW_CUSTOM_MAX], uncst_oui[4], bkup_ie[255];
u8 wpa_oui[4] = {0x0, 0x50, 0xf2, 0x01};
uint ielength, cnt, remove_cnt;
int iEntry;
if (pstat->aid > 0) {
DBG_871X(" old AID %d\n", pstat->aid);
} else {
- for (pstat->aid = 1; pstat->aid < NUM_STA; pstat->aid++)
+ for (pstat->aid = 1; pstat->aid <= NUM_STA; pstat->aid++)
if (pstapriv->sta_aid[pstat->aid - 1] == NULL)
break;
rx_bssid = get_hdr_bssid(wlanhdr);
pkt_info.bssid_match = ((!IsFrameTypeCtrl(wlanhdr)) &&
!pattrib->icv_err && !pattrib->crc_err &&
- !ether_addr_equal(rx_bssid, my_bssid));
+ ether_addr_equal(rx_bssid, my_bssid));
rx_ra = get_ra(wlanhdr);
my_hwaddr = myid(&padapter->eeprompriv);
pkt_info.to_self = pkt_info.bssid_match &&
- !ether_addr_equal(rx_ra, my_hwaddr);
+ ether_addr_equal(rx_ra, my_hwaddr);
pkt_info.is_beacon = pkt_info.bssid_match &&
sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_PACKETS);
sinfo->tx_packets = psta->sta_stats.tx_pkts;
-
+ sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_FAILED);
}
/* for Ad-Hoc/AP mode */
exit:
kfree(ptmp);
- return 0;
+ return ret;
}
static int rtw_wx_write32(struct net_device *dev,
struct vchiq_await_completion32 args32;
struct vchiq_completion_data32 completion32;
unsigned int *msgbufcount32;
+ unsigned int msgbufcount_native;
compat_uptr_t msgbuf32;
void *msgbuf;
void **msgbufptr;
sizeof(completion32)))
return -EFAULT;
- args32.msgbufcount--;
+ if (get_user(msgbufcount_native, &args->msgbufcount))
+ return -EFAULT;
+
+ if (!msgbufcount_native)
+ args32.msgbufcount--;
msgbufcount32 =
&((struct vchiq_await_completion32 __user *)arg)->msgbufcount;
return -1;
memset(&msg, 0, sizeof(struct msghdr));
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC,
- count->iov, count->iov_count, data);
+ iov_iter_kvec(&msg.msg_iter, READ, count->iov, count->iov_count, data);
while (msg_data_left(&msg)) {
rx_loop = sock_recvmsg(conn->sock, &msg, MSG_WAITALL);
memset(&msg, 0, sizeof(struct msghdr));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC,
- iov, iov_count, data);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iov, iov_count, data);
while (msg_data_left(&msg)) {
int tx_loop = sock_sendmsg(conn->sock, &msg);
}
transport_kunmap_data_sg(cmd);
- target_complete_cmd(cmd, GOOD);
+ target_complete_cmd_with_length(cmd, GOOD, rd_len + 4);
return 0;
}
len += sg->length;
}
- iov_iter_bvec(&iter, ITER_BVEC | is_write, bvec, sgl_nents, len);
+ iov_iter_bvec(&iter, is_write, bvec, sgl_nents, len);
aio_cmd->cmd = cmd;
aio_cmd->len = len;
len += sg->length;
}
- iov_iter_bvec(&iter, ITER_BVEC, bvec, sgl_nents, len);
+ iov_iter_bvec(&iter, READ, bvec, sgl_nents, len);
if (is_write)
ret = vfs_iter_write(fd, &iter, &pos, 0);
else
len += se_dev->dev_attrib.block_size;
}
- iov_iter_bvec(&iter, ITER_BVEC, bvec, nolb, len);
+ iov_iter_bvec(&iter, READ, bvec, nolb, len);
ret = vfs_iter_write(fd_dev->fd_file, &iter, &pos, 0);
kfree(bvec);
if (sub_api_initialized)
return;
- ret = request_module("target_core_iblock");
+ ret = IS_ENABLED(CONFIG_TCM_IBLOCK) && request_module("target_core_iblock");
if (ret != 0)
pr_err("Unable to load target_core_iblock\n");
- ret = request_module("target_core_file");
+ ret = IS_ENABLED(CONFIG_TCM_FILEIO) && request_module("target_core_file");
if (ret != 0)
pr_err("Unable to load target_core_file\n");
- ret = request_module("target_core_pscsi");
+ ret = IS_ENABLED(CONFIG_TCM_PSCSI) && request_module("target_core_pscsi");
if (ret != 0)
pr_err("Unable to load target_core_pscsi\n");
- ret = request_module("target_core_user");
+ ret = IS_ENABLED(CONFIG_TCM_USER2) && request_module("target_core_user");
if (ret != 0)
pr_err("Unable to load target_core_user\n");
void transport_generic_request_failure(struct se_cmd *cmd,
sense_reason_t sense_reason)
{
- int ret = 0;
+ int ret = 0, post_ret;
pr_debug("-----[ Storage Engine Exception; sense_reason %d\n",
sense_reason);
transport_complete_task_attr(cmd);
if (cmd->transport_complete_callback)
- cmd->transport_complete_callback(cmd, false, NULL);
+ cmd->transport_complete_callback(cmd, false, &post_ret);
if (transport_check_aborted_status(cmd, 1))
return;
int ret;
/* Valid check */
- if (armada_is_valid(priv)) {
+ if (!armada_is_valid(priv)) {
dev_err(priv->dev,
"Temperature sensor reading not valid\n");
return -EIO;
return ret;
}
-static struct thermal_zone_of_device_ops of_ops = {
+static const struct thermal_zone_of_device_ops of_ops = {
.get_temp = armada_get_temp,
};
/* First memory region points towards the status register */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- if (!res)
- return -EIO;
-
- /*
- * Edit the resource start address and length to map over all the
- * registers, instead of pointing at them one by one.
- */
- res->start -= data->syscon_status_off;
- res->end = res->start + max(data->syscon_status_off,
- max(data->syscon_control0_off,
- data->syscon_control1_off)) +
- sizeof(unsigned int) - 1;
-
base = devm_ioremap_resource(&pdev->dev, res);
if (IS_ERR(base))
return PTR_ERR(base);
+ /*
+ * Fix up from the old individual DT register specification to
+ * cover all the registers. We do this by adjusting the ioremap()
+ * result, which should be fine as ioremap() deals with pages.
+ * However, validate that we do not cross a page boundary while
+ * making this adjustment.
+ */
+ if (((unsigned long)base & ~PAGE_MASK) < data->syscon_status_off)
+ return -EINVAL;
+ base -= data->syscon_status_off;
+
priv->syscon = devm_regmap_init_mmio(&pdev->dev, base,
&armada_thermal_regmap_config);
if (IS_ERR(priv->syscon))
+// SPDX-License-Identifier: GPL-2.0+
/*
* Driver for Broadcom BCM2835 SoC temperature sensor
*
* Copyright (C) 2016 Martin Sperl
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
*/
#include <linux/clk.h>
return 0;
}
-static struct thermal_zone_of_device_ops of_ops = {
+static const struct thermal_zone_of_device_ops of_ops = {
.get_temp = brcmstb_get_temp,
.set_trips = brcmstb_set_trips,
};
}
static DEVICE_ATTR(key, 0600, key_show, key_store);
+static void nvm_authenticate_start(struct tb_switch *sw)
+{
+ struct pci_dev *root_port;
+
+ /*
+ * During host router NVM upgrade we should not allow root port to
+ * go into D3cold because some root ports cannot trigger PME
+ * itself. To be on the safe side keep the root port in D0 during
+ * the whole upgrade process.
+ */
+ root_port = pci_find_pcie_root_port(sw->tb->nhi->pdev);
+ if (root_port)
+ pm_runtime_get_noresume(&root_port->dev);
+}
+
+static void nvm_authenticate_complete(struct tb_switch *sw)
+{
+ struct pci_dev *root_port;
+
+ root_port = pci_find_pcie_root_port(sw->tb->nhi->pdev);
+ if (root_port)
+ pm_runtime_put(&root_port->dev);
+}
+
static ssize_t nvm_authenticate_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
sw->nvm->authenticating = true;
- if (!tb_route(sw))
+ if (!tb_route(sw)) {
+ /*
+ * Keep root port from suspending as long as the
+ * NVM upgrade process is running.
+ */
+ nvm_authenticate_start(sw);
ret = nvm_authenticate_host(sw);
- else
+ if (ret)
+ nvm_authenticate_complete(sw);
+ } else {
ret = nvm_authenticate_device(sw);
+ }
pm_runtime_mark_last_busy(&sw->dev);
pm_runtime_put_autosuspend(&sw->dev);
}
if (ret <= 0)
return ret;
+ /* Now we can allow root port to suspend again */
+ if (!tb_route(sw))
+ nvm_authenticate_complete(sw);
+
if (status) {
tb_sw_info(sw, "switch flash authentication failed\n");
tb_switch_set_uuid(sw);
platform_set_drvdata(pdev, data);
- pm_runtime_enable(&pdev->dev);
- if (!pm_runtime_enabled(&pdev->dev)) {
- err = mtk8250_runtime_resume(&pdev->dev);
- if (err)
- return err;
- }
+ err = mtk8250_runtime_resume(&pdev->dev);
+ if (err)
+ return err;
data->line = serial8250_register_8250_port(&uart);
if (data->line < 0)
return data->line;
+ pm_runtime_set_active(&pdev->dev);
+ pm_runtime_enable(&pdev->dev);
+
return 0;
}
pm_runtime_get_sync(&pdev->dev);
serial8250_unregister_port(data->line);
+ mtk8250_runtime_suspend(&pdev->dev);
pm_runtime_disable(&pdev->dev);
pm_runtime_put_noidle(&pdev->dev);
- if (!pm_runtime_status_suspended(&pdev->dev))
- mtk8250_runtime_suspend(&pdev->dev);
-
return 0;
}
static int param_set_kgdboc_var(const char *kmessage,
const struct kernel_param *kp)
{
- int len = strlen(kmessage);
+ size_t len = strlen(kmessage);
if (len >= MAX_CONFIG_LEN) {
pr_err("config string too long\n");
strcpy(config, kmessage);
/* Chop out \n char as a result of echo */
- if (config[len - 1] == '\n')
+ if (len && config[len - 1] == '\n')
config[len - 1] = '\0';
if (configured == 1)
hrtimer_init(&s->rx_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
s->rx_timer.function = rx_timer_fn;
+ s->chan_rx_saved = s->chan_rx = chan;
+
if (port->type == PORT_SCIFA || port->type == PORT_SCIFB)
sci_submit_rx(s);
-
- s->chan_rx_saved = s->chan_rx = chan;
}
}
static int sci_remove(struct platform_device *dev)
{
struct sci_port *port = platform_get_drvdata(dev);
+ unsigned int type = port->port.type; /* uart_remove_... clears it */
sci_ports_in_use &= ~BIT(port->port.line);
uart_remove_one_port(&sci_uart_driver, &port->port);
sysfs_remove_file(&dev->dev.kobj,
&dev_attr_rx_fifo_trigger.attr);
}
- if (port->port.type == PORT_SCIFA || port->port.type == PORT_SCIFB ||
- port->port.type == PORT_HSCIF) {
+ if (type == PORT_SCIFA || type == PORT_SCIFB || type == PORT_HSCIF) {
sysfs_remove_file(&dev->dev.kobj,
&dev_attr_rx_fifo_timeout.attr);
}
mode = of_get_property(dp, mode_prop, NULL);
if (!mode)
mode = "9600,8,n,1,-";
+ of_node_put(dp);
}
cflag = CREAD | HUPCL | CLOCAL;
else
cbaud += 15;
}
- return baud_table[cbaud];
+ return cbaud >= n_baud_table ? 0 : baud_table[cbaud];
}
EXPORT_SYMBOL(tty_termios_baud_rate);
else
cbaud += 15;
}
- return baud_table[cbaud];
+ return cbaud >= n_baud_table ? 0 : baud_table[cbaud];
#else /* IBSHIFT */
return tty_termios_baud_rate(termios);
#endif /* IBSHIFT */
return ERR_PTR(retval);
}
-static void tty_free_termios(struct tty_struct *tty)
+/**
+ * tty_save_termios() - save tty termios data in driver table
+ * @tty: tty whose termios data to save
+ *
+ * Locking: Caller guarantees serialisation with tty_init_termios().
+ */
+void tty_save_termios(struct tty_struct *tty)
{
struct ktermios *tp;
int idx = tty->index;
}
*tp = tty->termios;
}
+EXPORT_SYMBOL_GPL(tty_save_termios);
/**
* tty_flush_works - flush all works of a tty/pty pair
WARN_ON(!mutex_is_locked(&tty_mutex));
if (tty->ops->shutdown)
tty->ops->shutdown(tty);
- tty_free_termios(tty);
+ tty_save_termios(tty);
tty_driver_remove_tty(tty->driver, tty);
tty->port->itty = NULL;
if (tty->link)
if (tty_port_close_start(port, tty, filp) == 0)
return;
tty_port_shutdown(port, tty);
- set_bit(TTY_IO_ERROR, &tty->flags);
+ if (!port->console)
+ set_bit(TTY_IO_ERROR, &tty->flags);
tty_port_close_end(port, tty);
tty_port_tty_set(port, NULL);
}
scr_memsetw(start + offset, vc->vc_video_erase_char, 2 * count);
vc->vc_need_wrap = 0;
if (con_should_update(vc))
- do_update_region(vc, (unsigned long) start, count);
+ do_update_region(vc, (unsigned long)(start + offset), count);
}
static void csi_X(struct vc_data *vc, int vpar) /* erase the following vpar positions */
if (ret)
goto err_uio_dev_add_attributes;
+ info->uio_dev = idev;
+
if (info->irq && (info->irq != UIO_IRQ_CUSTOM)) {
/*
* Note that we deliberately don't use devm_request_irq
*/
ret = request_irq(info->irq, uio_interrupt,
info->irq_flags, info->name, idev);
- if (ret)
+ if (ret) {
+ info->uio_dev = NULL;
goto err_request_irq;
+ }
}
- info->uio_dev = idev;
return 0;
err_request_irq:
{ USB_DEVICE(0x0572, 0x1328), /* Shiro / Aztech USB MODEM UM-3100 */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
},
+ { USB_DEVICE(0x0572, 0x1349), /* Hiro (Conexant) USB MODEM H50228 */
+ .driver_info = NO_UNION_NORMAL, /* has no union descriptor */
+ },
{ USB_DEVICE(0x20df, 0x0001), /* Simtec Electronics Entropy Key */
.driver_info = QUIRK_CONTROL_LINE_STATE, },
{ USB_DEVICE(0x2184, 0x001c) }, /* GW Instek AFG-2225 */
/* descriptor may appear anywhere in config */
err = __usb_get_extra_descriptor(udev->rawdescriptors[0],
le16_to_cpu(udev->config[0].desc.wTotalLength),
- USB_DT_OTG, (void **) &desc);
+ USB_DT_OTG, (void **) &desc, sizeof(*desc));
if (err || !(desc->bmAttributes & USB_OTG_HNP))
return 0;
int i, status;
u16 portchange, portstatus;
struct usb_port *port_dev = hub->ports[port1 - 1];
+ int reset_recovery_time;
if (!hub_is_superspeed(hub->hdev)) {
if (warm) {
USB_PORT_FEAT_C_BH_PORT_RESET);
usb_clear_port_feature(hub->hdev, port1,
USB_PORT_FEAT_C_PORT_LINK_STATE);
- usb_clear_port_feature(hub->hdev, port1,
+
+ if (udev)
+ usb_clear_port_feature(hub->hdev, port1,
USB_PORT_FEAT_C_CONNECTION);
/*
done:
if (status == 0) {
- /* TRSTRCY = 10 ms; plus some extra */
if (port_dev->quirks & USB_PORT_QUIRK_FAST_ENUM)
usleep_range(10000, 12000);
- else
- msleep(10 + 40);
+ else {
+ /* TRSTRCY = 10 ms; plus some extra */
+ reset_recovery_time = 10 + 40;
+
+ /* Hub needs extra delay after resetting its port. */
+ if (hub->hdev->quirks & USB_QUIRK_HUB_SLOW_RESET)
+ reset_recovery_time += 100;
+
+ msleep(reset_recovery_time);
+ }
if (udev) {
struct usb_hcd *hcd = bus_to_hcd(udev->bus);
/* Handle notifying userspace about hub over-current events */
static void port_over_current_notify(struct usb_port *port_dev)
{
- static char *envp[] = { NULL, NULL, NULL };
+ char *envp[3];
struct device *hub_dev;
char *port_dev_path;
if (!envp[1])
goto exit;
+ envp[2] = NULL;
kobject_uevent_env(&hub_dev->kobj, KOBJ_CHANGE, envp);
kfree(envp[1]);
case 'n':
flags |= USB_QUIRK_DELAY_CTRL_MSG;
break;
+ case 'o':
+ flags |= USB_QUIRK_HUB_SLOW_RESET;
+ break;
/* Ignore unrecognized flag characters */
}
}
/* Microsoft LifeCam-VX700 v2.0 */
{ USB_DEVICE(0x045e, 0x0770), .driver_info = USB_QUIRK_RESET_RESUME },
+ /* Cherry Stream G230 2.0 (G85-231) and 3.0 (G85-232) */
+ { USB_DEVICE(0x046a, 0x0023), .driver_info = USB_QUIRK_RESET_RESUME },
+
/* Logitech HD Pro Webcams C920, C920-C, C925e and C930e */
{ USB_DEVICE(0x046d, 0x082d), .driver_info = USB_QUIRK_DELAY_INIT },
{ USB_DEVICE(0x046d, 0x0841), .driver_info = USB_QUIRK_DELAY_INIT },
/* Midiman M-Audio Keystation 88es */
{ USB_DEVICE(0x0763, 0x0192), .driver_info = USB_QUIRK_RESET_RESUME },
+ /* SanDisk Ultra Fit and Ultra Flair */
+ { USB_DEVICE(0x0781, 0x5583), .driver_info = USB_QUIRK_NO_LPM },
+ { USB_DEVICE(0x0781, 0x5591), .driver_info = USB_QUIRK_NO_LPM },
+
/* M-Systems Flash Disk Pioneers */
{ USB_DEVICE(0x08ec, 0x1000), .driver_info = USB_QUIRK_RESET_RESUME },
{ USB_DEVICE(0x1a0a, 0x0200), .driver_info =
USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL },
+ /* Terminus Technology Inc. Hub */
+ { USB_DEVICE(0x1a40, 0x0101), .driver_info = USB_QUIRK_HUB_SLOW_RESET },
+
/* Corsair K70 RGB */
{ USB_DEVICE(0x1b1c, 0x1b13), .driver_info = USB_QUIRK_DELAY_INIT },
{ USB_DEVICE(0x1b1c, 0x1b20), .driver_info = USB_QUIRK_DELAY_INIT |
USB_QUIRK_DELAY_CTRL_MSG },
+ /* Corsair K70 LUX RGB */
+ { USB_DEVICE(0x1b1c, 0x1b33), .driver_info = USB_QUIRK_DELAY_INIT },
+
/* Corsair K70 LUX */
{ USB_DEVICE(0x1b1c, 0x1b36), .driver_info = USB_QUIRK_DELAY_INIT },
{ USB_DEVICE(0x2040, 0x7200), .driver_info =
USB_QUIRK_CONFIG_INTF_STRINGS },
+ /* Raydium Touchscreen */
+ { USB_DEVICE(0x2386, 0x3114), .driver_info = USB_QUIRK_NO_LPM },
+
+ { USB_DEVICE(0x2386, 0x3119), .driver_info = USB_QUIRK_NO_LPM },
+
/* DJI CineSSD */
{ USB_DEVICE(0x2ca3, 0x0031), .driver_info = USB_QUIRK_NO_LPM },
*/
int __usb_get_extra_descriptor(char *buffer, unsigned size,
- unsigned char type, void **ptr)
+ unsigned char type, void **ptr, size_t minsize)
{
struct usb_descriptor_header *header;
while (size >= sizeof(struct usb_descriptor_header)) {
header = (struct usb_descriptor_header *)buffer;
- if (header->bLength < 2) {
+ if (header->bLength < 2 || header->bLength > size) {
printk(KERN_ERR
"%s: bogus descriptor, type %d length %d\n",
usbcore_name,
return -1;
}
- if (header->bDescriptorType == type) {
+ if (header->bDescriptorType == type && header->bLength >= minsize) {
*ptr = header;
return 0;
}
dwc2 = platform_device_alloc("dwc2", PLATFORM_DEVID_AUTO);
if (!dwc2) {
dev_err(dev, "couldn't allocate dwc2 device\n");
+ ret = -ENOMEM;
goto err;
}
err5:
dwc3_event_buffers_cleanup(dwc);
+ dwc3_ulpi_exit(dwc);
err4:
dwc3_free_scratch_buffers(dwc);
static void dwc3_pci_remove(struct pci_dev *pci)
{
struct dwc3_pci *dwc = pci_get_drvdata(pci);
+ struct pci_dev *pdev = dwc->pci;
- gpiod_remove_lookup_table(&platform_bytcr_gpios);
+ if (pdev->device == PCI_DEVICE_ID_INTEL_BYT)
+ gpiod_remove_lookup_table(&platform_bytcr_gpios);
#ifdef CONFIG_PM
cancel_work_sync(&dwc->wakeup_work);
#endif
/* Now prepare one extra TRB to align transfer size */
trb = &dep->trb_pool[dep->trb_enqueue];
__dwc3_prepare_one_trb(dep, trb, dwc->bounce_addr,
- maxp - rem, false, 0,
+ maxp - rem, false, 1,
req->request.stream_id,
req->request.short_not_ok,
req->request.no_interrupt);
/* Now prepare one extra TRB to align transfer size */
trb = &dep->trb_pool[dep->trb_enqueue];
__dwc3_prepare_one_trb(dep, trb, dwc->bounce_addr, maxp - rem,
- false, 0, req->request.stream_id,
+ false, 1, req->request.stream_id,
req->request.short_not_ok,
req->request.no_interrupt);
} else if (req->request.zero && req->request.length &&
/* Now prepare one extra TRB to handle ZLP */
trb = &dep->trb_pool[dep->trb_enqueue];
__dwc3_prepare_one_trb(dep, trb, dwc->bounce_addr, 0,
- false, 0, req->request.stream_id,
+ false, 1, req->request.stream_id,
req->request.short_not_ok,
req->request.no_interrupt);
} else {
unsigned transfer_in_flight;
unsigned started;
- if (dep->flags & DWC3_EP_STALL)
- return 0;
-
if (dep->number > 1)
trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
else
else
dep->flags |= DWC3_EP_STALL;
} else {
- if (!(dep->flags & DWC3_EP_STALL))
- return 0;
ret = dwc3_send_clear_stall_ep_cmd(dep);
if (ret)
* with one TRB pending in the ring. We need to manually clear HWO bit
* from that TRB.
*/
- if ((req->zero || req->unaligned) && (trb->ctrl & DWC3_TRB_CTRL_HWO)) {
+ if ((req->zero || req->unaligned) && !(trb->ctrl & DWC3_TRB_CTRL_CHN)) {
trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
return 1;
}
struct mm_struct *mm;
struct work_struct work;
- struct work_struct cancellation_work;
struct usb_ep *ep;
struct usb_request *req;
return 0;
}
-static void ffs_aio_cancel_worker(struct work_struct *work)
-{
- struct ffs_io_data *io_data = container_of(work, struct ffs_io_data,
- cancellation_work);
-
- ENTER();
-
- usb_ep_dequeue(io_data->ep, io_data->req);
-}
-
static int ffs_aio_cancel(struct kiocb *kiocb)
{
struct ffs_io_data *io_data = kiocb->private;
- struct ffs_data *ffs = io_data->ffs;
+ struct ffs_epfile *epfile = kiocb->ki_filp->private_data;
int value;
ENTER();
- if (likely(io_data && io_data->ep && io_data->req)) {
- INIT_WORK(&io_data->cancellation_work, ffs_aio_cancel_worker);
- queue_work(ffs->io_completion_wq, &io_data->cancellation_work);
- value = -EINPROGRESS;
- } else {
+ spin_lock_irq(&epfile->ffs->eps_lock);
+
+ if (likely(io_data && io_data->ep && io_data->req))
+ value = usb_ep_dequeue(io_data->ep, io_data->req);
+ else
value = -EINVAL;
- }
+
+ spin_unlock_irq(&epfile->ffs->eps_lock);
return value;
}
static void rx_fill(struct eth_dev *dev, gfp_t gfp_flags)
{
struct usb_request *req;
- struct usb_request *tmp;
unsigned long flags;
/* fill unused rxq slots with some skb */
spin_lock_irqsave(&dev->req_lock, flags);
- list_for_each_entry_safe(req, tmp, &dev->rx_reqs, list) {
+ while (!list_empty(&dev->rx_reqs)) {
+ req = list_first_entry(&dev->rx_reqs, struct usb_request, list);
list_del_init(&req->list);
spin_unlock_irqrestore(&dev->req_lock, flags);
{
struct eth_dev *dev = link->ioport;
struct usb_request *req;
- struct usb_request *tmp;
WARN_ON(!dev);
if (!dev)
*/
usb_ep_disable(link->in_ep);
spin_lock(&dev->req_lock);
- list_for_each_entry_safe(req, tmp, &dev->tx_reqs, list) {
+ while (!list_empty(&dev->tx_reqs)) {
+ req = list_first_entry(&dev->tx_reqs, struct usb_request, list);
list_del(&req->list);
spin_unlock(&dev->req_lock);
usb_ep_disable(link->out_ep);
spin_lock(&dev->req_lock);
- list_for_each_entry_safe(req, tmp, &dev->rx_reqs, list) {
+ while (!list_empty(&dev->rx_reqs)) {
+ req = list_first_entry(&dev->rx_reqs, struct usb_request, list);
list_del(&req->list);
spin_unlock(&dev->req_lock);
{
return machine_is_omap_innovator()
|| machine_is_omap_osk()
+ || machine_is_omap_palmte()
|| machine_is_sx1()
/* No known omap7xx boards with vbus sense */
|| cpu_is_omap7xx();
static int omap_udc_start(struct usb_gadget *g,
struct usb_gadget_driver *driver)
{
- int status = -ENODEV;
+ int status;
struct omap_ep *ep;
unsigned long flags;
goto done;
}
} else {
+ status = 0;
if (can_pullup(udc))
pullup_enable(udc);
else
static void omap_udc_release(struct device *dev)
{
- complete(udc->done);
+ pullup_disable(udc);
+ if (!IS_ERR_OR_NULL(udc->transceiver)) {
+ usb_put_phy(udc->transceiver);
+ udc->transceiver = NULL;
+ }
+ omap_writew(0, UDC_SYSCON1);
+ remove_proc_file();
+ if (udc->dc_clk) {
+ if (udc->clk_requested)
+ omap_udc_enable_clock(0);
+ clk_put(udc->hhc_clk);
+ clk_put(udc->dc_clk);
+ }
+ if (udc->done)
+ complete(udc->done);
kfree(udc);
- udc = NULL;
}
static int
udc->gadget.speed = USB_SPEED_UNKNOWN;
udc->gadget.max_speed = USB_SPEED_FULL;
udc->gadget.name = driver_name;
+ udc->gadget.quirk_ep_out_aligned_size = 1;
udc->transceiver = xceiv;
/* ep0 is special; put it right after the SETUP buffer */
udc->clr_halt = UDC_RESET_EP;
/* USB general purpose IRQ: ep0, state changes, dma, etc */
- status = request_irq(pdev->resource[1].start, omap_udc_irq,
- 0, driver_name, udc);
+ status = devm_request_irq(&pdev->dev, pdev->resource[1].start,
+ omap_udc_irq, 0, driver_name, udc);
if (status != 0) {
ERR("can't get irq %d, err %d\n",
(int) pdev->resource[1].start, status);
}
/* USB "non-iso" IRQ (PIO for all but ep0) */
- status = request_irq(pdev->resource[2].start, omap_udc_pio_irq,
- 0, "omap_udc pio", udc);
+ status = devm_request_irq(&pdev->dev, pdev->resource[2].start,
+ omap_udc_pio_irq, 0, "omap_udc pio", udc);
if (status != 0) {
ERR("can't get irq %d, err %d\n",
(int) pdev->resource[2].start, status);
- goto cleanup2;
+ goto cleanup1;
}
#ifdef USE_ISO
- status = request_irq(pdev->resource[3].start, omap_udc_iso_irq,
- 0, "omap_udc iso", udc);
+ status = devm_request_irq(&pdev->dev, pdev->resource[3].start,
+ omap_udc_iso_irq, 0, "omap_udc iso", udc);
if (status != 0) {
ERR("can't get irq %d, err %d\n",
(int) pdev->resource[3].start, status);
- goto cleanup3;
+ goto cleanup1;
}
#endif
if (cpu_is_omap16xx() || cpu_is_omap7xx()) {
}
create_proc_file();
- status = usb_add_gadget_udc_release(&pdev->dev, &udc->gadget,
- omap_udc_release);
- if (status)
- goto cleanup4;
-
- return 0;
-
-cleanup4:
- remove_proc_file();
-
-#ifdef USE_ISO
-cleanup3:
- free_irq(pdev->resource[2].start, udc);
-#endif
-
-cleanup2:
- free_irq(pdev->resource[1].start, udc);
+ return usb_add_gadget_udc_release(&pdev->dev, &udc->gadget,
+ omap_udc_release);
cleanup1:
kfree(udc);
{
DECLARE_COMPLETION_ONSTACK(done);
- if (!udc)
- return -ENODEV;
-
- usb_del_gadget_udc(&udc->gadget);
- if (udc->driver)
- return -EBUSY;
-
udc->done = &done;
- pullup_disable(udc);
- if (!IS_ERR_OR_NULL(udc->transceiver)) {
- usb_put_phy(udc->transceiver);
- udc->transceiver = NULL;
- }
- omap_writew(0, UDC_SYSCON1);
-
- remove_proc_file();
-
-#ifdef USE_ISO
- free_irq(pdev->resource[3].start, udc);
-#endif
- free_irq(pdev->resource[2].start, udc);
- free_irq(pdev->resource[1].start, udc);
+ usb_del_gadget_udc(&udc->gadget);
- if (udc->dc_clk) {
- if (udc->clk_requested)
- omap_udc_enable_clock(0);
- clk_put(udc->hhc_clk);
- clk_put(udc->dc_clk);
- }
+ wait_for_completion(&done);
release_mem_region(pdev->resource[0].start,
pdev->resource[0].end - pdev->resource[0].start + 1);
- wait_for_completion(&done);
-
return 0;
}
top = itr + itr_size;
result = __usb_get_extra_descriptor(usb_dev->rawdescriptors[index],
le16_to_cpu(usb_dev->actconfig->desc.wTotalLength),
- USB_DT_SECURITY, (void **) &secd);
+ USB_DT_SECURITY, (void **) &secd, sizeof(*secd));
if (result == -1) {
dev_warn(dev, "BUG? WUSB host has no security descriptors\n");
return 0;
struct xhci_hcd_histb *histb = platform_get_drvdata(dev);
struct usb_hcd *hcd = histb->hcd;
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+ struct usb_hcd *shared_hcd = xhci->shared_hcd;
xhci->xhc_state |= XHCI_STATE_REMOVING;
- usb_remove_hcd(xhci->shared_hcd);
+ usb_remove_hcd(shared_hcd);
+ xhci->shared_hcd = NULL;
device_wakeup_disable(&dev->dev);
usb_remove_hcd(hcd);
- usb_put_hcd(xhci->shared_hcd);
+ usb_put_hcd(shared_hcd);
xhci_histb_host_disable(histb);
usb_put_hcd(hcd);
status |= USB_PORT_STAT_SUSPEND;
}
if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
- !DEV_SUPERSPEED_ANY(raw_port_status)) {
+ !DEV_SUPERSPEED_ANY(raw_port_status) && hcd->speed < HCD_USB3) {
if ((raw_port_status & PORT_RESET) ||
!(raw_port_status & PORT_PE))
return 0xffffffff;
time_left = wait_for_completion_timeout(
&bus_state->rexit_done[wIndex],
msecs_to_jiffies(
- XHCI_MAX_REXIT_TIMEOUT));
+ XHCI_MAX_REXIT_TIMEOUT_MS));
spin_lock_irqsave(&xhci->lock, flags);
if (time_left) {
} else {
int port_status = readl(port->addr);
xhci_warn(xhci, "Port resume took longer than %i msec, port status = 0x%x\n",
- XHCI_MAX_REXIT_TIMEOUT,
+ XHCI_MAX_REXIT_TIMEOUT_MS,
port_status);
status |= USB_PORT_STAT_SUSPEND;
clear_bit(wIndex, &bus_state->rexit_ports);
unsigned long flags;
struct xhci_hub *rhub;
struct xhci_port **ports;
+ u32 portsc_buf[USB_MAXCHILDREN];
+ bool wake_enabled;
rhub = xhci_get_rhub(hcd);
ports = rhub->ports;
max_ports = rhub->num_ports;
bus_state = &xhci->bus_state[hcd_index(hcd)];
+ wake_enabled = hcd->self.root_hub->do_remote_wakeup;
spin_lock_irqsave(&xhci->lock, flags);
- if (hcd->self.root_hub->do_remote_wakeup) {
+ if (wake_enabled) {
if (bus_state->resuming_ports || /* USB2 */
bus_state->port_remote_wakeup) { /* USB3 */
spin_unlock_irqrestore(&xhci->lock, flags);
return -EBUSY;
}
}
-
- port_index = max_ports;
+ /*
+ * Prepare ports for suspend, but don't write anything before all ports
+ * are checked and we know bus suspend can proceed
+ */
bus_state->bus_suspended = 0;
+ port_index = max_ports;
while (port_index--) {
- /* suspend the port if the port is not suspended */
u32 t1, t2;
- int slot_id;
t1 = readl(ports[port_index]->addr);
t2 = xhci_port_state_to_neutral(t1);
+ portsc_buf[port_index] = 0;
- if ((t1 & PORT_PE) && !(t1 & PORT_PLS_MASK)) {
- xhci_dbg(xhci, "port %d not suspended\n", port_index);
- slot_id = xhci_find_slot_id_by_port(hcd, xhci,
- port_index + 1);
- if (slot_id) {
+ /* Bail out if a USB3 port has a new device in link training */
+ if ((t1 & PORT_PLS_MASK) == XDEV_POLLING) {
+ bus_state->bus_suspended = 0;
+ spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_dbg(xhci, "Bus suspend bailout, port in polling\n");
+ return -EBUSY;
+ }
+
+ /* suspend ports in U0, or bail out for new connect changes */
+ if ((t1 & PORT_PE) && (t1 & PORT_PLS_MASK) == XDEV_U0) {
+ if ((t1 & PORT_CSC) && wake_enabled) {
+ bus_state->bus_suspended = 0;
spin_unlock_irqrestore(&xhci->lock, flags);
- xhci_stop_device(xhci, slot_id, 1);
- spin_lock_irqsave(&xhci->lock, flags);
+ xhci_dbg(xhci, "Bus suspend bailout, port connect change\n");
+ return -EBUSY;
}
+ xhci_dbg(xhci, "port %d not suspended\n", port_index);
t2 &= ~PORT_PLS_MASK;
t2 |= PORT_LINK_STROBE | XDEV_U3;
set_bit(port_index, &bus_state->bus_suspended);
* including the USB 3.0 roothub, but only if CONFIG_PM
* is enabled, so also enable remote wake here.
*/
- if (hcd->self.root_hub->do_remote_wakeup) {
+ if (wake_enabled) {
if (t1 & PORT_CONNECT) {
t2 |= PORT_WKOC_E | PORT_WKDISC_E;
t2 &= ~PORT_WKCONN_E;
t1 = xhci_port_state_to_neutral(t1);
if (t1 != t2)
- writel(t2, ports[port_index]->addr);
+ portsc_buf[port_index] = t2;
+ }
+
+ /* write port settings, stopping and suspending ports if needed */
+ port_index = max_ports;
+ while (port_index--) {
+ if (!portsc_buf[port_index])
+ continue;
+ if (test_bit(port_index, &bus_state->bus_suspended)) {
+ int slot_id;
+
+ slot_id = xhci_find_slot_id_by_port(hcd, xhci,
+ port_index + 1);
+ if (slot_id) {
+ spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_stop_device(xhci, slot_id, 1);
+ spin_lock_irqsave(&xhci->lock, flags);
+ }
+ }
+ writel(portsc_buf[port_index], ports[port_index]->addr);
}
hcd->state = HC_STATE_SUSPENDED;
bus_state->next_statechange = jiffies + msecs_to_jiffies(10);
struct xhci_hcd_mtk *mtk = platform_get_drvdata(dev);
struct usb_hcd *hcd = mtk->hcd;
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+ struct usb_hcd *shared_hcd = xhci->shared_hcd;
- usb_remove_hcd(xhci->shared_hcd);
+ usb_remove_hcd(shared_hcd);
+ xhci->shared_hcd = NULL;
device_init_wakeup(&dev->dev, false);
usb_remove_hcd(hcd);
- usb_put_hcd(xhci->shared_hcd);
+ usb_put_hcd(shared_hcd);
usb_put_hcd(hcd);
xhci_mtk_sch_exit(mtk);
xhci_mtk_clks_disable(mtk);
pdev->device == 0x43bb))
xhci->quirks |= XHCI_SUSPEND_DELAY;
+ if (pdev->vendor == PCI_VENDOR_ID_AMD &&
+ (pdev->device == 0x15e0 || pdev->device == 0x15e1))
+ xhci->quirks |= XHCI_SNPS_BROKEN_SUSPEND;
+
if (pdev->vendor == PCI_VENDOR_ID_AMD)
xhci->quirks |= XHCI_TRUST_TX_LENGTH;
if (pdev->vendor == PCI_VENDOR_ID_TI && pdev->device == 0x8241)
xhci->quirks |= XHCI_LIMIT_ENDPOINT_INTERVAL_7;
+ if ((pdev->vendor == PCI_VENDOR_ID_BROADCOM ||
+ pdev->vendor == PCI_VENDOR_ID_CAVIUM) &&
+ pdev->device == 0x9026)
+ xhci->quirks |= XHCI_RESET_PLL_ON_DISCONNECT;
+
if (xhci->quirks & XHCI_RESET_ON_RESUME)
xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
"QUIRK: Resetting on resume");
if (xhci->shared_hcd) {
usb_remove_hcd(xhci->shared_hcd);
usb_put_hcd(xhci->shared_hcd);
+ xhci->shared_hcd = NULL;
}
/* Workaround for spurious wakeups at shutdown with HSW */
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
struct clk *clk = xhci->clk;
struct clk *reg_clk = xhci->reg_clk;
+ struct usb_hcd *shared_hcd = xhci->shared_hcd;
xhci->xhc_state |= XHCI_STATE_REMOVING;
- usb_remove_hcd(xhci->shared_hcd);
+ usb_remove_hcd(shared_hcd);
+ xhci->shared_hcd = NULL;
usb_phy_shutdown(hcd->usb_phy);
usb_remove_hcd(hcd);
- usb_put_hcd(xhci->shared_hcd);
+ usb_put_hcd(shared_hcd);
clk_disable_unprepare(clk);
clk_disable_unprepare(reg_clk);
usb_wakeup_notification(udev->parent, udev->portnum);
}
+/*
+ * Quirk hanlder for errata seen on Cavium ThunderX2 processor XHCI
+ * Controller.
+ * As per ThunderX2errata-129 USB 2 device may come up as USB 1
+ * If a connection to a USB 1 device is followed by another connection
+ * to a USB 2 device.
+ *
+ * Reset the PHY after the USB device is disconnected if device speed
+ * is less than HCD_USB3.
+ * Retry the reset sequence max of 4 times checking the PLL lock status.
+ *
+ */
+static void xhci_cavium_reset_phy_quirk(struct xhci_hcd *xhci)
+{
+ struct usb_hcd *hcd = xhci_to_hcd(xhci);
+ u32 pll_lock_check;
+ u32 retry_count = 4;
+
+ do {
+ /* Assert PHY reset */
+ writel(0x6F, hcd->regs + 0x1048);
+ udelay(10);
+ /* De-assert the PHY reset */
+ writel(0x7F, hcd->regs + 0x1048);
+ udelay(200);
+ pll_lock_check = readl(hcd->regs + 0x1070);
+ } while (!(pll_lock_check & 0x1) && --retry_count);
+}
+
static void handle_port_status(struct xhci_hcd *xhci,
union xhci_trb *event)
{
goto cleanup;
}
+ /* We might get interrupts after shared_hcd is removed */
+ if (port->rhub == &xhci->usb3_rhub && xhci->shared_hcd == NULL) {
+ xhci_dbg(xhci, "ignore port event for removed USB3 hcd\n");
+ bogus_port_status = true;
+ goto cleanup;
+ }
+
hcd = port->rhub->hcd;
bus_state = &xhci->bus_state[hcd_index(hcd)];
hcd_portnum = port->hcd_portnum;
* RExit to a disconnect state). If so, let the the driver know it's
* out of the RExit state.
*/
- if (!DEV_SUPERSPEED_ANY(portsc) &&
+ if (!DEV_SUPERSPEED_ANY(portsc) && hcd->speed < HCD_USB3 &&
test_and_clear_bit(hcd_portnum,
&bus_state->rexit_ports)) {
complete(&bus_state->rexit_done[hcd_portnum]);
goto cleanup;
}
- if (hcd->speed < HCD_USB3)
+ if (hcd->speed < HCD_USB3) {
xhci_test_and_clear_bit(xhci, port, PORT_PLC);
+ if ((xhci->quirks & XHCI_RESET_PLL_ON_DISCONNECT) &&
+ (portsc & PORT_CSC) && !(portsc & PORT_CONNECT))
+ xhci_cavium_reset_phy_quirk(xhci);
+ }
cleanup:
/* Update event ring dequeue pointer before dropping the lock */
goto cleanup;
case COMP_RING_UNDERRUN:
case COMP_RING_OVERRUN:
+ case COMP_STOPPED_LENGTH_INVALID:
goto cleanup;
default:
xhci_err(xhci, "ERROR Transfer event for unknown stream ring slot %u ep %u\n",
usb_remove_hcd(xhci->shared_hcd);
usb_put_hcd(xhci->shared_hcd);
+ xhci->shared_hcd = NULL;
usb_remove_hcd(tegra->hcd);
usb_put_hcd(tegra->hcd);
/* Only halt host and free memory after both hcds are removed */
if (!usb_hcd_is_primary_hcd(hcd)) {
- /* usb core will free this hcd shortly, unset pointer */
- xhci->shared_hcd = NULL;
mutex_unlock(&xhci->mutex);
return;
}
unsigned int delay = XHCI_MAX_HALT_USEC;
struct usb_hcd *hcd = xhci_to_hcd(xhci);
u32 command;
+ u32 res;
if (!hcd->state)
return 0;
command = readl(&xhci->op_regs->command);
command |= CMD_CSS;
writel(command, &xhci->op_regs->command);
+ xhci->broken_suspend = 0;
if (xhci_handshake(&xhci->op_regs->status,
STS_SAVE, 0, 10 * 1000)) {
- xhci_warn(xhci, "WARN: xHC save state timeout\n");
- spin_unlock_irq(&xhci->lock);
- return -ETIMEDOUT;
+ /*
+ * AMD SNPS xHC 3.0 occasionally does not clear the
+ * SSS bit of USBSTS and when driver tries to poll
+ * to see if the xHC clears BIT(8) which never happens
+ * and driver assumes that controller is not responding
+ * and times out. To workaround this, its good to check
+ * if SRE and HCE bits are not set (as per xhci
+ * Section 5.4.2) and bypass the timeout.
+ */
+ res = readl(&xhci->op_regs->status);
+ if ((xhci->quirks & XHCI_SNPS_BROKEN_SUSPEND) &&
+ (((res & STS_SRE) == 0) &&
+ ((res & STS_HCE) == 0))) {
+ xhci->broken_suspend = 1;
+ } else {
+ xhci_warn(xhci, "WARN: xHC save state timeout\n");
+ spin_unlock_irq(&xhci->lock);
+ return -ETIMEDOUT;
+ }
}
spin_unlock_irq(&xhci->lock);
set_bit(HCD_FLAG_HW_ACCESSIBLE, &xhci->shared_hcd->flags);
spin_lock_irq(&xhci->lock);
- if (xhci->quirks & XHCI_RESET_ON_RESUME)
+ if ((xhci->quirks & XHCI_RESET_ON_RESUME) || xhci->broken_suspend)
hibernated = true;
if (!hibernated) {
{
unsigned long long timeout_ns;
+ /* Prevent U1 if service interval is shorter than U1 exit latency */
+ if (usb_endpoint_xfer_int(desc) || usb_endpoint_xfer_isoc(desc)) {
+ if (xhci_service_interval_to_ns(desc) <= udev->u1_params.mel) {
+ dev_dbg(&udev->dev, "Disable U1, ESIT shorter than exit latency\n");
+ return USB3_LPM_DISABLED;
+ }
+ }
+
if (xhci->quirks & XHCI_INTEL_HOST)
timeout_ns = xhci_calculate_intel_u1_timeout(udev, desc);
else
{
unsigned long long timeout_ns;
+ /* Prevent U2 if service interval is shorter than U2 exit latency */
+ if (usb_endpoint_xfer_int(desc) || usb_endpoint_xfer_isoc(desc)) {
+ if (xhci_service_interval_to_ns(desc) <= udev->u2_params.mel) {
+ dev_dbg(&udev->dev, "Disable U2, ESIT shorter than exit latency\n");
+ return USB3_LPM_DISABLED;
+ }
+ }
+
if (xhci->quirks & XHCI_INTEL_HOST)
timeout_ns = xhci_calculate_intel_u2_timeout(udev, desc);
else
* It can take up to 20 ms to transition from RExit to U0 on the
* Intel Lynx Point LP xHCI host.
*/
-#define XHCI_MAX_REXIT_TIMEOUT (20 * 1000)
+#define XHCI_MAX_REXIT_TIMEOUT_MS 20
static inline unsigned int hcd_index(struct usb_hcd *hcd)
{
#define XHCI_INTEL_USB_ROLE_SW BIT_ULL(31)
#define XHCI_ZERO_64B_REGS BIT_ULL(32)
#define XHCI_DEFAULT_PM_RUNTIME_ALLOW BIT_ULL(33)
+#define XHCI_RESET_PLL_ON_DISCONNECT BIT_ULL(34)
+#define XHCI_SNPS_BROKEN_SUSPEND BIT_ULL(35)
unsigned int num_active_eps;
unsigned int limit_active_eps;
void *dbc;
/* platform-specific data -- must come last */
unsigned long priv[0] __aligned(sizeof(s64));
+ /* Broken Suspend flag for SNPS Suspend resume issue */
+ u8 broken_suspend;
};
/* Platform specific overrides to generic XHCI hc_driver ops */
{ APPLEDISPLAY_DEVICE(0x9219) },
{ APPLEDISPLAY_DEVICE(0x921c) },
{ APPLEDISPLAY_DEVICE(0x921d) },
+ { APPLEDISPLAY_DEVICE(0x9222) },
+ { APPLEDISPLAY_DEVICE(0x9226) },
{ APPLEDISPLAY_DEVICE(0x9236) },
/* Terminating entry */
cflag |= PARENB;
break;
}
- co->cflag = cflag;
/*
* no need to check the index here: if the index is wrong, console
serial->type->set_termios(tty, port, &dummy);
tty_port_tty_set(&port->port, NULL);
+ tty_save_termios(tty);
tty_kref_put(tty);
}
tty_port_set_initialized(&port->port, 1);
"USB Card Reader",
USB_SC_DEVICE, USB_PR_DEVICE, init_realtek_cr, 0),
+UNUSUAL_DEV(0x0bda, 0x0177, 0x0000, 0x9999,
+ "Realtek",
+ "USB Card Reader",
+ USB_SC_DEVICE, USB_PR_DEVICE, init_realtek_cr, 0),
+
+UNUSUAL_DEV(0x0bda, 0x0184, 0x0000, 0x9999,
+ "Realtek",
+ "USB Card Reader",
+ USB_SC_DEVICE, USB_PR_DEVICE, init_realtek_cr, 0),
+
#endif /* defined(CONFIG_USB_STORAGE_REALTEK) || ... */
if TYPEC_UCSI
+config UCSI_CCG
+ tristate "UCSI Interface Driver for Cypress CCGx"
+ depends on I2C
+ help
+ This driver enables UCSI support on platforms that expose a
+ Cypress CCGx Type-C controller over I2C interface.
+
+ To compile the driver as a module, choose M here: the module will be
+ called ucsi_ccg.
+
config UCSI_ACPI
tristate "UCSI ACPI Interface Driver"
depends on ACPI
typec_ucsi-$(CONFIG_TRACING) += trace.o
obj-$(CONFIG_UCSI_ACPI) += ucsi_acpi.o
+
+obj-$(CONFIG_UCSI_CCG) += ucsi_ccg.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * UCSI driver for Cypress CCGx Type-C controller
+ *
+ * Copyright (C) 2017-2018 NVIDIA Corporation. All rights reserved.
+ * Author: Ajay Gupta <ajayg@nvidia.com>
+ *
+ * Some code borrowed from drivers/usb/typec/ucsi/ucsi_acpi.c
+ */
+#include <linux/acpi.h>
+#include <linux/delay.h>
+#include <linux/i2c.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+
+#include <asm/unaligned.h>
+#include "ucsi.h"
+
+struct ucsi_ccg {
+ struct device *dev;
+ struct ucsi *ucsi;
+ struct ucsi_ppm ppm;
+ struct i2c_client *client;
+};
+
+#define CCGX_RAB_INTR_REG 0x06
+#define CCGX_RAB_UCSI_CONTROL 0x39
+#define CCGX_RAB_UCSI_CONTROL_START BIT(0)
+#define CCGX_RAB_UCSI_CONTROL_STOP BIT(1)
+#define CCGX_RAB_UCSI_DATA_BLOCK(offset) (0xf000 | ((offset) & 0xff))
+
+static int ccg_read(struct ucsi_ccg *uc, u16 rab, u8 *data, u32 len)
+{
+ struct i2c_client *client = uc->client;
+ const struct i2c_adapter_quirks *quirks = client->adapter->quirks;
+ unsigned char buf[2];
+ struct i2c_msg msgs[] = {
+ {
+ .addr = client->addr,
+ .flags = 0x0,
+ .len = sizeof(buf),
+ .buf = buf,
+ },
+ {
+ .addr = client->addr,
+ .flags = I2C_M_RD,
+ .buf = data,
+ },
+ };
+ u32 rlen, rem_len = len, max_read_len = len;
+ int status;
+
+ /* check any max_read_len limitation on i2c adapter */
+ if (quirks && quirks->max_read_len)
+ max_read_len = quirks->max_read_len;
+
+ while (rem_len > 0) {
+ msgs[1].buf = &data[len - rem_len];
+ rlen = min_t(u16, rem_len, max_read_len);
+ msgs[1].len = rlen;
+ put_unaligned_le16(rab, buf);
+ status = i2c_transfer(client->adapter, msgs, ARRAY_SIZE(msgs));
+ if (status < 0) {
+ dev_err(uc->dev, "i2c_transfer failed %d\n", status);
+ return status;
+ }
+ rab += rlen;
+ rem_len -= rlen;
+ }
+
+ return 0;
+}
+
+static int ccg_write(struct ucsi_ccg *uc, u16 rab, u8 *data, u32 len)
+{
+ struct i2c_client *client = uc->client;
+ unsigned char *buf;
+ struct i2c_msg msgs[] = {
+ {
+ .addr = client->addr,
+ .flags = 0x0,
+ }
+ };
+ int status;
+
+ buf = kzalloc(len + sizeof(rab), GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ put_unaligned_le16(rab, buf);
+ memcpy(buf + sizeof(rab), data, len);
+
+ msgs[0].len = len + sizeof(rab);
+ msgs[0].buf = buf;
+
+ status = i2c_transfer(client->adapter, msgs, ARRAY_SIZE(msgs));
+ if (status < 0) {
+ dev_err(uc->dev, "i2c_transfer failed %d\n", status);
+ kfree(buf);
+ return status;
+ }
+
+ kfree(buf);
+ return 0;
+}
+
+static int ucsi_ccg_init(struct ucsi_ccg *uc)
+{
+ unsigned int count = 10;
+ u8 data;
+ int status;
+
+ data = CCGX_RAB_UCSI_CONTROL_STOP;
+ status = ccg_write(uc, CCGX_RAB_UCSI_CONTROL, &data, sizeof(data));
+ if (status < 0)
+ return status;
+
+ data = CCGX_RAB_UCSI_CONTROL_START;
+ status = ccg_write(uc, CCGX_RAB_UCSI_CONTROL, &data, sizeof(data));
+ if (status < 0)
+ return status;
+
+ /*
+ * Flush CCGx RESPONSE queue by acking interrupts. Above ucsi control
+ * register write will push response which must be cleared.
+ */
+ do {
+ status = ccg_read(uc, CCGX_RAB_INTR_REG, &data, sizeof(data));
+ if (status < 0)
+ return status;
+
+ if (!data)
+ return 0;
+
+ status = ccg_write(uc, CCGX_RAB_INTR_REG, &data, sizeof(data));
+ if (status < 0)
+ return status;
+
+ usleep_range(10000, 11000);
+ } while (--count);
+
+ return -ETIMEDOUT;
+}
+
+static int ucsi_ccg_send_data(struct ucsi_ccg *uc)
+{
+ u8 *ppm = (u8 *)uc->ppm.data;
+ int status;
+ u16 rab;
+
+ rab = CCGX_RAB_UCSI_DATA_BLOCK(offsetof(struct ucsi_data, message_out));
+ status = ccg_write(uc, rab, ppm +
+ offsetof(struct ucsi_data, message_out),
+ sizeof(uc->ppm.data->message_out));
+ if (status < 0)
+ return status;
+
+ rab = CCGX_RAB_UCSI_DATA_BLOCK(offsetof(struct ucsi_data, ctrl));
+ return ccg_write(uc, rab, ppm + offsetof(struct ucsi_data, ctrl),
+ sizeof(uc->ppm.data->ctrl));
+}
+
+static int ucsi_ccg_recv_data(struct ucsi_ccg *uc)
+{
+ u8 *ppm = (u8 *)uc->ppm.data;
+ int status;
+ u16 rab;
+
+ rab = CCGX_RAB_UCSI_DATA_BLOCK(offsetof(struct ucsi_data, cci));
+ status = ccg_read(uc, rab, ppm + offsetof(struct ucsi_data, cci),
+ sizeof(uc->ppm.data->cci));
+ if (status < 0)
+ return status;
+
+ rab = CCGX_RAB_UCSI_DATA_BLOCK(offsetof(struct ucsi_data, message_in));
+ return ccg_read(uc, rab, ppm + offsetof(struct ucsi_data, message_in),
+ sizeof(uc->ppm.data->message_in));
+}
+
+static int ucsi_ccg_ack_interrupt(struct ucsi_ccg *uc)
+{
+ int status;
+ unsigned char data;
+
+ status = ccg_read(uc, CCGX_RAB_INTR_REG, &data, sizeof(data));
+ if (status < 0)
+ return status;
+
+ return ccg_write(uc, CCGX_RAB_INTR_REG, &data, sizeof(data));
+}
+
+static int ucsi_ccg_sync(struct ucsi_ppm *ppm)
+{
+ struct ucsi_ccg *uc = container_of(ppm, struct ucsi_ccg, ppm);
+ int status;
+
+ status = ucsi_ccg_recv_data(uc);
+ if (status < 0)
+ return status;
+
+ /* ack interrupt to allow next command to run */
+ return ucsi_ccg_ack_interrupt(uc);
+}
+
+static int ucsi_ccg_cmd(struct ucsi_ppm *ppm, struct ucsi_control *ctrl)
+{
+ struct ucsi_ccg *uc = container_of(ppm, struct ucsi_ccg, ppm);
+
+ ppm->data->ctrl.raw_cmd = ctrl->raw_cmd;
+ return ucsi_ccg_send_data(uc);
+}
+
+static irqreturn_t ccg_irq_handler(int irq, void *data)
+{
+ struct ucsi_ccg *uc = data;
+
+ ucsi_notify(uc->ucsi);
+
+ return IRQ_HANDLED;
+}
+
+static int ucsi_ccg_probe(struct i2c_client *client,
+ const struct i2c_device_id *id)
+{
+ struct device *dev = &client->dev;
+ struct ucsi_ccg *uc;
+ int status;
+ u16 rab;
+
+ uc = devm_kzalloc(dev, sizeof(*uc), GFP_KERNEL);
+ if (!uc)
+ return -ENOMEM;
+
+ uc->ppm.data = devm_kzalloc(dev, sizeof(struct ucsi_data), GFP_KERNEL);
+ if (!uc->ppm.data)
+ return -ENOMEM;
+
+ uc->ppm.cmd = ucsi_ccg_cmd;
+ uc->ppm.sync = ucsi_ccg_sync;
+ uc->dev = dev;
+ uc->client = client;
+
+ /* reset ccg device and initialize ucsi */
+ status = ucsi_ccg_init(uc);
+ if (status < 0) {
+ dev_err(uc->dev, "ucsi_ccg_init failed - %d\n", status);
+ return status;
+ }
+
+ status = devm_request_threaded_irq(dev, client->irq, NULL,
+ ccg_irq_handler,
+ IRQF_ONESHOT | IRQF_TRIGGER_HIGH,
+ dev_name(dev), uc);
+ if (status < 0) {
+ dev_err(uc->dev, "request_threaded_irq failed - %d\n", status);
+ return status;
+ }
+
+ uc->ucsi = ucsi_register_ppm(dev, &uc->ppm);
+ if (IS_ERR(uc->ucsi)) {
+ dev_err(uc->dev, "ucsi_register_ppm failed\n");
+ return PTR_ERR(uc->ucsi);
+ }
+
+ rab = CCGX_RAB_UCSI_DATA_BLOCK(offsetof(struct ucsi_data, version));
+ status = ccg_read(uc, rab, (u8 *)(uc->ppm.data) +
+ offsetof(struct ucsi_data, version),
+ sizeof(uc->ppm.data->version));
+ if (status < 0) {
+ ucsi_unregister_ppm(uc->ucsi);
+ return status;
+ }
+
+ i2c_set_clientdata(client, uc);
+ return 0;
+}
+
+static int ucsi_ccg_remove(struct i2c_client *client)
+{
+ struct ucsi_ccg *uc = i2c_get_clientdata(client);
+
+ ucsi_unregister_ppm(uc->ucsi);
+
+ return 0;
+}
+
+static const struct i2c_device_id ucsi_ccg_device_id[] = {
+ {"ccgx-ucsi", 0},
+ {}
+};
+MODULE_DEVICE_TABLE(i2c, ucsi_ccg_device_id);
+
+static struct i2c_driver ucsi_ccg_driver = {
+ .driver = {
+ .name = "ucsi_ccg",
+ },
+ .probe = ucsi_ccg_probe,
+ .remove = ucsi_ccg_remove,
+ .id_table = ucsi_ccg_device_id,
+};
+
+module_i2c_driver(ucsi_ccg_driver);
+
+MODULE_AUTHOR("Ajay Gupta <ajayg@nvidia.com>");
+MODULE_DESCRIPTION("UCSI driver for Cypress CCGx Type-C controller");
+MODULE_LICENSE("GPL v2");
if (!sock || !buf || !size)
return -EINVAL;
- iov_iter_kvec(&msg.msg_iter, READ|ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, size);
usbip_dbg_xmit("enter\n");
int vs_events_nr; /* num of pending events, protected by vq->mutex */
};
+/*
+ * Context for processing request and control queue operations.
+ */
+struct vhost_scsi_ctx {
+ int head;
+ unsigned int out, in;
+ size_t req_size, rsp_size;
+ size_t out_size, in_size;
+ u8 *target, *lunp;
+ void *req;
+ struct iov_iter out_iter;
+};
+
static struct workqueue_struct *vhost_scsi_workqueue;
/* Global spinlock to protect vhost_scsi TPG list for vhost IOCTL access */
pr_err("Faulted on virtio_scsi_cmd_resp\n");
}
+static int
+vhost_scsi_get_desc(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
+ struct vhost_scsi_ctx *vc)
+{
+ int ret = -ENXIO;
+
+ vc->head = vhost_get_vq_desc(vq, vq->iov,
+ ARRAY_SIZE(vq->iov), &vc->out, &vc->in,
+ NULL, NULL);
+
+ pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n",
+ vc->head, vc->out, vc->in);
+
+ /* On error, stop handling until the next kick. */
+ if (unlikely(vc->head < 0))
+ goto done;
+
+ /* Nothing new? Wait for eventfd to tell us they refilled. */
+ if (vc->head == vq->num) {
+ if (unlikely(vhost_enable_notify(&vs->dev, vq))) {
+ vhost_disable_notify(&vs->dev, vq);
+ ret = -EAGAIN;
+ }
+ goto done;
+ }
+
+ /*
+ * Get the size of request and response buffers.
+ * FIXME: Not correct for BIDI operation
+ */
+ vc->out_size = iov_length(vq->iov, vc->out);
+ vc->in_size = iov_length(&vq->iov[vc->out], vc->in);
+
+ /*
+ * Copy over the virtio-scsi request header, which for a
+ * ANY_LAYOUT enabled guest may span multiple iovecs, or a
+ * single iovec may contain both the header + outgoing
+ * WRITE payloads.
+ *
+ * copy_from_iter() will advance out_iter, so that it will
+ * point at the start of the outgoing WRITE payload, if
+ * DMA_TO_DEVICE is set.
+ */
+ iov_iter_init(&vc->out_iter, WRITE, vq->iov, vc->out, vc->out_size);
+ ret = 0;
+
+done:
+ return ret;
+}
+
+static int
+vhost_scsi_chk_size(struct vhost_virtqueue *vq, struct vhost_scsi_ctx *vc)
+{
+ if (unlikely(vc->in_size < vc->rsp_size)) {
+ vq_err(vq,
+ "Response buf too small, need min %zu bytes got %zu",
+ vc->rsp_size, vc->in_size);
+ return -EINVAL;
+ } else if (unlikely(vc->out_size < vc->req_size)) {
+ vq_err(vq,
+ "Request buf too small, need min %zu bytes got %zu",
+ vc->req_size, vc->out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+static int
+vhost_scsi_get_req(struct vhost_virtqueue *vq, struct vhost_scsi_ctx *vc,
+ struct vhost_scsi_tpg **tpgp)
+{
+ int ret = -EIO;
+
+ if (unlikely(!copy_from_iter_full(vc->req, vc->req_size,
+ &vc->out_iter))) {
+ vq_err(vq, "Faulted on copy_from_iter\n");
+ } else if (unlikely(*vc->lunp != 1)) {
+ /* virtio-scsi spec requires byte 0 of the lun to be 1 */
+ vq_err(vq, "Illegal virtio-scsi lun: %u\n", *vc->lunp);
+ } else {
+ struct vhost_scsi_tpg **vs_tpg, *tpg;
+
+ vs_tpg = vq->private_data; /* validated at handler entry */
+
+ tpg = READ_ONCE(vs_tpg[*vc->target]);
+ if (unlikely(!tpg)) {
+ vq_err(vq, "Target 0x%x does not exist\n", *vc->target);
+ } else {
+ if (tpgp)
+ *tpgp = tpg;
+ ret = 0;
+ }
+ }
+
+ return ret;
+}
+
static void
vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
{
struct vhost_scsi_tpg **vs_tpg, *tpg;
struct virtio_scsi_cmd_req v_req;
struct virtio_scsi_cmd_req_pi v_req_pi;
+ struct vhost_scsi_ctx vc;
struct vhost_scsi_cmd *cmd;
- struct iov_iter out_iter, in_iter, prot_iter, data_iter;
+ struct iov_iter in_iter, prot_iter, data_iter;
u64 tag;
u32 exp_data_len, data_direction;
- unsigned int out = 0, in = 0;
- int head, ret, prot_bytes;
- size_t req_size, rsp_size = sizeof(struct virtio_scsi_cmd_resp);
- size_t out_size, in_size;
+ int ret, prot_bytes;
u16 lun;
- u8 *target, *lunp, task_attr;
+ u8 task_attr;
bool t10_pi = vhost_has_feature(vq, VIRTIO_SCSI_F_T10_PI);
- void *req, *cdb;
+ void *cdb;
mutex_lock(&vq->mutex);
/*
if (!vs_tpg)
goto out;
+ memset(&vc, 0, sizeof(vc));
+ vc.rsp_size = sizeof(struct virtio_scsi_cmd_resp);
+
vhost_disable_notify(&vs->dev, vq);
for (;;) {
- head = vhost_get_vq_desc(vq, vq->iov,
- ARRAY_SIZE(vq->iov), &out, &in,
- NULL, NULL);
- pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n",
- head, out, in);
- /* On error, stop handling until the next kick. */
- if (unlikely(head < 0))
- break;
- /* Nothing new? Wait for eventfd to tell us they refilled. */
- if (head == vq->num) {
- if (unlikely(vhost_enable_notify(&vs->dev, vq))) {
- vhost_disable_notify(&vs->dev, vq);
- continue;
- }
- break;
- }
- /*
- * Check for a sane response buffer so we can report early
- * errors back to the guest.
- */
- if (unlikely(vq->iov[out].iov_len < rsp_size)) {
- vq_err(vq, "Expecting at least virtio_scsi_cmd_resp"
- " size, got %zu bytes\n", vq->iov[out].iov_len);
- break;
- }
+ ret = vhost_scsi_get_desc(vs, vq, &vc);
+ if (ret)
+ goto err;
+
/*
* Setup pointers and values based upon different virtio-scsi
* request header if T10_PI is enabled in KVM guest.
*/
if (t10_pi) {
- req = &v_req_pi;
- req_size = sizeof(v_req_pi);
- lunp = &v_req_pi.lun[0];
- target = &v_req_pi.lun[1];
+ vc.req = &v_req_pi;
+ vc.req_size = sizeof(v_req_pi);
+ vc.lunp = &v_req_pi.lun[0];
+ vc.target = &v_req_pi.lun[1];
} else {
- req = &v_req;
- req_size = sizeof(v_req);
- lunp = &v_req.lun[0];
- target = &v_req.lun[1];
+ vc.req = &v_req;
+ vc.req_size = sizeof(v_req);
+ vc.lunp = &v_req.lun[0];
+ vc.target = &v_req.lun[1];
}
- /*
- * FIXME: Not correct for BIDI operation
- */
- out_size = iov_length(vq->iov, out);
- in_size = iov_length(&vq->iov[out], in);
/*
- * Copy over the virtio-scsi request header, which for a
- * ANY_LAYOUT enabled guest may span multiple iovecs, or a
- * single iovec may contain both the header + outgoing
- * WRITE payloads.
- *
- * copy_from_iter() will advance out_iter, so that it will
- * point at the start of the outgoing WRITE payload, if
- * DMA_TO_DEVICE is set.
+ * Validate the size of request and response buffers.
+ * Check for a sane response buffer so we can report
+ * early errors back to the guest.
*/
- iov_iter_init(&out_iter, WRITE, vq->iov, out, out_size);
+ ret = vhost_scsi_chk_size(vq, &vc);
+ if (ret)
+ goto err;
- if (unlikely(!copy_from_iter_full(req, req_size, &out_iter))) {
- vq_err(vq, "Faulted on copy_from_iter\n");
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
- }
- /* virtio-scsi spec requires byte 0 of the lun to be 1 */
- if (unlikely(*lunp != 1)) {
- vq_err(vq, "Illegal virtio-scsi lun: %u\n", *lunp);
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
- }
+ ret = vhost_scsi_get_req(vq, &vc, &tpg);
+ if (ret)
+ goto err;
+
+ ret = -EIO; /* bad target on any error from here on */
- tpg = READ_ONCE(vs_tpg[*target]);
- if (unlikely(!tpg)) {
- /* Target does not exist, fail the request */
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
- }
/*
* Determine data_direction by calculating the total outgoing
* iovec sizes + incoming iovec sizes vs. virtio-scsi request +
*/
prot_bytes = 0;
- if (out_size > req_size) {
+ if (vc.out_size > vc.req_size) {
data_direction = DMA_TO_DEVICE;
- exp_data_len = out_size - req_size;
- data_iter = out_iter;
- } else if (in_size > rsp_size) {
+ exp_data_len = vc.out_size - vc.req_size;
+ data_iter = vc.out_iter;
+ } else if (vc.in_size > vc.rsp_size) {
data_direction = DMA_FROM_DEVICE;
- exp_data_len = in_size - rsp_size;
+ exp_data_len = vc.in_size - vc.rsp_size;
- iov_iter_init(&in_iter, READ, &vq->iov[out], in,
- rsp_size + exp_data_len);
- iov_iter_advance(&in_iter, rsp_size);
+ iov_iter_init(&in_iter, READ, &vq->iov[vc.out], vc.in,
+ vc.rsp_size + exp_data_len);
+ iov_iter_advance(&in_iter, vc.rsp_size);
data_iter = in_iter;
} else {
data_direction = DMA_NONE;
if (data_direction != DMA_TO_DEVICE) {
vq_err(vq, "Received non zero pi_bytesout,"
" but wrong data_direction\n");
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
+ goto err;
}
prot_bytes = vhost32_to_cpu(vq, v_req_pi.pi_bytesout);
} else if (v_req_pi.pi_bytesin) {
if (data_direction != DMA_FROM_DEVICE) {
vq_err(vq, "Received non zero pi_bytesin,"
" but wrong data_direction\n");
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
+ goto err;
}
prot_bytes = vhost32_to_cpu(vq, v_req_pi.pi_bytesin);
}
/*
- * Set prot_iter to data_iter, and advance past any
+ * Set prot_iter to data_iter and truncate it to
+ * prot_bytes, and advance data_iter past any
* preceeding prot_bytes that may be present.
*
* Also fix up the exp_data_len to reflect only the
if (prot_bytes) {
exp_data_len -= prot_bytes;
prot_iter = data_iter;
+ iov_iter_truncate(&prot_iter, prot_bytes);
iov_iter_advance(&data_iter, prot_bytes);
}
tag = vhost64_to_cpu(vq, v_req_pi.tag);
vq_err(vq, "Received SCSI CDB with command_size: %d that"
" exceeds SCSI_MAX_VARLEN_CDB_SIZE: %d\n",
scsi_command_size(cdb), VHOST_SCSI_MAX_CDB_SIZE);
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
+ goto err;
}
cmd = vhost_scsi_get_tag(vq, tpg, cdb, tag, lun, task_attr,
exp_data_len + prot_bytes,
if (IS_ERR(cmd)) {
vq_err(vq, "vhost_scsi_get_tag failed %ld\n",
PTR_ERR(cmd));
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
+ goto err;
}
cmd->tvc_vhost = vs;
cmd->tvc_vq = vq;
- cmd->tvc_resp_iov = vq->iov[out];
- cmd->tvc_in_iovs = in;
+ cmd->tvc_resp_iov = vq->iov[vc.out];
+ cmd->tvc_in_iovs = vc.in;
pr_debug("vhost_scsi got command opcode: %#02x, lun: %d\n",
cmd->tvc_cdb[0], cmd->tvc_lun);
" %d\n", cmd, exp_data_len, prot_bytes, data_direction);
if (data_direction != DMA_NONE) {
- ret = vhost_scsi_mapal(cmd,
- prot_bytes, &prot_iter,
- exp_data_len, &data_iter);
- if (unlikely(ret)) {
+ if (unlikely(vhost_scsi_mapal(cmd, prot_bytes,
+ &prot_iter, exp_data_len,
+ &data_iter))) {
vq_err(vq, "Failed to map iov to sgl\n");
vhost_scsi_release_cmd(&cmd->tvc_se_cmd);
- vhost_scsi_send_bad_target(vs, vq, head, out);
- continue;
+ goto err;
}
}
/*
* complete the virtio-scsi request in TCM callback context via
* vhost_scsi_queue_data_in() and vhost_scsi_queue_status()
*/
- cmd->tvc_vq_desc = head;
+ cmd->tvc_vq_desc = vc.head;
/*
* Dispatch cmd descriptor for cmwq execution in process
* context provided by vhost_scsi_workqueue. This also ensures
*/
INIT_WORK(&cmd->work, vhost_scsi_submission_work);
queue_work(vhost_scsi_workqueue, &cmd->work);
+ ret = 0;
+err:
+ /*
+ * ENXIO: No more requests, or read error, wait for next kick
+ * EINVAL: Invalid response buffer, drop the request
+ * EIO: Respond with bad target
+ * EAGAIN: Pending request
+ */
+ if (ret == -ENXIO)
+ break;
+ else if (ret == -EIO)
+ vhost_scsi_send_bad_target(vs, vq, vc.head, vc.out);
+ }
+out:
+ mutex_unlock(&vq->mutex);
+}
+
+static void
+vhost_scsi_send_tmf_reject(struct vhost_scsi *vs,
+ struct vhost_virtqueue *vq,
+ struct vhost_scsi_ctx *vc)
+{
+ struct virtio_scsi_ctrl_tmf_resp __user *resp;
+ struct virtio_scsi_ctrl_tmf_resp rsp;
+ int ret;
+
+ pr_debug("%s\n", __func__);
+ memset(&rsp, 0, sizeof(rsp));
+ rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
+ resp = vq->iov[vc->out].iov_base;
+ ret = __copy_to_user(resp, &rsp, sizeof(rsp));
+ if (!ret)
+ vhost_add_used_and_signal(&vs->dev, vq, vc->head, 0);
+ else
+ pr_err("Faulted on virtio_scsi_ctrl_tmf_resp\n");
+}
+
+static void
+vhost_scsi_send_an_resp(struct vhost_scsi *vs,
+ struct vhost_virtqueue *vq,
+ struct vhost_scsi_ctx *vc)
+{
+ struct virtio_scsi_ctrl_an_resp __user *resp;
+ struct virtio_scsi_ctrl_an_resp rsp;
+ int ret;
+
+ pr_debug("%s\n", __func__);
+ memset(&rsp, 0, sizeof(rsp)); /* event_actual = 0 */
+ rsp.response = VIRTIO_SCSI_S_OK;
+ resp = vq->iov[vc->out].iov_base;
+ ret = __copy_to_user(resp, &rsp, sizeof(rsp));
+ if (!ret)
+ vhost_add_used_and_signal(&vs->dev, vq, vc->head, 0);
+ else
+ pr_err("Faulted on virtio_scsi_ctrl_an_resp\n");
+}
+
+static void
+vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
+{
+ union {
+ __virtio32 type;
+ struct virtio_scsi_ctrl_an_req an;
+ struct virtio_scsi_ctrl_tmf_req tmf;
+ } v_req;
+ struct vhost_scsi_ctx vc;
+ size_t typ_size;
+ int ret;
+
+ mutex_lock(&vq->mutex);
+ /*
+ * We can handle the vq only after the endpoint is setup by calling the
+ * VHOST_SCSI_SET_ENDPOINT ioctl.
+ */
+ if (!vq->private_data)
+ goto out;
+
+ memset(&vc, 0, sizeof(vc));
+
+ vhost_disable_notify(&vs->dev, vq);
+
+ for (;;) {
+ ret = vhost_scsi_get_desc(vs, vq, &vc);
+ if (ret)
+ goto err;
+
+ /*
+ * Get the request type first in order to setup
+ * other parameters dependent on the type.
+ */
+ vc.req = &v_req.type;
+ typ_size = sizeof(v_req.type);
+
+ if (unlikely(!copy_from_iter_full(vc.req, typ_size,
+ &vc.out_iter))) {
+ vq_err(vq, "Faulted on copy_from_iter tmf type\n");
+ /*
+ * The size of the response buffer depends on the
+ * request type and must be validated against it.
+ * Since the request type is not known, don't send
+ * a response.
+ */
+ continue;
+ }
+
+ switch (v_req.type) {
+ case VIRTIO_SCSI_T_TMF:
+ vc.req = &v_req.tmf;
+ vc.req_size = sizeof(struct virtio_scsi_ctrl_tmf_req);
+ vc.rsp_size = sizeof(struct virtio_scsi_ctrl_tmf_resp);
+ vc.lunp = &v_req.tmf.lun[0];
+ vc.target = &v_req.tmf.lun[1];
+ break;
+ case VIRTIO_SCSI_T_AN_QUERY:
+ case VIRTIO_SCSI_T_AN_SUBSCRIBE:
+ vc.req = &v_req.an;
+ vc.req_size = sizeof(struct virtio_scsi_ctrl_an_req);
+ vc.rsp_size = sizeof(struct virtio_scsi_ctrl_an_resp);
+ vc.lunp = &v_req.an.lun[0];
+ vc.target = NULL;
+ break;
+ default:
+ vq_err(vq, "Unknown control request %d", v_req.type);
+ continue;
+ }
+
+ /*
+ * Validate the size of request and response buffers.
+ * Check for a sane response buffer so we can report
+ * early errors back to the guest.
+ */
+ ret = vhost_scsi_chk_size(vq, &vc);
+ if (ret)
+ goto err;
+
+ /*
+ * Get the rest of the request now that its size is known.
+ */
+ vc.req += typ_size;
+ vc.req_size -= typ_size;
+
+ ret = vhost_scsi_get_req(vq, &vc, NULL);
+ if (ret)
+ goto err;
+
+ if (v_req.type == VIRTIO_SCSI_T_TMF)
+ vhost_scsi_send_tmf_reject(vs, vq, &vc);
+ else
+ vhost_scsi_send_an_resp(vs, vq, &vc);
+err:
+ /*
+ * ENXIO: No more requests, or read error, wait for next kick
+ * EINVAL: Invalid response buffer, drop the request
+ * EIO: Respond with bad target
+ * EAGAIN: Pending request
+ */
+ if (ret == -ENXIO)
+ break;
+ else if (ret == -EIO)
+ vhost_scsi_send_bad_target(vs, vq, vc.head, vc.out);
}
out:
mutex_unlock(&vq->mutex);
static void vhost_scsi_ctl_handle_kick(struct vhost_work *work)
{
+ struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+ poll.work);
+ struct vhost_scsi *vs = container_of(vq->dev, struct vhost_scsi, dev);
+
pr_debug("%s: The handling func for control queue.\n", __func__);
+ vhost_scsi_ctl_handle_vq(vs, vq);
}
static void
#include <linux/sched/mm.h>
#include <linux/sched/signal.h>
#include <linux/interval_tree_generic.h>
+#include <linux/nospec.h>
#include "vhost.h"
if (msg->iova <= vq_msg->iova &&
msg->iova + msg->size - 1 >= vq_msg->iova &&
vq_msg->type == VHOST_IOTLB_MISS) {
- mutex_lock(&node->vq->mutex);
vhost_poll_queue(&node->vq->poll);
- mutex_unlock(&node->vq->mutex);
-
list_del(&node->node);
kfree(node);
}
if (idx >= d->nvqs)
return -ENOBUFS;
+ idx = array_index_nospec(idx, d->nvqs);
vq = d->vqs[idx];
mutex_lock(&vq->mutex);
#include <net/sock.h>
#include <linux/virtio_vsock.h>
#include <linux/vhost.h>
+#include <linux/hashtable.h>
#include <net/af_vsock.h>
#include "vhost.h"
/* Used to track all the vhost_vsock instances on the system. */
static DEFINE_SPINLOCK(vhost_vsock_lock);
-static LIST_HEAD(vhost_vsock_list);
+static DEFINE_READ_MOSTLY_HASHTABLE(vhost_vsock_hash, 8);
struct vhost_vsock {
struct vhost_dev dev;
struct vhost_virtqueue vqs[2];
- /* Link to global vhost_vsock_list, protected by vhost_vsock_lock */
- struct list_head list;
+ /* Link to global vhost_vsock_hash, writes use vhost_vsock_lock */
+ struct hlist_node hash;
struct vhost_work send_pkt_work;
spinlock_t send_pkt_list_lock;
return VHOST_VSOCK_DEFAULT_HOST_CID;
}
-static struct vhost_vsock *__vhost_vsock_get(u32 guest_cid)
+/* Callers that dereference the return value must hold vhost_vsock_lock or the
+ * RCU read lock.
+ */
+static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
{
struct vhost_vsock *vsock;
- list_for_each_entry(vsock, &vhost_vsock_list, list) {
+ hash_for_each_possible_rcu(vhost_vsock_hash, vsock, hash, guest_cid) {
u32 other_cid = vsock->guest_cid;
/* Skip instances that have no CID yet */
return NULL;
}
-static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
-{
- struct vhost_vsock *vsock;
-
- spin_lock_bh(&vhost_vsock_lock);
- vsock = __vhost_vsock_get(guest_cid);
- spin_unlock_bh(&vhost_vsock_lock);
-
- return vsock;
-}
-
static void
vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
struct vhost_virtqueue *vq)
struct vhost_vsock *vsock;
int len = pkt->len;
+ rcu_read_lock();
+
/* Find the vhost_vsock according to guest context id */
vsock = vhost_vsock_get(le64_to_cpu(pkt->hdr.dst_cid));
if (!vsock) {
+ rcu_read_unlock();
virtio_transport_free_pkt(pkt);
return -ENODEV;
}
spin_unlock_bh(&vsock->send_pkt_list_lock);
vhost_work_queue(&vsock->dev, &vsock->send_pkt_work);
+
+ rcu_read_unlock();
return len;
}
struct vhost_vsock *vsock;
struct virtio_vsock_pkt *pkt, *n;
int cnt = 0;
+ int ret = -ENODEV;
LIST_HEAD(freeme);
+ rcu_read_lock();
+
/* Find the vhost_vsock according to guest context id */
vsock = vhost_vsock_get(vsk->remote_addr.svm_cid);
if (!vsock)
- return -ENODEV;
+ goto out;
spin_lock_bh(&vsock->send_pkt_list_lock);
list_for_each_entry_safe(pkt, n, &vsock->send_pkt_list, list) {
vhost_poll_queue(&tx_vq->poll);
}
- return 0;
+ ret = 0;
+out:
+ rcu_read_unlock();
+ return ret;
}
static struct virtio_vsock_pkt *
spin_lock_init(&vsock->send_pkt_list_lock);
INIT_LIST_HEAD(&vsock->send_pkt_list);
vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work);
-
- spin_lock_bh(&vhost_vsock_lock);
- list_add_tail(&vsock->list, &vhost_vsock_list);
- spin_unlock_bh(&vhost_vsock_lock);
return 0;
out:
* executing.
*/
- if (!vhost_vsock_get(vsk->remote_addr.svm_cid)) {
- sock_set_flag(sk, SOCK_DONE);
- vsk->peer_shutdown = SHUTDOWN_MASK;
- sk->sk_state = SS_UNCONNECTED;
- sk->sk_err = ECONNRESET;
- sk->sk_error_report(sk);
- }
+ /* If the peer is still valid, no need to reset connection */
+ if (vhost_vsock_get(vsk->remote_addr.svm_cid))
+ return;
+
+ /* If the close timeout is pending, let it expire. This avoids races
+ * with the timeout callback.
+ */
+ if (vsk->close_work_scheduled)
+ return;
+
+ sock_set_flag(sk, SOCK_DONE);
+ vsk->peer_shutdown = SHUTDOWN_MASK;
+ sk->sk_state = SS_UNCONNECTED;
+ sk->sk_err = ECONNRESET;
+ sk->sk_error_report(sk);
}
static int vhost_vsock_dev_release(struct inode *inode, struct file *file)
struct vhost_vsock *vsock = file->private_data;
spin_lock_bh(&vhost_vsock_lock);
- list_del(&vsock->list);
+ if (vsock->guest_cid)
+ hash_del_rcu(&vsock->hash);
spin_unlock_bh(&vhost_vsock_lock);
+ /* Wait for other CPUs to finish using vsock */
+ synchronize_rcu();
+
/* Iterating over all connections for all CIDs to find orphans is
* inefficient. Room for improvement here. */
vsock_for_each_connected_socket(vhost_vsock_reset_orphans);
/* Refuse if CID is already in use */
spin_lock_bh(&vhost_vsock_lock);
- other = __vhost_vsock_get(guest_cid);
+ other = vhost_vsock_get(guest_cid);
if (other && other != vsock) {
spin_unlock_bh(&vhost_vsock_lock);
return -EADDRINUSE;
}
+
+ if (vsock->guest_cid)
+ hash_del_rcu(&vsock->hash);
+
vsock->guest_cid = guest_cid;
+ hash_add_rcu(vhost_vsock_hash, &vsock->hash, guest_cid);
spin_unlock_bh(&vhost_vsock_lock);
return 0;
goto err_alloc;
}
- if (!data->levels) {
+ if (data->levels) {
+ /*
+ * For the DT case, only when brightness levels is defined
+ * data->levels is filled. For the non-DT case, data->levels
+ * can come from platform data, however is not usual.
+ */
+ for (i = 0; i <= data->max_brightness; i++) {
+ if (data->levels[i] > pb->scale)
+ pb->scale = data->levels[i];
+
+ pb->levels = data->levels;
+ }
+ } else if (!data->max_brightness) {
+ /*
+ * If no brightness levels are provided and max_brightness is
+ * not set, use the default brightness table. For the DT case,
+ * max_brightness is set to 0 when brightness levels is not
+ * specified. For the non-DT case, max_brightness is usually
+ * set to some value.
+ */
+
+ /* Get the PWM period (in nanoseconds) */
+ pwm_get_state(pb->pwm, &state);
+
ret = pwm_backlight_brightness_default(&pdev->dev, data,
state.period);
if (ret < 0) {
"failed to setup default brightness table\n");
goto err_alloc;
}
- }
- for (i = 0; i <= data->max_brightness; i++) {
- if (data->levels[i] > pb->scale)
- pb->scale = data->levels[i];
+ for (i = 0; i <= data->max_brightness; i++) {
+ if (data->levels[i] > pb->scale)
+ pb->scale = data->levels[i];
- pb->levels = data->levels;
+ pb->levels = data->levels;
+ }
+ } else {
+ /*
+ * That only happens for the non-DT case, where platform data
+ * sets the max_brightness value.
+ */
+ pb->scale = data->max_brightness;
}
pb->lth_brightness = data->lth_brightness * (state.period / pb->scale);
#define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
+#define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \
+ __GFP_NOMEMALLOC)
+/* The order of free page blocks to report to host */
+#define VIRTIO_BALLOON_FREE_PAGE_ORDER (MAX_ORDER - 1)
+/* The size of a free page block in bytes */
+#define VIRTIO_BALLOON_FREE_PAGE_SIZE \
+ (1 << (VIRTIO_BALLOON_FREE_PAGE_ORDER + PAGE_SHIFT))
+
#ifdef CONFIG_BALLOON_COMPACTION
static struct vfsmount *balloon_mnt;
#endif
+enum virtio_balloon_vq {
+ VIRTIO_BALLOON_VQ_INFLATE,
+ VIRTIO_BALLOON_VQ_DEFLATE,
+ VIRTIO_BALLOON_VQ_STATS,
+ VIRTIO_BALLOON_VQ_FREE_PAGE,
+ VIRTIO_BALLOON_VQ_MAX
+};
+
struct virtio_balloon {
struct virtio_device *vdev;
- struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+ struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq;
+
+ /* Balloon's own wq for cpu-intensive work items */
+ struct workqueue_struct *balloon_wq;
+ /* The free page reporting work item submitted to the balloon wq */
+ struct work_struct report_free_page_work;
/* The balloon servicing is delegated to a freezable workqueue. */
struct work_struct update_balloon_stats_work;
spinlock_t stop_update_lock;
bool stop_update;
+ /* The list of allocated free pages, waiting to be given back to mm */
+ struct list_head free_page_list;
+ spinlock_t free_page_list_lock;
+ /* The number of free page blocks on the above list */
+ unsigned long num_free_page_blocks;
+ /* The cmd id received from host */
+ u32 cmd_id_received;
+ /* The cmd id that is actively in use */
+ __virtio32 cmd_id_active;
+ /* Buffer to store the stop sign */
+ __virtio32 cmd_id_stop;
+
/* Waiting for host to ack the pages we released. */
wait_queue_head_t acked;
virtqueue_kick(vq);
}
-static void virtballoon_changed(struct virtio_device *vdev)
-{
- struct virtio_balloon *vb = vdev->priv;
- unsigned long flags;
-
- spin_lock_irqsave(&vb->stop_update_lock, flags);
- if (!vb->stop_update)
- queue_work(system_freezable_wq, &vb->update_balloon_size_work);
- spin_unlock_irqrestore(&vb->stop_update_lock, flags);
-}
-
static inline s64 towards_target(struct virtio_balloon *vb)
{
s64 target;
return target - vb->num_pages;
}
+/* Gives back @num_to_return blocks of free pages to mm. */
+static unsigned long return_free_pages_to_mm(struct virtio_balloon *vb,
+ unsigned long num_to_return)
+{
+ struct page *page;
+ unsigned long num_returned;
+
+ spin_lock_irq(&vb->free_page_list_lock);
+ for (num_returned = 0; num_returned < num_to_return; num_returned++) {
+ page = balloon_page_pop(&vb->free_page_list);
+ if (!page)
+ break;
+ free_pages((unsigned long)page_address(page),
+ VIRTIO_BALLOON_FREE_PAGE_ORDER);
+ }
+ vb->num_free_page_blocks -= num_returned;
+ spin_unlock_irq(&vb->free_page_list_lock);
+
+ return num_returned;
+}
+
+static void virtballoon_changed(struct virtio_device *vdev)
+{
+ struct virtio_balloon *vb = vdev->priv;
+ unsigned long flags;
+ s64 diff = towards_target(vb);
+
+ if (diff) {
+ spin_lock_irqsave(&vb->stop_update_lock, flags);
+ if (!vb->stop_update)
+ queue_work(system_freezable_wq,
+ &vb->update_balloon_size_work);
+ spin_unlock_irqrestore(&vb->stop_update_lock, flags);
+ }
+
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+ virtio_cread(vdev, struct virtio_balloon_config,
+ free_page_report_cmd_id, &vb->cmd_id_received);
+ if (vb->cmd_id_received == VIRTIO_BALLOON_CMD_ID_DONE) {
+ /* Pass ULONG_MAX to give back all the free pages */
+ return_free_pages_to_mm(vb, ULONG_MAX);
+ } else if (vb->cmd_id_received != VIRTIO_BALLOON_CMD_ID_STOP &&
+ vb->cmd_id_received !=
+ virtio32_to_cpu(vdev, vb->cmd_id_active)) {
+ spin_lock_irqsave(&vb->stop_update_lock, flags);
+ if (!vb->stop_update) {
+ queue_work(vb->balloon_wq,
+ &vb->report_free_page_work);
+ }
+ spin_unlock_irqrestore(&vb->stop_update_lock, flags);
+ }
+ }
+}
+
static void update_balloon_size(struct virtio_balloon *vb)
{
u32 actual = vb->num_pages;
static int init_vqs(struct virtio_balloon *vb)
{
- struct virtqueue *vqs[3];
- vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request };
- static const char * const names[] = { "inflate", "deflate", "stats" };
- int err, nvqs;
+ struct virtqueue *vqs[VIRTIO_BALLOON_VQ_MAX];
+ vq_callback_t *callbacks[VIRTIO_BALLOON_VQ_MAX];
+ const char *names[VIRTIO_BALLOON_VQ_MAX];
+ int err;
/*
- * We expect two virtqueues: inflate and deflate, and
- * optionally stat.
+ * Inflateq and deflateq are used unconditionally. The names[]
+ * will be NULL if the related feature is not enabled, which will
+ * cause no allocation for the corresponding virtqueue in find_vqs.
*/
- nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
- err = virtio_find_vqs(vb->vdev, nvqs, vqs, callbacks, names, NULL);
+ callbacks[VIRTIO_BALLOON_VQ_INFLATE] = balloon_ack;
+ names[VIRTIO_BALLOON_VQ_INFLATE] = "inflate";
+ callbacks[VIRTIO_BALLOON_VQ_DEFLATE] = balloon_ack;
+ names[VIRTIO_BALLOON_VQ_DEFLATE] = "deflate";
+ names[VIRTIO_BALLOON_VQ_STATS] = NULL;
+ names[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL;
+
+ if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) {
+ names[VIRTIO_BALLOON_VQ_STATS] = "stats";
+ callbacks[VIRTIO_BALLOON_VQ_STATS] = stats_request;
+ }
+
+ if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+ names[VIRTIO_BALLOON_VQ_FREE_PAGE] = "free_page_vq";
+ callbacks[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL;
+ }
+
+ err = vb->vdev->config->find_vqs(vb->vdev, VIRTIO_BALLOON_VQ_MAX,
+ vqs, callbacks, names, NULL, NULL);
if (err)
return err;
- vb->inflate_vq = vqs[0];
- vb->deflate_vq = vqs[1];
+ vb->inflate_vq = vqs[VIRTIO_BALLOON_VQ_INFLATE];
+ vb->deflate_vq = vqs[VIRTIO_BALLOON_VQ_DEFLATE];
if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) {
struct scatterlist sg;
unsigned int num_stats;
- vb->stats_vq = vqs[2];
+ vb->stats_vq = vqs[VIRTIO_BALLOON_VQ_STATS];
/*
* Prime this virtqueue with one buffer so the hypervisor can
}
virtqueue_kick(vb->stats_vq);
}
+
+ if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT))
+ vb->free_page_vq = vqs[VIRTIO_BALLOON_VQ_FREE_PAGE];
+
+ return 0;
+}
+
+static int send_cmd_id_start(struct virtio_balloon *vb)
+{
+ struct scatterlist sg;
+ struct virtqueue *vq = vb->free_page_vq;
+ int err, unused;
+
+ /* Detach all the used buffers from the vq */
+ while (virtqueue_get_buf(vq, &unused))
+ ;
+
+ vb->cmd_id_active = cpu_to_virtio32(vb->vdev, vb->cmd_id_received);
+ sg_init_one(&sg, &vb->cmd_id_active, sizeof(vb->cmd_id_active));
+ err = virtqueue_add_outbuf(vq, &sg, 1, &vb->cmd_id_active, GFP_KERNEL);
+ if (!err)
+ virtqueue_kick(vq);
+ return err;
+}
+
+static int send_cmd_id_stop(struct virtio_balloon *vb)
+{
+ struct scatterlist sg;
+ struct virtqueue *vq = vb->free_page_vq;
+ int err, unused;
+
+ /* Detach all the used buffers from the vq */
+ while (virtqueue_get_buf(vq, &unused))
+ ;
+
+ sg_init_one(&sg, &vb->cmd_id_stop, sizeof(vb->cmd_id_stop));
+ err = virtqueue_add_outbuf(vq, &sg, 1, &vb->cmd_id_stop, GFP_KERNEL);
+ if (!err)
+ virtqueue_kick(vq);
+ return err;
+}
+
+static int get_free_page_and_send(struct virtio_balloon *vb)
+{
+ struct virtqueue *vq = vb->free_page_vq;
+ struct page *page;
+ struct scatterlist sg;
+ int err, unused;
+ void *p;
+
+ /* Detach all the used buffers from the vq */
+ while (virtqueue_get_buf(vq, &unused))
+ ;
+
+ page = alloc_pages(VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG,
+ VIRTIO_BALLOON_FREE_PAGE_ORDER);
+ /*
+ * When the allocation returns NULL, it indicates that we have got all
+ * the possible free pages, so return -EINTR to stop.
+ */
+ if (!page)
+ return -EINTR;
+
+ p = page_address(page);
+ sg_init_one(&sg, p, VIRTIO_BALLOON_FREE_PAGE_SIZE);
+ /* There is always 1 entry reserved for the cmd id to use. */
+ if (vq->num_free > 1) {
+ err = virtqueue_add_inbuf(vq, &sg, 1, p, GFP_KERNEL);
+ if (unlikely(err)) {
+ free_pages((unsigned long)p,
+ VIRTIO_BALLOON_FREE_PAGE_ORDER);
+ return err;
+ }
+ virtqueue_kick(vq);
+ spin_lock_irq(&vb->free_page_list_lock);
+ balloon_page_push(&vb->free_page_list, page);
+ vb->num_free_page_blocks++;
+ spin_unlock_irq(&vb->free_page_list_lock);
+ } else {
+ /*
+ * The vq has no available entry to add this page block, so
+ * just free it.
+ */
+ free_pages((unsigned long)p, VIRTIO_BALLOON_FREE_PAGE_ORDER);
+ }
+
+ return 0;
+}
+
+static int send_free_pages(struct virtio_balloon *vb)
+{
+ int err;
+ u32 cmd_id_active;
+
+ while (1) {
+ /*
+ * If a stop id or a new cmd id was just received from host,
+ * stop the reporting.
+ */
+ cmd_id_active = virtio32_to_cpu(vb->vdev, vb->cmd_id_active);
+ if (cmd_id_active != vb->cmd_id_received)
+ break;
+
+ /*
+ * The free page blocks are allocated and sent to host one by
+ * one.
+ */
+ err = get_free_page_and_send(vb);
+ if (err == -EINTR)
+ break;
+ else if (unlikely(err))
+ return err;
+ }
+
return 0;
}
+static void report_free_page_func(struct work_struct *work)
+{
+ int err;
+ struct virtio_balloon *vb = container_of(work, struct virtio_balloon,
+ report_free_page_work);
+ struct device *dev = &vb->vdev->dev;
+
+ /* Start by sending the received cmd id to host with an outbuf. */
+ err = send_cmd_id_start(vb);
+ if (unlikely(err))
+ dev_err(dev, "Failed to send a start id, err = %d\n", err);
+
+ err = send_free_pages(vb);
+ if (unlikely(err))
+ dev_err(dev, "Failed to send a free page, err = %d\n", err);
+
+ /* End by sending a stop id to host with an outbuf. */
+ err = send_cmd_id_stop(vb);
+ if (unlikely(err))
+ dev_err(dev, "Failed to send a stop id, err = %d\n", err);
+}
+
#ifdef CONFIG_BALLOON_COMPACTION
/*
* virtballoon_migratepage - perform the balloon page migration on behalf of
#endif /* CONFIG_BALLOON_COMPACTION */
-static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
- struct shrink_control *sc)
+static unsigned long shrink_free_pages(struct virtio_balloon *vb,
+ unsigned long pages_to_free)
{
- unsigned long pages_to_free, pages_freed = 0;
- struct virtio_balloon *vb = container_of(shrinker,
- struct virtio_balloon, shrinker);
+ unsigned long blocks_to_free, blocks_freed;
- pages_to_free = sc->nr_to_scan * VIRTIO_BALLOON_PAGES_PER_PAGE;
+ pages_to_free = round_up(pages_to_free,
+ 1 << VIRTIO_BALLOON_FREE_PAGE_ORDER);
+ blocks_to_free = pages_to_free >> VIRTIO_BALLOON_FREE_PAGE_ORDER;
+ blocks_freed = return_free_pages_to_mm(vb, blocks_to_free);
+
+ return blocks_freed << VIRTIO_BALLOON_FREE_PAGE_ORDER;
+}
+
+static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
+ unsigned long pages_to_free)
+{
+ unsigned long pages_freed = 0;
/*
* One invocation of leak_balloon can deflate at most
* multiple times to deflate pages till reaching pages_to_free.
*/
while (vb->num_pages && pages_to_free) {
+ pages_freed += leak_balloon(vb, pages_to_free) /
+ VIRTIO_BALLOON_PAGES_PER_PAGE;
pages_to_free -= pages_freed;
- pages_freed += leak_balloon(vb, pages_to_free);
}
update_balloon_size(vb);
- return pages_freed / VIRTIO_BALLOON_PAGES_PER_PAGE;
+ return pages_freed;
+}
+
+static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
+ struct shrink_control *sc)
+{
+ unsigned long pages_to_free, pages_freed = 0;
+ struct virtio_balloon *vb = container_of(shrinker,
+ struct virtio_balloon, shrinker);
+
+ pages_to_free = sc->nr_to_scan * VIRTIO_BALLOON_PAGES_PER_PAGE;
+
+ if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT))
+ pages_freed = shrink_free_pages(vb, pages_to_free);
+
+ if (pages_freed >= pages_to_free)
+ return pages_freed;
+
+ pages_freed += shrink_balloon_pages(vb, pages_to_free - pages_freed);
+
+ return pages_freed;
}
static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
{
struct virtio_balloon *vb = container_of(shrinker,
struct virtio_balloon, shrinker);
+ unsigned long count;
- return vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+ count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+ count += vb->num_free_page_blocks >> VIRTIO_BALLOON_FREE_PAGE_ORDER;
+
+ return count;
}
static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb)
static int virtballoon_probe(struct virtio_device *vdev)
{
struct virtio_balloon *vb;
+ __u32 poison_val;
int err;
if (!vdev->config->get) {
}
vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
#endif
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+ /*
+ * There is always one entry reserved for cmd id, so the ring
+ * size needs to be at least two to report free page hints.
+ */
+ if (virtqueue_get_vring_size(vb->free_page_vq) < 2) {
+ err = -ENOSPC;
+ goto out_del_vqs;
+ }
+ vb->balloon_wq = alloc_workqueue("balloon-wq",
+ WQ_FREEZABLE | WQ_CPU_INTENSIVE, 0);
+ if (!vb->balloon_wq) {
+ err = -ENOMEM;
+ goto out_del_vqs;
+ }
+ INIT_WORK(&vb->report_free_page_work, report_free_page_func);
+ vb->cmd_id_received = VIRTIO_BALLOON_CMD_ID_STOP;
+ vb->cmd_id_active = cpu_to_virtio32(vb->vdev,
+ VIRTIO_BALLOON_CMD_ID_STOP);
+ vb->cmd_id_stop = cpu_to_virtio32(vb->vdev,
+ VIRTIO_BALLOON_CMD_ID_STOP);
+ vb->num_free_page_blocks = 0;
+ spin_lock_init(&vb->free_page_list_lock);
+ INIT_LIST_HEAD(&vb->free_page_list);
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)) {
+ memset(&poison_val, PAGE_POISON, sizeof(poison_val));
+ virtio_cwrite(vb->vdev, struct virtio_balloon_config,
+ poison_val, &poison_val);
+ }
+ }
/*
* We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a
* shrinker needs to be registered to relieve memory pressure.
if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) {
err = virtio_balloon_register_shrinker(vb);
if (err)
- goto out_del_vqs;
+ goto out_del_balloon_wq;
}
virtio_device_ready(vdev);
virtballoon_changed(vdev);
return 0;
+out_del_balloon_wq:
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT))
+ destroy_workqueue(vb->balloon_wq);
out_del_vqs:
vdev->config->del_vqs(vdev);
out_free_vb:
cancel_work_sync(&vb->update_balloon_size_work);
cancel_work_sync(&vb->update_balloon_stats_work);
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+ cancel_work_sync(&vb->report_free_page_work);
+ destroy_workqueue(vb->balloon_wq);
+ }
+
remove_common(vb);
#ifdef CONFIG_BALLOON_COMPACTION
if (vb->vb_dev_info.inode)
static int virtballoon_validate(struct virtio_device *vdev)
{
+ if (!page_poisoning_enabled())
+ __virtio_clear_bit(vdev, VIRTIO_BALLOON_F_PAGE_POISON);
+
__virtio_clear_bit(vdev, VIRTIO_F_IOMMU_PLATFORM);
return 0;
}
VIRTIO_BALLOON_F_MUST_TELL_HOST,
VIRTIO_BALLOON_F_STATS_VQ,
VIRTIO_BALLOON_F_DEFLATE_ON_OOM,
+ VIRTIO_BALLOON_F_FREE_PAGE_HINT,
+ VIRTIO_BALLOON_F_PAGE_POISON,
};
static struct virtio_driver virtio_balloon_driver = {
kfree(resource);
}
-/*
- * Host memory not allocated to dom0. We can use this range for hotplug-based
- * ballooning.
- *
- * It's a type-less resource. Setting IORESOURCE_MEM will make resource
- * management algorithms (arch_remove_reservations()) look into guest e820,
- * which we don't want.
- */
-static struct resource hostmem_resource = {
- .name = "Host RAM",
-};
-
-void __attribute__((weak)) __init arch_xen_balloon_init(struct resource *res)
-{}
-
static struct resource *additional_memory_resource(phys_addr_t size)
{
- struct resource *res, *res_hostmem;
- int ret = -ENOMEM;
+ struct resource *res;
+ int ret;
res = kzalloc(sizeof(*res), GFP_KERNEL);
if (!res)
res->name = "System RAM";
res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
- res_hostmem = kzalloc(sizeof(*res), GFP_KERNEL);
- if (res_hostmem) {
- /* Try to grab a range from hostmem */
- res_hostmem->name = "Host memory";
- ret = allocate_resource(&hostmem_resource, res_hostmem,
- size, 0, -1,
- PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
- }
-
- if (!ret) {
- /*
- * Insert this resource into iomem. Because hostmem_resource
- * tracks portion of guest e820 marked as UNUSABLE noone else
- * should try to use it.
- */
- res->start = res_hostmem->start;
- res->end = res_hostmem->end;
- ret = insert_resource(&iomem_resource, res);
- if (ret < 0) {
- pr_err("Can't insert iomem_resource [%llx - %llx]\n",
- res->start, res->end);
- release_memory_resource(res_hostmem);
- res_hostmem = NULL;
- res->start = res->end = 0;
- }
- }
-
- if (ret) {
- ret = allocate_resource(&iomem_resource, res,
- size, 0, -1,
- PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
- if (ret < 0) {
- pr_err("Cannot allocate new System RAM resource\n");
- kfree(res);
- return NULL;
- }
+ ret = allocate_resource(&iomem_resource, res,
+ size, 0, -1,
+ PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
+ if (ret < 0) {
+ pr_err("Cannot allocate new System RAM resource\n");
+ kfree(res);
+ return NULL;
}
#ifdef CONFIG_SPARSEMEM
pr_err("New System RAM resource outside addressable RAM (%lu > %lu)\n",
pfn, limit);
release_memory_resource(res);
- release_memory_resource(res_hostmem);
return NULL;
}
}
set_online_page_callback(&xen_online_page);
register_memory_notifier(&xen_memory_nb);
register_sysctl_table(xen_root);
-
- arch_xen_balloon_init(&hostmem_resource);
#endif
#ifdef CONFIG_XEN_PV
ret = xenmem_reservation_increase(args->nr_pages, args->frames);
if (ret != args->nr_pages) {
- pr_debug("Failed to decrease reservation for DMA buffer\n");
+ pr_debug("Failed to increase reservation for DMA buffer\n");
ret = -EFAULT;
} else {
ret = 0;
MODULE_LICENSE("GPL");
-static unsigned int limit = 64;
-module_param(limit, uint, 0644);
-MODULE_PARM_DESC(limit, "Maximum number of pages that may be allocated by "
- "the privcmd-buf device per open file");
-
struct privcmd_buf_private {
struct mutex lock;
struct list_head list;
- unsigned int allocated;
};
struct privcmd_buf_vma_private {
{
unsigned int i;
- vma_priv->file_priv->allocated -= vma_priv->n_pages;
-
list_del(&vma_priv->list);
for (i = 0; i < vma_priv->n_pages; i++)
- if (vma_priv->pages[i])
- __free_page(vma_priv->pages[i]);
+ __free_page(vma_priv->pages[i]);
kfree(vma_priv);
}
unsigned int i;
int ret = 0;
- if (!(vma->vm_flags & VM_SHARED) || count > limit ||
- file_priv->allocated + count > limit)
+ if (!(vma->vm_flags & VM_SHARED))
return -EINVAL;
vma_priv = kzalloc(sizeof(*vma_priv) + count * sizeof(void *),
if (!vma_priv)
return -ENOMEM;
- vma_priv->n_pages = count;
- count = 0;
- for (i = 0; i < vma_priv->n_pages; i++) {
+ for (i = 0; i < count; i++) {
vma_priv->pages[i] = alloc_page(GFP_KERNEL | __GFP_ZERO);
if (!vma_priv->pages[i])
break;
- count++;
+ vma_priv->n_pages++;
}
mutex_lock(&file_priv->lock);
- file_priv->allocated += count;
-
vma_priv->file_priv = file_priv;
vma_priv->users = 1;
if (masked_prod < masked_cons) {
vec[0].iov_base = data->in + masked_prod;
vec[0].iov_len = wanted;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|WRITE, vec, 1, wanted);
+ iov_iter_kvec(&msg.msg_iter, WRITE, vec, 1, wanted);
} else {
vec[0].iov_base = data->in + masked_prod;
vec[0].iov_len = array_size - masked_prod;
vec[1].iov_base = data->in;
vec[1].iov_len = wanted - vec[0].iov_len;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|WRITE, vec, 2, wanted);
+ iov_iter_kvec(&msg.msg_iter, WRITE, vec, 2, wanted);
}
atomic_set(&map->read, 0);
if (pvcalls_mask(prod, array_size) > pvcalls_mask(cons, array_size)) {
vec[0].iov_base = data->out + pvcalls_mask(cons, array_size);
vec[0].iov_len = size;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|READ, vec, 1, size);
+ iov_iter_kvec(&msg.msg_iter, READ, vec, 1, size);
} else {
vec[0].iov_base = data->out + pvcalls_mask(cons, array_size);
vec[0].iov_len = array_size - pvcalls_mask(cons, array_size);
vec[1].iov_base = data->out;
vec[1].iov_len = size - vec[0].iov_len;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|READ, vec, 2, size);
+ iov_iter_kvec(&msg.msg_iter, READ, vec, 2, size);
}
atomic_set(&map->write, 0);
out_error:
if (*evtchn >= 0)
xenbus_free_evtchn(pvcalls_front_dev, *evtchn);
- kfree(map->active.data.in);
- kfree(map->active.ring);
+ free_pages((unsigned long)map->active.data.in, PVCALLS_RING_ORDER);
+ free_page((unsigned long)map->active.ring);
return ret;
}
#include <asm/xen/hypervisor.h>
#include <xen/xen.h>
+#include <xen/xen-ops.h>
#include <xen/page.h>
#include <xen/interface/xen.h>
#include <xen/interface/memory.h>
if (retval == 0)
return retval;
- iov_iter_bvec(&to, ITER_BVEC | READ, &bvec, 1, PAGE_SIZE);
+ iov_iter_bvec(&to, READ, &bvec, 1, PAGE_SIZE);
retval = p9_client_read(fid, page_offset(page), &to, &err);
if (err) {
bvec.bv_page = page;
bvec.bv_offset = 0;
bvec.bv_len = len;
- iov_iter_bvec(&from, ITER_BVEC | WRITE, &bvec, 1, len);
+ iov_iter_bvec(&from, WRITE, &bvec, 1, len);
/* We should have writeback_fid always set */
BUG_ON(!v9inode->writeback_fid);
if (rdir->tail == rdir->head) {
struct iov_iter to;
int n;
- iov_iter_kvec(&to, READ | ITER_KVEC, &kvec, 1, buflen);
+ iov_iter_kvec(&to, READ, &kvec, 1, buflen);
n = p9_client_read(file->private_data, ctx->pos, &to,
&err);
if (err)
struct iov_iter to;
int err;
- iov_iter_kvec(&to, READ | ITER_KVEC, &kvec, 1, buffer_size);
+ iov_iter_kvec(&to, READ, &kvec, 1, buffer_size);
attr_fid = p9_client_xattrwalk(fid, name, &attr_size);
if (IS_ERR(attr_fid)) {
struct iov_iter from;
int retval, err;
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &kvec, 1, value_len);
+ iov_iter_kvec(&from, WRITE, &kvec, 1, value_len);
p9_debug(P9_DEBUG_VFS, "name = %s value_len = %zu flags = %d\n",
name, value_len, flags);
help
Say Y here if you want AFS data to be cached locally on disk through
the generic filesystem cache manager
+
+config AFS_DEBUG_CURSOR
+ bool "AFS server cursor debugging"
+ depends on AFS_FS
+ help
+ Say Y here to cause the contents of a server cursor to be dumped to
+ the dmesg log if the server rotation algorithm fails to successfully
+ contact a server.
+
+ See <file:Documentation/filesystems/afs.txt> for more information.
+
+ If unsure, say N.
file.o \
flock.o \
fsclient.o \
+ fs_probe.o \
inode.o \
main.o \
misc.o \
super.o \
netdevices.o \
vlclient.o \
+ vl_list.o \
+ vl_probe.o \
+ vl_rotate.o \
volume.o \
write.o \
- xattr.o
+ xattr.o \
+ yfsclient.o
kafs-$(CONFIG_PROC_FS) += proc.o
obj-$(CONFIG_AFS_FS) := kafs.o
/*
* Parse a text string consisting of delimited addresses.
*/
-struct afs_addr_list *afs_parse_text_addrs(const char *text, size_t len,
- char delim,
- unsigned short service,
- unsigned short port)
+struct afs_vlserver_list *afs_parse_text_addrs(struct afs_net *net,
+ const char *text, size_t len,
+ char delim,
+ unsigned short service,
+ unsigned short port)
{
+ struct afs_vlserver_list *vllist;
struct afs_addr_list *alist;
const char *p, *end = text + len;
+ const char *problem;
unsigned int nr = 0;
+ int ret = -ENOMEM;
_enter("%*.*s,%c", (int)len, (int)len, text, delim);
- if (!len)
+ if (!len) {
+ _leave(" = -EDESTADDRREQ [empty]");
return ERR_PTR(-EDESTADDRREQ);
+ }
if (delim == ':' && (memchr(text, ',', len) || !memchr(text, '.', len)))
delim = ',';
/* Count the addresses */
p = text;
do {
- if (!*p)
- return ERR_PTR(-EINVAL);
+ if (!*p) {
+ problem = "nul";
+ goto inval;
+ }
if (*p == delim)
continue;
nr++;
if (*p == '[') {
p++;
- if (p == end)
- return ERR_PTR(-EINVAL);
+ if (p == end) {
+ problem = "brace1";
+ goto inval;
+ }
p = memchr(p, ']', end - p);
- if (!p)
- return ERR_PTR(-EINVAL);
+ if (!p) {
+ problem = "brace2";
+ goto inval;
+ }
p++;
if (p >= end)
break;
_debug("%u/%u addresses", nr, AFS_MAX_ADDRESSES);
- alist = afs_alloc_addrlist(nr, service, port);
- if (!alist)
+ vllist = afs_alloc_vlserver_list(1);
+ if (!vllist)
return ERR_PTR(-ENOMEM);
+ vllist->nr_servers = 1;
+ vllist->servers[0].server = afs_alloc_vlserver("<dummy>", 7, AFS_VL_PORT);
+ if (!vllist->servers[0].server)
+ goto error_vl;
+
+ alist = afs_alloc_addrlist(nr, service, AFS_VL_PORT);
+ if (!alist)
+ goto error;
+
/* Extract the addresses */
p = text;
do {
break;
}
- if (in4_pton(p, q - p, (u8 *)&x[0], -1, &stop))
+ if (in4_pton(p, q - p, (u8 *)&x[0], -1, &stop)) {
family = AF_INET;
- else if (in6_pton(p, q - p, (u8 *)x, -1, &stop))
+ } else if (in6_pton(p, q - p, (u8 *)x, -1, &stop)) {
family = AF_INET6;
- else
+ } else {
+ problem = "family";
goto bad_address;
+ }
- if (stop != q)
+ p = q;
+ if (stop != p) {
+ problem = "nostop";
goto bad_address;
+ }
- p = q;
if (q < end && *q == ']')
p++;
/* Port number specification "+1234" */
xport = 0;
p++;
- if (p >= end || !isdigit(*p))
+ if (p >= end || !isdigit(*p)) {
+ problem = "port";
goto bad_address;
+ }
do {
xport *= 10;
xport += *p - '0';
- if (xport > 65535)
+ if (xport > 65535) {
+ problem = "pval";
goto bad_address;
+ }
p++;
} while (p < end && isdigit(*p));
} else if (*p == delim) {
p++;
} else {
+ problem = "weird";
goto bad_address;
}
}
} while (p < end);
+ rcu_assign_pointer(vllist->servers[0].server->addresses, alist);
_leave(" = [nr %u]", alist->nr_addrs);
- return alist;
+ return vllist;
-bad_address:
- kfree(alist);
+inval:
+ _leave(" = -EINVAL [%s %zu %*.*s]",
+ problem, p - text, (int)len, (int)len, text);
return ERR_PTR(-EINVAL);
+bad_address:
+ _leave(" = -EINVAL [%s %zu %*.*s]",
+ problem, p - text, (int)len, (int)len, text);
+ ret = -EINVAL;
+error:
+ afs_put_addrlist(alist);
+error_vl:
+ afs_put_vlserverlist(net, vllist);
+ return ERR_PTR(ret);
}
/*
/*
* Perform a DNS query for VL servers and build a up an address list.
*/
-struct afs_addr_list *afs_dns_query(struct afs_cell *cell, time64_t *_expiry)
+struct afs_vlserver_list *afs_dns_query(struct afs_cell *cell, time64_t *_expiry)
{
- struct afs_addr_list *alist;
- char *vllist = NULL;
+ struct afs_vlserver_list *vllist;
+ char *result = NULL;
int ret;
_enter("%s", cell->name);
- ret = dns_query("afsdb", cell->name, cell->name_len,
- "", &vllist, _expiry);
- if (ret < 0)
+ ret = dns_query("afsdb", cell->name, cell->name_len, "srv=1",
+ &result, _expiry);
+ if (ret < 0) {
+ _leave(" = %d [dns]", ret);
return ERR_PTR(ret);
-
- alist = afs_parse_text_addrs(vllist, strlen(vllist), ',',
- VL_SERVICE, AFS_VL_PORT);
- if (IS_ERR(alist)) {
- kfree(vllist);
- if (alist != ERR_PTR(-ENOMEM))
- pr_err("Failed to parse DNS data\n");
- return alist;
}
- kfree(vllist);
- return alist;
+ if (*_expiry == 0)
+ *_expiry = ktime_get_real_seconds() + 60;
+
+ if (ret > 1 && result[0] == 0)
+ vllist = afs_extract_vlserver_list(cell, result, ret);
+ else
+ vllist = afs_parse_text_addrs(cell->net, result, ret, ',',
+ VL_SERVICE, AFS_VL_PORT);
+ kfree(result);
+ if (IS_ERR(vllist) && vllist != ERR_PTR(-ENOMEM))
+ pr_err("Failed to parse DNS data %ld\n", PTR_ERR(vllist));
+
+ return vllist;
}
/*
sizeof(alist->addrs[0]) * (alist->nr_addrs - i));
srx = &alist->addrs[i];
+ srx->srx_family = AF_RXRPC;
+ srx->transport_type = SOCK_DGRAM;
srx->transport_len = sizeof(srx->transport.sin);
srx->transport.sin.sin_family = AF_INET;
srx->transport.sin.sin_port = htons(port);
sizeof(alist->addrs[0]) * (alist->nr_addrs - i));
srx = &alist->addrs[i];
+ srx->srx_family = AF_RXRPC;
+ srx->transport_type = SOCK_DGRAM;
srx->transport_len = sizeof(srx->transport.sin6);
srx->transport.sin6.sin6_family = AF_INET6;
srx->transport.sin6.sin6_port = htons(port);
*/
bool afs_iterate_addresses(struct afs_addr_cursor *ac)
{
- _enter("%hu+%hd", ac->start, (short)ac->index);
+ unsigned long set, failed;
+ int index;
if (!ac->alist)
return false;
- if (ac->begun) {
- ac->index++;
- if (ac->index == ac->alist->nr_addrs)
- ac->index = 0;
+ set = ac->alist->responded;
+ failed = ac->alist->failed;
+ _enter("%lx-%lx-%lx,%d", set, failed, ac->tried, ac->index);
- if (ac->index == ac->start) {
- ac->error = -EDESTADDRREQ;
- return false;
- }
- }
+ ac->nr_iterations++;
+
+ set &= ~(failed | ac->tried);
+
+ if (!set)
+ return false;
- ac->begun = true;
+ index = READ_ONCE(ac->alist->preferred);
+ if (test_bit(index, &set))
+ goto selected;
+
+ index = __ffs(set);
+
+selected:
+ ac->index = index;
+ set_bit(index, &ac->tried);
ac->responded = false;
- ac->addr = &ac->alist->addrs[ac->index];
return true;
}
alist = ac->alist;
if (alist) {
- if (ac->responded && ac->index != ac->start)
- WRITE_ONCE(alist->index, ac->index);
+ if (ac->responded &&
+ ac->index != alist->preferred &&
+ test_bit(ac->alist->preferred, &ac->tried))
+ WRITE_ONCE(alist->preferred, ac->index);
afs_put_addrlist(alist);
+ ac->alist = NULL;
}
- ac->addr = NULL;
- ac->alist = NULL;
- ac->begun = false;
return ac->error;
}
-
-/*
- * Set the address cursor for iterating over VL servers.
- */
-int afs_set_vl_cursor(struct afs_addr_cursor *ac, struct afs_cell *cell)
-{
- struct afs_addr_list *alist;
- int ret;
-
- if (!rcu_access_pointer(cell->vl_addrs)) {
- ret = wait_on_bit(&cell->flags, AFS_CELL_FL_NO_LOOKUP_YET,
- TASK_INTERRUPTIBLE);
- if (ret < 0)
- return ret;
-
- if (!rcu_access_pointer(cell->vl_addrs) &&
- ktime_get_real_seconds() < cell->dns_expiry)
- return cell->error;
- }
-
- read_lock(&cell->vl_addrs_lock);
- alist = rcu_dereference_protected(cell->vl_addrs,
- lockdep_is_held(&cell->vl_addrs_lock));
- if (alist->nr_addrs > 0)
- afs_get_addrlist(alist);
- else
- alist = NULL;
- read_unlock(&cell->vl_addrs_lock);
-
- if (!alist)
- return -EDESTADDRREQ;
-
- ac->alist = alist;
- ac->addr = NULL;
- ac->start = READ_ONCE(alist->index);
- ac->index = ac->start;
- ac->error = 0;
- ac->begun = false;
- return 0;
-}
#define AFSPATHMAX 1024 /* Maximum length of a pathname plus NUL */
#define AFSOPAQUEMAX 1024 /* Maximum length of an opaque field */
-typedef unsigned afs_volid_t;
-typedef unsigned afs_vnodeid_t;
-typedef unsigned long long afs_dataversion_t;
+typedef u64 afs_volid_t;
+typedef u64 afs_vnodeid_t;
+typedef u64 afs_dataversion_t;
typedef enum {
AFSVL_RWVOL, /* read/write volume */
*/
struct afs_fid {
afs_volid_t vid; /* volume ID */
- afs_vnodeid_t vnode; /* file index within volume */
- unsigned unique; /* unique ID number (file index version) */
+ afs_vnodeid_t vnode; /* Lower 64-bits of file index within volume */
+ u32 vnode_hi; /* Upper 32-bits of file index */
+ u32 unique; /* unique ID number (file index version) */
};
/*
} afs_callback_type_t;
struct afs_callback {
+ time64_t expires_at; /* Time at which expires */
unsigned version; /* Callback version */
- unsigned expiry; /* Time at which expires */
afs_callback_type_t type; /* Type of callback */
};
struct afs_callback_break {
struct afs_fid fid; /* File identifier */
- struct afs_callback cb; /* Callback details */
+ //struct afs_callback cb; /* Callback details */
};
#define AFSCBMAX 50 /* maximum callbacks transferred per bulk op */
struct afs_file_status {
u64 size; /* file size */
afs_dataversion_t data_version; /* current data version */
- time_t mtime_client; /* last time client changed data */
- time_t mtime_server; /* last time server changed data */
- unsigned abort_code; /* Abort if bulk-fetching this failed */
-
- afs_file_type_t type; /* file type */
- unsigned nlink; /* link count */
- u32 author; /* author ID */
- u32 owner; /* owner ID */
- u32 group; /* group ID */
+ struct timespec64 mtime_client; /* Last time client changed data */
+ struct timespec64 mtime_server; /* Last time server changed data */
+ s64 author; /* author ID */
+ s64 owner; /* owner ID */
+ s64 group; /* group ID */
afs_access_t caller_access; /* access rights for authenticated caller */
afs_access_t anon_access; /* access rights for unauthenticated caller */
umode_t mode; /* UNIX mode */
+ afs_file_type_t type; /* file type */
+ u32 nlink; /* link count */
s32 lock_count; /* file lock count (0=UNLK -1=WRLCK +ve=#RDLCK */
+ u32 abort_code; /* Abort if bulk-fetching this failed */
};
/*
* AFS volume synchronisation information
*/
struct afs_volsync {
- time_t creation; /* volume creation time */
+ time64_t creation; /* volume creation time */
};
/*
* AFS volume status record
*/
struct afs_volume_status {
- u32 vid; /* volume ID */
- u32 parent_id; /* parent volume ID */
+ afs_volid_t vid; /* volume ID */
+ afs_volid_t parent_id; /* parent volume ID */
u8 online; /* true if volume currently online and available */
u8 in_service; /* true if volume currently in service */
u8 blessed; /* same as in_service */
u8 needs_salvage; /* true if consistency checking required */
u32 type; /* volume type (afs_voltype_t) */
- u32 min_quota; /* minimum space set aside (blocks) */
- u32 max_quota; /* maximum space this volume may occupy (blocks) */
- u32 blocks_in_use; /* space this volume currently occupies (blocks) */
- u32 part_blocks_avail; /* space available in volume's partition */
- u32 part_max_blocks; /* size of volume's partition */
+ u64 min_quota; /* minimum space set aside (blocks) */
+ u64 max_quota; /* maximum space this volume may occupy (blocks) */
+ u64 blocks_in_use; /* space this volume currently occupies (blocks) */
+ u64 part_blocks_avail; /* space available in volume's partition */
+ u64 part_max_blocks; /* size of volume's partition */
+ s64 vol_copy_date;
+ s64 vol_backup_date;
};
#define AFS_BLOCK_SIZE 1024
struct afs_vnode *vnode = cookie_netfs_data;
struct afs_vnode_cache_aux aux;
- _enter("{%x,%x,%llx},%p,%u",
+ _enter("{%llx,%x,%llx},%p,%u",
vnode->fid.vnode, vnode->fid.unique, vnode->status.data_version,
buffer, buflen);
/*
* actually break a callback
*/
-void afs_break_callback(struct afs_vnode *vnode)
+void __afs_break_callback(struct afs_vnode *vnode)
{
_enter("");
- write_seqlock(&vnode->cb_lock);
-
clear_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
if (test_and_clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
vnode->cb_break++;
afs_lock_may_be_available(vnode);
spin_unlock(&vnode->lock);
}
+}
+void afs_break_callback(struct afs_vnode *vnode)
+{
+ write_seqlock(&vnode->cb_lock);
+ __afs_break_callback(vnode);
write_sequnlock(&vnode->cb_lock);
}
/* TODO: Sort the callback break list by volume ID */
for (; count > 0; callbacks++, count--) {
- _debug("- Fid { vl=%08x n=%u u=%u } CB { v=%u x=%u t=%u }",
+ _debug("- Fid { vl=%08llx n=%llu u=%u }",
callbacks->fid.vid,
callbacks->fid.vnode,
- callbacks->fid.unique,
- callbacks->cb.version,
- callbacks->cb.expiry,
- callbacks->cb.type
- );
+ callbacks->fid.unique);
afs_break_one_callback(server, &callbacks->fid);
}
#include "internal.h"
static unsigned __read_mostly afs_cell_gc_delay = 10;
+static unsigned __read_mostly afs_cell_min_ttl = 10 * 60;
+static unsigned __read_mostly afs_cell_max_ttl = 24 * 60 * 60;
static void afs_manage_cell(struct work_struct *);
*/
static struct afs_cell *afs_alloc_cell(struct afs_net *net,
const char *name, unsigned int namelen,
- const char *vllist)
+ const char *addresses)
{
struct afs_cell *cell;
int i, ret;
if (namelen == 5 && memcmp(name, "@cell", 5) == 0)
return ERR_PTR(-EINVAL);
- _enter("%*.*s,%s", namelen, namelen, name, vllist);
+ _enter("%*.*s,%s", namelen, namelen, name, addresses);
cell = kzalloc(sizeof(struct afs_cell), GFP_KERNEL);
if (!cell) {
(1 << AFS_CELL_FL_NO_LOOKUP_YET));
INIT_LIST_HEAD(&cell->proc_volumes);
rwlock_init(&cell->proc_lock);
- rwlock_init(&cell->vl_addrs_lock);
+ rwlock_init(&cell->vl_servers_lock);
/* Fill in the VL server list if we were given a list of addresses to
* use.
*/
- if (vllist) {
- struct afs_addr_list *alist;
-
- alist = afs_parse_text_addrs(vllist, strlen(vllist), ':',
- VL_SERVICE, AFS_VL_PORT);
- if (IS_ERR(alist)) {
- ret = PTR_ERR(alist);
+ if (addresses) {
+ struct afs_vlserver_list *vllist;
+
+ vllist = afs_parse_text_addrs(net,
+ addresses, strlen(addresses), ':',
+ VL_SERVICE, AFS_VL_PORT);
+ if (IS_ERR(vllist)) {
+ ret = PTR_ERR(vllist);
goto parse_failed;
}
- rcu_assign_pointer(cell->vl_addrs, alist);
+ rcu_assign_pointer(cell->vl_servers, vllist);
cell->dns_expiry = TIME64_MAX;
+ } else {
+ cell->dns_expiry = ktime_get_real_seconds();
}
_leave(" = %p", cell);
*/
static void afs_update_cell(struct afs_cell *cell)
{
- struct afs_addr_list *alist, *old;
- time64_t now, expiry;
+ struct afs_vlserver_list *vllist, *old;
+ unsigned int min_ttl = READ_ONCE(afs_cell_min_ttl);
+ unsigned int max_ttl = READ_ONCE(afs_cell_max_ttl);
+ time64_t now, expiry = 0;
_enter("%s", cell->name);
- alist = afs_dns_query(cell, &expiry);
- if (IS_ERR(alist)) {
- switch (PTR_ERR(alist)) {
+ vllist = afs_dns_query(cell, &expiry);
+
+ now = ktime_get_real_seconds();
+ if (min_ttl > max_ttl)
+ max_ttl = min_ttl;
+ if (expiry < now + min_ttl)
+ expiry = now + min_ttl;
+ else if (expiry > now + max_ttl)
+ expiry = now + max_ttl;
+
+ if (IS_ERR(vllist)) {
+ switch (PTR_ERR(vllist)) {
case -ENODATA:
- /* The DNS said that the cell does not exist */
+ case -EDESTADDRREQ:
+ /* The DNS said that the cell does not exist or there
+ * weren't any addresses to be had.
+ */
set_bit(AFS_CELL_FL_NOT_FOUND, &cell->flags);
clear_bit(AFS_CELL_FL_DNS_FAIL, &cell->flags);
- cell->dns_expiry = ktime_get_real_seconds() + 61;
+ cell->dns_expiry = expiry;
break;
case -EAGAIN:
case -ECONNREFUSED:
default:
set_bit(AFS_CELL_FL_DNS_FAIL, &cell->flags);
- cell->dns_expiry = ktime_get_real_seconds() + 10;
+ cell->dns_expiry = now + 10;
break;
}
/* Exclusion on changing vl_addrs is achieved by a
* non-reentrant work item.
*/
- old = rcu_dereference_protected(cell->vl_addrs, true);
- rcu_assign_pointer(cell->vl_addrs, alist);
+ old = rcu_dereference_protected(cell->vl_servers, true);
+ rcu_assign_pointer(cell->vl_servers, vllist);
cell->dns_expiry = expiry;
if (old)
- afs_put_addrlist(old);
+ afs_put_vlserverlist(cell->net, old);
}
if (test_and_clear_bit(AFS_CELL_FL_NO_LOOKUP_YET, &cell->flags))
ASSERTCMP(atomic_read(&cell->usage), ==, 0);
- afs_put_addrlist(rcu_access_pointer(cell->vl_addrs));
+ afs_put_vlserverlist(cell->net, rcu_access_pointer(cell->vl_servers));
key_put(cell->anonymous_key);
kfree(cell);
#include <linux/ip.h>
#include "internal.h"
#include "afs_cm.h"
+#include "protocol_yfs.h"
static int afs_deliver_cb_init_call_back_state(struct afs_call *);
static int afs_deliver_cb_init_call_back_state3(struct afs_call *);
static void SRXAFSCB_ProbeUuid(struct work_struct *);
static void SRXAFSCB_TellMeAboutYourself(struct work_struct *);
+static int afs_deliver_yfs_cb_callback(struct afs_call *);
+
#define CM_NAME(name) \
const char afs_SRXCB##name##_name[] __tracepoint_string = \
"CB." #name
.work = SRXAFSCB_TellMeAboutYourself,
};
+/*
+ * YFS CB.CallBack operation type
+ */
+static CM_NAME(YFS_CallBack);
+static const struct afs_call_type afs_SRXYFSCB_CallBack = {
+ .name = afs_SRXCBYFS_CallBack_name,
+ .deliver = afs_deliver_yfs_cb_callback,
+ .destructor = afs_cm_destructor,
+ .work = SRXAFSCB_CallBack,
+};
+
/*
* route an incoming cache manager call
* - return T if supported, F if not
*/
bool afs_cm_incoming_call(struct afs_call *call)
{
- _enter("{CB.OP %u}", call->operation_ID);
+ _enter("{%u, CB.OP %u}", call->service_id, call->operation_ID);
+
+ call->epoch = rxrpc_kernel_get_epoch(call->net->socket, call->rxcall);
switch (call->operation_ID) {
case CBCallBack:
case CBTellMeAboutYourself:
call->type = &afs_SRXCBTellMeAboutYourself;
return true;
+ case YFSCBCallBack:
+ if (call->service_id != YFS_CM_SERVICE)
+ return false;
+ call->type = &afs_SRXYFSCB_CallBack;
+ return true;
default:
return false;
}
}
+/*
+ * Record a probe to the cache manager from a server.
+ */
+static int afs_record_cm_probe(struct afs_call *call, struct afs_server *server)
+{
+ _enter("");
+
+ if (test_bit(AFS_SERVER_FL_HAVE_EPOCH, &server->flags) &&
+ !test_bit(AFS_SERVER_FL_PROBING, &server->flags)) {
+ if (server->cm_epoch == call->epoch)
+ return 0;
+
+ if (!server->probe.said_rebooted) {
+ pr_notice("kAFS: FS rebooted %pU\n", &server->uuid);
+ server->probe.said_rebooted = true;
+ }
+ }
+
+ spin_lock(&server->probe_lock);
+
+ if (!test_bit(AFS_SERVER_FL_HAVE_EPOCH, &server->flags)) {
+ server->cm_epoch = call->epoch;
+ server->probe.cm_epoch = call->epoch;
+ goto out;
+ }
+
+ if (server->probe.cm_probed &&
+ call->epoch != server->probe.cm_epoch &&
+ !server->probe.said_inconsistent) {
+ pr_notice("kAFS: FS endpoints inconsistent %pU\n",
+ &server->uuid);
+ server->probe.said_inconsistent = true;
+ }
+
+ if (!server->probe.cm_probed || call->epoch == server->cm_epoch)
+ server->probe.cm_epoch = server->cm_epoch;
+
+out:
+ server->probe.cm_probed = true;
+ spin_unlock(&server->probe_lock);
+ return 0;
+}
+
+/*
+ * Find the server record by peer address and record a probe to the cache
+ * manager from a server.
+ */
+static int afs_find_cm_server_by_peer(struct afs_call *call)
+{
+ struct sockaddr_rxrpc srx;
+ struct afs_server *server;
+
+ rxrpc_kernel_get_peer(call->net->socket, call->rxcall, &srx);
+
+ server = afs_find_server(call->net, &srx);
+ if (!server) {
+ trace_afs_cm_no_server(call, &srx);
+ return 0;
+ }
+
+ call->cm_server = server;
+ return afs_record_cm_probe(call, server);
+}
+
+/*
+ * Find the server record by server UUID and record a probe to the cache
+ * manager from a server.
+ */
+static int afs_find_cm_server_by_uuid(struct afs_call *call,
+ struct afs_uuid *uuid)
+{
+ struct afs_server *server;
+
+ rcu_read_lock();
+ server = afs_find_server_by_uuid(call->net, call->request);
+ rcu_read_unlock();
+ if (!server) {
+ trace_afs_cm_no_server_u(call, call->request);
+ return 0;
+ }
+
+ call->cm_server = server;
+ return afs_record_cm_probe(call, server);
+}
+
/*
* Clean up a cache manager call.
*/
static int afs_deliver_cb_callback(struct afs_call *call)
{
struct afs_callback_break *cb;
- struct sockaddr_rxrpc srx;
__be32 *bp;
int ret, loop;
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* extract the FID array and its count in two steps */
case 1:
_debug("extract FID count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->count = ntohl(call->tmp);
_debug("FID count: %u", call->count);
if (call->count > AFSCBMAX)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_cb_fid_count);
call->buffer = kmalloc(array3_size(call->count, 3, 4),
GFP_KERNEL);
if (!call->buffer)
return -ENOMEM;
- call->offset = 0;
+ afs_extract_to_buf(call, call->count * 3 * 4);
call->unmarshall++;
case 2:
_debug("extract FID array");
- ret = afs_extract_data(call, call->buffer,
- call->count * 3 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
cb->fid.vid = ntohl(*bp++);
cb->fid.vnode = ntohl(*bp++);
cb->fid.unique = ntohl(*bp++);
- cb->cb.type = AFSCM_CB_UNTYPED;
}
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* extract the callback array and its count in two steps */
case 3:
_debug("extract CB count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->count2 = ntohl(call->tmp);
_debug("CB count: %u", call->count2);
if (call->count2 != call->count && call->count2 != 0)
- return afs_protocol_error(call, -EBADMSG);
- call->offset = 0;
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_cb_count);
+ call->_iter = &call->iter;
+ iov_iter_discard(&call->iter, READ, call->count2 * 3 * 4);
call->unmarshall++;
case 4:
- _debug("extract CB array");
- ret = afs_extract_data(call, call->buffer,
- call->count2 * 3 * 4, false);
+ _debug("extract discard %zu/%u",
+ iov_iter_count(&call->iter), call->count2 * 3 * 4);
+
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
- _debug("unmarshall CB array");
- cb = call->request;
- bp = call->buffer;
- for (loop = call->count2; loop > 0; loop--, cb++) {
- cb->cb.version = ntohl(*bp++);
- cb->cb.expiry = ntohl(*bp++);
- cb->cb.type = ntohl(*bp++);
- }
-
- call->offset = 0;
call->unmarshall++;
case 5:
break;
}
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
/* we'll need the file server record as that tells us which set of
* vnodes to operate upon */
- rxrpc_kernel_get_peer(call->net->socket, call->rxcall, &srx);
- call->cm_server = afs_find_server(call->net, &srx);
- if (!call->cm_server)
- trace_afs_cm_no_server(call, &srx);
-
- return afs_queue_call_work(call);
+ return afs_find_cm_server_by_peer(call);
}
/*
*/
static int afs_deliver_cb_init_call_back_state(struct afs_call *call)
{
- struct sockaddr_rxrpc srx;
int ret;
_enter("");
- rxrpc_kernel_get_peer(call->net->socket, call->rxcall, &srx);
-
- ret = afs_extract_data(call, NULL, 0, false);
+ afs_extract_discard(call, 0);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
/* we'll need the file server record as that tells us which set of
* vnodes to operate upon */
- call->cm_server = afs_find_server(call->net, &srx);
- if (!call->cm_server)
- trace_afs_cm_no_server(call, &srx);
-
- return afs_queue_call_work(call);
+ return afs_find_cm_server_by_peer(call);
}
/*
switch (call->unmarshall) {
case 0:
- call->offset = 0;
call->buffer = kmalloc_array(11, sizeof(__be32), GFP_KERNEL);
if (!call->buffer)
return -ENOMEM;
+ afs_extract_to_buf(call, 11 * sizeof(__be32));
call->unmarshall++;
case 1:
_debug("extract UUID");
- ret = afs_extract_data(call, call->buffer,
- 11 * sizeof(__be32), false);
+ ret = afs_extract_data(call, false);
switch (ret) {
case 0: break;
case -EAGAIN: return 0;
for (loop = 0; loop < 6; loop++)
r->node[loop] = ntohl(b[loop + 5]);
- call->offset = 0;
call->unmarshall++;
case 2:
}
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
/* we'll need the file server record as that tells us which set of
* vnodes to operate upon */
- rcu_read_lock();
- call->cm_server = afs_find_server_by_uuid(call->net, call->request);
- rcu_read_unlock();
- if (!call->cm_server)
- trace_afs_cm_no_server_u(call, call->request);
-
- return afs_queue_call_work(call);
+ return afs_find_cm_server_by_uuid(call, call->request);
}
/*
_enter("");
- ret = afs_extract_data(call, NULL, 0, false);
+ afs_extract_discard(call, 0);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
-
- return afs_queue_call_work(call);
+ return afs_io_error(call, afs_io_error_cm_reply);
+ return afs_find_cm_server_by_peer(call);
}
/*
switch (call->unmarshall) {
case 0:
- call->offset = 0;
call->buffer = kmalloc_array(11, sizeof(__be32), GFP_KERNEL);
if (!call->buffer)
return -ENOMEM;
+ afs_extract_to_buf(call, 11 * sizeof(__be32));
call->unmarshall++;
case 1:
_debug("extract UUID");
- ret = afs_extract_data(call, call->buffer,
- 11 * sizeof(__be32), false);
+ ret = afs_extract_data(call, false);
switch (ret) {
case 0: break;
case -EAGAIN: return 0;
for (loop = 0; loop < 6; loop++)
r->node[loop] = ntohl(b[loop + 5]);
- call->offset = 0;
call->unmarshall++;
case 2:
}
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
-
- return afs_queue_call_work(call);
+ return afs_io_error(call, afs_io_error_cm_reply);
+ return afs_find_cm_server_by_uuid(call, call->request);
}
/*
_enter("");
- ret = afs_extract_data(call, NULL, 0, false);
+ afs_extract_discard(call, 0);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
+ return afs_find_cm_server_by_peer(call);
+}
+
+/*
+ * deliver request data to a YFS CB.CallBack call
+ */
+static int afs_deliver_yfs_cb_callback(struct afs_call *call)
+{
+ struct afs_callback_break *cb;
+ struct yfs_xdr_YFSFid *bp;
+ size_t size;
+ int ret, loop;
+
+ _enter("{%u}", call->unmarshall);
+
+ switch (call->unmarshall) {
+ case 0:
+ afs_extract_to_tmp(call);
+ call->unmarshall++;
+
+ /* extract the FID array and its count in two steps */
+ case 1:
+ _debug("extract FID count");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ call->count = ntohl(call->tmp);
+ _debug("FID count: %u", call->count);
+ if (call->count > YFSCBMAX)
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_cb_fid_count);
+
+ size = array_size(call->count, sizeof(struct yfs_xdr_YFSFid));
+ call->buffer = kmalloc(size, GFP_KERNEL);
+ if (!call->buffer)
+ return -ENOMEM;
+ afs_extract_to_buf(call, size);
+ call->unmarshall++;
+
+ case 2:
+ _debug("extract FID array");
+ ret = afs_extract_data(call, false);
+ if (ret < 0)
+ return ret;
+
+ _debug("unmarshall FID array");
+ call->request = kcalloc(call->count,
+ sizeof(struct afs_callback_break),
+ GFP_KERNEL);
+ if (!call->request)
+ return -ENOMEM;
+
+ cb = call->request;
+ bp = call->buffer;
+ for (loop = call->count; loop > 0; loop--, cb++) {
+ cb->fid.vid = xdr_to_u64(bp->volume);
+ cb->fid.vnode = xdr_to_u64(bp->vnode.lo);
+ cb->fid.vnode_hi = ntohl(bp->vnode.hi);
+ cb->fid.unique = ntohl(bp->vnode.unique);
+ bp++;
+ }
+
+ afs_extract_to_tmp(call);
+ call->unmarshall++;
+
+ case 3:
+ break;
+ }
+
+ if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
+ return afs_io_error(call, afs_io_error_cm_reply);
- return afs_queue_call_work(call);
+ /* We'll need the file server record as that tells us which set of
+ * vnodes to operate upon.
+ */
+ return afs_find_cm_server_by_peer(call);
}
ntohs(dbuf->blocks[tmp].hdr.magic));
trace_afs_dir_check_failed(dvnode, off, i_size);
kunmap(page);
+ trace_afs_file_error(dvnode, -EIO, afs_file_error_dir_bad_magic);
goto error;
}
retry:
i_size = i_size_read(&dvnode->vfs_inode);
if (i_size < 2048)
- return ERR_PTR(-EIO);
- if (i_size > 2048 * 1024)
+ return ERR_PTR(afs_bad(dvnode, afs_file_error_dir_small));
+ if (i_size > 2048 * 1024) {
+ trace_afs_file_error(dvnode, -EFBIG, afs_file_error_dir_big);
return ERR_PTR(-EFBIG);
+ }
_enter("%llu", i_size);
/*
* deal with one block in an AFS directory
*/
-static int afs_dir_iterate_block(struct dir_context *ctx,
+static int afs_dir_iterate_block(struct afs_vnode *dvnode,
+ struct dir_context *ctx,
union afs_xdr_dir_block *block,
unsigned blkoff)
{
" (len %u/%zu)",
blkoff / sizeof(union afs_xdr_dir_block),
offset, next, tmp, nlen);
- return -EIO;
+ return afs_bad(dvnode, afs_file_error_dir_over_end);
}
if (!(block->hdr.bitmap[next / 8] &
(1 << (next % 8)))) {
" %u unmarked extension (len %u/%zu)",
blkoff / sizeof(union afs_xdr_dir_block),
offset, next, tmp, nlen);
- return -EIO;
+ return afs_bad(dvnode, afs_file_error_dir_unmarked_ext);
}
_debug("ENT[%zu.%u]: ext %u/%zu",
*/
page = req->pages[blkoff / PAGE_SIZE];
if (!page) {
- ret = -EIO;
+ ret = afs_bad(dvnode, afs_file_error_dir_missing_page);
break;
}
mark_page_accessed(page);
do {
dblock = &dbuf->blocks[(blkoff % PAGE_SIZE) /
sizeof(union afs_xdr_dir_block)];
- ret = afs_dir_iterate_block(ctx, dblock, blkoff);
+ ret = afs_dir_iterate_block(dvnode, ctx, dblock, blkoff);
if (ret != 1) {
kunmap(page);
goto out;
}
*fid = cookie.fid;
- _leave(" = 0 { vn=%u u=%u }", fid->vnode, fid->unique);
+ _leave(" = 0 { vn=%llu u=%u }", fid->vnode, fid->unique);
return 0;
}
struct key *key;
int ret;
- _enter("{%x:%u},%p{%pd},",
+ _enter("{%llx:%llu},%p{%pd},",
dvnode->fid.vid, dvnode->fid.vnode, dentry, dentry);
ASSERTCMP(d_inode(dentry), ==, NULL);
if (d_really_is_positive(dentry)) {
vnode = AFS_FS_I(d_inode(dentry));
- _enter("{v={%x:%u} n=%pd fl=%lx},",
+ _enter("{v={%llx:%llu} n=%pd fl=%lx},",
vnode->fid.vid, vnode->fid.vnode, dentry,
vnode->flags);
} else {
/* if the vnode ID has changed, then the dirent points to a
* different file */
if (fid.vnode != vnode->fid.vnode) {
- _debug("%pd: dirent changed [%u != %u]",
+ _debug("%pd: dirent changed [%llu != %llu]",
dentry, fid.vnode,
vnode->fid.vnode);
goto not_found;
if (fc->ac.error < 0)
return;
- d_drop(new_dentry);
-
inode = afs_iget(fc->vnode->vfs_inode.i_sb, fc->key,
newfid, newstatus, newcb, fc->cbi);
if (IS_ERR(inode)) {
vnode = AFS_FS_I(inode);
set_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
- d_add(new_dentry, inode);
+ afs_vnode_commit_status(fc, vnode, 0);
+ d_instantiate(new_dentry, inode);
}
/*
mode |= S_IFDIR;
- _enter("{%x:%u},{%pd},%ho",
+ _enter("{%llx:%llu},{%pd},%ho",
dvnode->fid.vid, dvnode->fid.vnode, dentry, mode);
key = afs_request_key(dvnode->volume->cell);
static int afs_rmdir(struct inode *dir, struct dentry *dentry)
{
struct afs_fs_cursor fc;
- struct afs_vnode *dvnode = AFS_FS_I(dir);
+ struct afs_vnode *dvnode = AFS_FS_I(dir), *vnode = NULL;
struct key *key;
u64 data_version = dvnode->status.data_version;
int ret;
- _enter("{%x:%u},{%pd}",
+ _enter("{%llx:%llu},{%pd}",
dvnode->fid.vid, dvnode->fid.vnode, dentry);
key = afs_request_key(dvnode->volume->cell);
goto error;
}
+ /* Try to make sure we have a callback promise on the victim. */
+ if (d_really_is_positive(dentry)) {
+ vnode = AFS_FS_I(d_inode(dentry));
+ ret = afs_validate(vnode, key);
+ if (ret < 0)
+ goto error_key;
+ }
+
ret = -ERESTARTSYS;
if (afs_begin_vnode_operation(&fc, dvnode, key)) {
while (afs_select_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(dvnode);
- afs_fs_remove(&fc, dentry->d_name.name, true,
+ afs_fs_remove(&fc, vnode, dentry->d_name.name, true,
data_version);
}
}
}
+error_key:
key_put(key);
error:
return ret;
if (d_really_is_positive(dentry)) {
struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
- if (dir_valid) {
+ if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
+ /* Already done */
+ } else if (dir_valid) {
drop_nlink(&vnode->vfs_inode);
if (vnode->vfs_inode.i_nlink == 0) {
set_bit(AFS_VNODE_DELETED, &vnode->flags);
static int afs_unlink(struct inode *dir, struct dentry *dentry)
{
struct afs_fs_cursor fc;
- struct afs_vnode *dvnode = AFS_FS_I(dir), *vnode;
+ struct afs_vnode *dvnode = AFS_FS_I(dir), *vnode = NULL;
struct key *key;
unsigned long d_version = (unsigned long)dentry->d_fsdata;
u64 data_version = dvnode->status.data_version;
int ret;
- _enter("{%x:%u},{%pd}",
+ _enter("{%llx:%llu},{%pd}",
dvnode->fid.vid, dvnode->fid.vnode, dentry);
if (dentry->d_name.len >= AFSNAMEMAX)
if (afs_begin_vnode_operation(&fc, dvnode, key)) {
while (afs_select_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(dvnode);
- afs_fs_remove(&fc, dentry->d_name.name, false,
+
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc.cbi->server->flags) &&
+ !test_bit(AFS_SERVER_FL_NO_RM2, &fc.cbi->server->flags)) {
+ yfs_fs_remove_file2(&fc, vnode, dentry->d_name.name,
+ data_version);
+ if (fc.ac.error != -ECONNABORTED ||
+ fc.ac.abort_code != RXGEN_OPCODE)
+ continue;
+ set_bit(AFS_SERVER_FL_NO_RM2, &fc.cbi->server->flags);
+ }
+
+ afs_fs_remove(&fc, vnode, dentry->d_name.name, false,
data_version);
}
mode |= S_IFREG;
- _enter("{%x:%u},{%pd},%ho,",
+ _enter("{%llx:%llu},{%pd},%ho,",
dvnode->fid.vid, dvnode->fid.vnode, dentry, mode);
ret = -ENAMETOOLONG;
dvnode = AFS_FS_I(dir);
data_version = dvnode->status.data_version;
- _enter("{%x:%u},{%x:%u},{%pd}",
+ _enter("{%llx:%llu},{%llx:%llu},{%pd}",
vnode->fid.vid, vnode->fid.vnode,
dvnode->fid.vid, dvnode->fid.vnode,
dentry);
u64 data_version = dvnode->status.data_version;
int ret;
- _enter("{%x:%u},{%pd},%s",
+ _enter("{%llx:%llu},{%pd},%s",
dvnode->fid.vid, dvnode->fid.vnode, dentry,
content);
orig_data_version = orig_dvnode->status.data_version;
new_data_version = new_dvnode->status.data_version;
- _enter("{%x:%u},{%x:%u},{%x:%u},{%pd}",
+ _enter("{%llx:%llu},{%llx:%llu},{%llx:%llu},{%pd}",
orig_dvnode->fid.vid, orig_dvnode->fid.vnode,
vnode->fid.vid, vnode->fid.vnode,
new_dvnode->fid.vid, new_dvnode->fid.vnode,
{
struct afs_vnode *dvnode = AFS_FS_I(page->mapping->host);
- _enter("{{%x:%u}[%lu]}", dvnode->fid.vid, dvnode->fid.vnode, page->index);
+ _enter("{{%llx:%llu}[%lu]}", dvnode->fid.vid, dvnode->fid.vnode, page->index);
set_page_private(page, 0);
ClearPagePrivate(page);
return 0;
}
- ret = dns_query("afsdb", name, len, "", NULL, NULL);
+ ret = dns_query("afsdb", name, len, "srv=1", NULL, NULL);
if (ret == -ENODATA)
ret = -EDESTADDRREQ;
return ret;
struct inode *inode;
int ret = -ENOENT;
- _enter("%p{%pd}, {%x:%u}",
+ _enter("%p{%pd}, {%llx:%llu}",
dentry, dentry, vnode->fid.vid, vnode->fid.vnode);
if (!test_bit(AFS_VNODE_AUTOCELL, &vnode->flags))
struct key *key;
int ret;
- _enter("{%x:%u},", vnode->fid.vid, vnode->fid.vnode);
+ _enter("{%llx:%llu},", vnode->fid.vid, vnode->fid.vnode);
key = afs_request_key(vnode->volume->cell);
if (IS_ERR(key)) {
struct afs_vnode *vnode = AFS_FS_I(inode);
struct afs_file *af = file->private_data;
- _enter("{%x:%u},", vnode->fid.vid, vnode->fid.vnode);
+ _enter("{%llx:%llu},", vnode->fid.vid, vnode->fid.vnode);
if ((file->f_mode & FMODE_WRITE))
return vfs_fsync(file, 0);
struct afs_fs_cursor fc;
int ret;
- _enter("%s{%x:%u.%u},%x,,,",
+ _enter("%s{%llx:%llu.%u},%x,,,",
vnode->volume->name,
vnode->fid.vid,
vnode->fid.vnode,
struct afs_vnode *vnode = AFS_FS_I(page->mapping->host);
unsigned long priv;
- _enter("{{%x:%u}[%lu],%lx},%x",
+ _enter("{{%llx:%llu}[%lu],%lx},%x",
vnode->fid.vid, vnode->fid.vnode, page->index, page->flags,
gfp_flags);
*/
void afs_lock_may_be_available(struct afs_vnode *vnode)
{
- _enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
+ _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
queue_delayed_work(afs_lock_manager, &vnode->lock_work, 0);
}
struct afs_fs_cursor fc;
int ret;
- _enter("%s{%x:%u.%u},%x,%u",
+ _enter("%s{%llx:%llu.%u},%x,%u",
vnode->volume->name,
vnode->fid.vid,
vnode->fid.vnode,
struct afs_fs_cursor fc;
int ret;
- _enter("%s{%x:%u.%u},%x",
+ _enter("%s{%llx:%llu.%u},%x",
vnode->volume->name,
vnode->fid.vid,
vnode->fid.vnode,
struct afs_fs_cursor fc;
int ret;
- _enter("%s{%x:%u.%u},%x",
+ _enter("%s{%llx:%llu.%u},%x",
vnode->volume->name,
vnode->fid.vid,
vnode->fid.vnode,
struct key *key;
int ret;
- _enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
+ _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
spin_lock(&vnode->lock);
ret = afs_release_lock(vnode, vnode->lock_key);
if (ret < 0)
printk(KERN_WARNING "AFS:"
- " Failed to release lock on {%x:%x} error %d\n",
+ " Failed to release lock on {%llx:%llx} error %d\n",
vnode->fid.vid, vnode->fid.vnode, ret);
spin_lock(&vnode->lock);
key_put(key);
if (ret < 0)
- pr_warning("AFS: Failed to extend lock on {%x:%x} error %d\n",
+ pr_warning("AFS: Failed to extend lock on {%llx:%llx} error %d\n",
vnode->fid.vid, vnode->fid.vnode, ret);
spin_lock(&vnode->lock);
struct key *key = afs_file_key(file);
int ret;
- _enter("{%x:%u},%u", vnode->fid.vid, vnode->fid.vnode, fl->fl_type);
+ _enter("{%llx:%llu},%u", vnode->fid.vid, vnode->fid.vnode, fl->fl_type);
/* only whole-file locks are supported */
if (fl->fl_start != 0 || fl->fl_end != OFFSET_MAX)
struct afs_vnode *vnode = AFS_FS_I(locks_inode(file));
int ret;
- _enter("{%x:%u},%u", vnode->fid.vid, vnode->fid.vnode, fl->fl_type);
+ _enter("{%llx:%llu},%u", vnode->fid.vid, vnode->fid.vnode, fl->fl_type);
/* Flush all pending writes before doing anything with locks. */
vfs_fsync(file, 0);
{
struct afs_vnode *vnode = AFS_FS_I(locks_inode(file));
- _enter("{%x:%u},%d,{t=%x,fl=%x,r=%Ld:%Ld}",
+ _enter("{%llx:%llu},%d,{t=%x,fl=%x,r=%Ld:%Ld}",
vnode->fid.vid, vnode->fid.vnode, cmd,
fl->fl_type, fl->fl_flags,
(long long) fl->fl_start, (long long) fl->fl_end);
{
struct afs_vnode *vnode = AFS_FS_I(locks_inode(file));
- _enter("{%x:%u},%d,{t=%x,fl=%x}",
+ _enter("{%llx:%llu},%d,{t=%x,fl=%x}",
vnode->fid.vid, vnode->fid.vnode, cmd,
fl->fl_type, fl->fl_flags);
--- /dev/null
+/* AFS fileserver probing
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include "afs_fs.h"
+#include "internal.h"
+#include "protocol_yfs.h"
+
+static bool afs_fs_probe_done(struct afs_server *server)
+{
+ if (!atomic_dec_and_test(&server->probe_outstanding))
+ return false;
+
+ wake_up_var(&server->probe_outstanding);
+ clear_bit_unlock(AFS_SERVER_FL_PROBING, &server->flags);
+ wake_up_bit(&server->flags, AFS_SERVER_FL_PROBING);
+ return true;
+}
+
+/*
+ * Process the result of probing a fileserver. This is called after successful
+ * or failed delivery of an FS.GetCapabilities operation.
+ */
+void afs_fileserver_probe_result(struct afs_call *call)
+{
+ struct afs_addr_list *alist = call->alist;
+ struct afs_server *server = call->reply[0];
+ unsigned int server_index = (long)call->reply[1];
+ unsigned int index = call->addr_ix;
+ unsigned int rtt = UINT_MAX;
+ bool have_result = false;
+ u64 _rtt;
+ int ret = call->error;
+
+ _enter("%pU,%u", &server->uuid, index);
+
+ spin_lock(&server->probe_lock);
+
+ switch (ret) {
+ case 0:
+ server->probe.error = 0;
+ goto responded;
+ case -ECONNABORTED:
+ if (!server->probe.responded) {
+ server->probe.abort_code = call->abort_code;
+ server->probe.error = ret;
+ }
+ goto responded;
+ case -ENOMEM:
+ case -ENONET:
+ server->probe.local_failure = true;
+ afs_io_error(call, afs_io_error_fs_probe_fail);
+ goto out;
+ case -ECONNRESET: /* Responded, but call expired. */
+ case -ERFKILL:
+ case -EADDRNOTAVAIL:
+ case -ENETUNREACH:
+ case -EHOSTUNREACH:
+ case -EHOSTDOWN:
+ case -ECONNREFUSED:
+ case -ETIMEDOUT:
+ case -ETIME:
+ default:
+ clear_bit(index, &alist->responded);
+ set_bit(index, &alist->failed);
+ if (!server->probe.responded &&
+ (server->probe.error == 0 ||
+ server->probe.error == -ETIMEDOUT ||
+ server->probe.error == -ETIME))
+ server->probe.error = ret;
+ afs_io_error(call, afs_io_error_fs_probe_fail);
+ goto out;
+ }
+
+responded:
+ set_bit(index, &alist->responded);
+ clear_bit(index, &alist->failed);
+
+ if (call->service_id == YFS_FS_SERVICE) {
+ server->probe.is_yfs = true;
+ set_bit(AFS_SERVER_FL_IS_YFS, &server->flags);
+ alist->addrs[index].srx_service = call->service_id;
+ } else {
+ server->probe.not_yfs = true;
+ if (!server->probe.is_yfs) {
+ clear_bit(AFS_SERVER_FL_IS_YFS, &server->flags);
+ alist->addrs[index].srx_service = call->service_id;
+ }
+ }
+
+ /* Get the RTT and scale it to fit into a 32-bit value that represents
+ * over a minute of time so that we can access it with one instruction
+ * on a 32-bit system.
+ */
+ _rtt = rxrpc_kernel_get_rtt(call->net->socket, call->rxcall);
+ _rtt /= 64;
+ rtt = (_rtt > UINT_MAX) ? UINT_MAX : _rtt;
+ if (rtt < server->probe.rtt) {
+ server->probe.rtt = rtt;
+ alist->preferred = index;
+ have_result = true;
+ }
+
+ smp_wmb(); /* Set rtt before responded. */
+ server->probe.responded = true;
+ set_bit(AFS_SERVER_FL_PROBED, &server->flags);
+out:
+ spin_unlock(&server->probe_lock);
+
+ _debug("probe [%u][%u] %pISpc rtt=%u ret=%d",
+ server_index, index, &alist->addrs[index].transport,
+ (unsigned int)rtt, ret);
+
+ have_result |= afs_fs_probe_done(server);
+ if (have_result) {
+ server->probe.have_result = true;
+ wake_up_var(&server->probe.have_result);
+ wake_up_all(&server->probe_wq);
+ }
+}
+
+/*
+ * Probe all of a fileserver's addresses to find out the best route and to
+ * query its capabilities.
+ */
+static int afs_do_probe_fileserver(struct afs_net *net,
+ struct afs_server *server,
+ struct key *key,
+ unsigned int server_index,
+ struct afs_error *_e)
+{
+ struct afs_addr_cursor ac = {
+ .index = 0,
+ };
+ bool in_progress = false;
+ int err;
+
+ _enter("%pU", &server->uuid);
+
+ read_lock(&server->fs_lock);
+ ac.alist = rcu_dereference_protected(server->addresses,
+ lockdep_is_held(&server->fs_lock));
+ read_unlock(&server->fs_lock);
+
+ atomic_set(&server->probe_outstanding, ac.alist->nr_addrs);
+ memset(&server->probe, 0, sizeof(server->probe));
+ server->probe.rtt = UINT_MAX;
+
+ for (ac.index = 0; ac.index < ac.alist->nr_addrs; ac.index++) {
+ err = afs_fs_get_capabilities(net, server, &ac, key, server_index,
+ true);
+ if (err == -EINPROGRESS)
+ in_progress = true;
+ else
+ afs_prioritise_error(_e, err, ac.abort_code);
+ }
+
+ if (!in_progress)
+ afs_fs_probe_done(server);
+ return in_progress;
+}
+
+/*
+ * Send off probes to all unprobed servers.
+ */
+int afs_probe_fileservers(struct afs_net *net, struct key *key,
+ struct afs_server_list *list)
+{
+ struct afs_server *server;
+ struct afs_error e;
+ bool in_progress = false;
+ int i;
+
+ e.error = 0;
+ e.responded = false;
+ for (i = 0; i < list->nr_servers; i++) {
+ server = list->servers[i].server;
+ if (test_bit(AFS_SERVER_FL_PROBED, &server->flags))
+ continue;
+
+ if (!test_and_set_bit_lock(AFS_SERVER_FL_PROBING, &server->flags) &&
+ afs_do_probe_fileserver(net, server, key, i, &e))
+ in_progress = true;
+ }
+
+ return in_progress ? 0 : e.error;
+}
+
+/*
+ * Wait for the first as-yet untried fileserver to respond.
+ */
+int afs_wait_for_fs_probes(struct afs_server_list *slist, unsigned long untried)
+{
+ struct wait_queue_entry *waits;
+ struct afs_server *server;
+ unsigned int rtt = UINT_MAX;
+ bool have_responders = false;
+ int pref = -1, i;
+
+ _enter("%u,%lx", slist->nr_servers, untried);
+
+ /* Only wait for servers that have a probe outstanding. */
+ for (i = 0; i < slist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = slist->servers[i].server;
+ if (!test_bit(AFS_SERVER_FL_PROBING, &server->flags))
+ __clear_bit(i, &untried);
+ if (server->probe.responded)
+ have_responders = true;
+ }
+ }
+ if (have_responders || !untried)
+ return 0;
+
+ waits = kmalloc(array_size(slist->nr_servers, sizeof(*waits)), GFP_KERNEL);
+ if (!waits)
+ return -ENOMEM;
+
+ for (i = 0; i < slist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = slist->servers[i].server;
+ init_waitqueue_entry(&waits[i], current);
+ add_wait_queue(&server->probe_wq, &waits[i]);
+ }
+ }
+
+ for (;;) {
+ bool still_probing = false;
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ for (i = 0; i < slist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = slist->servers[i].server;
+ if (server->probe.responded)
+ goto stop;
+ if (test_bit(AFS_SERVER_FL_PROBING, &server->flags))
+ still_probing = true;
+ }
+ }
+
+ if (!still_probing || unlikely(signal_pending(current)))
+ goto stop;
+ schedule();
+ }
+
+stop:
+ set_current_state(TASK_RUNNING);
+
+ for (i = 0; i < slist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = slist->servers[i].server;
+ if (server->probe.responded &&
+ server->probe.rtt < rtt) {
+ pref = i;
+ rtt = server->probe.rtt;
+ }
+
+ remove_wait_queue(&server->probe_wq, &waits[i]);
+ }
+ }
+
+ kfree(waits);
+
+ if (pref == -1 && signal_pending(current))
+ return -ERESTARTSYS;
+
+ if (pref >= 0)
+ slist->preferred = pref;
+ return 0;
+}
#include "internal.h"
#include "afs_fs.h"
#include "xdr_fs.h"
+#include "protocol_yfs.h"
static const struct afs_fid afs_zero_fid;
-/*
- * We need somewhere to discard into in case the server helpfully returns more
- * than we asked for in FS.FetchData{,64}.
- */
-static u8 afs_discard_buffer[64];
-
static inline void afs_use_fs_server(struct afs_call *call, struct afs_cb_interest *cbi)
{
call->cbi = afs_get_cb_interest(cbi);
struct timespec64 t;
umode_t mode;
- t.tv_sec = status->mtime_client;
- t.tv_nsec = 0;
+ t = status->mtime_client;
vnode->vfs_inode.i_ctime = t;
vnode->vfs_inode.i_mtime = t;
vnode->vfs_inode.i_atime = t;
if (!(flags & AFS_VNODE_NOT_YET_SET)) {
if (expected_version &&
*expected_version != status->data_version) {
- _debug("vnode modified %llx on {%x:%u} [exp %llx]",
+ _debug("vnode modified %llx on {%llx:%llu} [exp %llx]",
(unsigned long long) status->data_version,
vnode->fid.vid, vnode->fid.vnode,
(unsigned long long) *expected_version);
if (type != status->type &&
vnode &&
!test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
- pr_warning("Vnode %x:%x:%x changed type %u to %u\n",
+ pr_warning("Vnode %llx:%llx:%x changed type %u to %u\n",
vnode->fid.vid,
vnode->fid.vnode,
vnode->fid.unique,
EXTRACT_M(mode);
EXTRACT_M(group);
- status->mtime_client = ntohl(xdr->mtime_client);
- status->mtime_server = ntohl(xdr->mtime_server);
+ status->mtime_client.tv_sec = ntohl(xdr->mtime_client);
+ status->mtime_client.tv_nsec = 0;
+ status->mtime_server.tv_sec = ntohl(xdr->mtime_server);
+ status->mtime_server.tv_nsec = 0;
status->lock_count = ntohl(xdr->lock_count);
size = (u64)ntohl(xdr->size_lo);
bad:
xdr_dump_bad(*_bp);
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG, afs_eproto_bad_status);
}
/*
write_seqlock(&vnode->cb_lock);
- if (call->cb_break == afs_cb_break_sum(vnode, cbi)) {
+ if (!afs_cb_is_broken(call->cb_break, vnode, cbi)) {
vnode->cb_version = ntohl(*bp++);
cb_expiry = ntohl(*bp++);
vnode->cb_type = ntohl(*bp++);
*_bp = bp;
}
-static void xdr_decode_AFSCallBack_raw(const __be32 **_bp,
+static ktime_t xdr_decode_expiry(struct afs_call *call, u32 expiry)
+{
+ return ktime_add_ns(call->reply_time, expiry * NSEC_PER_SEC);
+}
+
+static void xdr_decode_AFSCallBack_raw(struct afs_call *call,
+ const __be32 **_bp,
struct afs_callback *cb)
{
const __be32 *bp = *_bp;
cb->version = ntohl(*bp++);
- cb->expiry = ntohl(*bp++);
+ cb->expires_at = xdr_decode_expiry(call, ntohl(*bp++));
cb->type = ntohl(*bp++);
*_bp = bp;
}
struct afs_volsync *volsync)
{
const __be32 *bp = *_bp;
+ u32 creation;
- volsync->creation = ntohl(*bp++);
+ creation = ntohl(*bp++);
bp++; /* spare2 */
bp++; /* spare3 */
bp++; /* spare4 */
bp++; /* spare5 */
bp++; /* spare6 */
*_bp = bp;
+
+ if (volsync)
+ volsync->creation = creation;
}
/*
vs->blocks_in_use = ntohl(*bp++);
vs->part_blocks_avail = ntohl(*bp++);
vs->part_max_blocks = ntohl(*bp++);
+ vs->vol_copy_date = 0;
+ vs->vol_backup_date = 0;
*_bp = bp;
}
if (ret < 0)
return ret;
- _enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
+ _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
xdr_decode_AFSCallBack(call, vnode, &bp);
- if (call->reply[1])
- xdr_decode_AFSVolSync(&bp, call->reply[1]);
+ xdr_decode_AFSVolSync(&bp, call->reply[1]);
_leave(" = 0 [done]");
return 0;
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
- _enter(",%x,{%x:%u},,",
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_fetch_file_status(fc, volsync, new_inode);
+
+ _enter(",%x,{%llx:%llu},,",
key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
call = afs_alloc_flat_call(net, &afs_RXFSFetchStatus_vnode,
call->reply[0] = vnode;
call->reply[1] = volsync;
call->expected_version = new_inode ? 1 : vnode->status.data_version;
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
struct afs_read *req = call->reply[2];
const __be32 *bp;
unsigned int size;
- void *buffer;
int ret;
- _enter("{%u,%zu/%u;%llu/%llu}",
- call->unmarshall, call->offset, call->count,
- req->remain, req->actual_len);
+ _enter("{%u,%zu/%llu}",
+ call->unmarshall, iov_iter_count(&call->iter), req->actual_len);
switch (call->unmarshall) {
case 0:
req->actual_len = 0;
- call->offset = 0;
+ req->index = 0;
+ req->offset = req->pos & (PAGE_SIZE - 1);
call->unmarshall++;
- if (call->operation_ID != FSFETCHDATA64) {
- call->unmarshall++;
- goto no_msw;
+ if (call->operation_ID == FSFETCHDATA64) {
+ afs_extract_to_tmp64(call);
+ } else {
+ call->tmp_u = htonl(0);
+ afs_extract_to_tmp(call);
}
- /* extract the upper part of the returned data length of an
- * FSFETCHDATA64 op (which should always be 0 using this
- * client) */
- case 1:
- _debug("extract data length (MSW)");
- ret = afs_extract_data(call, &call->tmp, 4, true);
- if (ret < 0)
- return ret;
-
- req->actual_len = ntohl(call->tmp);
- req->actual_len <<= 32;
- call->offset = 0;
- call->unmarshall++;
-
- no_msw:
/* extract the returned data length */
- case 2:
+ case 1:
_debug("extract data length");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
- req->actual_len |= ntohl(call->tmp);
+ req->actual_len = be64_to_cpu(call->tmp64);
_debug("DATA length: %llu", req->actual_len);
-
- req->remain = req->actual_len;
- call->offset = req->pos & (PAGE_SIZE - 1);
- req->index = 0;
- if (req->actual_len == 0)
+ req->remain = min(req->len, req->actual_len);
+ if (req->remain == 0)
goto no_more_data;
+
call->unmarshall++;
begin_page:
ASSERTCMP(req->index, <, req->nr_pages);
- if (req->remain > PAGE_SIZE - call->offset)
- size = PAGE_SIZE - call->offset;
+ if (req->remain > PAGE_SIZE - req->offset)
+ size = PAGE_SIZE - req->offset;
else
size = req->remain;
- call->count = call->offset + size;
- ASSERTCMP(call->count, <=, PAGE_SIZE);
- req->remain -= size;
+ call->bvec[0].bv_len = size;
+ call->bvec[0].bv_offset = req->offset;
+ call->bvec[0].bv_page = req->pages[req->index];
+ iov_iter_bvec(&call->iter, READ, call->bvec, 1, size);
+ ASSERTCMP(size, <=, PAGE_SIZE);
/* extract the returned data */
- case 3:
- _debug("extract data %llu/%llu %zu/%u",
- req->remain, req->actual_len, call->offset, call->count);
+ case 2:
+ _debug("extract data %zu/%llu",
+ iov_iter_count(&call->iter), req->remain);
- buffer = kmap(req->pages[req->index]);
- ret = afs_extract_data(call, buffer, call->count, true);
- kunmap(req->pages[req->index]);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
- if (call->offset == PAGE_SIZE) {
+ req->remain -= call->bvec[0].bv_len;
+ req->offset += call->bvec[0].bv_len;
+ ASSERTCMP(req->offset, <=, PAGE_SIZE);
+ if (req->offset == PAGE_SIZE) {
+ req->offset = 0;
if (req->page_done)
req->page_done(call, req);
req->index++;
- if (req->remain > 0) {
- call->offset = 0;
- if (req->index >= req->nr_pages) {
- call->unmarshall = 4;
- goto begin_discard;
- }
+ if (req->remain > 0)
goto begin_page;
- }
}
- goto no_more_data;
+
+ ASSERTCMP(req->remain, ==, 0);
+ if (req->actual_len <= req->len)
+ goto no_more_data;
/* Discard any excess data the server gave us */
- begin_discard:
- case 4:
- size = min_t(loff_t, sizeof(afs_discard_buffer), req->remain);
- call->count = size;
- _debug("extract discard %llu/%llu %zu/%u",
- req->remain, req->actual_len, call->offset, call->count);
-
- call->offset = 0;
- ret = afs_extract_data(call, afs_discard_buffer, call->count, true);
- req->remain -= call->offset;
+ iov_iter_discard(&call->iter, READ, req->actual_len - req->len);
+ call->unmarshall = 3;
+ case 3:
+ _debug("extract discard %zu/%llu",
+ iov_iter_count(&call->iter), req->actual_len - req->len);
+
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
- if (req->remain > 0)
- goto begin_discard;
no_more_data:
- call->offset = 0;
- call->unmarshall = 5;
+ call->unmarshall = 4;
+ afs_extract_to_buf(call, (21 + 3 + 6) * 4);
/* extract the metadata */
- case 5:
- ret = afs_extract_data(call, call->buffer,
- (21 + 3 + 6) * 4, false);
+ case 4:
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &vnode->status.data_version, req) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &vnode->status.data_version, req);
+ if (ret < 0)
+ return ret;
xdr_decode_AFSCallBack(call, vnode, &bp);
- if (call->reply[1])
- xdr_decode_AFSVolSync(&bp, call->reply[1]);
+ xdr_decode_AFSVolSync(&bp, call->reply[1]);
- call->offset = 0;
call->unmarshall++;
- case 6:
+ case 5:
break;
}
for (; req->index < req->nr_pages; req->index++) {
- if (call->count < PAGE_SIZE)
+ if (req->offset < PAGE_SIZE)
zero_user_segment(req->pages[req->index],
- call->count, PAGE_SIZE);
+ req->offset, PAGE_SIZE);
if (req->page_done)
req->page_done(call, req);
- call->count = 0;
+ req->offset = 0;
}
_leave(" = 0 [done]");
call->reply[1] = NULL; /* volsync */
call->reply[2] = req;
call->expected_version = vnode->status.data_version;
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_fetch_data(fc, req);
+
if (upper_32_bits(req->pos) ||
upper_32_bits(req->len) ||
upper_32_bits(req->pos + req->len))
call->reply[1] = NULL; /* volsync */
call->reply[2] = req;
call->expected_version = vnode->status.data_version;
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
xdr_decode_AFSFid(&bp, call->reply[1]);
- if (afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL) < 0 ||
- afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
- xdr_decode_AFSCallBack_raw(&bp, call->reply[3]);
+ ret = afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_AFSCallBack_raw(call, &bp, call->reply[3]);
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
size_t namesz, reqsz, padsz;
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags)){
+ if (S_ISDIR(mode))
+ return yfs_fs_make_dir(fc, name, mode, current_data_version,
+ newfid, newstatus, newcb);
+ else
+ return yfs_fs_create_file(fc, name, mode, current_data_version,
+ newfid, newstatus, newcb);
+ }
+
_enter("");
namesz = strlen(name);
call->reply[2] = newstatus;
call->reply[3] = newcb;
call->expected_version = current_data_version + 1;
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
/*
* remove a file or directory
*/
-int afs_fs_remove(struct afs_fs_cursor *fc, const char *name, bool isdir,
- u64 current_data_version)
+int afs_fs_remove(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
+ const char *name, bool isdir, u64 current_data_version)
{
- struct afs_vnode *vnode = fc->vnode;
+ struct afs_vnode *dvnode = fc->vnode;
struct afs_call *call;
- struct afs_net *net = afs_v2net(vnode);
+ struct afs_net *net = afs_v2net(dvnode);
size_t namesz, reqsz, padsz;
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_remove(fc, vnode, name, isdir, current_data_version);
+
_enter("");
namesz = strlen(name);
return -ENOMEM;
call->key = fc->key;
- call->reply[0] = vnode;
+ call->reply[0] = dvnode;
+ call->reply[1] = vnode;
call->expected_version = current_data_version + 1;
/* marshall the parameters */
bp = call->request;
*bp++ = htonl(isdir ? FSREMOVEDIR : FSREMOVEFILE);
- *bp++ = htonl(vnode->fid.vid);
- *bp++ = htonl(vnode->fid.vnode);
- *bp++ = htonl(vnode->fid.unique);
+ *bp++ = htonl(dvnode->fid.vid);
+ *bp++ = htonl(dvnode->fid.vnode);
+ *bp++ = htonl(dvnode->fid.unique);
*bp++ = htonl(namesz);
memcpy(bp, name, namesz);
bp = (void *) bp + namesz;
}
afs_use_fs_server(call, fc->cbi);
- trace_afs_make_fs_call(call, &vnode->fid);
+ trace_afs_make_fs_call(call, &dvnode->fid);
return afs_make_call(&fc->ac, call, GFP_NOFS, false);
}
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode, NULL, NULL) < 0 ||
- afs_decode_status(call, &bp, &dvnode->status, dvnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = afs_decode_status(call, &bp, &dvnode->status, dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
size_t namesz, reqsz, padsz;
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_link(fc, vnode, name, current_data_version);
+
_enter("");
namesz = strlen(name);
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
xdr_decode_AFSFid(&bp, call->reply[1]);
- if (afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL) ||
- afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
size_t namesz, reqsz, padsz, c_namesz, c_padsz;
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_symlink(fc, name, contents, current_data_version,
+ newfid, newstatus);
+
_enter("");
namesz = strlen(name);
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &orig_dvnode->status, orig_dvnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
- if (new_dvnode != orig_dvnode &&
- afs_decode_status(call, &bp, &new_dvnode->status, new_dvnode,
- &call->expected_version_2, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &orig_dvnode->status, orig_dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ if (new_dvnode != orig_dvnode) {
+ ret = afs_decode_status(call, &bp, &new_dvnode->status, new_dvnode,
+ &call->expected_version_2, NULL);
+ if (ret < 0)
+ return ret;
+ }
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
size_t reqsz, o_namesz, o_padsz, n_namesz, n_padsz;
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_rename(fc, orig_name,
+ new_dvnode, new_name,
+ current_orig_data_version,
+ current_new_data_version);
+
_enter("");
o_namesz = strlen(orig_name);
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
afs_pages_written_back(vnode, call);
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
- _enter(",%x,{%x:%u},,",
+ _enter(",%x,{%llx:%llu},,",
key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
call = afs_alloc_flat_call(net, &afs_RXFSStoreData64,
loff_t size, pos, i_size;
__be32 *bp;
- _enter(",%x,{%x:%u},,",
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_store_data(fc, mapping, first, last, offset, to);
+
+ _enter(",%x,{%llx:%llu},,",
key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
size = (loff_t)to - (loff_t)offset;
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
- _enter(",%x,{%x:%u},,",
+ _enter(",%x,{%llx:%llu},,",
key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
ASSERT(attr->ia_valid & ATTR_SIZE);
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
- _enter(",%x,{%x:%u},,",
+ _enter(",%x,{%llx:%llu},,",
key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
ASSERT(attr->ia_valid & ATTR_SIZE);
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_setattr(fc, attr);
+
if (attr->ia_valid & ATTR_SIZE)
return afs_fs_setattr_size(fc, attr);
- _enter(",%x,{%x:%u},,",
+ _enter(",%x,{%llx:%llu},,",
key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
call = afs_alloc_flat_call(net, &afs_RXFSStoreStatus,
{
const __be32 *bp;
char *p;
+ u32 size;
int ret;
_enter("{%u}", call->unmarshall);
switch (call->unmarshall) {
case 0:
- call->offset = 0;
call->unmarshall++;
+ afs_extract_to_buf(call, 12 * 4);
/* extract the returned status record */
case 1:
_debug("extract status");
- ret = afs_extract_data(call, call->buffer,
- 12 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
bp = call->buffer;
xdr_decode_AFSFetchVolumeStatus(&bp, call->reply[1]);
- call->offset = 0;
call->unmarshall++;
+ afs_extract_to_tmp(call);
/* extract the volume name length */
case 2:
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->count = ntohl(call->tmp);
_debug("volname length: %u", call->count);
if (call->count >= AFSNAMEMAX)
- return afs_protocol_error(call, -EBADMSG);
- call->offset = 0;
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_volname_len);
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
call->unmarshall++;
/* extract the volume name */
case 3:
_debug("extract volname");
- if (call->count > 0) {
- ret = afs_extract_data(call, call->reply[2],
- call->count, true);
- if (ret < 0)
- return ret;
- }
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
p = call->reply[2];
p[call->count] = 0;
_debug("volname '%s'", p);
-
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
- /* extract the volume name padding */
- if ((call->count & 3) == 0) {
- call->unmarshall++;
- goto no_volname_padding;
- }
- call->count = 4 - (call->count & 3);
-
- case 4:
- ret = afs_extract_data(call, call->buffer,
- call->count, true);
- if (ret < 0)
- return ret;
-
- call->offset = 0;
- call->unmarshall++;
- no_volname_padding:
-
/* extract the offline message length */
- case 5:
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ case 4:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->count = ntohl(call->tmp);
_debug("offline msg length: %u", call->count);
if (call->count >= AFSNAMEMAX)
- return afs_protocol_error(call, -EBADMSG);
- call->offset = 0;
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_offline_msg_len);
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
call->unmarshall++;
/* extract the offline message */
- case 6:
+ case 5:
_debug("extract offline");
- if (call->count > 0) {
- ret = afs_extract_data(call, call->reply[2],
- call->count, true);
- if (ret < 0)
- return ret;
- }
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
p = call->reply[2];
p[call->count] = 0;
_debug("offline '%s'", p);
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
- /* extract the offline message padding */
- if ((call->count & 3) == 0) {
- call->unmarshall++;
- goto no_offline_padding;
- }
- call->count = 4 - (call->count & 3);
-
- case 7:
- ret = afs_extract_data(call, call->buffer,
- call->count, true);
- if (ret < 0)
- return ret;
-
- call->offset = 0;
- call->unmarshall++;
- no_offline_padding:
-
/* extract the message of the day length */
- case 8:
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ case 6:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->count = ntohl(call->tmp);
_debug("motd length: %u", call->count);
if (call->count >= AFSNAMEMAX)
- return afs_protocol_error(call, -EBADMSG);
- call->offset = 0;
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_motd_len);
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
call->unmarshall++;
/* extract the message of the day */
- case 9:
+ case 7:
_debug("extract motd");
- if (call->count > 0) {
- ret = afs_extract_data(call, call->reply[2],
- call->count, true);
- if (ret < 0)
- return ret;
- }
+ ret = afs_extract_data(call, false);
+ if (ret < 0)
+ return ret;
p = call->reply[2];
p[call->count] = 0;
_debug("motd '%s'", p);
- call->offset = 0;
call->unmarshall++;
- /* extract the message of the day padding */
- call->count = (4 - (call->count & 3)) & 3;
-
- case 10:
- ret = afs_extract_data(call, call->buffer,
- call->count, false);
- if (ret < 0)
- return ret;
-
- call->offset = 0;
- call->unmarshall++;
- case 11:
+ case 8:
break;
}
__be32 *bp;
void *tmpbuf;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_get_volume_status(fc, vs);
+
_enter("");
tmpbuf = kmalloc(AFSOPAQUEMAX, GFP_KERNEL);
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_set_lock(fc, type);
+
_enter("");
call = afs_alloc_flat_call(net, &afs_RXFSSetLock, 5 * 4, 6 * 4);
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_extend_lock(fc);
+
_enter("");
call = afs_alloc_flat_call(net, &afs_RXFSExtendLock, 4 * 4, 6 * 4);
struct afs_net *net = afs_v2net(vnode);
__be32 *bp;
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_release_lock(fc);
+
_enter("");
call = afs_alloc_flat_call(net, &afs_RXFSReleaseLock, 4 * 4, 6 * 4);
u32 count;
int ret;
- _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+ _enter("{%u,%zu}", call->unmarshall, iov_iter_count(&call->iter));
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* Extract the capabilities word count */
case 1:
- ret = afs_extract_data(call, &call->tmp,
- 1 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->count = count;
call->count2 = count;
- call->offset = 0;
+ iov_iter_discard(&call->iter, READ, count * sizeof(__be32));
call->unmarshall++;
/* Extract capabilities words */
case 2:
- count = min(call->count, 16U);
- ret = afs_extract_data(call, call->buffer,
- count * sizeof(__be32),
- call->count > 16);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
/* TODO: Examine capabilities */
- call->count -= count;
- if (call->count > 0)
- goto again;
- call->offset = 0;
call->unmarshall++;
break;
}
return 0;
}
+static void afs_destroy_fs_get_capabilities(struct afs_call *call)
+{
+ struct afs_server *server = call->reply[0];
+
+ afs_put_server(call->net, server);
+ afs_flat_call_destructor(call);
+}
+
/*
* FS.GetCapabilities operation type
*/
.name = "FS.GetCapabilities",
.op = afs_FS_GetCapabilities,
.deliver = afs_deliver_fs_get_capabilities,
- .destructor = afs_flat_call_destructor,
+ .done = afs_fileserver_probe_result,
+ .destructor = afs_destroy_fs_get_capabilities,
};
/*
int afs_fs_get_capabilities(struct afs_net *net,
struct afs_server *server,
struct afs_addr_cursor *ac,
- struct key *key)
+ struct key *key,
+ unsigned int server_index,
+ bool async)
{
struct afs_call *call;
__be32 *bp;
return -ENOMEM;
call->key = key;
+ call->reply[0] = afs_get_server(server);
+ call->reply[1] = (void *)(long)server_index;
+ call->upgrade = true;
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
/* Can't take a ref on server */
trace_afs_make_fs_call(call, NULL);
- return afs_make_call(ac, call, GFP_NOFS, false);
+ return afs_make_call(ac, call, GFP_NOFS, async);
}
/*
struct afs_file_status *status = call->reply[1];
struct afs_callback *callback = call->reply[2];
struct afs_volsync *volsync = call->reply[3];
- struct afs_vnode *vnode = call->reply[0];
+ struct afs_fid *fid = call->reply[0];
const __be32 *bp;
int ret;
if (ret < 0)
return ret;
- _enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
+ _enter("{%llx:%llu}", fid->vid, fid->vnode);
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- afs_decode_status(call, &bp, status, vnode,
- &call->expected_version, NULL);
- callback[call->count].version = ntohl(bp[0]);
- callback[call->count].expiry = ntohl(bp[1]);
- callback[call->count].type = ntohl(bp[2]);
- if (vnode)
- xdr_decode_AFSCallBack(call, vnode, &bp);
- else
- bp += 3;
- if (volsync)
- xdr_decode_AFSVolSync(&bp, volsync);
+ ret = afs_decode_status(call, &bp, status, NULL,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_AFSCallBack_raw(call, &bp, callback);
+ xdr_decode_AFSVolSync(&bp, volsync);
_leave(" = 0 [done]");
return 0;
struct afs_call *call;
__be32 *bp;
- _enter(",%x,{%x:%u},,",
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_fetch_status(fc, net, fid, status, callback, volsync);
+
+ _enter(",%x,{%llx:%llu},,",
key_serial(fc->key), fid->vid, fid->vnode);
call = afs_alloc_flat_call(net, &afs_RXFSFetchStatus, 16, (21 + 3 + 6) * 4);
}
call->key = fc->key;
- call->reply[0] = NULL; /* vnode for fid[0] */
+ call->reply[0] = fid;
call->reply[1] = status;
call->reply[2] = callback;
call->reply[3] = volsync;
call->expected_version = 1; /* vnode->status.data_version */
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* Extract the file status count and array in two steps */
case 1:
_debug("extract status count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
tmp = ntohl(call->tmp);
_debug("status count: %u/%u", tmp, call->count2);
if (tmp != call->count2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_ibulkst_count);
call->count = 0;
call->unmarshall++;
more_counts:
- call->offset = 0;
+ afs_extract_to_buf(call, 21 * sizeof(__be32));
case 2:
_debug("extract status array %u", call->count);
- ret = afs_extract_data(call, call->buffer, 21 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
bp = call->buffer;
statuses = call->reply[1];
- if (afs_decode_status(call, &bp, &statuses[call->count],
- call->count == 0 ? vnode : NULL,
- NULL, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &statuses[call->count],
+ call->count == 0 ? vnode : NULL,
+ NULL, NULL);
+ if (ret < 0)
+ return ret;
call->count++;
if (call->count < call->count2)
call->count = 0;
call->unmarshall++;
- call->offset = 0;
+ afs_extract_to_tmp(call);
/* Extract the callback count and array in two steps */
case 3:
_debug("extract CB count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
tmp = ntohl(call->tmp);
_debug("CB count: %u", tmp);
if (tmp != call->count2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_ibulkst_cb_count);
call->count = 0;
call->unmarshall++;
more_cbs:
- call->offset = 0;
+ afs_extract_to_buf(call, 3 * sizeof(__be32));
case 4:
_debug("extract CB array");
- ret = afs_extract_data(call, call->buffer, 3 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
bp = call->buffer;
callbacks = call->reply[2];
callbacks[call->count].version = ntohl(bp[0]);
- callbacks[call->count].expiry = ntohl(bp[1]);
+ callbacks[call->count].expires_at = xdr_decode_expiry(call, ntohl(bp[1]));
callbacks[call->count].type = ntohl(bp[2]);
statuses = call->reply[1];
if (call->count == 0 && vnode && statuses[0].abort_code == 0)
if (call->count < call->count2)
goto more_cbs;
- call->offset = 0;
+ afs_extract_to_buf(call, 6 * sizeof(__be32));
call->unmarshall++;
case 5:
- ret = afs_extract_data(call, call->buffer, 6 * 4, false);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
bp = call->buffer;
- if (call->reply[3])
- xdr_decode_AFSVolSync(&bp, call->reply[3]);
+ xdr_decode_AFSVolSync(&bp, call->reply[3]);
- call->offset = 0;
call->unmarshall++;
case 6:
__be32 *bp;
int i;
- _enter(",%x,{%x:%u},%u",
+ if (test_bit(AFS_SERVER_FL_IS_YFS, &fc->cbi->server->flags))
+ return yfs_fs_inline_bulk_status(fc, net, fids, statuses, callbacks,
+ nr_fids, volsync);
+
+ _enter(",%x,{%llx:%llu},%u",
key_serial(fc->key), fids[0].vid, fids[1].vnode, nr_fids);
call = afs_alloc_flat_call(net, &afs_RXFSInlineBulkStatus,
call->reply[2] = callbacks;
call->reply[3] = volsync;
call->count2 = nr_fids;
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
default:
printk("kAFS: AFS vnode with undefined type\n");
read_sequnlock_excl(&vnode->cb_lock);
- return afs_protocol_error(NULL, -EBADMSG);
+ return afs_protocol_error(NULL, -EBADMSG, afs_eproto_file_type);
}
inode->i_blocks = 0;
struct afs_fs_cursor fc;
int ret;
- _enter("%s,{%x:%u.%u,S=%lx}",
+ _enter("%s,{%llx:%llu.%u,S=%lx}",
vnode->volume->name,
vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique,
vnode->flags);
int afs_iget5_test(struct inode *inode, void *opaque)
{
struct afs_iget_data *data = opaque;
+ struct afs_vnode *vnode = AFS_FS_I(inode);
- return inode->i_ino == data->fid.vnode &&
- inode->i_generation == data->fid.unique;
+ return memcmp(&vnode->fid, &data->fid, sizeof(data->fid)) == 0;
}
/*
struct afs_iget_data *data = opaque;
struct afs_vnode *vnode = AFS_FS_I(inode);
- inode->i_ino = data->fid.vnode;
- inode->i_generation = data->fid.unique;
vnode->fid = data->fid;
vnode->volume = data->volume;
+ /* YFS supports 96-bit vnode IDs, but Linux only supports
+ * 64-bit inode numbers.
+ */
+ inode->i_ino = data->fid.vnode;
+ inode->i_generation = data->fid.unique;
return 0;
}
return ERR_PTR(-ENOMEM);
}
- _debug("GOT INODE %p { ino=%lu, vl=%x, vn=%x, u=%x }",
+ _debug("GOT INODE %p { ino=%lu, vl=%llx, vn=%llx, u=%x }",
inode, inode->i_ino, data.fid.vid, data.fid.vnode,
data.fid.unique);
key.vnode_id = vnode->fid.vnode;
key.unique = vnode->fid.unique;
- key.vnode_id_ext[0] = 0;
- key.vnode_id_ext[1] = 0;
+ key.vnode_id_ext[0] = vnode->fid.vnode >> 32;
+ key.vnode_id_ext[1] = vnode->fid.vnode_hi;
aux.data_version = vnode->status.data_version;
vnode->cache = fscache_acquire_cookie(vnode->volume->cache,
struct inode *inode;
int ret;
- _enter(",{%x:%u.%u},,", fid->vid, fid->vnode, fid->unique);
+ _enter(",{%llx:%llu.%u},,", fid->vid, fid->vnode, fid->unique);
as = sb->s_fs_info;
data.volume = as->volume;
return ERR_PTR(-ENOMEM);
}
- _debug("GOT INODE %p { vl=%x vn=%x, u=%x }",
+ _debug("GOT INODE %p { vl=%llx vn=%llx, u=%x }",
inode, fid->vid, fid->vnode, fid->unique);
vnode = AFS_FS_I(inode);
* didn't give us a callback) */
vnode->cb_version = 0;
vnode->cb_type = 0;
- vnode->cb_expires_at = 0;
+ vnode->cb_expires_at = ktime_get();
} else {
vnode->cb_version = cb->version;
vnode->cb_type = cb->type;
- vnode->cb_expires_at = cb->expiry;
+ vnode->cb_expires_at = cb->expires_at;
vnode->cb_interest = afs_get_cb_interest(cbi);
set_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
}
*/
void afs_zap_data(struct afs_vnode *vnode)
{
- _enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
+ _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
#ifdef CONFIG_AFS_FSCACHE
fscache_invalidate(vnode->cache);
int afs_validate(struct afs_vnode *vnode, struct key *key)
{
time64_t now = ktime_get_real_seconds();
- bool valid = false;
+ bool valid;
int ret;
- _enter("{v={%x:%u} fl=%lx},%x",
+ _enter("{v={%llx:%llu} fl=%lx},%x",
vnode->fid.vid, vnode->fid.vnode, vnode->flags,
key_serial(key));
vnode->cb_v_break = vnode->volume->cb_v_break;
valid = false;
} else if (vnode->status.type == AFS_FTYPE_DIR &&
- test_bit(AFS_VNODE_DIR_VALID, &vnode->flags) &&
- vnode->cb_expires_at - 10 > now) {
- valid = true;
- } else if (!test_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) &&
- vnode->cb_expires_at - 10 > now) {
+ (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags) ||
+ vnode->cb_expires_at - 10 <= now)) {
+ valid = false;
+ } else if (test_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) ||
+ vnode->cb_expires_at - 10 <= now) {
+ valid = false;
+ } else {
valid = true;
}
} else if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
valid = true;
+ } else {
+ vnode->cb_s_break = vnode->cb_interest->server->cb_s_break;
+ vnode->cb_v_break = vnode->volume->cb_v_break;
+ valid = false;
}
read_sequnlock_excl(&vnode->cb_lock);
vnode = AFS_FS_I(inode);
- _enter("{%x:%u.%d}",
+ _enter("{%llx:%llu.%d}",
vnode->fid.vid,
vnode->fid.vnode,
vnode->fid.unique);
struct key *key;
int ret;
- _enter("{%x:%u},{n=%pd},%x",
+ _enter("{%llx:%llu},{n=%pd},%x",
vnode->fid.vid, vnode->fid.vnode, dentry,
attr->ia_valid);
#include <linux/backing-dev.h>
#include <linux/uuid.h>
#include <linux/mm_types.h>
+#include <linux/dns_resolver.h>
#include <net/net_namespace.h>
#include <net/netns/generic.h>
#include <net/sock.h>
u32 version; /* Version */
unsigned char max_addrs;
unsigned char nr_addrs;
- unsigned char index; /* Address currently in use */
+ unsigned char preferred; /* Preferred address */
unsigned char nr_ipv4; /* Number of IPv4 addresses */
+ enum dns_record_source source:8;
+ enum dns_lookup_status status:8;
unsigned long probed; /* Mask of servers that have been probed */
- unsigned long yfs; /* Mask of servers that are YFS */
+ unsigned long failed; /* Mask of addrs that failed locally/ICMP */
+ unsigned long responded; /* Mask of addrs that responded */
struct sockaddr_rxrpc addrs[];
#define AFS_MAX_ADDRESSES ((unsigned int)(sizeof(unsigned long) * 8))
};
*/
struct afs_call {
const struct afs_call_type *type; /* type of call */
+ struct afs_addr_list *alist; /* Address is alist[addr_ix] */
wait_queue_head_t waitq; /* processes awaiting completion */
struct work_struct async_work; /* async I/O processor */
struct work_struct work; /* actual work processor */
struct afs_cb_interest *cbi; /* Callback interest for server used */
void *request; /* request data (first part) */
struct address_space *mapping; /* Pages being written from */
+ struct iov_iter iter; /* Buffer iterator */
+ struct iov_iter *_iter; /* Iterator currently in use */
+ union { /* Convenience for ->iter */
+ struct kvec kvec[1];
+ struct bio_vec bvec[1];
+ };
void *buffer; /* reply receive buffer */
void *reply[4]; /* Where to put the reply */
pgoff_t first; /* first page in mapping to deal with */
pgoff_t last; /* last page in mapping to deal with */
- size_t offset; /* offset into received data store */
atomic_t usage;
enum afs_call_state state;
spinlock_t state_lock;
int error; /* error code */
u32 abort_code; /* Remote abort ID or 0 */
+ u32 epoch;
unsigned request_size; /* size of request data */
unsigned reply_max; /* maximum size of reply */
unsigned first_offset; /* offset into mapping[first] */
unsigned count2; /* count used in unmarshalling */
};
unsigned char unmarshall; /* unmarshalling phase */
+ unsigned char addr_ix; /* Address in ->alist */
bool incoming; /* T if incoming call */
bool send_pages; /* T if data from mapping should be sent */
bool need_attention; /* T if RxRPC poked us */
bool async; /* T if asynchronous */
bool ret_reply0; /* T if should return reply[0] on success */
bool upgrade; /* T to request service upgrade */
+ bool want_reply_time; /* T if want reply_time */
u16 service_id; /* Actual service ID (after upgrade) */
unsigned int debug_id; /* Trace ID */
u32 operation_ID; /* operation ID for an incoming call */
u32 count; /* count for use in unmarshalling */
- __be32 tmp; /* place to extract temporary data */
+ union { /* place to extract temporary data */
+ struct {
+ __be32 tmp_u;
+ __be32 tmp;
+ } __attribute__((packed));
+ __be64 tmp64;
+ };
afs_dataversion_t expected_version; /* Updated version expected from store */
afs_dataversion_t expected_version_2; /* 2nd updated version expected from store */
+ ktime_t reply_time; /* Time of first reply packet */
};
struct afs_call_type {
/* Work function */
void (*work)(struct work_struct *work);
+
+ /* Call done function (gets called immediately on success or failure) */
+ void (*done)(struct afs_call *call);
};
/*
refcount_t usage;
unsigned int index; /* Which page we're reading into */
unsigned int nr_pages;
+ unsigned int offset; /* offset into current page */
void (*page_done)(struct afs_call *, struct afs_read *);
struct page **pages;
struct page *array[];
rwlock_t proc_lock;
/* VL server list. */
- rwlock_t vl_addrs_lock; /* Lock on vl_addrs */
- struct afs_addr_list __rcu *vl_addrs; /* List of VL servers */
+ rwlock_t vl_servers_lock; /* Lock on vl_servers */
+ struct afs_vlserver_list __rcu *vl_servers;
+
u8 name_len; /* Length of name */
char name[64 + 1]; /* Cell name, case-flattened and NUL-padded */
};
+/*
+ * Volume Location server record.
+ */
+struct afs_vlserver {
+ struct rcu_head rcu;
+ struct afs_addr_list __rcu *addresses; /* List of addresses for this VL server */
+ unsigned long flags;
+#define AFS_VLSERVER_FL_PROBED 0 /* The VL server has been probed */
+#define AFS_VLSERVER_FL_PROBING 1 /* VL server is being probed */
+#define AFS_VLSERVER_FL_IS_YFS 2 /* Server is YFS not AFS */
+ rwlock_t lock; /* Lock on addresses */
+ atomic_t usage;
+
+ /* Probe state */
+ wait_queue_head_t probe_wq;
+ atomic_t probe_outstanding;
+ spinlock_t probe_lock;
+ struct {
+ unsigned int rtt; /* RTT as ktime/64 */
+ u32 abort_code;
+ short error;
+ bool have_result;
+ bool responded:1;
+ bool is_yfs:1;
+ bool not_yfs:1;
+ bool local_failure:1;
+ } probe;
+
+ u16 port;
+ u16 name_len; /* Length of name */
+ char name[]; /* Server name, case-flattened */
+};
+
+/*
+ * Weighted list of Volume Location servers.
+ */
+struct afs_vlserver_entry {
+ u16 priority; /* Preference (as SRV) */
+ u16 weight; /* Weight (as SRV) */
+ enum dns_record_source source:8;
+ enum dns_lookup_status status:8;
+ struct afs_vlserver *server;
+};
+
+struct afs_vlserver_list {
+ struct rcu_head rcu;
+ atomic_t usage;
+ u8 nr_servers;
+ u8 index; /* Server currently in use */
+ u8 preferred; /* Preferred server */
+ enum dns_record_source source:8;
+ enum dns_lookup_status status:8;
+ rwlock_t lock;
+ struct afs_vlserver_entry servers[];
+};
+
/*
* Cached VLDB entry.
*
#define AFS_SERVER_FL_PROBING 6 /* Fileserver is being probed */
#define AFS_SERVER_FL_NO_IBULK 7 /* Fileserver doesn't support FS.InlineBulkStatus */
#define AFS_SERVER_FL_MAY_HAVE_CB 8 /* May have callbacks on this fileserver */
+#define AFS_SERVER_FL_IS_YFS 9 /* Server is YFS not AFS */
+#define AFS_SERVER_FL_NO_RM2 10 /* Fileserver doesn't support YFS.RemoveFile2 */
+#define AFS_SERVER_FL_HAVE_EPOCH 11 /* ->epoch is valid */
atomic_t usage;
u32 addr_version; /* Address list version */
+ u32 cm_epoch; /* Server RxRPC epoch */
/* file service access */
rwlock_t fs_lock; /* access lock */
struct hlist_head cb_volumes; /* List of volume interests on this server */
unsigned cb_s_break; /* Break-everything counter. */
rwlock_t cb_break_lock; /* Volume finding lock */
+
+ /* Probe state */
+ wait_queue_head_t probe_wq;
+ atomic_t probe_outstanding;
+ spinlock_t probe_lock;
+ struct {
+ unsigned int rtt; /* RTT as ktime/64 */
+ u32 abort_code;
+ u32 cm_epoch;
+ short error;
+ bool have_result;
+ bool responded:1;
+ bool is_yfs:1;
+ bool not_yfs:1;
+ bool local_failure:1;
+ bool no_epoch:1;
+ bool cm_probed:1;
+ bool said_rebooted:1;
+ bool said_inconsistent:1;
+ } probe;
};
/*
struct afs_server_list {
refcount_t usage;
- unsigned short nr_servers;
- unsigned short index; /* Server currently in use */
+ unsigned char nr_servers;
+ unsigned char preferred; /* Preferred server */
unsigned short vnovol_mask; /* Servers to be skipped due to VNOVOL */
unsigned int seq; /* Set to ->servers_seq when installed */
rwlock_t lock;
afs_callback_type_t cb_type; /* type of callback */
};
+static inline struct fscache_cookie *afs_vnode_cache(struct afs_vnode *vnode)
+{
+#ifdef CONFIG_AFS_FSCACHE
+ return vnode->cache;
+#else
+ return NULL;
+#endif
+}
+
/*
* cached security record for one user's attempt to access a vnode
*/
unsigned mtu; /* MTU of interface */
};
+/*
+ * Error prioritisation and accumulation.
+ */
+struct afs_error {
+ short error; /* Accumulated error */
+ bool responded; /* T if server responded */
+};
+
/*
* Cursor for iterating over a server's address list.
*/
struct afs_addr_cursor {
struct afs_addr_list *alist; /* Current address list (pins ref) */
- struct sockaddr_rxrpc *addr;
+ unsigned long tried; /* Tried addresses */
+ signed char index; /* Current address */
+ bool responded; /* T if the current address responded */
+ unsigned short nr_iterations; /* Number of address iterations */
+ short error;
u32 abort_code;
- unsigned short start; /* Starting point in alist->addrs[] */
- unsigned short index; /* Wrapping offset from start to current addr */
+};
+
+/*
+ * Cursor for iterating over a set of volume location servers.
+ */
+struct afs_vl_cursor {
+ struct afs_addr_cursor ac;
+ struct afs_cell *cell; /* The cell we're querying */
+ struct afs_vlserver_list *server_list; /* Current server list (pins ref) */
+ struct afs_vlserver *server; /* Server on which this resides */
+ struct key *key; /* Key for the server */
+ unsigned long untried; /* Bitmask of untried servers */
+ short index; /* Current server */
short error;
- bool begun; /* T if we've begun iteration */
- bool responded; /* T if the current address responded */
+ unsigned short flags;
+#define AFS_VL_CURSOR_STOP 0x0001 /* Set to cease iteration */
+#define AFS_VL_CURSOR_RETRY 0x0002 /* Set to do a retry */
+#define AFS_VL_CURSOR_RETRIED 0x0004 /* Set if started a retry */
+ unsigned short nr_iterations; /* Number of server iterations */
};
/*
struct afs_server_list *server_list; /* Current server list (pins ref) */
struct afs_cb_interest *cbi; /* Server on which this resides (pins ref) */
struct key *key; /* Key for the server */
+ unsigned long untried; /* Bitmask of untried servers */
unsigned int cb_break; /* cb_break + cb_s_break before the call */
unsigned int cb_break_2; /* cb_break + cb_s_break (2nd vnode) */
- unsigned char start; /* Initial index in server list */
- unsigned char index; /* Number of servers tried beyond start */
+ short index; /* Current server */
+ short error;
unsigned short flags;
#define AFS_FS_CURSOR_STOP 0x0001 /* Set to cease iteration */
#define AFS_FS_CURSOR_VBUSY 0x0002 /* Set if seen VBUSY */
#define AFS_FS_CURSOR_VNOVOL 0x0008 /* Set if seen VNOVOL */
#define AFS_FS_CURSOR_CUR_ONLY 0x0010 /* Set if current server only (file lock held) */
#define AFS_FS_CURSOR_NO_VSLEEP 0x0020 /* Set to prevent sleep on VBUSY, VOFFLINE, ... */
+ unsigned short nr_iterations; /* Number of server iterations */
};
/*
unsigned short,
unsigned short);
extern void afs_put_addrlist(struct afs_addr_list *);
-extern struct afs_addr_list *afs_parse_text_addrs(const char *, size_t, char,
- unsigned short, unsigned short);
-extern struct afs_addr_list *afs_dns_query(struct afs_cell *, time64_t *);
+extern struct afs_vlserver_list *afs_parse_text_addrs(struct afs_net *,
+ const char *, size_t, char,
+ unsigned short, unsigned short);
+extern struct afs_vlserver_list *afs_dns_query(struct afs_cell *, time64_t *);
extern bool afs_iterate_addresses(struct afs_addr_cursor *);
extern int afs_end_cursor(struct afs_addr_cursor *);
-extern int afs_set_vl_cursor(struct afs_addr_cursor *, struct afs_cell *);
extern void afs_merge_fs_addr4(struct afs_addr_list *, __be32, u16);
extern void afs_merge_fs_addr6(struct afs_addr_list *, __be32 *, u16);
* callback.c
*/
extern void afs_init_callback_state(struct afs_server *);
+extern void __afs_break_callback(struct afs_vnode *);
extern void afs_break_callback(struct afs_vnode *);
extern void afs_break_callbacks(struct afs_server *, size_t, struct afs_callback_break*);
return vnode->cb_break + vnode->cb_s_break + vnode->cb_v_break;
}
-static inline unsigned int afs_cb_break_sum(struct afs_vnode *vnode,
- struct afs_cb_interest *cbi)
+static inline bool afs_cb_is_broken(unsigned int cb_break,
+ const struct afs_vnode *vnode,
+ const struct afs_cb_interest *cbi)
{
- return vnode->cb_break + cbi->server->cb_s_break + vnode->volume->cb_v_break;
+ return !cbi || cb_break != (vnode->cb_break +
+ cbi->server->cb_s_break +
+ vnode->volume->cb_v_break);
}
/*
extern int afs_fs_fetch_data(struct afs_fs_cursor *, struct afs_read *);
extern int afs_fs_create(struct afs_fs_cursor *, const char *, umode_t, u64,
struct afs_fid *, struct afs_file_status *, struct afs_callback *);
-extern int afs_fs_remove(struct afs_fs_cursor *, const char *, bool, u64);
+extern int afs_fs_remove(struct afs_fs_cursor *, struct afs_vnode *, const char *, bool, u64);
extern int afs_fs_link(struct afs_fs_cursor *, struct afs_vnode *, const char *, u64);
extern int afs_fs_symlink(struct afs_fs_cursor *, const char *, const char *, u64,
struct afs_fid *, struct afs_file_status *);
extern int afs_fs_give_up_all_callbacks(struct afs_net *, struct afs_server *,
struct afs_addr_cursor *, struct key *);
extern int afs_fs_get_capabilities(struct afs_net *, struct afs_server *,
- struct afs_addr_cursor *, struct key *);
+ struct afs_addr_cursor *, struct key *, unsigned int, bool);
extern int afs_fs_inline_bulk_status(struct afs_fs_cursor *, struct afs_net *,
struct afs_fid *, struct afs_file_status *,
struct afs_callback *, unsigned int,
struct afs_fid *, struct afs_file_status *,
struct afs_callback *, struct afs_volsync *);
+/*
+ * fs_probe.c
+ */
+extern void afs_fileserver_probe_result(struct afs_call *);
+extern int afs_probe_fileservers(struct afs_net *, struct key *, struct afs_server_list *);
+extern int afs_wait_for_fs_probes(struct afs_server_list *, unsigned long);
+
/*
* inode.c
*/
* misc.c
*/
extern int afs_abort_to_error(u32);
+extern void afs_prioritise_error(struct afs_error *, int, u32);
/*
* mntpt.c
extern void __net_exit afs_close_socket(struct afs_net *);
extern void afs_charge_preallocation(struct work_struct *);
extern void afs_put_call(struct afs_call *);
-extern int afs_queue_call_work(struct afs_call *);
extern long afs_make_call(struct afs_addr_cursor *, struct afs_call *, gfp_t, bool);
extern struct afs_call *afs_alloc_flat_call(struct afs_net *,
const struct afs_call_type *,
extern void afs_flat_call_destructor(struct afs_call *);
extern void afs_send_empty_reply(struct afs_call *);
extern void afs_send_simple_reply(struct afs_call *, const void *, size_t);
-extern int afs_extract_data(struct afs_call *, void *, size_t, bool);
-extern int afs_protocol_error(struct afs_call *, int);
+extern int afs_extract_data(struct afs_call *, bool);
+extern int afs_protocol_error(struct afs_call *, int, enum afs_eproto_cause);
+
+static inline void afs_extract_begin(struct afs_call *call, void *buf, size_t size)
+{
+ call->kvec[0].iov_base = buf;
+ call->kvec[0].iov_len = size;
+ iov_iter_kvec(&call->iter, READ, call->kvec, 1, size);
+}
+
+static inline void afs_extract_to_tmp(struct afs_call *call)
+{
+ afs_extract_begin(call, &call->tmp, sizeof(call->tmp));
+}
+
+static inline void afs_extract_to_tmp64(struct afs_call *call)
+{
+ afs_extract_begin(call, &call->tmp64, sizeof(call->tmp64));
+}
+
+static inline void afs_extract_discard(struct afs_call *call, size_t size)
+{
+ iov_iter_discard(&call->iter, READ, size);
+}
+
+static inline void afs_extract_to_buf(struct afs_call *call, size_t size)
+{
+ afs_extract_begin(call, call->buffer, size);
+}
static inline int afs_transfer_reply(struct afs_call *call)
{
- return afs_extract_data(call, call->buffer, call->reply_max, false);
+ return afs_extract_data(call, false);
}
static inline bool afs_check_call_state(struct afs_call *call,
extern void afs_manage_servers(struct work_struct *);
extern void afs_servers_timer(struct timer_list *);
extern void __net_exit afs_purge_servers(struct afs_net *);
-extern bool afs_probe_fileserver(struct afs_fs_cursor *);
extern bool afs_check_server_record(struct afs_fs_cursor *, struct afs_server *);
/*
/*
* vlclient.c
*/
-extern struct afs_vldb_entry *afs_vl_get_entry_by_name_u(struct afs_net *,
- struct afs_addr_cursor *,
- struct key *, const char *, int);
-extern struct afs_addr_list *afs_vl_get_addrs_u(struct afs_net *, struct afs_addr_cursor *,
- struct key *, const uuid_t *);
-extern int afs_vl_get_capabilities(struct afs_net *, struct afs_addr_cursor *, struct key *);
-extern struct afs_addr_list *afs_yfsvl_get_endpoints(struct afs_net *, struct afs_addr_cursor *,
- struct key *, const uuid_t *);
+extern struct afs_vldb_entry *afs_vl_get_entry_by_name_u(struct afs_vl_cursor *,
+ const char *, int);
+extern struct afs_addr_list *afs_vl_get_addrs_u(struct afs_vl_cursor *, const uuid_t *);
+extern int afs_vl_get_capabilities(struct afs_net *, struct afs_addr_cursor *, struct key *,
+ struct afs_vlserver *, unsigned int, bool);
+extern struct afs_addr_list *afs_yfsvl_get_endpoints(struct afs_vl_cursor *, const uuid_t *);
+
+/*
+ * vl_probe.c
+ */
+extern void afs_vlserver_probe_result(struct afs_call *);
+extern int afs_send_vl_probes(struct afs_net *, struct key *, struct afs_vlserver_list *);
+extern int afs_wait_for_vl_probes(struct afs_vlserver_list *, unsigned long);
+
+/*
+ * vl_rotate.c
+ */
+extern bool afs_begin_vlserver_operation(struct afs_vl_cursor *,
+ struct afs_cell *, struct key *);
+extern bool afs_select_vlserver(struct afs_vl_cursor *);
+extern bool afs_select_current_vlserver(struct afs_vl_cursor *);
+extern int afs_end_vlserver_operation(struct afs_vl_cursor *);
+
+/*
+ * vlserver_list.c
+ */
+static inline struct afs_vlserver *afs_get_vlserver(struct afs_vlserver *vlserver)
+{
+ atomic_inc(&vlserver->usage);
+ return vlserver;
+}
+
+static inline struct afs_vlserver_list *afs_get_vlserverlist(struct afs_vlserver_list *vllist)
+{
+ if (vllist)
+ atomic_inc(&vllist->usage);
+ return vllist;
+}
+
+extern struct afs_vlserver *afs_alloc_vlserver(const char *, size_t, unsigned short);
+extern void afs_put_vlserver(struct afs_net *, struct afs_vlserver *);
+extern struct afs_vlserver_list *afs_alloc_vlserver_list(unsigned int);
+extern void afs_put_vlserverlist(struct afs_net *, struct afs_vlserver_list *);
+extern struct afs_vlserver_list *afs_extract_vlserver_list(struct afs_cell *,
+ const void *, size_t);
/*
* volume.c
extern const struct xattr_handler *afs_xattr_handlers[];
extern ssize_t afs_listxattr(struct dentry *, char *, size_t);
+/*
+ * yfsclient.c
+ */
+extern int yfs_fs_fetch_file_status(struct afs_fs_cursor *, struct afs_volsync *, bool);
+extern int yfs_fs_fetch_data(struct afs_fs_cursor *, struct afs_read *);
+extern int yfs_fs_create_file(struct afs_fs_cursor *, const char *, umode_t, u64,
+ struct afs_fid *, struct afs_file_status *, struct afs_callback *);
+extern int yfs_fs_make_dir(struct afs_fs_cursor *, const char *, umode_t, u64,
+ struct afs_fid *, struct afs_file_status *, struct afs_callback *);
+extern int yfs_fs_remove_file2(struct afs_fs_cursor *, struct afs_vnode *, const char *, u64);
+extern int yfs_fs_remove(struct afs_fs_cursor *, struct afs_vnode *, const char *, bool, u64);
+extern int yfs_fs_link(struct afs_fs_cursor *, struct afs_vnode *, const char *, u64);
+extern int yfs_fs_symlink(struct afs_fs_cursor *, const char *, const char *, u64,
+ struct afs_fid *, struct afs_file_status *);
+extern int yfs_fs_rename(struct afs_fs_cursor *, const char *,
+ struct afs_vnode *, const char *, u64, u64);
+extern int yfs_fs_store_data(struct afs_fs_cursor *, struct address_space *,
+ pgoff_t, pgoff_t, unsigned, unsigned);
+extern int yfs_fs_setattr(struct afs_fs_cursor *, struct iattr *);
+extern int yfs_fs_get_volume_status(struct afs_fs_cursor *, struct afs_volume_status *);
+extern int yfs_fs_set_lock(struct afs_fs_cursor *, afs_lock_type_t);
+extern int yfs_fs_extend_lock(struct afs_fs_cursor *);
+extern int yfs_fs_release_lock(struct afs_fs_cursor *);
+extern int yfs_fs_fetch_status(struct afs_fs_cursor *, struct afs_net *,
+ struct afs_fid *, struct afs_file_status *,
+ struct afs_callback *, struct afs_volsync *);
+extern int yfs_fs_inline_bulk_status(struct afs_fs_cursor *, struct afs_net *,
+ struct afs_fid *, struct afs_file_status *,
+ struct afs_callback *, unsigned int,
+ struct afs_volsync *);
/*
* Miscellaneous inline functions.
}
}
+static inline int afs_io_error(struct afs_call *call, enum afs_io_error where)
+{
+ trace_afs_io_error(call->debug_id, -EIO, where);
+ return -EIO;
+}
+
+static inline int afs_bad(struct afs_vnode *vnode, enum afs_file_error where)
+{
+ trace_afs_file_error(vnode, -EIO, where);
+ return -EIO;
+}
/*****************************************************************************/
/*
default: return -EREMOTEIO;
}
}
+
+/*
+ * Select the error to report from a set of errors.
+ */
+void afs_prioritise_error(struct afs_error *e, int error, u32 abort_code)
+{
+ switch (error) {
+ case 0:
+ return;
+ default:
+ if (e->error == -ETIMEDOUT ||
+ e->error == -ETIME)
+ return;
+ case -ETIMEDOUT:
+ case -ETIME:
+ if (e->error == -ENOMEM ||
+ e->error == -ENONET)
+ return;
+ case -ENOMEM:
+ case -ENONET:
+ if (e->error == -ERFKILL)
+ return;
+ case -ERFKILL:
+ if (e->error == -EADDRNOTAVAIL)
+ return;
+ case -EADDRNOTAVAIL:
+ if (e->error == -ENETUNREACH)
+ return;
+ case -ENETUNREACH:
+ if (e->error == -EHOSTUNREACH)
+ return;
+ case -EHOSTUNREACH:
+ if (e->error == -EHOSTDOWN)
+ return;
+ case -EHOSTDOWN:
+ if (e->error == -ECONNREFUSED)
+ return;
+ case -ECONNREFUSED:
+ if (e->error == -ECONNRESET)
+ return;
+ case -ECONNRESET: /* Responded, but call expired. */
+ if (e->responded)
+ return;
+ e->error = error;
+ return;
+
+ case -ECONNABORTED:
+ e->responded = true;
+ e->error = afs_abort_to_error(abort_code);
+ return;
+ }
+}
goto error_no_page;
}
- ret = -EIO;
- if (PageError(page))
+ if (PageError(page)) {
+ ret = afs_bad(AFS_FS_I(d_inode(mntpt)), afs_file_error_mntpt);
goto error;
+ }
buf = kmap_atomic(page);
memcpy(devname, buf, size);
#include <linux/uaccess.h>
#include "internal.h"
+struct afs_vl_seq_net_private {
+ struct seq_net_private seq; /* Must be first */
+ struct afs_vlserver_list *vllist;
+};
+
static inline struct afs_net *afs_seq2net(struct seq_file *m)
{
return afs_net(seq_file_net(m));
*/
static int afs_proc_cells_show(struct seq_file *m, void *v)
{
- struct afs_cell *cell = list_entry(v, struct afs_cell, proc_link);
+ struct afs_vlserver_list *vllist;
+ struct afs_cell *cell;
if (v == SEQ_START_TOKEN) {
/* display header on line 1 */
- seq_puts(m, "USE NAME\n");
+ seq_puts(m, "USE TTL SV NAME\n");
return 0;
}
+ cell = list_entry(v, struct afs_cell, proc_link);
+ vllist = rcu_dereference(cell->vl_servers);
+
/* display one cell per line on subsequent lines */
- seq_printf(m, "%3u %s\n", atomic_read(&cell->usage), cell->name);
+ seq_printf(m, "%3u %6lld %2u %s\n",
+ atomic_read(&cell->usage),
+ cell->dns_expiry - ktime_get_real_seconds(),
+ vllist ? vllist->nr_servers : 0,
+ cell->name);
return 0;
}
return 0;
}
- seq_printf(m, "%3d %08x %s\n",
+ seq_printf(m, "%3d %08llx %s\n",
atomic_read(&vol->usage), vol->vid,
afs_vol_types[vol->type]);
.show = afs_proc_cell_volumes_show,
};
+static const char *const dns_record_sources[NR__dns_record_source + 1] = {
+ [DNS_RECORD_UNAVAILABLE] = "unav",
+ [DNS_RECORD_FROM_CONFIG] = "cfg",
+ [DNS_RECORD_FROM_DNS_A] = "A",
+ [DNS_RECORD_FROM_DNS_AFSDB] = "AFSDB",
+ [DNS_RECORD_FROM_DNS_SRV] = "SRV",
+ [DNS_RECORD_FROM_NSS] = "nss",
+ [NR__dns_record_source] = "[weird]"
+};
+
+static const char *const dns_lookup_statuses[NR__dns_lookup_status + 1] = {
+ [DNS_LOOKUP_NOT_DONE] = "no-lookup",
+ [DNS_LOOKUP_GOOD] = "good",
+ [DNS_LOOKUP_GOOD_WITH_BAD] = "good/bad",
+ [DNS_LOOKUP_BAD] = "bad",
+ [DNS_LOOKUP_GOT_NOT_FOUND] = "not-found",
+ [DNS_LOOKUP_GOT_LOCAL_FAILURE] = "local-failure",
+ [DNS_LOOKUP_GOT_TEMP_FAILURE] = "temp-failure",
+ [DNS_LOOKUP_GOT_NS_FAILURE] = "ns-failure",
+ [NR__dns_lookup_status] = "[weird]"
+};
+
/*
* Display the list of Volume Location servers we're using for a cell.
*/
static int afs_proc_cell_vlservers_show(struct seq_file *m, void *v)
{
- struct sockaddr_rxrpc *addr = v;
+ const struct afs_vl_seq_net_private *priv = m->private;
+ const struct afs_vlserver_list *vllist = priv->vllist;
+ const struct afs_vlserver_entry *entry;
+ const struct afs_vlserver *vlserver;
+ const struct afs_addr_list *alist;
+ int i;
- /* display header on line 1 */
- if (v == (void *)1) {
- seq_puts(m, "ADDRESS\n");
+ if (v == SEQ_START_TOKEN) {
+ seq_printf(m, "# source %s, status %s\n",
+ dns_record_sources[vllist->source],
+ dns_lookup_statuses[vllist->status]);
return 0;
}
- /* display one cell per line on subsequent lines */
- seq_printf(m, "%pISp\n", &addr->transport);
+ entry = v;
+ vlserver = entry->server;
+ alist = rcu_dereference(vlserver->addresses);
+
+ seq_printf(m, "%s [p=%hu w=%hu s=%s,%s]:\n",
+ vlserver->name, entry->priority, entry->weight,
+ dns_record_sources[alist ? alist->source : entry->source],
+ dns_lookup_statuses[alist ? alist->status : entry->status]);
+ if (alist) {
+ for (i = 0; i < alist->nr_addrs; i++)
+ seq_printf(m, " %c %pISpc\n",
+ alist->preferred == i ? '>' : '-',
+ &alist->addrs[i].transport);
+ }
return 0;
}
static void *afs_proc_cell_vlservers_start(struct seq_file *m, loff_t *_pos)
__acquires(rcu)
{
- struct afs_addr_list *alist;
+ struct afs_vl_seq_net_private *priv = m->private;
+ struct afs_vlserver_list *vllist;
struct afs_cell *cell = PDE_DATA(file_inode(m->file));
loff_t pos = *_pos;
rcu_read_lock();
- alist = rcu_dereference(cell->vl_addrs);
+ vllist = rcu_dereference(cell->vl_servers);
+ priv->vllist = vllist;
- /* allow for the header line */
- if (!pos)
- return (void *) 1;
- pos--;
+ if (pos < 0)
+ *_pos = pos = 0;
+ if (pos == 0)
+ return SEQ_START_TOKEN;
- if (!alist || pos >= alist->nr_addrs)
+ if (!vllist || pos - 1 >= vllist->nr_servers)
return NULL;
- return alist->addrs + pos;
+ return &vllist->servers[pos - 1];
}
static void *afs_proc_cell_vlservers_next(struct seq_file *m, void *v,
loff_t *_pos)
{
- struct afs_addr_list *alist;
- struct afs_cell *cell = PDE_DATA(file_inode(m->file));
+ struct afs_vl_seq_net_private *priv = m->private;
+ struct afs_vlserver_list *vllist = priv->vllist;
loff_t pos;
- alist = rcu_dereference(cell->vl_addrs);
-
pos = *_pos;
- (*_pos)++;
- if (!alist || pos >= alist->nr_addrs)
+ pos++;
+ *_pos = pos;
+ if (!vllist || pos - 1 >= vllist->nr_servers)
return NULL;
- return alist->addrs + pos;
+ return &vllist->servers[pos - 1];
}
static void afs_proc_cell_vlservers_stop(struct seq_file *m, void *v)
&server->uuid,
atomic_read(&server->usage),
&alist->addrs[0].transport,
- alist->index == 0 ? "*" : "");
+ alist->preferred == 0 ? "*" : "");
for (i = 1; i < alist->nr_addrs; i++)
seq_printf(m, " %pISpc%s\n",
&alist->addrs[i].transport,
- alist->index == i ? "*" : "");
+ alist->preferred == i ? "*" : "");
return 0;
}
if (!proc_create_net_data("vlservers", 0444, dir,
&afs_proc_cell_vlservers_ops,
- sizeof(struct seq_net_private),
+ sizeof(struct afs_vl_seq_net_private),
cell) ||
!proc_create_net_data("volumes", 0444, dir,
&afs_proc_cell_volumes_ops,
--- /dev/null
+/* YFS protocol bits
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#define YFS_FS_SERVICE 2500
+#define YFS_CM_SERVICE 2501
+
+#define YFSCBMAX 1024
+
+enum YFS_CM_Operations {
+ YFSCBProbe = 206, /* probe client */
+ YFSCBGetLock = 207, /* get contents of CM lock table */
+ YFSCBXStatsVersion = 209, /* get version of extended statistics */
+ YFSCBGetXStats = 210, /* get contents of extended statistics data */
+ YFSCBInitCallBackState3 = 213, /* initialise callback state, version 3 */
+ YFSCBProbeUuid = 214, /* check the client hasn't rebooted */
+ YFSCBGetServerPrefs = 215,
+ YFSCBGetCellServDV = 216,
+ YFSCBGetLocalCell = 217,
+ YFSCBGetCacheConfig = 218,
+ YFSCBGetCellByNum = 65537,
+ YFSCBTellMeAboutYourself = 65538, /* get client capabilities */
+ YFSCBCallBack = 64204,
+};
+
+enum YFS_FS_Operations {
+ YFSFETCHACL = 64131, /* YFS Fetch file ACL */
+ YFSFETCHSTATUS = 64132, /* YFS Fetch file status */
+ YFSSTOREACL = 64134, /* YFS Store file ACL */
+ YFSSTORESTATUS = 64135, /* YFS Store file status */
+ YFSREMOVEFILE = 64136, /* YFS Remove a file */
+ YFSCREATEFILE = 64137, /* YFS Create a file */
+ YFSRENAME = 64138, /* YFS Rename or move a file or directory */
+ YFSSYMLINK = 64139, /* YFS Create a symbolic link */
+ YFSLINK = 64140, /* YFS Create a hard link */
+ YFSMAKEDIR = 64141, /* YFS Create a directory */
+ YFSREMOVEDIR = 64142, /* YFS Remove a directory */
+ YFSGETVOLUMESTATUS = 64149, /* YFS Get volume status information */
+ YFSSETVOLUMESTATUS = 64150, /* YFS Set volume status information */
+ YFSSETLOCK = 64156, /* YFS Request a file lock */
+ YFSEXTENDLOCK = 64157, /* YFS Extend a file lock */
+ YFSRELEASELOCK = 64158, /* YFS Release a file lock */
+ YFSLOOKUP = 64161, /* YFS lookup file in directory */
+ YFSFLUSHCPS = 64165,
+ YFSFETCHOPAQUEACL = 64168,
+ YFSWHOAMI = 64170,
+ YFSREMOVEACL = 64171,
+ YFSREMOVEFILE2 = 64173,
+ YFSSTOREOPAQUEACL2 = 64174,
+ YFSINLINEBULKSTATUS = 64536, /* YFS Fetch multiple file statuses with errors */
+ YFSFETCHDATA64 = 64537, /* YFS Fetch file data */
+ YFSSTOREDATA64 = 64538, /* YFS Store file data */
+ YFSUPDATESYMLINK = 64540,
+};
+
+struct yfs_xdr_u64 {
+ __be32 msw;
+ __be32 lsw;
+} __packed;
+
+static inline u64 xdr_to_u64(const struct yfs_xdr_u64 x)
+{
+ return ((u64)ntohl(x.msw) << 32) | ntohl(x.lsw);
+}
+
+static inline struct yfs_xdr_u64 u64_to_xdr(const u64 x)
+{
+ return (struct yfs_xdr_u64){ .msw = htonl(x >> 32), .lsw = htonl(x) };
+}
+
+struct yfs_xdr_vnode {
+ struct yfs_xdr_u64 lo;
+ __be32 hi;
+ __be32 unique;
+} __packed;
+
+struct yfs_xdr_YFSFid {
+ struct yfs_xdr_u64 volume;
+ struct yfs_xdr_vnode vnode;
+} __packed;
+
+
+struct yfs_xdr_YFSFetchStatus {
+ __be32 type;
+ __be32 nlink;
+ struct yfs_xdr_u64 size;
+ struct yfs_xdr_u64 data_version;
+ struct yfs_xdr_u64 author;
+ struct yfs_xdr_u64 owner;
+ struct yfs_xdr_u64 group;
+ __be32 mode;
+ __be32 caller_access;
+ __be32 anon_access;
+ struct yfs_xdr_vnode parent;
+ __be32 data_access_protocol;
+ struct yfs_xdr_u64 mtime_client;
+ struct yfs_xdr_u64 mtime_server;
+ __be32 lock_count;
+ __be32 abort_code;
+} __packed;
+
+struct yfs_xdr_YFSCallBack {
+ __be32 version;
+ struct yfs_xdr_u64 expiration_time;
+ __be32 type;
+} __packed;
+
+struct yfs_xdr_YFSStoreStatus {
+ __be32 mask;
+ __be32 mode;
+ struct yfs_xdr_u64 mtime_client;
+ struct yfs_xdr_u64 owner;
+ struct yfs_xdr_u64 group;
+} __packed;
+
+struct yfs_xdr_RPCFlags {
+ __be32 rpc_flags;
+} __packed;
+
+struct yfs_xdr_YFSVolSync {
+ struct yfs_xdr_u64 vol_creation_date;
+ struct yfs_xdr_u64 vol_update_date;
+ struct yfs_xdr_u64 max_quota;
+ struct yfs_xdr_u64 blocks_in_use;
+ struct yfs_xdr_u64 blocks_avail;
+} __packed;
+
+enum yfs_volume_type {
+ yfs_volume_type_ro = 0,
+ yfs_volume_type_rw = 1,
+};
+
+#define yfs_FVSOnline 0x1
+#define yfs_FVSInservice 0x2
+#define yfs_FVSBlessed 0x4
+#define yfs_FVSNeedsSalvage 0x8
+
+struct yfs_xdr_YFSFetchVolumeStatus {
+ struct yfs_xdr_u64 vid;
+ struct yfs_xdr_u64 parent_id;
+ __be32 flags;
+ __be32 type;
+ struct yfs_xdr_u64 max_quota;
+ struct yfs_xdr_u64 blocks_in_use;
+ struct yfs_xdr_u64 part_blocks_avail;
+ struct yfs_xdr_u64 part_max_blocks;
+ struct yfs_xdr_u64 vol_copy_date;
+ struct yfs_xdr_u64 vol_backup_date;
+} __packed;
+
+struct yfs_xdr_YFSStoreVolumeStatus {
+ __be32 mask;
+ struct yfs_xdr_u64 min_quota;
+ struct yfs_xdr_u64 max_quota;
+ struct yfs_xdr_u64 file_quota;
+} __packed;
#include "internal.h"
#include "afs_fs.h"
-/*
- * Initialise a filesystem server cursor for iterating over FS servers.
- */
-static void afs_init_fs_cursor(struct afs_fs_cursor *fc, struct afs_vnode *vnode)
-{
- memset(fc, 0, sizeof(*fc));
-}
-
/*
* Begin an operation on the fileserver.
*
bool afs_begin_vnode_operation(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
struct key *key)
{
- afs_init_fs_cursor(fc, vnode);
+ memset(fc, 0, sizeof(*fc));
fc->vnode = vnode;
fc->key = key;
fc->ac.error = SHRT_MAX;
+ fc->error = -EDESTADDRREQ;
if (mutex_lock_interruptible(&vnode->io_lock) < 0) {
- fc->ac.error = -EINTR;
+ fc->error = -EINTR;
fc->flags |= AFS_FS_CURSOR_STOP;
return false;
}
fc->server_list = afs_get_serverlist(vnode->volume->servers);
read_unlock(&vnode->volume->servers_lock);
+ fc->untried = (1UL << fc->server_list->nr_servers) - 1;
+ fc->index = READ_ONCE(fc->server_list->preferred);
+
cbi = vnode->cb_interest;
if (cbi) {
/* See if the vnode's preferred record is still available */
for (i = 0; i < fc->server_list->nr_servers; i++) {
if (fc->server_list->servers[i].cb_interest == cbi) {
- fc->start = i;
+ fc->index = i;
goto found_interest;
}
}
* and have to return an error.
*/
if (fc->flags & AFS_FS_CURSOR_CUR_ONLY) {
- fc->ac.error = -ESTALE;
+ fc->error = -ESTALE;
return false;
}
afs_put_cb_interest(afs_v2net(vnode), cbi);
cbi = NULL;
- } else {
- fc->start = READ_ONCE(fc->server_list->index);
}
found_interest:
- fc->index = fc->start;
return true;
}
default: m = "busy"; break;
}
- pr_notice("kAFS: Volume %u '%s' is %s\n", volume->vid, volume->name, m);
+ pr_notice("kAFS: Volume %llu '%s' is %s\n", volume->vid, volume->name, m);
}
/*
{
msleep_interruptible(1000);
if (signal_pending(current)) {
- fc->ac.error = -ERESTARTSYS;
+ fc->error = -ERESTARTSYS;
return false;
}
struct afs_addr_list *alist;
struct afs_server *server;
struct afs_vnode *vnode = fc->vnode;
+ struct afs_error e;
+ u32 rtt;
+ int error = fc->ac.error, i;
- _enter("%u/%u,%u/%u,%d,%d",
- fc->index, fc->start,
- fc->ac.index, fc->ac.start,
- fc->ac.error, fc->ac.abort_code);
+ _enter("%lx[%d],%lx[%d],%d,%d",
+ fc->untried, fc->index,
+ fc->ac.tried, fc->ac.index,
+ error, fc->ac.abort_code);
if (fc->flags & AFS_FS_CURSOR_STOP) {
_leave(" = f [stopped]");
return false;
}
+ fc->nr_iterations++;
+
/* Evaluate the result of the previous operation, if there was one. */
- switch (fc->ac.error) {
+ switch (error) {
case SHRT_MAX:
goto start;
case 0:
default:
/* Success or local failure. Stop. */
+ fc->error = error;
fc->flags |= AFS_FS_CURSOR_STOP;
- _leave(" = f [okay/local %d]", fc->ac.error);
+ _leave(" = f [okay/local %d]", error);
return false;
case -ECONNABORTED:
* - May indicate that the fileserver couldn't attach to the vol.
*/
if (fc->flags & AFS_FS_CURSOR_VNOVOL) {
- fc->ac.error = -EREMOTEIO;
+ fc->error = -EREMOTEIO;
goto next_server;
}
write_unlock(&vnode->volume->servers_lock);
set_bit(AFS_VOLUME_NEEDS_UPDATE, &vnode->volume->flags);
- fc->ac.error = afs_check_volume_status(vnode->volume, fc->key);
- if (fc->ac.error < 0)
- goto failed;
+ error = afs_check_volume_status(vnode->volume, fc->key);
+ if (error < 0)
+ goto failed_set_error;
if (test_bit(AFS_VOLUME_DELETED, &vnode->volume->flags)) {
- fc->ac.error = -ENOMEDIUM;
+ fc->error = -ENOMEDIUM;
goto failed;
}
* it's the fileserver having trouble.
*/
if (vnode->volume->servers == fc->server_list) {
- fc->ac.error = -EREMOTEIO;
+ fc->error = -EREMOTEIO;
goto next_server;
}
case VONLINE:
case VDISKFULL:
case VOVERQUOTA:
- fc->ac.error = afs_abort_to_error(fc->ac.abort_code);
+ fc->error = afs_abort_to_error(fc->ac.abort_code);
goto next_server;
case VOFFLINE:
clear_bit(AFS_VOLUME_BUSY, &vnode->volume->flags);
}
if (fc->flags & AFS_FS_CURSOR_NO_VSLEEP) {
- fc->ac.error = -EADV;
+ fc->error = -EADV;
goto failed;
}
if (fc->flags & AFS_FS_CURSOR_CUR_ONLY) {
- fc->ac.error = -ESTALE;
+ fc->error = -ESTALE;
goto failed;
}
goto busy;
* have a file lock we need to maintain.
*/
if (fc->flags & AFS_FS_CURSOR_NO_VSLEEP) {
- fc->ac.error = -EBUSY;
+ fc->error = -EBUSY;
goto failed;
}
if (!test_and_set_bit(AFS_VOLUME_BUSY, &vnode->volume->flags)) {
* honour, just in case someone sets up a loop.
*/
if (fc->flags & AFS_FS_CURSOR_VMOVED) {
- fc->ac.error = -EREMOTEIO;
+ fc->error = -EREMOTEIO;
goto failed;
}
fc->flags |= AFS_FS_CURSOR_VMOVED;
set_bit(AFS_VOLUME_WAIT, &vnode->volume->flags);
set_bit(AFS_VOLUME_NEEDS_UPDATE, &vnode->volume->flags);
- fc->ac.error = afs_check_volume_status(vnode->volume, fc->key);
- if (fc->ac.error < 0)
- goto failed;
+ error = afs_check_volume_status(vnode->volume, fc->key);
+ if (error < 0)
+ goto failed_set_error;
/* If the server list didn't change, then the VLDB is
* out of sync with the fileservers. This is hopefully
* TODO: Retry a few times with sleeps.
*/
if (vnode->volume->servers == fc->server_list) {
- fc->ac.error = -ENOMEDIUM;
+ fc->error = -ENOMEDIUM;
goto failed;
}
default:
clear_bit(AFS_VOLUME_OFFLINE, &vnode->volume->flags);
clear_bit(AFS_VOLUME_BUSY, &vnode->volume->flags);
- fc->ac.error = afs_abort_to_error(fc->ac.abort_code);
+ fc->error = afs_abort_to_error(fc->ac.abort_code);
goto failed;
}
+ case -ETIMEDOUT:
+ case -ETIME:
+ if (fc->error != -EDESTADDRREQ)
+ goto iterate_address;
+ /* Fall through */
+ case -ERFKILL:
+ case -EADDRNOTAVAIL:
case -ENETUNREACH:
case -EHOSTUNREACH:
+ case -EHOSTDOWN:
case -ECONNREFUSED:
- case -ETIMEDOUT:
- case -ETIME:
_debug("no conn");
+ fc->error = error;
goto iterate_address;
case -ECONNRESET:
_debug("call reset");
+ fc->error = error;
goto failed;
}
/* See if we need to do an update of the volume record. Note that the
* volume may have moved or even have been deleted.
*/
- fc->ac.error = afs_check_volume_status(vnode->volume, fc->key);
- if (fc->ac.error < 0)
- goto failed;
+ error = afs_check_volume_status(vnode->volume, fc->key);
+ if (error < 0)
+ goto failed_set_error;
if (!afs_start_fs_iteration(fc, vnode))
goto failed;
-use_server:
- _debug("use");
+ _debug("__ VOL %llx __", vnode->volume->vid);
+ error = afs_probe_fileservers(afs_v2net(vnode), fc->key, fc->server_list);
+ if (error < 0)
+ goto failed_set_error;
+
+pick_server:
+ _debug("pick [%lx]", fc->untried);
+
+ error = afs_wait_for_fs_probes(fc->server_list, fc->untried);
+ if (error < 0)
+ goto failed_set_error;
+
+ /* Pick the untried server with the lowest RTT. If we have outstanding
+ * callbacks, we stick with the server we're already using if we can.
+ */
+ if (fc->cbi) {
+ _debug("cbi %u", fc->index);
+ if (test_bit(fc->index, &fc->untried))
+ goto selected_server;
+ afs_put_cb_interest(afs_v2net(vnode), fc->cbi);
+ fc->cbi = NULL;
+ _debug("nocbi");
+ }
+
+ fc->index = -1;
+ rtt = U32_MAX;
+ for (i = 0; i < fc->server_list->nr_servers; i++) {
+ struct afs_server *s = fc->server_list->servers[i].server;
+
+ if (!test_bit(i, &fc->untried) || !s->probe.responded)
+ continue;
+ if (s->probe.rtt < rtt) {
+ fc->index = i;
+ rtt = s->probe.rtt;
+ }
+ }
+
+ if (fc->index == -1)
+ goto no_more_servers;
+
+selected_server:
+ _debug("use %d", fc->index);
+ __clear_bit(fc->index, &fc->untried);
+
/* We're starting on a different fileserver from the list. We need to
* check it, create a callback intercept, find its address list and
* probe its capabilities before we use it.
* break request before we've finished decoding the reply and
* installing the vnode.
*/
- fc->ac.error = afs_register_server_cb_interest(vnode, fc->server_list,
- fc->index);
- if (fc->ac.error < 0)
- goto failed;
+ error = afs_register_server_cb_interest(vnode, fc->server_list,
+ fc->index);
+ if (error < 0)
+ goto failed_set_error;
fc->cbi = afs_get_cb_interest(vnode->cb_interest);
memset(&fc->ac, 0, sizeof(fc->ac));
- /* Probe the current fileserver if we haven't done so yet. */
- if (!test_bit(AFS_SERVER_FL_PROBED, &server->flags)) {
- fc->ac.alist = afs_get_addrlist(alist);
-
- if (!afs_probe_fileserver(fc)) {
- switch (fc->ac.error) {
- case -ENOMEM:
- case -ERESTARTSYS:
- case -EINTR:
- goto failed;
- default:
- goto next_server;
- }
- }
- }
-
if (!fc->ac.alist)
fc->ac.alist = alist;
else
afs_put_addrlist(alist);
- fc->ac.start = READ_ONCE(alist->index);
- fc->ac.index = fc->ac.start;
+ fc->ac.index = -1;
iterate_address:
ASSERT(fc->ac.alist);
- _debug("iterate %d/%d", fc->ac.index, fc->ac.alist->nr_addrs);
/* Iterate over the current server's address list to try and find an
* address on which it will respond to us.
*/
if (!afs_iterate_addresses(&fc->ac))
goto next_server;
+ _debug("address [%u] %u/%u", fc->index, fc->ac.index, fc->ac.alist->nr_addrs);
+
_leave(" = t");
return true;
next_server:
_debug("next");
afs_end_cursor(&fc->ac);
- afs_put_cb_interest(afs_v2net(vnode), fc->cbi);
- fc->cbi = NULL;
- fc->index++;
- if (fc->index >= fc->server_list->nr_servers)
- fc->index = 0;
- if (fc->index != fc->start)
- goto use_server;
+ goto pick_server;
+no_more_servers:
/* That's all the servers poked to no good effect. Try again if some
* of them were busy.
*/
if (fc->flags & AFS_FS_CURSOR_VBUSY)
goto restart_from_beginning;
- fc->ac.error = -EDESTADDRREQ;
- goto failed;
+ e.error = -EDESTADDRREQ;
+ e.responded = false;
+ for (i = 0; i < fc->server_list->nr_servers; i++) {
+ struct afs_server *s = fc->server_list->servers[i].server;
+
+ afs_prioritise_error(&e, READ_ONCE(s->probe.error),
+ s->probe.abort_code);
+ }
+failed_set_error:
+ fc->error = error;
failed:
fc->flags |= AFS_FS_CURSOR_STOP;
afs_end_cursor(&fc->ac);
- _leave(" = f [failed %d]", fc->ac.error);
+ _leave(" = f [failed %d]", fc->error);
return false;
}
struct afs_vnode *vnode = fc->vnode;
struct afs_cb_interest *cbi = vnode->cb_interest;
struct afs_addr_list *alist;
+ int error = fc->ac.error;
_enter("");
- switch (fc->ac.error) {
+ switch (error) {
case SHRT_MAX:
if (!cbi) {
- fc->ac.error = -ESTALE;
+ fc->error = -ESTALE;
fc->flags |= AFS_FS_CURSOR_STOP;
return false;
}
afs_get_addrlist(alist);
read_unlock(&cbi->server->fs_lock);
if (!alist) {
- fc->ac.error = -ESTALE;
+ fc->error = -ESTALE;
fc->flags |= AFS_FS_CURSOR_STOP;
return false;
}
memset(&fc->ac, 0, sizeof(fc->ac));
fc->ac.alist = alist;
- fc->ac.start = READ_ONCE(alist->index);
- fc->ac.index = fc->ac.start;
+ fc->ac.index = -1;
goto iterate_address;
case 0:
default:
/* Success or local failure. Stop. */
+ fc->error = error;
fc->flags |= AFS_FS_CURSOR_STOP;
- _leave(" = f [okay/local %d]", fc->ac.error);
+ _leave(" = f [okay/local %d]", error);
return false;
case -ECONNABORTED:
+ fc->error = afs_abort_to_error(fc->ac.abort_code);
fc->flags |= AFS_FS_CURSOR_STOP;
_leave(" = f [abort]");
return false;
+ case -ERFKILL:
+ case -EADDRNOTAVAIL:
case -ENETUNREACH:
case -EHOSTUNREACH:
+ case -EHOSTDOWN:
case -ECONNREFUSED:
case -ETIMEDOUT:
case -ETIME:
_debug("no conn");
+ fc->error = error;
goto iterate_address;
}
return false;
}
+/*
+ * Dump cursor state in the case of the error being EDESTADDRREQ.
+ */
+static void afs_dump_edestaddrreq(const struct afs_fs_cursor *fc)
+{
+ static int count;
+ int i;
+
+ if (!IS_ENABLED(CONFIG_AFS_DEBUG_CURSOR) || count > 3)
+ return;
+ count++;
+
+ rcu_read_lock();
+
+ pr_notice("EDESTADDR occurred\n");
+ pr_notice("FC: cbb=%x cbb2=%x fl=%hx err=%hd\n",
+ fc->cb_break, fc->cb_break_2, fc->flags, fc->error);
+ pr_notice("FC: ut=%lx ix=%d ni=%u\n",
+ fc->untried, fc->index, fc->nr_iterations);
+
+ if (fc->server_list) {
+ const struct afs_server_list *sl = fc->server_list;
+ pr_notice("FC: SL nr=%u pr=%u vnov=%hx\n",
+ sl->nr_servers, sl->preferred, sl->vnovol_mask);
+ for (i = 0; i < sl->nr_servers; i++) {
+ const struct afs_server *s = sl->servers[i].server;
+ pr_notice("FC: server fl=%lx av=%u %pU\n",
+ s->flags, s->addr_version, &s->uuid);
+ if (s->addresses) {
+ const struct afs_addr_list *a =
+ rcu_dereference(s->addresses);
+ pr_notice("FC: - av=%u nr=%u/%u/%u pr=%u\n",
+ a->version,
+ a->nr_ipv4, a->nr_addrs, a->max_addrs,
+ a->preferred);
+ pr_notice("FC: - pr=%lx R=%lx F=%lx\n",
+ a->probed, a->responded, a->failed);
+ if (a == fc->ac.alist)
+ pr_notice("FC: - current\n");
+ }
+ }
+ }
+
+ pr_notice("AC: t=%lx ax=%u ac=%d er=%d r=%u ni=%u\n",
+ fc->ac.tried, fc->ac.index, fc->ac.abort_code, fc->ac.error,
+ fc->ac.responded, fc->ac.nr_iterations);
+ rcu_read_unlock();
+}
+
/*
* Tidy up a filesystem cursor and unlock the vnode.
*/
int afs_end_vnode_operation(struct afs_fs_cursor *fc)
{
struct afs_net *net = afs_v2net(fc->vnode);
- int ret;
+
+ if (fc->error == -EDESTADDRREQ ||
+ fc->error == -EADDRNOTAVAIL ||
+ fc->error == -ENETUNREACH ||
+ fc->error == -EHOSTUNREACH)
+ afs_dump_edestaddrreq(fc);
mutex_unlock(&fc->vnode->io_lock);
afs_put_cb_interest(net, fc->cbi);
afs_put_serverlist(net, fc->server_list);
- ret = fc->ac.error;
- if (ret == -ECONNABORTED)
- afs_abort_to_error(fc->ac.abort_code);
+ if (fc->error == -ECONNABORTED)
+ fc->error = afs_abort_to_error(fc->ac.abort_code);
- return fc->ac.error;
+ return fc->error;
}
#include <net/af_rxrpc.h>
#include "internal.h"
#include "afs_cm.h"
+#include "protocol_yfs.h"
struct workqueue_struct *afs_async_calls;
if (ret < 0)
goto error_2;
+ srx.srx_service = YFS_CM_SERVICE;
+ ret = kernel_bind(socket, (struct sockaddr *) &srx, sizeof(srx));
+ if (ret < 0)
+ goto error_2;
+
+ /* Ideally, we'd turn on service upgrade here, but we can't because
+ * OpenAFS is buggy and leaks the userStatus field from packet to
+ * packet and between FS packets and CB packets - so if we try to do an
+ * upgrade on an FS packet, OpenAFS will leak that into the CB packet
+ * it sends back to us.
+ */
+
rxrpc_kernel_new_call_notification(socket, afs_rx_new_call,
afs_rx_discard_new_call);
INIT_WORK(&call->async_work, afs_process_async_call);
init_waitqueue_head(&call->waitq);
spin_lock_init(&call->state_lock);
+ call->_iter = &call->iter;
o = atomic_inc_return(&net->nr_outstanding_calls);
trace_afs_call(call, afs_call_trace_alloc, 1, o,
afs_put_server(call->net, call->cm_server);
afs_put_cb_interest(call->net, call->cbi);
+ afs_put_addrlist(call->alist);
kfree(call->request);
trace_afs_call(call, afs_call_trace_free, 0, o,
}
/*
- * Queue the call for actual work. Returns 0 unconditionally for convenience.
+ * Queue the call for actual work.
*/
-int afs_queue_call_work(struct afs_call *call)
+static void afs_queue_call_work(struct afs_call *call)
{
- int u = atomic_inc_return(&call->usage);
+ if (call->type->work) {
+ int u = atomic_inc_return(&call->usage);
- trace_afs_call(call, afs_call_trace_work, u,
- atomic_read(&call->net->nr_outstanding_calls),
- __builtin_return_address(0));
+ trace_afs_call(call, afs_call_trace_work, u,
+ atomic_read(&call->net->nr_outstanding_calls),
+ __builtin_return_address(0));
- INIT_WORK(&call->work, call->type->work);
+ INIT_WORK(&call->work, call->type->work);
- if (!queue_work(afs_wq, &call->work))
- afs_put_call(call);
- return 0;
+ if (!queue_work(afs_wq, &call->work))
+ afs_put_call(call);
+ }
}
/*
goto nomem_free;
}
+ afs_extract_to_buf(call, call->reply_max);
call->operation_ID = type->op;
init_waitqueue_head(&call->waitq);
return call;
offset = 0;
}
- iov_iter_bvec(&msg->msg_iter, WRITE | ITER_BVEC, bv, nr, bytes);
+ iov_iter_bvec(&msg->msg_iter, WRITE, bv, nr, bytes);
}
/*
long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
gfp_t gfp, bool async)
{
- struct sockaddr_rxrpc *srx = ac->addr;
+ struct sockaddr_rxrpc *srx = &ac->alist->addrs[ac->index];
struct rxrpc_call *rxcall;
struct msghdr msg;
struct kvec iov[1];
atomic_read(&call->net->nr_outstanding_calls));
call->async = async;
+ call->addr_ix = ac->index;
+ call->alist = afs_get_addrlist(ac->alist);
/* Work out the length we're going to transmit. This is awkward for
* calls such as FS.StoreData where there's an extra injection of data
call->debug_id);
if (IS_ERR(rxcall)) {
ret = PTR_ERR(rxcall);
+ call->error = ret;
goto error_kill_call;
}
msg.msg_name = NULL;
msg.msg_namelen = 0;
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iov, 1,
- call->request_size);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, call->request_size);
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = MSG_WAITALL | (call->send_pages ? MSG_MORE : 0);
rxrpc_kernel_abort_call(call->net->socket, rxcall,
RX_USER_ABORT, ret, "KSD");
} else {
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&msg.msg_iter, READ, NULL, 0, 0);
rxrpc_kernel_recv_data(call->net->socket, rxcall,
&msg.msg_iter, false,
&call->abort_code, &call->service_id);
call->error = ret;
trace_afs_call_done(call);
error_kill_call:
+ if (call->type->done)
+ call->type->done(call);
afs_put_call(call);
ac->error = ret;
_leave(" = %d", ret);
state == AFS_CALL_SV_AWAIT_ACK
) {
if (state == AFS_CALL_SV_AWAIT_ACK) {
- struct iov_iter iter;
-
- iov_iter_kvec(&iter, READ | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&call->iter, READ, NULL, 0, 0);
ret = rxrpc_kernel_recv_data(call->net->socket,
- call->rxcall, &iter, false,
- &remote_abort,
+ call->rxcall, &call->iter,
+ false, &remote_abort,
&call->service_id);
- trace_afs_recv_data(call, 0, 0, false, ret);
+ trace_afs_receive_data(call, &call->iter, false, ret);
if (ret == -EINPROGRESS || ret == -EAGAIN)
return;
return;
}
+ if (call->want_reply_time &&
+ rxrpc_kernel_get_reply_time(call->net->socket,
+ call->rxcall,
+ &call->reply_time))
+ call->want_reply_time = false;
+
ret = call->type->deliver(call);
state = READ_ONCE(call->state);
switch (ret) {
case 0:
+ afs_queue_call_work(call);
if (state == AFS_CALL_CL_PROC_REPLY) {
if (call->cbi)
set_bit(AFS_SERVER_FL_MAY_HAVE_CB,
case -EINPROGRESS:
case -EAGAIN:
goto out;
- case -EIO:
case -ECONNABORTED:
ASSERTCMP(state, ==, AFS_CALL_COMPLETE);
goto done;
rxrpc_kernel_abort_call(call->net->socket, call->rxcall,
abort_code, ret, "KIV");
goto local_abort;
+ case -EIO:
+ pr_err("kAFS: Call %u in bad state %u\n",
+ call->debug_id, state);
+ /* Fall through */
case -ENODATA:
case -EBADMSG:
case -EMSGSIZE:
if (state != AFS_CALL_CL_AWAIT_REPLY)
abort_code = RXGEN_SS_UNMARSHAL;
rxrpc_kernel_abort_call(call->net->socket, call->rxcall,
- abort_code, -EBADMSG, "KUM");
+ abort_code, ret, "KUM");
goto local_abort;
}
}
done:
+ if (call->type->done)
+ call->type->done(call);
if (state == AFS_CALL_COMPLETE && call->incoming)
afs_put_call(call);
out:
{
signed long rtt2, timeout;
long ret;
+ bool stalled = false;
u64 rtt;
u32 life, last_life;
life = rxrpc_kernel_check_life(call->net->socket, call->rxcall);
if (timeout == 0 &&
- life == last_life && signal_pending(current))
+ life == last_life && signal_pending(current)) {
+ if (stalled)
break;
+ __set_current_state(TASK_RUNNING);
+ rxrpc_kernel_probe_life(call->net->socket, call->rxcall);
+ timeout = rtt2;
+ stalled = true;
+ continue;
+ }
if (life != last_life) {
timeout = rtt2;
last_life = life;
+ stalled = false;
}
timeout = schedule_timeout(timeout);
call->async = true;
call->state = AFS_CALL_SV_AWAIT_OP_ID;
init_waitqueue_head(&call->waitq);
+ afs_extract_to_tmp(call);
}
if (rxrpc_kernel_charge_accept(net->socket,
{
int ret;
- _enter("{%zu}", call->offset);
-
- ASSERTCMP(call->offset, <, 4);
+ _enter("{%zu}", iov_iter_count(call->_iter));
/* the operation ID forms the first four bytes of the request data */
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->operation_ID = ntohl(call->tmp);
afs_set_call_state(call, AFS_CALL_SV_AWAIT_OP_ID, AFS_CALL_SV_AWAIT_REQUEST);
- call->offset = 0;
/* ask the cache manager to route the call (it'll change the call type
* if successful) */
msg.msg_name = NULL;
msg.msg_namelen = 0;
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&msg.msg_iter, WRITE, NULL, 0, 0);
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = 0;
iov[0].iov_len = len;
msg.msg_name = NULL;
msg.msg_namelen = 0;
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iov, 1, len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = 0;
/*
* Extract a piece of data from the received data socket buffers.
*/
-int afs_extract_data(struct afs_call *call, void *buf, size_t count,
- bool want_more)
+int afs_extract_data(struct afs_call *call, bool want_more)
{
struct afs_net *net = call->net;
- struct iov_iter iter;
- struct kvec iov;
+ struct iov_iter *iter = call->_iter;
enum afs_call_state state;
u32 remote_abort = 0;
int ret;
- _enter("{%s,%zu},,%zu,%d",
- call->type->name, call->offset, count, want_more);
-
- ASSERTCMP(call->offset, <=, count);
-
- iov.iov_base = buf + call->offset;
- iov.iov_len = count - call->offset;
- iov_iter_kvec(&iter, ITER_KVEC | READ, &iov, 1, count - call->offset);
+ _enter("{%s,%zu},%d", call->type->name, iov_iter_count(iter), want_more);
- ret = rxrpc_kernel_recv_data(net->socket, call->rxcall, &iter,
+ ret = rxrpc_kernel_recv_data(net->socket, call->rxcall, iter,
want_more, &remote_abort,
&call->service_id);
- call->offset += (count - call->offset) - iov_iter_count(&iter);
- trace_afs_recv_data(call, count, call->offset, want_more, ret);
if (ret == 0 || ret == -EAGAIN)
return ret;
break;
case AFS_CALL_COMPLETE:
kdebug("prem complete %d", call->error);
- return -EIO;
+ return afs_io_error(call, afs_io_error_extract);
default:
break;
}
/*
* Log protocol error production.
*/
-noinline int afs_protocol_error(struct afs_call *call, int error)
+noinline int afs_protocol_error(struct afs_call *call, int error,
+ enum afs_eproto_cause cause)
{
- trace_afs_protocol_error(call, error, __builtin_return_address(0));
+ trace_afs_protocol_error(call, error, cause);
return error;
}
bool changed = false;
int i, j;
- _enter("{%x:%u},%x,%x",
+ _enter("{%llx:%llu},%x,%x",
vnode->fid.vid, vnode->fid.vnode, key_serial(key), caller_access);
rcu_read_lock();
break;
}
- if (cb_break != afs_cb_break_sum(vnode, vnode->cb_interest)) {
+ if (afs_cb_is_broken(cb_break, vnode,
+ vnode->cb_interest)) {
changed = true;
break;
}
}
}
- if (cb_break != afs_cb_break_sum(vnode, vnode->cb_interest))
+ if (afs_cb_is_broken(cb_break, vnode, vnode->cb_interest))
goto someone_else_changed_it;
/* We need a ref on any permits list we want to copy as we'll have to
spin_lock(&vnode->lock);
zap = rcu_access_pointer(vnode->permit_cache);
- if (cb_break == afs_cb_break_sum(vnode, vnode->cb_interest) &&
+ if (!afs_cb_is_broken(cb_break, vnode, vnode->cb_interest) &&
zap == permits)
rcu_assign_pointer(vnode->permit_cache, replacement);
else
bool valid = false;
int i, ret;
- _enter("{%x:%u},%x",
+ _enter("{%llx:%llu},%x",
vnode->fid.vid, vnode->fid.vnode, key_serial(key));
/* check the permits to see if we've got one yet */
if (mask & MAY_NOT_BLOCK)
return -ECHILD;
- _enter("{{%x:%u},%lx},%x,",
+ _enter("{{%llx:%llu},%lx},%x,",
vnode->fid.vid, vnode->fid.vnode, vnode->flags, mask);
key = afs_request_key(vnode->volume->cell);
#include <linux/slab.h>
#include "afs_fs.h"
#include "internal.h"
+#include "protocol_yfs.h"
static unsigned afs_server_gc_delay = 10; /* Server record timeout in seconds */
static unsigned afs_server_update_delay = 30; /* Time till VLDB recheck in secs */
rwlock_init(&server->fs_lock);
INIT_HLIST_HEAD(&server->cb_volumes);
rwlock_init(&server->cb_break_lock);
+ init_waitqueue_head(&server->probe_wq);
+ spin_lock_init(&server->probe_lock);
afs_inc_servers_outstanding(net);
_leave(" = %p", server);
static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell,
struct key *key, const uuid_t *uuid)
{
- struct afs_addr_cursor ac;
- struct afs_addr_list *alist;
+ struct afs_vl_cursor vc;
+ struct afs_addr_list *alist = NULL;
int ret;
- ret = afs_set_vl_cursor(&ac, cell);
- if (ret < 0)
- return ERR_PTR(ret);
-
- while (afs_iterate_addresses(&ac)) {
- if (test_bit(ac.index, &ac.alist->yfs))
- alist = afs_yfsvl_get_endpoints(cell->net, &ac, key, uuid);
- else
- alist = afs_vl_get_addrs_u(cell->net, &ac, key, uuid);
- switch (ac.error) {
- case 0:
- afs_end_cursor(&ac);
- return alist;
- case -ECONNABORTED:
- ac.error = afs_abort_to_error(ac.abort_code);
- goto error;
- case -ENOMEM:
- case -ENONET:
- goto error;
- case -ENETUNREACH:
- case -EHOSTUNREACH:
- case -ECONNREFUSED:
- break;
- default:
- ac.error = -EIO;
- goto error;
+ ret = -ERESTARTSYS;
+ if (afs_begin_vlserver_operation(&vc, cell, key)) {
+ while (afs_select_vlserver(&vc)) {
+ if (test_bit(AFS_VLSERVER_FL_IS_YFS, &vc.server->flags))
+ alist = afs_yfsvl_get_endpoints(&vc, uuid);
+ else
+ alist = afs_vl_get_addrs_u(&vc, uuid);
}
+
+ ret = afs_end_vlserver_operation(&vc);
}
-error:
- return ERR_PTR(afs_end_cursor(&ac));
+ return ret < 0 ? ERR_PTR(ret) : alist;
}
/*
struct afs_addr_list *alist = rcu_access_pointer(server->addresses);
struct afs_addr_cursor ac = {
.alist = alist,
- .start = alist->index,
- .index = 0,
- .addr = &alist->addrs[alist->index],
+ .index = alist->preferred,
.error = 0,
};
_enter("%p", server);
if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags))
afs_fs_give_up_all_callbacks(net, server, &ac, NULL);
+ wait_var_event(&server->probe_outstanding,
+ atomic_read(&server->probe_outstanding) == 0);
+
call_rcu(&server->rcu, afs_server_rcu);
afs_dec_servers_outstanding(net);
}
_leave("");
}
-/*
- * Probe a fileserver to find its capabilities.
- *
- * TODO: Try service upgrade.
- */
-static bool afs_do_probe_fileserver(struct afs_fs_cursor *fc)
-{
- _enter("");
-
- fc->ac.addr = NULL;
- fc->ac.start = READ_ONCE(fc->ac.alist->index);
- fc->ac.index = fc->ac.start;
- fc->ac.error = 0;
- fc->ac.begun = false;
-
- while (afs_iterate_addresses(&fc->ac)) {
- afs_fs_get_capabilities(afs_v2net(fc->vnode), fc->cbi->server,
- &fc->ac, fc->key);
- switch (fc->ac.error) {
- case 0:
- afs_end_cursor(&fc->ac);
- set_bit(AFS_SERVER_FL_PROBED, &fc->cbi->server->flags);
- return true;
- case -ECONNABORTED:
- fc->ac.error = afs_abort_to_error(fc->ac.abort_code);
- goto error;
- case -ENOMEM:
- case -ENONET:
- goto error;
- case -ENETUNREACH:
- case -EHOSTUNREACH:
- case -ECONNREFUSED:
- case -ETIMEDOUT:
- case -ETIME:
- break;
- default:
- fc->ac.error = -EIO;
- goto error;
- }
- }
-
-error:
- afs_end_cursor(&fc->ac);
- return false;
-}
-
-/*
- * If we haven't already, try probing the fileserver to get its capabilities.
- * We try not to instigate parallel probes, but it's possible that the parallel
- * probes will fail due to authentication failure when ours would succeed.
- *
- * TODO: Try sending an anonymous probe if an authenticated probe fails.
- */
-bool afs_probe_fileserver(struct afs_fs_cursor *fc)
-{
- bool success;
- int ret, retries = 0;
-
- _enter("");
-
-retry:
- if (test_bit(AFS_SERVER_FL_PROBED, &fc->cbi->server->flags)) {
- _leave(" = t");
- return true;
- }
-
- if (!test_and_set_bit_lock(AFS_SERVER_FL_PROBING, &fc->cbi->server->flags)) {
- success = afs_do_probe_fileserver(fc);
- clear_bit_unlock(AFS_SERVER_FL_PROBING, &fc->cbi->server->flags);
- wake_up_bit(&fc->cbi->server->flags, AFS_SERVER_FL_PROBING);
- _leave(" = t");
- return success;
- }
-
- _debug("wait");
- ret = wait_on_bit(&fc->cbi->server->flags, AFS_SERVER_FL_PROBING,
- TASK_INTERRUPTIBLE);
- if (ret == -ERESTARTSYS) {
- fc->ac.error = ret;
- _leave(" = f [%d]", ret);
- return false;
- }
-
- retries++;
- if (retries == 4) {
- fc->ac.error = -ESTALE;
- _leave(" = f [stale]");
- return false;
- }
- _debug("retry");
- goto retry;
-}
-
/*
* Get an update for a server's address list.
*/
return false;
changed:
- /* Maintain the same current server as before if possible. */
- cur = old->servers[old->index].server;
+ /* Maintain the same preferred server as before if possible. */
+ cur = old->servers[old->preferred].server;
for (j = 0; j < new->nr_servers; j++) {
if (new->servers[j].server == cur) {
- new->index = j;
+ new->preferred = j;
break;
}
}
inode = afs_iget_pseudo_dir(sb, true);
sb->s_flags |= SB_RDONLY;
} else {
- sprintf(sb->s_id, "%u", as->volume->vid);
+ sprintf(sb->s_id, "%llu", as->volume->vid);
afs_activate_volume(as->volume);
fid.vid = as->volume->vid;
fid.vnode = 1;
+ fid.vnode_hi = 0;
fid.unique = 1;
inode = afs_iget(sb, params->key, &fid, NULL, NULL, NULL);
}
{
struct afs_vnode *vnode = AFS_FS_I(inode);
- _enter("%p{%x:%u}", inode, vnode->fid.vid, vnode->fid.vnode);
+ _enter("%p{%llx:%llu}", inode, vnode->fid.vid, vnode->fid.vnode);
_debug("DESTROY INODE %p", inode);
--- /dev/null
+/* AFS vlserver list management.
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include "internal.h"
+
+struct afs_vlserver *afs_alloc_vlserver(const char *name, size_t name_len,
+ unsigned short port)
+{
+ struct afs_vlserver *vlserver;
+
+ vlserver = kzalloc(struct_size(vlserver, name, name_len + 1),
+ GFP_KERNEL);
+ if (vlserver) {
+ atomic_set(&vlserver->usage, 1);
+ rwlock_init(&vlserver->lock);
+ init_waitqueue_head(&vlserver->probe_wq);
+ spin_lock_init(&vlserver->probe_lock);
+ vlserver->name_len = name_len;
+ vlserver->port = port;
+ memcpy(vlserver->name, name, name_len);
+ }
+ return vlserver;
+}
+
+static void afs_vlserver_rcu(struct rcu_head *rcu)
+{
+ struct afs_vlserver *vlserver = container_of(rcu, struct afs_vlserver, rcu);
+
+ afs_put_addrlist(rcu_access_pointer(vlserver->addresses));
+ kfree_rcu(vlserver, rcu);
+}
+
+void afs_put_vlserver(struct afs_net *net, struct afs_vlserver *vlserver)
+{
+ if (vlserver) {
+ unsigned int u = atomic_dec_return(&vlserver->usage);
+ //_debug("VL PUT %p{%u}", vlserver, u);
+
+ if (u == 0)
+ call_rcu(&vlserver->rcu, afs_vlserver_rcu);
+ }
+}
+
+struct afs_vlserver_list *afs_alloc_vlserver_list(unsigned int nr_servers)
+{
+ struct afs_vlserver_list *vllist;
+
+ vllist = kzalloc(struct_size(vllist, servers, nr_servers), GFP_KERNEL);
+ if (vllist) {
+ atomic_set(&vllist->usage, 1);
+ rwlock_init(&vllist->lock);
+ }
+
+ return vllist;
+}
+
+void afs_put_vlserverlist(struct afs_net *net, struct afs_vlserver_list *vllist)
+{
+ if (vllist) {
+ unsigned int u = atomic_dec_return(&vllist->usage);
+
+ //_debug("VLLS PUT %p{%u}", vllist, u);
+ if (u == 0) {
+ int i;
+
+ for (i = 0; i < vllist->nr_servers; i++) {
+ afs_put_vlserver(net, vllist->servers[i].server);
+ }
+ kfree_rcu(vllist, rcu);
+ }
+ }
+}
+
+static u16 afs_extract_le16(const u8 **_b)
+{
+ u16 val;
+
+ val = (u16)*(*_b)++ << 0;
+ val |= (u16)*(*_b)++ << 8;
+ return val;
+}
+
+/*
+ * Build a VL server address list from a DNS queried server list.
+ */
+static struct afs_addr_list *afs_extract_vl_addrs(const u8 **_b, const u8 *end,
+ u8 nr_addrs, u16 port)
+{
+ struct afs_addr_list *alist;
+ const u8 *b = *_b;
+ int ret = -EINVAL;
+
+ alist = afs_alloc_addrlist(nr_addrs, VL_SERVICE, port);
+ if (!alist)
+ return ERR_PTR(-ENOMEM);
+ if (nr_addrs == 0)
+ return alist;
+
+ for (; nr_addrs > 0 && end - b >= nr_addrs; nr_addrs--) {
+ struct dns_server_list_v1_address hdr;
+ __be32 x[4];
+
+ hdr.address_type = *b++;
+
+ switch (hdr.address_type) {
+ case DNS_ADDRESS_IS_IPV4:
+ if (end - b < 4) {
+ _leave(" = -EINVAL [short inet]");
+ goto error;
+ }
+ memcpy(x, b, 4);
+ afs_merge_fs_addr4(alist, x[0], port);
+ b += 4;
+ break;
+
+ case DNS_ADDRESS_IS_IPV6:
+ if (end - b < 16) {
+ _leave(" = -EINVAL [short inet6]");
+ goto error;
+ }
+ memcpy(x, b, 16);
+ afs_merge_fs_addr6(alist, x, port);
+ b += 16;
+ break;
+
+ default:
+ _leave(" = -EADDRNOTAVAIL [unknown af %u]",
+ hdr.address_type);
+ ret = -EADDRNOTAVAIL;
+ goto error;
+ }
+ }
+
+ /* Start with IPv6 if available. */
+ if (alist->nr_ipv4 < alist->nr_addrs)
+ alist->preferred = alist->nr_ipv4;
+
+ *_b = b;
+ return alist;
+
+error:
+ *_b = b;
+ afs_put_addrlist(alist);
+ return ERR_PTR(ret);
+}
+
+/*
+ * Build a VL server list from a DNS queried server list.
+ */
+struct afs_vlserver_list *afs_extract_vlserver_list(struct afs_cell *cell,
+ const void *buffer,
+ size_t buffer_size)
+{
+ const struct dns_server_list_v1_header *hdr = buffer;
+ struct dns_server_list_v1_server bs;
+ struct afs_vlserver_list *vllist, *previous;
+ struct afs_addr_list *addrs;
+ struct afs_vlserver *server;
+ const u8 *b = buffer, *end = buffer + buffer_size;
+ int ret = -ENOMEM, nr_servers, i, j;
+
+ _enter("");
+
+ /* Check that it's a server list, v1 */
+ if (end - b < sizeof(*hdr) ||
+ hdr->hdr.content != DNS_PAYLOAD_IS_SERVER_LIST ||
+ hdr->hdr.version != 1) {
+ pr_notice("kAFS: Got DNS record [%u,%u] len %zu\n",
+ hdr->hdr.content, hdr->hdr.version, end - b);
+ ret = -EDESTADDRREQ;
+ goto dump;
+ }
+
+ nr_servers = hdr->nr_servers;
+
+ vllist = afs_alloc_vlserver_list(nr_servers);
+ if (!vllist)
+ return ERR_PTR(-ENOMEM);
+
+ vllist->source = (hdr->source < NR__dns_record_source) ?
+ hdr->source : NR__dns_record_source;
+ vllist->status = (hdr->status < NR__dns_lookup_status) ?
+ hdr->status : NR__dns_lookup_status;
+
+ read_lock(&cell->vl_servers_lock);
+ previous = afs_get_vlserverlist(
+ rcu_dereference_protected(cell->vl_servers,
+ lockdep_is_held(&cell->vl_servers_lock)));
+ read_unlock(&cell->vl_servers_lock);
+
+ b += sizeof(*hdr);
+ while (end - b >= sizeof(bs)) {
+ bs.name_len = afs_extract_le16(&b);
+ bs.priority = afs_extract_le16(&b);
+ bs.weight = afs_extract_le16(&b);
+ bs.port = afs_extract_le16(&b);
+ bs.source = *b++;
+ bs.status = *b++;
+ bs.protocol = *b++;
+ bs.nr_addrs = *b++;
+
+ _debug("extract %u %u %u %u %u %u %*.*s",
+ bs.name_len, bs.priority, bs.weight,
+ bs.port, bs.protocol, bs.nr_addrs,
+ bs.name_len, bs.name_len, b);
+
+ if (end - b < bs.name_len)
+ break;
+
+ ret = -EPROTONOSUPPORT;
+ if (bs.protocol == DNS_SERVER_PROTOCOL_UNSPECIFIED) {
+ bs.protocol = DNS_SERVER_PROTOCOL_UDP;
+ } else if (bs.protocol != DNS_SERVER_PROTOCOL_UDP) {
+ _leave(" = [proto %u]", bs.protocol);
+ goto error;
+ }
+
+ if (bs.port == 0)
+ bs.port = AFS_VL_PORT;
+ if (bs.source > NR__dns_record_source)
+ bs.source = NR__dns_record_source;
+ if (bs.status > NR__dns_lookup_status)
+ bs.status = NR__dns_lookup_status;
+
+ server = NULL;
+ if (previous) {
+ /* See if we can update an old server record */
+ for (i = 0; i < previous->nr_servers; i++) {
+ struct afs_vlserver *p = previous->servers[i].server;
+
+ if (p->name_len == bs.name_len &&
+ p->port == bs.port &&
+ strncasecmp(b, p->name, bs.name_len) == 0) {
+ server = afs_get_vlserver(p);
+ break;
+ }
+ }
+ }
+
+ if (!server) {
+ ret = -ENOMEM;
+ server = afs_alloc_vlserver(b, bs.name_len, bs.port);
+ if (!server)
+ goto error;
+ }
+
+ b += bs.name_len;
+
+ /* Extract the addresses - note that we can't skip this as we
+ * have to advance the payload pointer.
+ */
+ addrs = afs_extract_vl_addrs(&b, end, bs.nr_addrs, bs.port);
+ if (IS_ERR(addrs)) {
+ ret = PTR_ERR(addrs);
+ goto error_2;
+ }
+
+ if (vllist->nr_servers >= nr_servers) {
+ _debug("skip %u >= %u", vllist->nr_servers, nr_servers);
+ afs_put_addrlist(addrs);
+ afs_put_vlserver(cell->net, server);
+ continue;
+ }
+
+ addrs->source = bs.source;
+ addrs->status = bs.status;
+
+ if (addrs->nr_addrs == 0) {
+ afs_put_addrlist(addrs);
+ if (!rcu_access_pointer(server->addresses)) {
+ afs_put_vlserver(cell->net, server);
+ continue;
+ }
+ } else {
+ struct afs_addr_list *old = addrs;
+
+ write_lock(&server->lock);
+ rcu_swap_protected(server->addresses, old,
+ lockdep_is_held(&server->lock));
+ write_unlock(&server->lock);
+ afs_put_addrlist(old);
+ }
+
+
+ /* TODO: Might want to check for duplicates */
+
+ /* Insertion-sort by priority and weight */
+ for (j = 0; j < vllist->nr_servers; j++) {
+ if (bs.priority < vllist->servers[j].priority)
+ break; /* Lower preferable */
+ if (bs.priority == vllist->servers[j].priority &&
+ bs.weight > vllist->servers[j].weight)
+ break; /* Higher preferable */
+ }
+
+ if (j < vllist->nr_servers) {
+ memmove(vllist->servers + j + 1,
+ vllist->servers + j,
+ (vllist->nr_servers - j) * sizeof(struct afs_vlserver_entry));
+ }
+
+ clear_bit(AFS_VLSERVER_FL_PROBED, &server->flags);
+
+ vllist->servers[j].priority = bs.priority;
+ vllist->servers[j].weight = bs.weight;
+ vllist->servers[j].server = server;
+ vllist->nr_servers++;
+ }
+
+ if (b != end) {
+ _debug("parse error %zd", b - end);
+ goto error;
+ }
+
+ afs_put_vlserverlist(cell->net, previous);
+ _leave(" = ok [%u]", vllist->nr_servers);
+ return vllist;
+
+error_2:
+ afs_put_vlserver(cell->net, server);
+error:
+ afs_put_vlserverlist(cell->net, vllist);
+ afs_put_vlserverlist(cell->net, previous);
+dump:
+ if (ret != -ENOMEM) {
+ printk(KERN_DEBUG "DNS: at %zu\n", (const void *)b - buffer);
+ print_hex_dump_bytes("DNS: ", DUMP_PREFIX_NONE, buffer, buffer_size);
+ }
+ return ERR_PTR(ret);
+}
--- /dev/null
+/* AFS vlserver probing
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include "afs_fs.h"
+#include "internal.h"
+#include "protocol_yfs.h"
+
+static bool afs_vl_probe_done(struct afs_vlserver *server)
+{
+ if (!atomic_dec_and_test(&server->probe_outstanding))
+ return false;
+
+ wake_up_var(&server->probe_outstanding);
+ clear_bit_unlock(AFS_VLSERVER_FL_PROBING, &server->flags);
+ wake_up_bit(&server->flags, AFS_VLSERVER_FL_PROBING);
+ return true;
+}
+
+/*
+ * Process the result of probing a vlserver. This is called after successful
+ * or failed delivery of an VL.GetCapabilities operation.
+ */
+void afs_vlserver_probe_result(struct afs_call *call)
+{
+ struct afs_addr_list *alist = call->alist;
+ struct afs_vlserver *server = call->reply[0];
+ unsigned int server_index = (long)call->reply[1];
+ unsigned int index = call->addr_ix;
+ unsigned int rtt = UINT_MAX;
+ bool have_result = false;
+ u64 _rtt;
+ int ret = call->error;
+
+ _enter("%s,%u,%u,%d,%d", server->name, server_index, index, ret, call->abort_code);
+
+ spin_lock(&server->probe_lock);
+
+ switch (ret) {
+ case 0:
+ server->probe.error = 0;
+ goto responded;
+ case -ECONNABORTED:
+ if (!server->probe.responded) {
+ server->probe.abort_code = call->abort_code;
+ server->probe.error = ret;
+ }
+ goto responded;
+ case -ENOMEM:
+ case -ENONET:
+ server->probe.local_failure = true;
+ afs_io_error(call, afs_io_error_vl_probe_fail);
+ goto out;
+ case -ECONNRESET: /* Responded, but call expired. */
+ case -ERFKILL:
+ case -EADDRNOTAVAIL:
+ case -ENETUNREACH:
+ case -EHOSTUNREACH:
+ case -EHOSTDOWN:
+ case -ECONNREFUSED:
+ case -ETIMEDOUT:
+ case -ETIME:
+ default:
+ clear_bit(index, &alist->responded);
+ set_bit(index, &alist->failed);
+ if (!server->probe.responded &&
+ (server->probe.error == 0 ||
+ server->probe.error == -ETIMEDOUT ||
+ server->probe.error == -ETIME))
+ server->probe.error = ret;
+ afs_io_error(call, afs_io_error_vl_probe_fail);
+ goto out;
+ }
+
+responded:
+ set_bit(index, &alist->responded);
+ clear_bit(index, &alist->failed);
+
+ if (call->service_id == YFS_VL_SERVICE) {
+ server->probe.is_yfs = true;
+ set_bit(AFS_VLSERVER_FL_IS_YFS, &server->flags);
+ alist->addrs[index].srx_service = call->service_id;
+ } else {
+ server->probe.not_yfs = true;
+ if (!server->probe.is_yfs) {
+ clear_bit(AFS_VLSERVER_FL_IS_YFS, &server->flags);
+ alist->addrs[index].srx_service = call->service_id;
+ }
+ }
+
+ /* Get the RTT and scale it to fit into a 32-bit value that represents
+ * over a minute of time so that we can access it with one instruction
+ * on a 32-bit system.
+ */
+ _rtt = rxrpc_kernel_get_rtt(call->net->socket, call->rxcall);
+ _rtt /= 64;
+ rtt = (_rtt > UINT_MAX) ? UINT_MAX : _rtt;
+ if (rtt < server->probe.rtt) {
+ server->probe.rtt = rtt;
+ alist->preferred = index;
+ have_result = true;
+ }
+
+ smp_wmb(); /* Set rtt before responded. */
+ server->probe.responded = true;
+ set_bit(AFS_VLSERVER_FL_PROBED, &server->flags);
+out:
+ spin_unlock(&server->probe_lock);
+
+ _debug("probe [%u][%u] %pISpc rtt=%u ret=%d",
+ server_index, index, &alist->addrs[index].transport,
+ (unsigned int)rtt, ret);
+
+ have_result |= afs_vl_probe_done(server);
+ if (have_result) {
+ server->probe.have_result = true;
+ wake_up_var(&server->probe.have_result);
+ wake_up_all(&server->probe_wq);
+ }
+}
+
+/*
+ * Probe all of a vlserver's addresses to find out the best route and to
+ * query its capabilities.
+ */
+static bool afs_do_probe_vlserver(struct afs_net *net,
+ struct afs_vlserver *server,
+ struct key *key,
+ unsigned int server_index,
+ struct afs_error *_e)
+{
+ struct afs_addr_cursor ac = {
+ .index = 0,
+ };
+ bool in_progress = false;
+ int err;
+
+ _enter("%s", server->name);
+
+ read_lock(&server->lock);
+ ac.alist = rcu_dereference_protected(server->addresses,
+ lockdep_is_held(&server->lock));
+ read_unlock(&server->lock);
+
+ atomic_set(&server->probe_outstanding, ac.alist->nr_addrs);
+ memset(&server->probe, 0, sizeof(server->probe));
+ server->probe.rtt = UINT_MAX;
+
+ for (ac.index = 0; ac.index < ac.alist->nr_addrs; ac.index++) {
+ err = afs_vl_get_capabilities(net, &ac, key, server,
+ server_index, true);
+ if (err == -EINPROGRESS)
+ in_progress = true;
+ else
+ afs_prioritise_error(_e, err, ac.abort_code);
+ }
+
+ if (!in_progress)
+ afs_vl_probe_done(server);
+ return in_progress;
+}
+
+/*
+ * Send off probes to all unprobed servers.
+ */
+int afs_send_vl_probes(struct afs_net *net, struct key *key,
+ struct afs_vlserver_list *vllist)
+{
+ struct afs_vlserver *server;
+ struct afs_error e;
+ bool in_progress = false;
+ int i;
+
+ e.error = 0;
+ e.responded = false;
+ for (i = 0; i < vllist->nr_servers; i++) {
+ server = vllist->servers[i].server;
+ if (test_bit(AFS_VLSERVER_FL_PROBED, &server->flags))
+ continue;
+
+ if (!test_and_set_bit_lock(AFS_VLSERVER_FL_PROBING, &server->flags) &&
+ afs_do_probe_vlserver(net, server, key, i, &e))
+ in_progress = true;
+ }
+
+ return in_progress ? 0 : e.error;
+}
+
+/*
+ * Wait for the first as-yet untried server to respond.
+ */
+int afs_wait_for_vl_probes(struct afs_vlserver_list *vllist,
+ unsigned long untried)
+{
+ struct wait_queue_entry *waits;
+ struct afs_vlserver *server;
+ unsigned int rtt = UINT_MAX;
+ bool have_responders = false;
+ int pref = -1, i;
+
+ _enter("%u,%lx", vllist->nr_servers, untried);
+
+ /* Only wait for servers that have a probe outstanding. */
+ for (i = 0; i < vllist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = vllist->servers[i].server;
+ if (!test_bit(AFS_VLSERVER_FL_PROBING, &server->flags))
+ __clear_bit(i, &untried);
+ if (server->probe.responded)
+ have_responders = true;
+ }
+ }
+ if (have_responders || !untried)
+ return 0;
+
+ waits = kmalloc(array_size(vllist->nr_servers, sizeof(*waits)), GFP_KERNEL);
+ if (!waits)
+ return -ENOMEM;
+
+ for (i = 0; i < vllist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = vllist->servers[i].server;
+ init_waitqueue_entry(&waits[i], current);
+ add_wait_queue(&server->probe_wq, &waits[i]);
+ }
+ }
+
+ for (;;) {
+ bool still_probing = false;
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ for (i = 0; i < vllist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = vllist->servers[i].server;
+ if (server->probe.responded)
+ goto stop;
+ if (test_bit(AFS_VLSERVER_FL_PROBING, &server->flags))
+ still_probing = true;
+ }
+ }
+
+ if (!still_probing || unlikely(signal_pending(current)))
+ goto stop;
+ schedule();
+ }
+
+stop:
+ set_current_state(TASK_RUNNING);
+
+ for (i = 0; i < vllist->nr_servers; i++) {
+ if (test_bit(i, &untried)) {
+ server = vllist->servers[i].server;
+ if (server->probe.responded &&
+ server->probe.rtt < rtt) {
+ pref = i;
+ rtt = server->probe.rtt;
+ }
+
+ remove_wait_queue(&server->probe_wq, &waits[i]);
+ }
+ }
+
+ kfree(waits);
+
+ if (pref == -1 && signal_pending(current))
+ return -ERESTARTSYS;
+
+ if (pref >= 0)
+ vllist->preferred = pref;
+
+ _leave(" = 0 [%u]", pref);
+ return 0;
+}
--- /dev/null
+/* Handle vlserver selection and rotation.
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/sched/signal.h>
+#include "internal.h"
+#include "afs_vl.h"
+
+/*
+ * Begin an operation on a volume location server.
+ */
+bool afs_begin_vlserver_operation(struct afs_vl_cursor *vc, struct afs_cell *cell,
+ struct key *key)
+{
+ memset(vc, 0, sizeof(*vc));
+ vc->cell = cell;
+ vc->key = key;
+ vc->error = -EDESTADDRREQ;
+ vc->ac.error = SHRT_MAX;
+
+ if (signal_pending(current)) {
+ vc->error = -EINTR;
+ vc->flags |= AFS_VL_CURSOR_STOP;
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * Begin iteration through a server list, starting with the last used server if
+ * possible, or the last recorded good server if not.
+ */
+static bool afs_start_vl_iteration(struct afs_vl_cursor *vc)
+{
+ struct afs_cell *cell = vc->cell;
+
+ if (wait_on_bit(&cell->flags, AFS_CELL_FL_NO_LOOKUP_YET,
+ TASK_INTERRUPTIBLE)) {
+ vc->error = -ERESTARTSYS;
+ return false;
+ }
+
+ read_lock(&cell->vl_servers_lock);
+ vc->server_list = afs_get_vlserverlist(
+ rcu_dereference_protected(cell->vl_servers,
+ lockdep_is_held(&cell->vl_servers_lock)));
+ read_unlock(&cell->vl_servers_lock);
+ if (!vc->server_list || !vc->server_list->nr_servers)
+ return false;
+
+ vc->untried = (1UL << vc->server_list->nr_servers) - 1;
+ vc->index = -1;
+ return true;
+}
+
+/*
+ * Select the vlserver to use. May be called multiple times to rotate
+ * through the vlservers.
+ */
+bool afs_select_vlserver(struct afs_vl_cursor *vc)
+{
+ struct afs_addr_list *alist;
+ struct afs_vlserver *vlserver;
+ struct afs_error e;
+ u32 rtt;
+ int error = vc->ac.error, i;
+
+ _enter("%lx[%d],%lx[%d],%d,%d",
+ vc->untried, vc->index,
+ vc->ac.tried, vc->ac.index,
+ error, vc->ac.abort_code);
+
+ if (vc->flags & AFS_VL_CURSOR_STOP) {
+ _leave(" = f [stopped]");
+ return false;
+ }
+
+ vc->nr_iterations++;
+
+ /* Evaluate the result of the previous operation, if there was one. */
+ switch (error) {
+ case SHRT_MAX:
+ goto start;
+
+ default:
+ case 0:
+ /* Success or local failure. Stop. */
+ vc->error = error;
+ vc->flags |= AFS_VL_CURSOR_STOP;
+ _leave(" = f [okay/local %d]", vc->ac.error);
+ return false;
+
+ case -ECONNABORTED:
+ /* The far side rejected the operation on some grounds. This
+ * might involve the server being busy or the volume having been moved.
+ */
+ switch (vc->ac.abort_code) {
+ case AFSVL_IO:
+ case AFSVL_BADVOLOPER:
+ case AFSVL_NOMEM:
+ /* The server went weird. */
+ vc->error = -EREMOTEIO;
+ //write_lock(&vc->cell->vl_servers_lock);
+ //vc->server_list->weird_mask |= 1 << vc->index;
+ //write_unlock(&vc->cell->vl_servers_lock);
+ goto next_server;
+
+ default:
+ vc->error = afs_abort_to_error(vc->ac.abort_code);
+ goto failed;
+ }
+
+ case -ERFKILL:
+ case -EADDRNOTAVAIL:
+ case -ENETUNREACH:
+ case -EHOSTUNREACH:
+ case -EHOSTDOWN:
+ case -ECONNREFUSED:
+ case -ETIMEDOUT:
+ case -ETIME:
+ _debug("no conn %d", error);
+ vc->error = error;
+ goto iterate_address;
+
+ case -ECONNRESET:
+ _debug("call reset");
+ vc->error = error;
+ vc->flags |= AFS_VL_CURSOR_RETRY;
+ goto next_server;
+ }
+
+restart_from_beginning:
+ _debug("restart");
+ afs_end_cursor(&vc->ac);
+ afs_put_vlserverlist(vc->cell->net, vc->server_list);
+ vc->server_list = NULL;
+ if (vc->flags & AFS_VL_CURSOR_RETRIED)
+ goto failed;
+ vc->flags |= AFS_VL_CURSOR_RETRIED;
+start:
+ _debug("start");
+
+ if (!afs_start_vl_iteration(vc))
+ goto failed;
+
+ error = afs_send_vl_probes(vc->cell->net, vc->key, vc->server_list);
+ if (error < 0)
+ goto failed_set_error;
+
+pick_server:
+ _debug("pick [%lx]", vc->untried);
+
+ error = afs_wait_for_vl_probes(vc->server_list, vc->untried);
+ if (error < 0)
+ goto failed_set_error;
+
+ /* Pick the untried server with the lowest RTT. */
+ vc->index = vc->server_list->preferred;
+ if (test_bit(vc->index, &vc->untried))
+ goto selected_server;
+
+ vc->index = -1;
+ rtt = U32_MAX;
+ for (i = 0; i < vc->server_list->nr_servers; i++) {
+ struct afs_vlserver *s = vc->server_list->servers[i].server;
+
+ if (!test_bit(i, &vc->untried) || !s->probe.responded)
+ continue;
+ if (s->probe.rtt < rtt) {
+ vc->index = i;
+ rtt = s->probe.rtt;
+ }
+ }
+
+ if (vc->index == -1)
+ goto no_more_servers;
+
+selected_server:
+ _debug("use %d", vc->index);
+ __clear_bit(vc->index, &vc->untried);
+
+ /* We're starting on a different vlserver from the list. We need to
+ * check it, find its address list and probe its capabilities before we
+ * use it.
+ */
+ ASSERTCMP(vc->ac.alist, ==, NULL);
+ vlserver = vc->server_list->servers[vc->index].server;
+ vc->server = vlserver;
+
+ _debug("USING VLSERVER: %s", vlserver->name);
+
+ read_lock(&vlserver->lock);
+ alist = rcu_dereference_protected(vlserver->addresses,
+ lockdep_is_held(&vlserver->lock));
+ afs_get_addrlist(alist);
+ read_unlock(&vlserver->lock);
+
+ memset(&vc->ac, 0, sizeof(vc->ac));
+
+ if (!vc->ac.alist)
+ vc->ac.alist = alist;
+ else
+ afs_put_addrlist(alist);
+
+ vc->ac.index = -1;
+
+iterate_address:
+ ASSERT(vc->ac.alist);
+ /* Iterate over the current server's address list to try and find an
+ * address on which it will respond to us.
+ */
+ if (!afs_iterate_addresses(&vc->ac))
+ goto next_server;
+
+ _debug("VL address %d/%d", vc->ac.index, vc->ac.alist->nr_addrs);
+
+ _leave(" = t %pISpc", &vc->ac.alist->addrs[vc->ac.index].transport);
+ return true;
+
+next_server:
+ _debug("next");
+ afs_end_cursor(&vc->ac);
+ goto pick_server;
+
+no_more_servers:
+ /* That's all the servers poked to no good effect. Try again if some
+ * of them were busy.
+ */
+ if (vc->flags & AFS_VL_CURSOR_RETRY)
+ goto restart_from_beginning;
+
+ e.error = -EDESTADDRREQ;
+ e.responded = false;
+ for (i = 0; i < vc->server_list->nr_servers; i++) {
+ struct afs_vlserver *s = vc->server_list->servers[i].server;
+
+ afs_prioritise_error(&e, READ_ONCE(s->probe.error),
+ s->probe.abort_code);
+ }
+
+failed_set_error:
+ vc->error = error;
+failed:
+ vc->flags |= AFS_VL_CURSOR_STOP;
+ afs_end_cursor(&vc->ac);
+ _leave(" = f [failed %d]", vc->error);
+ return false;
+}
+
+/*
+ * Dump cursor state in the case of the error being EDESTADDRREQ.
+ */
+static void afs_vl_dump_edestaddrreq(const struct afs_vl_cursor *vc)
+{
+ static int count;
+ int i;
+
+ if (!IS_ENABLED(CONFIG_AFS_DEBUG_CURSOR) || count > 3)
+ return;
+ count++;
+
+ rcu_read_lock();
+ pr_notice("EDESTADDR occurred\n");
+ pr_notice("VC: ut=%lx ix=%u ni=%hu fl=%hx err=%hd\n",
+ vc->untried, vc->index, vc->nr_iterations, vc->flags, vc->error);
+
+ if (vc->server_list) {
+ const struct afs_vlserver_list *sl = vc->server_list;
+ pr_notice("VC: SL nr=%u ix=%u\n",
+ sl->nr_servers, sl->index);
+ for (i = 0; i < sl->nr_servers; i++) {
+ const struct afs_vlserver *s = sl->servers[i].server;
+ pr_notice("VC: server %s+%hu fl=%lx E=%hd\n",
+ s->name, s->port, s->flags, s->probe.error);
+ if (s->addresses) {
+ const struct afs_addr_list *a =
+ rcu_dereference(s->addresses);
+ pr_notice("VC: - nr=%u/%u/%u pf=%u\n",
+ a->nr_ipv4, a->nr_addrs, a->max_addrs,
+ a->preferred);
+ pr_notice("VC: - pr=%lx R=%lx F=%lx\n",
+ a->probed, a->responded, a->failed);
+ if (a == vc->ac.alist)
+ pr_notice("VC: - current\n");
+ }
+ }
+ }
+
+ pr_notice("AC: t=%lx ax=%u ac=%d er=%d r=%u ni=%u\n",
+ vc->ac.tried, vc->ac.index, vc->ac.abort_code, vc->ac.error,
+ vc->ac.responded, vc->ac.nr_iterations);
+ rcu_read_unlock();
+}
+
+/*
+ * Tidy up a volume location server cursor and unlock the vnode.
+ */
+int afs_end_vlserver_operation(struct afs_vl_cursor *vc)
+{
+ struct afs_net *net = vc->cell->net;
+
+ if (vc->error == -EDESTADDRREQ ||
+ vc->error == -EADDRNOTAVAIL ||
+ vc->error == -ENETUNREACH ||
+ vc->error == -EHOSTUNREACH)
+ afs_vl_dump_edestaddrreq(vc);
+
+ afs_end_cursor(&vc->ac);
+ afs_put_vlserverlist(net, vc->server_list);
+
+ if (vc->error == -ECONNABORTED)
+ vc->error = afs_abort_to_error(vc->ac.abort_code);
+
+ return vc->error;
+}
* Dispatch a get volume entry by name or ID operation (uuid variant). If the
* volname is a decimal number then it's a volume ID not a volume name.
*/
-struct afs_vldb_entry *afs_vl_get_entry_by_name_u(struct afs_net *net,
- struct afs_addr_cursor *ac,
- struct key *key,
+struct afs_vldb_entry *afs_vl_get_entry_by_name_u(struct afs_vl_cursor *vc,
const char *volname,
int volnamesz)
{
struct afs_vldb_entry *entry;
struct afs_call *call;
+ struct afs_net *net = vc->cell->net;
size_t reqsz, padsz;
__be32 *bp;
return ERR_PTR(-ENOMEM);
}
- call->key = key;
+ call->key = vc->key;
call->reply[0] = entry;
call->ret_reply0 = true;
memset((void *)bp + volnamesz, 0, padsz);
trace_afs_make_vl_call(call);
- return (struct afs_vldb_entry *)afs_make_call(ac, call, GFP_KERNEL, false);
+ return (struct afs_vldb_entry *)afs_make_call(&vc->ac, call, GFP_KERNEL, false);
}
/*
u32 uniquifier, nentries, count;
int i, ret;
- _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+ _enter("{%u,%zu/%u}",
+ call->unmarshall, iov_iter_count(call->_iter), call->count);
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_buf(call,
+ sizeof(struct afs_uuid__xdr) + 3 * sizeof(__be32));
call->unmarshall++;
/* Extract the returned uuid, uniquifier, nentries and blkaddrs size */
case 1:
- ret = afs_extract_data(call, call->buffer,
- sizeof(struct afs_uuid__xdr) + 3 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->reply[0] = alist;
call->count = count;
call->count2 = nentries;
- call->offset = 0;
call->unmarshall++;
+ more_entries:
+ count = min(call->count, 4U);
+ afs_extract_to_buf(call, count * sizeof(__be32));
+
/* Extract entries */
case 2:
- count = min(call->count, 4U);
- ret = afs_extract_data(call, call->buffer,
- count * sizeof(__be32),
- call->count > 4);
+ ret = afs_extract_data(call, call->count > 4);
if (ret < 0)
return ret;
alist = call->reply[0];
bp = call->buffer;
+ count = min(call->count, 4U);
for (i = 0; i < count; i++)
if (alist->nr_addrs < call->count2)
afs_merge_fs_addr4(alist, *bp++, AFS_FS_PORT);
call->count -= count;
if (call->count > 0)
- goto again;
- call->offset = 0;
+ goto more_entries;
call->unmarshall++;
break;
}
* Dispatch an operation to get the addresses for a server, where the server is
* nominated by UUID.
*/
-struct afs_addr_list *afs_vl_get_addrs_u(struct afs_net *net,
- struct afs_addr_cursor *ac,
- struct key *key,
+struct afs_addr_list *afs_vl_get_addrs_u(struct afs_vl_cursor *vc,
const uuid_t *uuid)
{
struct afs_ListAddrByAttributes__xdr *r;
const struct afs_uuid *u = (const struct afs_uuid *)uuid;
struct afs_call *call;
+ struct afs_net *net = vc->cell->net;
__be32 *bp;
int i;
if (!call)
return ERR_PTR(-ENOMEM);
- call->key = key;
+ call->key = vc->key;
call->reply[0] = NULL;
call->ret_reply0 = true;
r->uuid.node[i] = htonl(u->node[i]);
trace_afs_make_vl_call(call);
- return (struct afs_addr_list *)afs_make_call(ac, call, GFP_KERNEL, false);
+ return (struct afs_addr_list *)afs_make_call(&vc->ac, call, GFP_KERNEL, false);
}
/*
u32 count;
int ret;
- _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+ _enter("{%u,%zu/%u}",
+ call->unmarshall, iov_iter_count(call->_iter), call->count);
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* Extract the capabilities word count */
case 1:
- ret = afs_extract_data(call, &call->tmp,
- 1 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
count = ntohl(call->tmp);
-
call->count = count;
call->count2 = count;
- call->offset = 0;
+
call->unmarshall++;
+ afs_extract_discard(call, count * sizeof(__be32));
/* Extract capabilities words */
case 2:
- count = min(call->count, 16U);
- ret = afs_extract_data(call, call->buffer,
- count * sizeof(__be32),
- call->count > 16);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
/* TODO: Examine capabilities */
- call->count -= count;
- if (call->count > 0)
- goto again;
- call->offset = 0;
call->unmarshall++;
break;
}
- call->reply[0] = (void *)(unsigned long)call->service_id;
-
_leave(" = 0 [done]");
return 0;
}
+static void afs_destroy_vl_get_capabilities(struct afs_call *call)
+{
+ struct afs_vlserver *server = call->reply[0];
+
+ afs_put_vlserver(call->net, server);
+ afs_flat_call_destructor(call);
+}
+
/*
* VL.GetCapabilities operation type
*/
.name = "VL.GetCapabilities",
.op = afs_VL_GetCapabilities,
.deliver = afs_deliver_vl_get_capabilities,
- .destructor = afs_flat_call_destructor,
+ .done = afs_vlserver_probe_result,
+ .destructor = afs_destroy_vl_get_capabilities,
};
/*
- * Probe a fileserver for the capabilities that it supports. This can
+ * Probe a volume server for the capabilities that it supports. This can
* return up to 196 words.
*
* We use this to probe for service upgrade to determine what the server at the
*/
int afs_vl_get_capabilities(struct afs_net *net,
struct afs_addr_cursor *ac,
- struct key *key)
+ struct key *key,
+ struct afs_vlserver *server,
+ unsigned int server_index,
+ bool async)
{
struct afs_call *call;
__be32 *bp;
return -ENOMEM;
call->key = key;
- call->upgrade = true; /* Let's see if this is a YFS server */
- call->reply[0] = (void *)VLGETCAPABILITIES;
- call->ret_reply0 = true;
+ call->reply[0] = afs_get_vlserver(server);
+ call->reply[1] = (void *)(long)server_index;
+ call->upgrade = true;
+ call->want_reply_time = true;
/* marshall the parameters */
bp = call->request;
/* Can't take a ref on server */
trace_afs_make_vl_call(call);
- return afs_make_call(ac, call, GFP_KERNEL, false);
+ return afs_make_call(ac, call, GFP_KERNEL, async);
}
/*
u32 uniquifier, size;
int ret;
- _enter("{%u,%zu/%u,%u}", call->unmarshall, call->offset, call->count, call->count2);
+ _enter("{%u,%zu,%u}",
+ call->unmarshall, iov_iter_count(call->_iter), call->count2);
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_buf(call, sizeof(uuid_t) + 3 * sizeof(__be32));
call->unmarshall = 1;
/* Extract the returned uuid, uniquifier, fsEndpoints count and
* either the first fsEndpoint type or the volEndpoints
* count if there are no fsEndpoints. */
case 1:
- ret = afs_extract_data(call, call->buffer,
- sizeof(uuid_t) +
- 3 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->count2 = ntohl(*bp); /* Type or next count */
if (call->count > YFS_MAXENDPOINTS)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt_num);
alist = afs_alloc_addrlist(call->count, FS_SERVICE, AFS_FS_PORT);
if (!alist)
return -ENOMEM;
alist->version = uniquifier;
call->reply[0] = alist;
- call->offset = 0;
if (call->count == 0)
goto extract_volendpoints;
- call->unmarshall = 2;
-
- /* Extract fsEndpoints[] entries */
- case 2:
+ next_fsendpoint:
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
size = sizeof(__be32) * (1 + 1 + 1);
size = sizeof(__be32) * (1 + 4 + 1);
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt_type);
}
size += sizeof(__be32);
- ret = afs_extract_data(call, call->buffer, size, true);
+ afs_extract_to_buf(call, size);
+ call->unmarshall = 2;
+
+ /* Extract fsEndpoints[] entries */
+ case 2:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
if (ntohl(bp[0]) != sizeof(__be32) * 2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt4_len);
afs_merge_fs_addr4(alist, bp[1], ntohl(bp[2]));
bp += 3;
break;
case YFS_ENDPOINT_IPV6:
if (ntohl(bp[0]) != sizeof(__be32) * 5)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt6_len);
afs_merge_fs_addr6(alist, bp + 1, ntohl(bp[5]));
bp += 6;
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt_type);
}
/* Got either the type of the next entry or the count of
*/
call->count2 = ntohl(*bp++);
- call->offset = 0;
call->count--;
if (call->count > 0)
- goto again;
+ goto next_fsendpoint;
extract_volendpoints:
/* Extract the list of volEndpoints. */
if (!call->count)
goto end;
if (call->count > YFS_MAXENDPOINTS)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt_type);
+ afs_extract_to_buf(call, 1 * sizeof(__be32));
call->unmarshall = 3;
/* Extract the type of volEndpoints[0]. Normally we would
* data of the current one, but this is the first...
*/
case 3:
- ret = afs_extract_data(call, call->buffer, sizeof(__be32), true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
bp = call->buffer;
- call->count2 = ntohl(*bp++);
- call->offset = 0;
- call->unmarshall = 4;
- /* Extract volEndpoints[] entries */
- case 4:
+ next_volendpoint:
+ call->count2 = ntohl(*bp++);
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
size = sizeof(__be32) * (1 + 1 + 1);
size = sizeof(__be32) * (1 + 4 + 1);
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt_type);
}
if (call->count > 1)
- size += sizeof(__be32);
- ret = afs_extract_data(call, call->buffer, size, true);
+ size += sizeof(__be32); /* Get next type too */
+ afs_extract_to_buf(call, size);
+ call->unmarshall = 4;
+
+ /* Extract volEndpoints[] entries */
+ case 4:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
if (ntohl(bp[0]) != sizeof(__be32) * 2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt4_len);
bp += 3;
break;
case YFS_ENDPOINT_IPV6:
if (ntohl(bp[0]) != sizeof(__be32) * 5)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt6_len);
bp += 6;
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt_type);
}
/* Got either the type of the next entry or the count of
* volEndpoints if no more fsEndpoints.
*/
- call->offset = 0;
call->count--;
- if (call->count > 0) {
- call->count2 = ntohl(*bp++);
- goto again;
- }
+ if (call->count > 0)
+ goto next_volendpoint;
end:
+ afs_extract_discard(call, 0);
call->unmarshall = 5;
/* Done */
case 5:
- ret = afs_extract_data(call, call->buffer, 0, false);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
call->unmarshall = 6;
}
alist = call->reply[0];
-
- /* Start with IPv6 if available. */
- if (alist->nr_ipv4 < alist->nr_addrs)
- alist->index = alist->nr_ipv4;
-
_leave(" = 0 [done]");
return 0;
}
* Dispatch an operation to get the addresses for a server, where the server is
* nominated by UUID.
*/
-struct afs_addr_list *afs_yfsvl_get_endpoints(struct afs_net *net,
- struct afs_addr_cursor *ac,
- struct key *key,
+struct afs_addr_list *afs_yfsvl_get_endpoints(struct afs_vl_cursor *vc,
const uuid_t *uuid)
{
struct afs_call *call;
+ struct afs_net *net = vc->cell->net;
__be32 *bp;
_enter("");
if (!call)
return ERR_PTR(-ENOMEM);
- call->key = key;
+ call->key = vc->key;
call->reply[0] = NULL;
call->ret_reply0 = true;
memcpy(bp, uuid, sizeof(*uuid)); /* Type opr_uuid */
trace_afs_make_vl_call(call);
- return (struct afs_addr_list *)afs_make_call(ac, call, GFP_KERNEL, false);
+ return (struct afs_addr_list *)afs_make_call(&vc->ac, call, GFP_KERNEL, false);
}
const char *volname,
size_t volnamesz)
{
- struct afs_addr_cursor ac;
- struct afs_vldb_entry *vldb;
+ struct afs_vldb_entry *vldb = ERR_PTR(-EDESTADDRREQ);
+ struct afs_vl_cursor vc;
int ret;
- ret = afs_set_vl_cursor(&ac, cell);
- if (ret < 0)
- return ERR_PTR(ret);
-
- while (afs_iterate_addresses(&ac)) {
- if (!test_bit(ac.index, &ac.alist->probed)) {
- ret = afs_vl_get_capabilities(cell->net, &ac, key);
- switch (ret) {
- case VL_SERVICE:
- clear_bit(ac.index, &ac.alist->yfs);
- set_bit(ac.index, &ac.alist->probed);
- ac.addr->srx_service = ret;
- break;
- case YFS_VL_SERVICE:
- set_bit(ac.index, &ac.alist->yfs);
- set_bit(ac.index, &ac.alist->probed);
- ac.addr->srx_service = ret;
- break;
- }
- }
-
- vldb = afs_vl_get_entry_by_name_u(cell->net, &ac, key,
- volname, volnamesz);
- switch (ac.error) {
- case 0:
- afs_end_cursor(&ac);
- return vldb;
- case -ECONNABORTED:
- ac.error = afs_abort_to_error(ac.abort_code);
- goto error;
- case -ENOMEM:
- case -ENONET:
- goto error;
- case -ENETUNREACH:
- case -EHOSTUNREACH:
- case -ECONNREFUSED:
- break;
- default:
- ac.error = -EIO;
- goto error;
- }
+ if (!afs_begin_vlserver_operation(&vc, cell, key))
+ return ERR_PTR(-ERESTARTSYS);
+
+ while (afs_select_vlserver(&vc)) {
+ vldb = afs_vl_get_entry_by_name_u(&vc, volname, volnamesz);
}
-error:
- return ERR_PTR(afs_end_cursor(&ac));
+ ret = afs_end_vlserver_operation(&vc);
+ return ret < 0 ? ERR_PTR(ret) : vldb;
}
/*
/* We look up an ID by passing it as a decimal string in the
* operation's name parameter.
*/
- idsz = sprintf(idbuf, "%u", volume->vid);
+ idsz = sprintf(idbuf, "%llu", volume->vid);
vldb = afs_vl_lookup_vldb(volume->cell, key, idbuf, idsz);
if (IS_ERR(vldb)) {
loff_t pos, unsigned int len, struct page *page)
{
struct afs_read *req;
+ size_t p;
+ void *data;
int ret;
_enter(",,%llu", (unsigned long long)pos);
+ if (pos >= vnode->vfs_inode.i_size) {
+ p = pos & ~PAGE_MASK;
+ ASSERTCMP(p + len, <=, PAGE_SIZE);
+ data = kmap(page);
+ memset(data + p, 0, len);
+ kunmap(page);
+ return 0;
+ }
+
req = kzalloc(sizeof(struct afs_read) + sizeof(struct page *),
GFP_KERNEL);
if (!req)
pgoff_t index = pos >> PAGE_SHIFT;
int ret;
- _enter("{%x:%u},{%lx},%u,%u",
+ _enter("{%llx:%llu},{%lx},%u,%u",
vnode->fid.vid, vnode->fid.vnode, index, from, to);
/* We want to store information about how much of a page is altered in
loff_t i_size, maybe_i_size;
int ret;
- _enter("{%x:%u},{%lx}",
+ _enter("{%llx:%llu},{%lx}",
vnode->fid.vid, vnode->fid.vnode, page->index);
maybe_i_size = pos + copied;
struct pagevec pv;
unsigned count, loop;
- _enter("{%x:%u},%lx-%lx",
+ _enter("{%llx:%llu},%lx-%lx",
vnode->fid.vid, vnode->fid.vnode, first, last);
pagevec_init(&pv);
struct pagevec pv;
unsigned count, loop;
- _enter("{%x:%u},%lx-%lx",
+ _enter("{%llx:%llu},%lx-%lx",
vnode->fid.vid, vnode->fid.vnode, first, last);
pagevec_init(&pv);
struct list_head *p;
int ret = -ENOKEY, ret2;
- _enter("%s{%x:%u.%u},%lx,%lx,%x,%x",
+ _enter("%s{%llx:%llu.%u},%lx,%lx,%x,%x",
vnode->volume->name,
vnode->fid.vid,
vnode->fid.vnode,
case -ENOENT:
case -ENOMEDIUM:
case -ENXIO:
+ trace_afs_file_error(vnode, ret, afs_file_error_writeback_fail);
afs_kill_pages(mapping, first, last);
mapping_set_error(mapping, ret);
break;
unsigned count, loop;
pgoff_t first = call->first, last = call->last;
- _enter("{%x:%u},{%lx-%lx}",
+ _enter("{%llx:%llu},{%lx-%lx}",
vnode->fid.vid, vnode->fid.vnode, first, last);
pagevec_init(&pv);
ssize_t result;
size_t count = iov_iter_count(from);
- _enter("{%x.%u},{%zu},",
+ _enter("{%llx:%llu},{%zu},",
vnode->fid.vid, vnode->fid.vnode, count);
if (IS_SWAPFILE(&vnode->vfs_inode)) {
struct inode *inode = file_inode(file);
struct afs_vnode *vnode = AFS_FS_I(inode);
- _enter("{%x:%u},{n=%pD},%d",
+ _enter("{%llx:%llu},{n=%pD},%d",
vnode->fid.vid, vnode->fid.vnode, file,
datasync);
struct afs_vnode *vnode = AFS_FS_I(inode);
unsigned long priv;
- _enter("{{%x:%u}},{%lx}",
+ _enter("{{%llx:%llu}},{%lx}",
vnode->fid.vid, vnode->fid.vnode, vmf->page->index);
sb_start_pagefault(inode->i_sb);
char text[8 + 1 + 8 + 1 + 8 + 1];
size_t len;
- len = sprintf(text, "%x:%x:%x",
+ len = sprintf(text, "%llx:%llx:%x",
vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique);
if (size == 0)
return len;
--- /dev/null
+/* YFS File Server client stubs
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/sched.h>
+#include <linux/circ_buf.h>
+#include <linux/iversion.h>
+#include "internal.h"
+#include "afs_fs.h"
+#include "xdr_fs.h"
+#include "protocol_yfs.h"
+
+static const struct afs_fid afs_zero_fid;
+
+static inline void afs_use_fs_server(struct afs_call *call, struct afs_cb_interest *cbi)
+{
+ call->cbi = afs_get_cb_interest(cbi);
+}
+
+#define xdr_size(x) (sizeof(*x) / sizeof(__be32))
+
+static void xdr_decode_YFSFid(const __be32 **_bp, struct afs_fid *fid)
+{
+ const struct yfs_xdr_YFSFid *x = (const void *)*_bp;
+
+ fid->vid = xdr_to_u64(x->volume);
+ fid->vnode = xdr_to_u64(x->vnode.lo);
+ fid->vnode_hi = ntohl(x->vnode.hi);
+ fid->unique = ntohl(x->vnode.unique);
+ *_bp += xdr_size(x);
+}
+
+static __be32 *xdr_encode_u32(__be32 *bp, u32 n)
+{
+ *bp++ = htonl(n);
+ return bp;
+}
+
+static __be32 *xdr_encode_u64(__be32 *bp, u64 n)
+{
+ struct yfs_xdr_u64 *x = (void *)bp;
+
+ *x = u64_to_xdr(n);
+ return bp + xdr_size(x);
+}
+
+static __be32 *xdr_encode_YFSFid(__be32 *bp, struct afs_fid *fid)
+{
+ struct yfs_xdr_YFSFid *x = (void *)bp;
+
+ x->volume = u64_to_xdr(fid->vid);
+ x->vnode.lo = u64_to_xdr(fid->vnode);
+ x->vnode.hi = htonl(fid->vnode_hi);
+ x->vnode.unique = htonl(fid->unique);
+ return bp + xdr_size(x);
+}
+
+static size_t xdr_strlen(unsigned int len)
+{
+ return sizeof(__be32) + round_up(len, sizeof(__be32));
+}
+
+static __be32 *xdr_encode_string(__be32 *bp, const char *p, unsigned int len)
+{
+ bp = xdr_encode_u32(bp, len);
+ bp = memcpy(bp, p, len);
+ if (len & 3) {
+ unsigned int pad = 4 - (len & 3);
+
+ memset((u8 *)bp + len, 0, pad);
+ len += pad;
+ }
+
+ return bp + len / sizeof(__be32);
+}
+
+static s64 linux_to_yfs_time(const struct timespec64 *t)
+{
+ /* Convert to 100ns intervals. */
+ return (u64)t->tv_sec * 10000000 + t->tv_nsec/100;
+}
+
+static __be32 *xdr_encode_YFSStoreStatus_mode(__be32 *bp, mode_t mode)
+{
+ struct yfs_xdr_YFSStoreStatus *x = (void *)bp;
+
+ x->mask = htonl(AFS_SET_MODE);
+ x->mode = htonl(mode & S_IALLUGO);
+ x->mtime_client = u64_to_xdr(0);
+ x->owner = u64_to_xdr(0);
+ x->group = u64_to_xdr(0);
+ return bp + xdr_size(x);
+}
+
+static __be32 *xdr_encode_YFSStoreStatus_mtime(__be32 *bp, const struct timespec64 *t)
+{
+ struct yfs_xdr_YFSStoreStatus *x = (void *)bp;
+ s64 mtime = linux_to_yfs_time(t);
+
+ x->mask = htonl(AFS_SET_MTIME);
+ x->mode = htonl(0);
+ x->mtime_client = u64_to_xdr(mtime);
+ x->owner = u64_to_xdr(0);
+ x->group = u64_to_xdr(0);
+ return bp + xdr_size(x);
+}
+
+/*
+ * Convert a signed 100ns-resolution 64-bit time into a timespec.
+ */
+static struct timespec64 yfs_time_to_linux(s64 t)
+{
+ struct timespec64 ts;
+ u64 abs_t;
+
+ /*
+ * Unfortunately can not use normal 64 bit division on 32 bit arch, but
+ * the alternative, do_div, does not work with negative numbers so have
+ * to special case them
+ */
+ if (t < 0) {
+ abs_t = -t;
+ ts.tv_nsec = (time64_t)(do_div(abs_t, 10000000) * 100);
+ ts.tv_nsec = -ts.tv_nsec;
+ ts.tv_sec = -abs_t;
+ } else {
+ abs_t = t;
+ ts.tv_nsec = (time64_t)do_div(abs_t, 10000000) * 100;
+ ts.tv_sec = abs_t;
+ }
+
+ return ts;
+}
+
+static struct timespec64 xdr_to_time(const struct yfs_xdr_u64 xdr)
+{
+ s64 t = xdr_to_u64(xdr);
+
+ return yfs_time_to_linux(t);
+}
+
+static void yfs_check_req(struct afs_call *call, __be32 *bp)
+{
+ size_t len = (void *)bp - call->request;
+
+ if (len > call->request_size)
+ pr_err("kAFS: %s: Request buffer overflow (%zu>%u)\n",
+ call->type->name, len, call->request_size);
+ else if (len < call->request_size)
+ pr_warning("kAFS: %s: Request buffer underflow (%zu<%u)\n",
+ call->type->name, len, call->request_size);
+}
+
+/*
+ * Dump a bad file status record.
+ */
+static void xdr_dump_bad(const __be32 *bp)
+{
+ __be32 x[4];
+ int i;
+
+ pr_notice("YFS XDR: Bad status record\n");
+ for (i = 0; i < 5 * 4 * 4; i += 16) {
+ memcpy(x, bp, 16);
+ bp += 4;
+ pr_notice("%03x: %08x %08x %08x %08x\n",
+ i, ntohl(x[0]), ntohl(x[1]), ntohl(x[2]), ntohl(x[3]));
+ }
+
+ memcpy(x, bp, 4);
+ pr_notice("0x50: %08x\n", ntohl(x[0]));
+}
+
+/*
+ * Decode a YFSFetchStatus block
+ */
+static int xdr_decode_YFSFetchStatus(struct afs_call *call,
+ const __be32 **_bp,
+ struct afs_file_status *status,
+ struct afs_vnode *vnode,
+ const afs_dataversion_t *expected_version,
+ struct afs_read *read_req)
+{
+ const struct yfs_xdr_YFSFetchStatus *xdr = (const void *)*_bp;
+ u32 type;
+ u8 flags = 0;
+
+ status->abort_code = ntohl(xdr->abort_code);
+ if (status->abort_code != 0) {
+ if (vnode && status->abort_code == VNOVNODE) {
+ set_bit(AFS_VNODE_DELETED, &vnode->flags);
+ status->nlink = 0;
+ __afs_break_callback(vnode);
+ }
+ return 0;
+ }
+
+ type = ntohl(xdr->type);
+ switch (type) {
+ case AFS_FTYPE_FILE:
+ case AFS_FTYPE_DIR:
+ case AFS_FTYPE_SYMLINK:
+ if (type != status->type &&
+ vnode &&
+ !test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
+ pr_warning("Vnode %llx:%llx:%x changed type %u to %u\n",
+ vnode->fid.vid,
+ vnode->fid.vnode,
+ vnode->fid.unique,
+ status->type, type);
+ goto bad;
+ }
+ status->type = type;
+ break;
+ default:
+ goto bad;
+ }
+
+#define EXTRACT_M4(FIELD) \
+ do { \
+ u32 x = ntohl(xdr->FIELD); \
+ if (status->FIELD != x) { \
+ flags |= AFS_VNODE_META_CHANGED; \
+ status->FIELD = x; \
+ } \
+ } while (0)
+
+#define EXTRACT_M8(FIELD) \
+ do { \
+ u64 x = xdr_to_u64(xdr->FIELD); \
+ if (status->FIELD != x) { \
+ flags |= AFS_VNODE_META_CHANGED; \
+ status->FIELD = x; \
+ } \
+ } while (0)
+
+#define EXTRACT_D8(FIELD) \
+ do { \
+ u64 x = xdr_to_u64(xdr->FIELD); \
+ if (status->FIELD != x) { \
+ flags |= AFS_VNODE_DATA_CHANGED; \
+ status->FIELD = x; \
+ } \
+ } while (0)
+
+ EXTRACT_M4(nlink);
+ EXTRACT_D8(size);
+ EXTRACT_D8(data_version);
+ EXTRACT_M8(author);
+ EXTRACT_M8(owner);
+ EXTRACT_M8(group);
+ EXTRACT_M4(mode);
+ EXTRACT_M4(caller_access); /* call ticket dependent */
+ EXTRACT_M4(anon_access);
+
+ status->mtime_client = xdr_to_time(xdr->mtime_client);
+ status->mtime_server = xdr_to_time(xdr->mtime_server);
+ status->lock_count = ntohl(xdr->lock_count);
+
+ if (read_req) {
+ read_req->data_version = status->data_version;
+ read_req->file_size = status->size;
+ }
+
+ *_bp += xdr_size(xdr);
+
+ if (vnode) {
+ if (test_bit(AFS_VNODE_UNSET, &vnode->flags))
+ flags |= AFS_VNODE_NOT_YET_SET;
+ afs_update_inode_from_status(vnode, status, expected_version,
+ flags);
+ }
+
+ return 0;
+
+bad:
+ xdr_dump_bad(*_bp);
+ return afs_protocol_error(call, -EBADMSG, afs_eproto_bad_status);
+}
+
+/*
+ * Decode the file status. We need to lock the target vnode if we're going to
+ * update its status so that stat() sees the attributes update atomically.
+ */
+static int yfs_decode_status(struct afs_call *call,
+ const __be32 **_bp,
+ struct afs_file_status *status,
+ struct afs_vnode *vnode,
+ const afs_dataversion_t *expected_version,
+ struct afs_read *read_req)
+{
+ int ret;
+
+ if (!vnode)
+ return xdr_decode_YFSFetchStatus(call, _bp, status, vnode,
+ expected_version, read_req);
+
+ write_seqlock(&vnode->cb_lock);
+ ret = xdr_decode_YFSFetchStatus(call, _bp, status, vnode,
+ expected_version, read_req);
+ write_sequnlock(&vnode->cb_lock);
+ return ret;
+}
+
+/*
+ * Decode a YFSCallBack block
+ */
+static void xdr_decode_YFSCallBack(struct afs_call *call,
+ struct afs_vnode *vnode,
+ const __be32 **_bp)
+{
+ struct yfs_xdr_YFSCallBack *xdr = (void *)*_bp;
+ struct afs_cb_interest *old, *cbi = call->cbi;
+ u64 cb_expiry;
+
+ write_seqlock(&vnode->cb_lock);
+
+ if (!afs_cb_is_broken(call->cb_break, vnode, cbi)) {
+ cb_expiry = xdr_to_u64(xdr->expiration_time);
+ do_div(cb_expiry, 10 * 1000 * 1000);
+ vnode->cb_version = ntohl(xdr->version);
+ vnode->cb_type = ntohl(xdr->type);
+ vnode->cb_expires_at = cb_expiry + ktime_get_real_seconds();
+ old = vnode->cb_interest;
+ if (old != call->cbi) {
+ vnode->cb_interest = cbi;
+ cbi = old;
+ }
+ set_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
+ }
+
+ write_sequnlock(&vnode->cb_lock);
+ call->cbi = cbi;
+ *_bp += xdr_size(xdr);
+}
+
+static void xdr_decode_YFSCallBack_raw(const __be32 **_bp,
+ struct afs_callback *cb)
+{
+ struct yfs_xdr_YFSCallBack *x = (void *)*_bp;
+ u64 cb_expiry;
+
+ cb_expiry = xdr_to_u64(x->expiration_time);
+ do_div(cb_expiry, 10 * 1000 * 1000);
+ cb->version = ntohl(x->version);
+ cb->type = ntohl(x->type);
+ cb->expires_at = cb_expiry + ktime_get_real_seconds();
+
+ *_bp += xdr_size(x);
+}
+
+/*
+ * Decode a YFSVolSync block
+ */
+static void xdr_decode_YFSVolSync(const __be32 **_bp,
+ struct afs_volsync *volsync)
+{
+ struct yfs_xdr_YFSVolSync *x = (void *)*_bp;
+ u64 creation;
+
+ if (volsync) {
+ creation = xdr_to_u64(x->vol_creation_date);
+ do_div(creation, 10 * 1000 * 1000);
+ volsync->creation = creation;
+ }
+
+ *_bp += xdr_size(x);
+}
+
+/*
+ * Encode the requested attributes into a YFSStoreStatus block
+ */
+static __be32 *xdr_encode_YFS_StoreStatus(__be32 *bp, struct iattr *attr)
+{
+ struct yfs_xdr_YFSStoreStatus *x = (void *)bp;
+ s64 mtime = 0, owner = 0, group = 0;
+ u32 mask = 0, mode = 0;
+
+ mask = 0;
+ if (attr->ia_valid & ATTR_MTIME) {
+ mask |= AFS_SET_MTIME;
+ mtime = linux_to_yfs_time(&attr->ia_mtime);
+ }
+
+ if (attr->ia_valid & ATTR_UID) {
+ mask |= AFS_SET_OWNER;
+ owner = from_kuid(&init_user_ns, attr->ia_uid);
+ }
+
+ if (attr->ia_valid & ATTR_GID) {
+ mask |= AFS_SET_GROUP;
+ group = from_kgid(&init_user_ns, attr->ia_gid);
+ }
+
+ if (attr->ia_valid & ATTR_MODE) {
+ mask |= AFS_SET_MODE;
+ mode = attr->ia_mode & S_IALLUGO;
+ }
+
+ x->mask = htonl(mask);
+ x->mode = htonl(mode);
+ x->mtime_client = u64_to_xdr(mtime);
+ x->owner = u64_to_xdr(owner);
+ x->group = u64_to_xdr(group);
+ return bp + xdr_size(x);
+}
+
+/*
+ * Decode a YFSFetchVolumeStatus block.
+ */
+static void xdr_decode_YFSFetchVolumeStatus(const __be32 **_bp,
+ struct afs_volume_status *vs)
+{
+ const struct yfs_xdr_YFSFetchVolumeStatus *x = (const void *)*_bp;
+ u32 flags;
+
+ vs->vid = xdr_to_u64(x->vid);
+ vs->parent_id = xdr_to_u64(x->parent_id);
+ flags = ntohl(x->flags);
+ vs->online = flags & yfs_FVSOnline;
+ vs->in_service = flags & yfs_FVSInservice;
+ vs->blessed = flags & yfs_FVSBlessed;
+ vs->needs_salvage = flags & yfs_FVSNeedsSalvage;
+ vs->type = ntohl(x->type);
+ vs->min_quota = 0;
+ vs->max_quota = xdr_to_u64(x->max_quota);
+ vs->blocks_in_use = xdr_to_u64(x->blocks_in_use);
+ vs->part_blocks_avail = xdr_to_u64(x->part_blocks_avail);
+ vs->part_max_blocks = xdr_to_u64(x->part_max_blocks);
+ vs->vol_copy_date = xdr_to_u64(x->vol_copy_date);
+ vs->vol_backup_date = xdr_to_u64(x->vol_backup_date);
+ *_bp += sizeof(*x) / sizeof(__be32);
+}
+
+/*
+ * deliver reply data to an FS.FetchStatus
+ */
+static int yfs_deliver_fs_fetch_status_vnode(struct afs_call *call)
+{
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSCallBack(call, vnode, &bp);
+ xdr_decode_YFSVolSync(&bp, call->reply[1]);
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.FetchStatus operation type
+ */
+static const struct afs_call_type yfs_RXYFSFetchStatus_vnode = {
+ .name = "YFS.FetchStatus(vnode)",
+ .op = yfs_FS_FetchStatus,
+ .deliver = yfs_deliver_fs_fetch_status_vnode,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Fetch the status information for a file.
+ */
+int yfs_fs_fetch_file_status(struct afs_fs_cursor *fc, struct afs_volsync *volsync,
+ bool new_inode)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+
+ _enter(",%x,{%llx:%llu},,",
+ key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSFetchStatus_vnode,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSCallBack) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call) {
+ fc->ac.error = -ENOMEM;
+ return -ENOMEM;
+ }
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+ call->reply[1] = volsync;
+ call->expected_version = new_inode ? 1 : vnode->status.data_version;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSFETCHSTATUS);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ yfs_check_req(call, bp);
+
+ call->cb_break = fc->cb_break;
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to an YFS.FetchData64.
+ */
+static int yfs_deliver_fs_fetch_data64(struct afs_call *call)
+{
+ struct afs_vnode *vnode = call->reply[0];
+ struct afs_read *req = call->reply[2];
+ const __be32 *bp;
+ unsigned int size;
+ int ret;
+
+ _enter("{%u,%zu/%llu}",
+ call->unmarshall, iov_iter_count(&call->iter), req->actual_len);
+
+ switch (call->unmarshall) {
+ case 0:
+ req->actual_len = 0;
+ req->index = 0;
+ req->offset = req->pos & (PAGE_SIZE - 1);
+ afs_extract_to_tmp64(call);
+ call->unmarshall++;
+
+ /* extract the returned data length */
+ case 1:
+ _debug("extract data length");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ req->actual_len = be64_to_cpu(call->tmp64);
+ _debug("DATA length: %llu", req->actual_len);
+ req->remain = min(req->len, req->actual_len);
+ if (req->remain == 0)
+ goto no_more_data;
+
+ call->unmarshall++;
+
+ begin_page:
+ ASSERTCMP(req->index, <, req->nr_pages);
+ if (req->remain > PAGE_SIZE - req->offset)
+ size = PAGE_SIZE - req->offset;
+ else
+ size = req->remain;
+ call->bvec[0].bv_len = size;
+ call->bvec[0].bv_offset = req->offset;
+ call->bvec[0].bv_page = req->pages[req->index];
+ iov_iter_bvec(&call->iter, READ, call->bvec, 1, size);
+ ASSERTCMP(size, <=, PAGE_SIZE);
+
+ /* extract the returned data */
+ case 2:
+ _debug("extract data %zu/%llu",
+ iov_iter_count(&call->iter), req->remain);
+
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+ req->remain -= call->bvec[0].bv_len;
+ req->offset += call->bvec[0].bv_len;
+ ASSERTCMP(req->offset, <=, PAGE_SIZE);
+ if (req->offset == PAGE_SIZE) {
+ req->offset = 0;
+ if (req->page_done)
+ req->page_done(call, req);
+ req->index++;
+ if (req->remain > 0)
+ goto begin_page;
+ }
+
+ ASSERTCMP(req->remain, ==, 0);
+ if (req->actual_len <= req->len)
+ goto no_more_data;
+
+ /* Discard any excess data the server gave us */
+ iov_iter_discard(&call->iter, READ, req->actual_len - req->len);
+ call->unmarshall = 3;
+ case 3:
+ _debug("extract discard %zu/%llu",
+ iov_iter_count(&call->iter), req->actual_len - req->len);
+
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ no_more_data:
+ call->unmarshall = 4;
+ afs_extract_to_buf(call,
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSCallBack) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+
+ /* extract the metadata */
+ case 4:
+ ret = afs_extract_data(call, false);
+ if (ret < 0)
+ return ret;
+
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode,
+ &vnode->status.data_version, req);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSCallBack(call, vnode, &bp);
+ xdr_decode_YFSVolSync(&bp, call->reply[1]);
+
+ call->unmarshall++;
+
+ case 5:
+ break;
+ }
+
+ for (; req->index < req->nr_pages; req->index++) {
+ if (req->offset < PAGE_SIZE)
+ zero_user_segment(req->pages[req->index],
+ req->offset, PAGE_SIZE);
+ if (req->page_done)
+ req->page_done(call, req);
+ req->offset = 0;
+ }
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+static void yfs_fetch_data_destructor(struct afs_call *call)
+{
+ struct afs_read *req = call->reply[2];
+
+ afs_put_read(req);
+ afs_flat_call_destructor(call);
+}
+
+/*
+ * YFS.FetchData64 operation type
+ */
+static const struct afs_call_type yfs_RXYFSFetchData64 = {
+ .name = "YFS.FetchData64",
+ .op = yfs_FS_FetchData64,
+ .deliver = yfs_deliver_fs_fetch_data64,
+ .destructor = yfs_fetch_data_destructor,
+};
+
+/*
+ * Fetch data from a file.
+ */
+int yfs_fs_fetch_data(struct afs_fs_cursor *fc, struct afs_read *req)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+
+ _enter(",%x,{%llx:%llu},%llx,%llx",
+ key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode,
+ req->pos, req->len);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSFetchData64,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_u64) * 2,
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSCallBack) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+ call->reply[1] = NULL; /* volsync */
+ call->reply[2] = req;
+ call->expected_version = vnode->status.data_version;
+ call->want_reply_time = true;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSFETCHDATA64);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ bp = xdr_encode_u64(bp, req->pos);
+ bp = xdr_encode_u64(bp, req->len);
+ yfs_check_req(call, bp);
+
+ refcount_inc(&req->usage);
+ call->cb_break = fc->cb_break;
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data for YFS.CreateFile or YFS.MakeDir.
+ */
+static int yfs_deliver_fs_create_vnode(struct afs_call *call)
+{
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ xdr_decode_YFSFid(&bp, call->reply[1]);
+ ret = yfs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSCallBack_raw(&bp, call->reply[3]);
+ xdr_decode_YFSVolSync(&bp, NULL);
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * FS.CreateFile and FS.MakeDir operation type
+ */
+static const struct afs_call_type afs_RXFSCreateFile = {
+ .name = "YFS.CreateFile",
+ .op = yfs_FS_CreateFile,
+ .deliver = yfs_deliver_fs_create_vnode,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Create a file.
+ */
+int yfs_fs_create_file(struct afs_fs_cursor *fc,
+ const char *name,
+ umode_t mode,
+ u64 current_data_version,
+ struct afs_fid *newfid,
+ struct afs_file_status *newstatus,
+ struct afs_callback *newcb)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ size_t namesz, reqsz, rplsz;
+ __be32 *bp;
+
+ _enter("");
+
+ namesz = strlen(name);
+ reqsz = (sizeof(__be32) +
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(namesz) +
+ sizeof(struct yfs_xdr_YFSStoreStatus) +
+ sizeof(__be32));
+ rplsz = (sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSCallBack) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+
+ call = afs_alloc_flat_call(net, &afs_RXFSCreateFile, reqsz, rplsz);
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+ call->reply[1] = newfid;
+ call->reply[2] = newstatus;
+ call->reply[3] = newcb;
+ call->expected_version = current_data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSCREATEFILE);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ bp = xdr_encode_string(bp, name, namesz);
+ bp = xdr_encode_YFSStoreStatus_mode(bp, mode);
+ bp = xdr_encode_u32(bp, 0); /* ViceLockType */
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+static const struct afs_call_type yfs_RXFSMakeDir = {
+ .name = "YFS.MakeDir",
+ .op = yfs_FS_MakeDir,
+ .deliver = yfs_deliver_fs_create_vnode,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Make a directory.
+ */
+int yfs_fs_make_dir(struct afs_fs_cursor *fc,
+ const char *name,
+ umode_t mode,
+ u64 current_data_version,
+ struct afs_fid *newfid,
+ struct afs_file_status *newstatus,
+ struct afs_callback *newcb)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ size_t namesz, reqsz, rplsz;
+ __be32 *bp;
+
+ _enter("");
+
+ namesz = strlen(name);
+ reqsz = (sizeof(__be32) +
+ sizeof(struct yfs_xdr_RPCFlags) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(namesz) +
+ sizeof(struct yfs_xdr_YFSStoreStatus));
+ rplsz = (sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSCallBack) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+
+ call = afs_alloc_flat_call(net, &yfs_RXFSMakeDir, reqsz, rplsz);
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+ call->reply[1] = newfid;
+ call->reply[2] = newstatus;
+ call->reply[3] = newcb;
+ call->expected_version = current_data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSMAKEDIR);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ bp = xdr_encode_string(bp, name, namesz);
+ bp = xdr_encode_YFSStoreStatus_mode(bp, mode);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to a YFS.RemoveFile2 operation.
+ */
+static int yfs_deliver_fs_remove_file2(struct afs_call *call)
+{
+ struct afs_vnode *dvnode = call->reply[0];
+ struct afs_vnode *vnode = call->reply[1];
+ struct afs_fid fid;
+ const __be32 *bp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &dvnode->status, dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+
+ xdr_decode_YFSFid(&bp, &fid);
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ /* Was deleted if vnode->status.abort_code == VNOVNODE. */
+
+ xdr_decode_YFSVolSync(&bp, NULL);
+ return 0;
+}
+
+/*
+ * YFS.RemoveFile2 operation type.
+ */
+static const struct afs_call_type yfs_RXYFSRemoveFile2 = {
+ .name = "YFS.RemoveFile2",
+ .op = yfs_FS_RemoveFile2,
+ .deliver = yfs_deliver_fs_remove_file2,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Remove a file and retrieve new file status.
+ */
+int yfs_fs_remove_file2(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
+ const char *name, u64 current_data_version)
+{
+ struct afs_vnode *dvnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(dvnode);
+ size_t namesz;
+ __be32 *bp;
+
+ _enter("");
+
+ namesz = strlen(name);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSRemoveFile2,
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_RPCFlags) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(namesz),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = dvnode;
+ call->reply[1] = vnode;
+ call->expected_version = current_data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSREMOVEFILE2);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &dvnode->fid);
+ bp = xdr_encode_string(bp, name, namesz);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &dvnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to a YFS.RemoveFile or YFS.RemoveDir operation.
+ */
+static int yfs_deliver_fs_remove(struct afs_call *call)
+{
+ struct afs_vnode *dvnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &dvnode->status, dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+
+ xdr_decode_YFSVolSync(&bp, NULL);
+ return 0;
+}
+
+/*
+ * FS.RemoveDir and FS.RemoveFile operation types.
+ */
+static const struct afs_call_type yfs_RXYFSRemoveFile = {
+ .name = "YFS.RemoveFile",
+ .op = yfs_FS_RemoveFile,
+ .deliver = yfs_deliver_fs_remove,
+ .destructor = afs_flat_call_destructor,
+};
+
+static const struct afs_call_type yfs_RXYFSRemoveDir = {
+ .name = "YFS.RemoveDir",
+ .op = yfs_FS_RemoveDir,
+ .deliver = yfs_deliver_fs_remove,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * remove a file or directory
+ */
+int yfs_fs_remove(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
+ const char *name, bool isdir, u64 current_data_version)
+{
+ struct afs_vnode *dvnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(dvnode);
+ size_t namesz;
+ __be32 *bp;
+
+ _enter("");
+
+ namesz = strlen(name);
+ call = afs_alloc_flat_call(
+ net, isdir ? &yfs_RXYFSRemoveDir : &yfs_RXYFSRemoveFile,
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_RPCFlags) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(namesz),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = dvnode;
+ call->reply[1] = vnode;
+ call->expected_version = current_data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, isdir ? YFSREMOVEDIR : YFSREMOVEFILE);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &dvnode->fid);
+ bp = xdr_encode_string(bp, name, namesz);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &dvnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to a YFS.Link operation.
+ */
+static int yfs_deliver_fs_link(struct afs_call *call)
+{
+ struct afs_vnode *dvnode = call->reply[0], *vnode = call->reply[1];
+ const __be32 *bp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = yfs_decode_status(call, &bp, &dvnode->status, dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSVolSync(&bp, NULL);
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.Link operation type.
+ */
+static const struct afs_call_type yfs_RXYFSLink = {
+ .name = "YFS.Link",
+ .op = yfs_FS_Link,
+ .deliver = yfs_deliver_fs_link,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Make a hard link.
+ */
+int yfs_fs_link(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
+ const char *name, u64 current_data_version)
+{
+ struct afs_vnode *dvnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ size_t namesz;
+ __be32 *bp;
+
+ _enter("");
+
+ namesz = strlen(name);
+ call = afs_alloc_flat_call(net, &yfs_RXYFSLink,
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_RPCFlags) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(namesz) +
+ sizeof(struct yfs_xdr_YFSFid),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = dvnode;
+ call->reply[1] = vnode;
+ call->expected_version = current_data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSLINK);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &dvnode->fid);
+ bp = xdr_encode_string(bp, name, namesz);
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to a YFS.Symlink operation.
+ */
+static int yfs_deliver_fs_symlink(struct afs_call *call)
+{
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ xdr_decode_YFSFid(&bp, call->reply[1]);
+ ret = yfs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSVolSync(&bp, NULL);
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.Symlink operation type
+ */
+static const struct afs_call_type yfs_RXYFSSymlink = {
+ .name = "YFS.Symlink",
+ .op = yfs_FS_Symlink,
+ .deliver = yfs_deliver_fs_symlink,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Create a symbolic link.
+ */
+int yfs_fs_symlink(struct afs_fs_cursor *fc,
+ const char *name,
+ const char *contents,
+ u64 current_data_version,
+ struct afs_fid *newfid,
+ struct afs_file_status *newstatus)
+{
+ struct afs_vnode *dvnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(dvnode);
+ size_t namesz, contents_sz;
+ __be32 *bp;
+
+ _enter("");
+
+ namesz = strlen(name);
+ contents_sz = strlen(contents);
+ call = afs_alloc_flat_call(net, &yfs_RXYFSSymlink,
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_RPCFlags) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(namesz) +
+ xdr_strlen(contents_sz) +
+ sizeof(struct yfs_xdr_YFSStoreStatus),
+ sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = dvnode;
+ call->reply[1] = newfid;
+ call->reply[2] = newstatus;
+ call->expected_version = current_data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSSYMLINK);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &dvnode->fid);
+ bp = xdr_encode_string(bp, name, namesz);
+ bp = xdr_encode_string(bp, contents, contents_sz);
+ bp = xdr_encode_YFSStoreStatus_mode(bp, S_IRWXUGO);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &dvnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to a YFS.Rename operation.
+ */
+static int yfs_deliver_fs_rename(struct afs_call *call)
+{
+ struct afs_vnode *orig_dvnode = call->reply[0];
+ struct afs_vnode *new_dvnode = call->reply[1];
+ const __be32 *bp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &orig_dvnode->status, orig_dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ if (new_dvnode != orig_dvnode) {
+ ret = yfs_decode_status(call, &bp, &new_dvnode->status, new_dvnode,
+ &call->expected_version_2, NULL);
+ if (ret < 0)
+ return ret;
+ }
+
+ xdr_decode_YFSVolSync(&bp, NULL);
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.Rename operation type
+ */
+static const struct afs_call_type yfs_RXYFSRename = {
+ .name = "FS.Rename",
+ .op = yfs_FS_Rename,
+ .deliver = yfs_deliver_fs_rename,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Rename a file or directory.
+ */
+int yfs_fs_rename(struct afs_fs_cursor *fc,
+ const char *orig_name,
+ struct afs_vnode *new_dvnode,
+ const char *new_name,
+ u64 current_orig_data_version,
+ u64 current_new_data_version)
+{
+ struct afs_vnode *orig_dvnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(orig_dvnode);
+ size_t o_namesz, n_namesz;
+ __be32 *bp;
+
+ _enter("");
+
+ o_namesz = strlen(orig_name);
+ n_namesz = strlen(new_name);
+ call = afs_alloc_flat_call(net, &yfs_RXYFSRename,
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_RPCFlags) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(o_namesz) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ xdr_strlen(n_namesz),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = orig_dvnode;
+ call->reply[1] = new_dvnode;
+ call->expected_version = current_orig_data_version + 1;
+ call->expected_version_2 = current_new_data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSRENAME);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &orig_dvnode->fid);
+ bp = xdr_encode_string(bp, orig_name, o_namesz);
+ bp = xdr_encode_YFSFid(bp, &new_dvnode->fid);
+ bp = xdr_encode_string(bp, new_name, n_namesz);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &orig_dvnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to a YFS.StoreData64 operation.
+ */
+static int yfs_deliver_fs_store_data(struct afs_call *call)
+{
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ _enter("");
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSVolSync(&bp, NULL);
+
+ afs_pages_written_back(vnode, call);
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.StoreData64 operation type.
+ */
+static const struct afs_call_type yfs_RXYFSStoreData64 = {
+ .name = "YFS.StoreData64",
+ .op = yfs_FS_StoreData64,
+ .deliver = yfs_deliver_fs_store_data,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Store a set of pages to a large file.
+ */
+int yfs_fs_store_data(struct afs_fs_cursor *fc, struct address_space *mapping,
+ pgoff_t first, pgoff_t last,
+ unsigned offset, unsigned to)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ loff_t size, pos, i_size;
+ __be32 *bp;
+
+ _enter(",%x,{%llx:%llu},,",
+ key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
+
+ size = (loff_t)to - (loff_t)offset;
+ if (first != last)
+ size += (loff_t)(last - first) << PAGE_SHIFT;
+ pos = (loff_t)first << PAGE_SHIFT;
+ pos += offset;
+
+ i_size = i_size_read(&vnode->vfs_inode);
+ if (pos + size > i_size)
+ i_size = size + pos;
+
+ _debug("size %llx, at %llx, i_size %llx",
+ (unsigned long long)size, (unsigned long long)pos,
+ (unsigned long long)i_size);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSStoreData64,
+ sizeof(__be32) +
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_YFSStoreStatus) +
+ sizeof(struct yfs_xdr_u64) * 3,
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->mapping = mapping;
+ call->reply[0] = vnode;
+ call->first = first;
+ call->last = last;
+ call->first_offset = offset;
+ call->last_to = to;
+ call->send_pages = true;
+ call->expected_version = vnode->status.data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSSTOREDATA64);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ bp = xdr_encode_YFSStoreStatus_mtime(bp, &vnode->vfs_inode.i_mtime);
+ bp = xdr_encode_u64(bp, pos);
+ bp = xdr_encode_u64(bp, size);
+ bp = xdr_encode_u64(bp, i_size);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * deliver reply data to an FS.StoreStatus
+ */
+static int yfs_deliver_fs_store_status(struct afs_call *call)
+{
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ _enter("");
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSVolSync(&bp, NULL);
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.StoreStatus operation type
+ */
+static const struct afs_call_type yfs_RXYFSStoreStatus = {
+ .name = "YFS.StoreStatus",
+ .op = yfs_FS_StoreStatus,
+ .deliver = yfs_deliver_fs_store_status,
+ .destructor = afs_flat_call_destructor,
+};
+
+static const struct afs_call_type yfs_RXYFSStoreData64_as_Status = {
+ .name = "YFS.StoreData64",
+ .op = yfs_FS_StoreData64,
+ .deliver = yfs_deliver_fs_store_status,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Set the attributes on a file, using YFS.StoreData64 rather than
+ * YFS.StoreStatus so as to alter the file size also.
+ */
+static int yfs_fs_setattr_size(struct afs_fs_cursor *fc, struct iattr *attr)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+
+ _enter(",%x,{%llx:%llu},,",
+ key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSStoreData64_as_Status,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_YFSStoreStatus) +
+ sizeof(struct yfs_xdr_u64) * 3,
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+ call->expected_version = vnode->status.data_version + 1;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSSTOREDATA64);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ bp = xdr_encode_YFS_StoreStatus(bp, attr);
+ bp = xdr_encode_u64(bp, 0); /* position of start of write */
+ bp = xdr_encode_u64(bp, 0); /* size of write */
+ bp = xdr_encode_u64(bp, attr->ia_size); /* new file length */
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Set the attributes on a file, using YFS.StoreData64 if there's a change in
+ * file size, and YFS.StoreStatus otherwise.
+ */
+int yfs_fs_setattr(struct afs_fs_cursor *fc, struct iattr *attr)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+
+ if (attr->ia_valid & ATTR_SIZE)
+ return yfs_fs_setattr_size(fc, attr);
+
+ _enter(",%x,{%llx:%llu},,",
+ key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSStoreStatus,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(struct yfs_xdr_YFSStoreStatus),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+ call->expected_version = vnode->status.data_version;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSSTORESTATUS);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ bp = xdr_encode_YFS_StoreStatus(bp, attr);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to a YFS.GetVolumeStatus operation.
+ */
+static int yfs_deliver_fs_get_volume_status(struct afs_call *call)
+{
+ const __be32 *bp;
+ char *p;
+ u32 size;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ switch (call->unmarshall) {
+ case 0:
+ call->unmarshall++;
+ afs_extract_to_buf(call, sizeof(struct yfs_xdr_YFSFetchVolumeStatus));
+
+ /* extract the returned status record */
+ case 1:
+ _debug("extract status");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ bp = call->buffer;
+ xdr_decode_YFSFetchVolumeStatus(&bp, call->reply[1]);
+ call->unmarshall++;
+ afs_extract_to_tmp(call);
+
+ /* extract the volume name length */
+ case 2:
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ call->count = ntohl(call->tmp);
+ _debug("volname length: %u", call->count);
+ if (call->count >= AFSNAMEMAX)
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_volname_len);
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
+ call->unmarshall++;
+
+ /* extract the volume name */
+ case 3:
+ _debug("extract volname");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ p = call->reply[2];
+ p[call->count] = 0;
+ _debug("volname '%s'", p);
+ afs_extract_to_tmp(call);
+ call->unmarshall++;
+
+ /* extract the offline message length */
+ case 4:
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ call->count = ntohl(call->tmp);
+ _debug("offline msg length: %u", call->count);
+ if (call->count >= AFSNAMEMAX)
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_offline_msg_len);
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
+ call->unmarshall++;
+
+ /* extract the offline message */
+ case 5:
+ _debug("extract offline");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ p = call->reply[2];
+ p[call->count] = 0;
+ _debug("offline '%s'", p);
+
+ afs_extract_to_tmp(call);
+ call->unmarshall++;
+
+ /* extract the message of the day length */
+ case 6:
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ call->count = ntohl(call->tmp);
+ _debug("motd length: %u", call->count);
+ if (call->count >= AFSNAMEMAX)
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_motd_len);
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
+ call->unmarshall++;
+
+ /* extract the message of the day */
+ case 7:
+ _debug("extract motd");
+ ret = afs_extract_data(call, false);
+ if (ret < 0)
+ return ret;
+
+ p = call->reply[2];
+ p[call->count] = 0;
+ _debug("motd '%s'", p);
+
+ call->unmarshall++;
+
+ case 8:
+ break;
+ }
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * Destroy a YFS.GetVolumeStatus call.
+ */
+static void yfs_get_volume_status_call_destructor(struct afs_call *call)
+{
+ kfree(call->reply[2]);
+ call->reply[2] = NULL;
+ afs_flat_call_destructor(call);
+}
+
+/*
+ * YFS.GetVolumeStatus operation type
+ */
+static const struct afs_call_type yfs_RXYFSGetVolumeStatus = {
+ .name = "YFS.GetVolumeStatus",
+ .op = yfs_FS_GetVolumeStatus,
+ .deliver = yfs_deliver_fs_get_volume_status,
+ .destructor = yfs_get_volume_status_call_destructor,
+};
+
+/*
+ * fetch the status of a volume
+ */
+int yfs_fs_get_volume_status(struct afs_fs_cursor *fc,
+ struct afs_volume_status *vs)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+ void *tmpbuf;
+
+ _enter("");
+
+ tmpbuf = kmalloc(AFSOPAQUEMAX, GFP_KERNEL);
+ if (!tmpbuf)
+ return -ENOMEM;
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSGetVolumeStatus,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_u64),
+ sizeof(struct yfs_xdr_YFSFetchVolumeStatus) +
+ sizeof(__be32));
+ if (!call) {
+ kfree(tmpbuf);
+ return -ENOMEM;
+ }
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+ call->reply[1] = vs;
+ call->reply[2] = tmpbuf;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSGETVOLUMESTATUS);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_u64(bp, vnode->fid.vid);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to an YFS.SetLock, YFS.ExtendLock or YFS.ReleaseLock
+ */
+static int yfs_deliver_fs_xxxx_lock(struct afs_call *call)
+{
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSVolSync(&bp, NULL);
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.SetLock operation type
+ */
+static const struct afs_call_type yfs_RXYFSSetLock = {
+ .name = "YFS.SetLock",
+ .op = yfs_FS_SetLock,
+ .deliver = yfs_deliver_fs_xxxx_lock,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * YFS.ExtendLock operation type
+ */
+static const struct afs_call_type yfs_RXYFSExtendLock = {
+ .name = "YFS.ExtendLock",
+ .op = yfs_FS_ExtendLock,
+ .deliver = yfs_deliver_fs_xxxx_lock,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * YFS.ReleaseLock operation type
+ */
+static const struct afs_call_type yfs_RXYFSReleaseLock = {
+ .name = "YFS.ReleaseLock",
+ .op = yfs_FS_ReleaseLock,
+ .deliver = yfs_deliver_fs_xxxx_lock,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Set a lock on a file
+ */
+int yfs_fs_set_lock(struct afs_fs_cursor *fc, afs_lock_type_t type)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+
+ _enter("");
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSSetLock,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid) +
+ sizeof(__be32),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSSETLOCK);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ bp = xdr_encode_u32(bp, type);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * extend a lock on a file
+ */
+int yfs_fs_extend_lock(struct afs_fs_cursor *fc)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+
+ _enter("");
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSExtendLock,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSEXTENDLOCK);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * release a lock on a file
+ */
+int yfs_fs_release_lock(struct afs_fs_cursor *fc)
+{
+ struct afs_vnode *vnode = fc->vnode;
+ struct afs_call *call;
+ struct afs_net *net = afs_v2net(vnode);
+ __be32 *bp;
+
+ _enter("");
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSReleaseLock,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call)
+ return -ENOMEM;
+
+ call->key = fc->key;
+ call->reply[0] = vnode;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSRELEASELOCK);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, &vnode->fid);
+ yfs_check_req(call, bp);
+
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &vnode->fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to an FS.FetchStatus with no vnode.
+ */
+static int yfs_deliver_fs_fetch_status(struct afs_call *call)
+{
+ struct afs_file_status *status = call->reply[1];
+ struct afs_callback *callback = call->reply[2];
+ struct afs_volsync *volsync = call->reply[3];
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ int ret;
+
+ ret = afs_transfer_reply(call);
+ if (ret < 0)
+ return ret;
+
+ _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
+
+ /* unmarshall the reply once we've received all of it */
+ bp = call->buffer;
+ ret = yfs_decode_status(call, &bp, status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ xdr_decode_YFSCallBack_raw(&bp, callback);
+ xdr_decode_YFSVolSync(&bp, volsync);
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * YFS.FetchStatus operation type
+ */
+static const struct afs_call_type yfs_RXYFSFetchStatus = {
+ .name = "YFS.FetchStatus",
+ .op = yfs_FS_FetchStatus,
+ .deliver = yfs_deliver_fs_fetch_status,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Fetch the status information for a fid without needing a vnode handle.
+ */
+int yfs_fs_fetch_status(struct afs_fs_cursor *fc,
+ struct afs_net *net,
+ struct afs_fid *fid,
+ struct afs_file_status *status,
+ struct afs_callback *callback,
+ struct afs_volsync *volsync)
+{
+ struct afs_call *call;
+ __be32 *bp;
+
+ _enter(",%x,{%llx:%llu},,",
+ key_serial(fc->key), fid->vid, fid->vnode);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSFetchStatus,
+ sizeof(__be32) * 2 +
+ sizeof(struct yfs_xdr_YFSFid),
+ sizeof(struct yfs_xdr_YFSFetchStatus) +
+ sizeof(struct yfs_xdr_YFSCallBack) +
+ sizeof(struct yfs_xdr_YFSVolSync));
+ if (!call) {
+ fc->ac.error = -ENOMEM;
+ return -ENOMEM;
+ }
+
+ call->key = fc->key;
+ call->reply[0] = NULL; /* vnode for fid[0] */
+ call->reply[1] = status;
+ call->reply[2] = callback;
+ call->reply[3] = volsync;
+ call->expected_version = 1; /* vnode->status.data_version */
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSFETCHSTATUS);
+ bp = xdr_encode_u32(bp, 0); /* RPC flags */
+ bp = xdr_encode_YFSFid(bp, fid);
+ yfs_check_req(call, bp);
+
+ call->cb_break = fc->cb_break;
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, fid);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to an YFS.InlineBulkStatus call
+ */
+static int yfs_deliver_fs_inline_bulk_status(struct afs_call *call)
+{
+ struct afs_file_status *statuses;
+ struct afs_callback *callbacks;
+ struct afs_vnode *vnode = call->reply[0];
+ const __be32 *bp;
+ u32 tmp;
+ int ret;
+
+ _enter("{%u}", call->unmarshall);
+
+ switch (call->unmarshall) {
+ case 0:
+ afs_extract_to_tmp(call);
+ call->unmarshall++;
+
+ /* Extract the file status count and array in two steps */
+ case 1:
+ _debug("extract status count");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ tmp = ntohl(call->tmp);
+ _debug("status count: %u/%u", tmp, call->count2);
+ if (tmp != call->count2)
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_ibulkst_count);
+
+ call->count = 0;
+ call->unmarshall++;
+ more_counts:
+ afs_extract_to_buf(call, sizeof(struct yfs_xdr_YFSFetchStatus));
+
+ case 2:
+ _debug("extract status array %u", call->count);
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ bp = call->buffer;
+ statuses = call->reply[1];
+ ret = yfs_decode_status(call, &bp, &statuses[call->count],
+ call->count == 0 ? vnode : NULL,
+ NULL, NULL);
+ if (ret < 0)
+ return ret;
+
+ call->count++;
+ if (call->count < call->count2)
+ goto more_counts;
+
+ call->count = 0;
+ call->unmarshall++;
+ afs_extract_to_tmp(call);
+
+ /* Extract the callback count and array in two steps */
+ case 3:
+ _debug("extract CB count");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ tmp = ntohl(call->tmp);
+ _debug("CB count: %u", tmp);
+ if (tmp != call->count2)
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_ibulkst_cb_count);
+ call->count = 0;
+ call->unmarshall++;
+ more_cbs:
+ afs_extract_to_buf(call, sizeof(struct yfs_xdr_YFSCallBack));
+
+ case 4:
+ _debug("extract CB array");
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
+
+ _debug("unmarshall CB array");
+ bp = call->buffer;
+ callbacks = call->reply[2];
+ xdr_decode_YFSCallBack_raw(&bp, &callbacks[call->count]);
+ statuses = call->reply[1];
+ if (call->count == 0 && vnode && statuses[0].abort_code == 0) {
+ bp = call->buffer;
+ xdr_decode_YFSCallBack(call, vnode, &bp);
+ }
+ call->count++;
+ if (call->count < call->count2)
+ goto more_cbs;
+
+ afs_extract_to_buf(call, sizeof(struct yfs_xdr_YFSVolSync));
+ call->unmarshall++;
+
+ case 5:
+ ret = afs_extract_data(call, false);
+ if (ret < 0)
+ return ret;
+
+ bp = call->buffer;
+ xdr_decode_YFSVolSync(&bp, call->reply[3]);
+
+ call->unmarshall++;
+
+ case 6:
+ break;
+ }
+
+ _leave(" = 0 [done]");
+ return 0;
+}
+
+/*
+ * FS.InlineBulkStatus operation type
+ */
+static const struct afs_call_type yfs_RXYFSInlineBulkStatus = {
+ .name = "YFS.InlineBulkStatus",
+ .op = yfs_FS_InlineBulkStatus,
+ .deliver = yfs_deliver_fs_inline_bulk_status,
+ .destructor = afs_flat_call_destructor,
+};
+
+/*
+ * Fetch the status information for up to 1024 files
+ */
+int yfs_fs_inline_bulk_status(struct afs_fs_cursor *fc,
+ struct afs_net *net,
+ struct afs_fid *fids,
+ struct afs_file_status *statuses,
+ struct afs_callback *callbacks,
+ unsigned int nr_fids,
+ struct afs_volsync *volsync)
+{
+ struct afs_call *call;
+ __be32 *bp;
+ int i;
+
+ _enter(",%x,{%llx:%llu},%u",
+ key_serial(fc->key), fids[0].vid, fids[1].vnode, nr_fids);
+
+ call = afs_alloc_flat_call(net, &yfs_RXYFSInlineBulkStatus,
+ sizeof(__be32) +
+ sizeof(__be32) +
+ sizeof(__be32) +
+ sizeof(struct yfs_xdr_YFSFid) * nr_fids,
+ sizeof(struct yfs_xdr_YFSFetchStatus));
+ if (!call) {
+ fc->ac.error = -ENOMEM;
+ return -ENOMEM;
+ }
+
+ call->key = fc->key;
+ call->reply[0] = NULL; /* vnode for fid[0] */
+ call->reply[1] = statuses;
+ call->reply[2] = callbacks;
+ call->reply[3] = volsync;
+ call->count2 = nr_fids;
+
+ /* marshall the parameters */
+ bp = call->request;
+ bp = xdr_encode_u32(bp, YFSINLINEBULKSTATUS);
+ bp = xdr_encode_u32(bp, 0); /* RPCFlags */
+ bp = xdr_encode_u32(bp, nr_fids);
+ for (i = 0; i < nr_fids; i++)
+ bp = xdr_encode_YFSFid(bp, &fids[i]);
+ yfs_check_req(call, bp);
+
+ call->cb_break = fc->cb_break;
+ afs_use_fs_server(call, fc->cbi);
+ trace_afs_make_fs_call(call, &fids[0]);
+ return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
ret = ioprio_check_cap(iocb->aio_reqprio);
if (ret) {
pr_debug("aio ioprio check cap error: %d\n", ret);
+ fput(req->ki_filp);
return ret;
}
s->s_magic = BFS_MAGIC;
- if (le32_to_cpu(bfs_sb->s_start) > le32_to_cpu(bfs_sb->s_end)) {
+ if (le32_to_cpu(bfs_sb->s_start) > le32_to_cpu(bfs_sb->s_end) ||
+ le32_to_cpu(bfs_sb->s_start) < BFS_BSIZE) {
printf("Superblock is corrupted\n");
goto out1;
}
sizeof(struct bfs_inode)
+ BFS_ROOT_INO - 1;
imap_len = (info->si_lasti / 8) + 1;
- info->si_imap = kzalloc(imap_len, GFP_KERNEL);
- if (!info->si_imap)
+ info->si_imap = kzalloc(imap_len, GFP_KERNEL | __GFP_NOWARN);
+ if (!info->si_imap) {
+ printf("Cannot allocate %u bytes\n", imap_len);
goto out1;
+ }
for (i = 0; i < BFS_ROOT_INO; i++)
set_bit(i, info->si_imap);
dio->size = 0;
dio->multi_bio = false;
- dio->should_dirty = is_read && (iter->type == ITER_IOVEC);
+ dio->should_dirty = is_read && iter_is_iovec(iter);
blk_start_plug(&plug);
for (;;) {
int btrfs_drop_inode(struct inode *inode);
int __init btrfs_init_cachep(void);
void __cold btrfs_destroy_cachep(void);
+struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location,
+ struct btrfs_root *root, int *new,
+ struct btrfs_path *path);
struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
struct btrfs_root *root, int *was_new);
struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
struct btrfs_ioctl_space_info *space);
void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info,
struct btrfs_ioctl_balance_args *bargs);
-int btrfs_dedupe_file_range(struct file *src_file, loff_t src_loff,
- struct file *dst_file, loff_t dst_loff,
- u64 olen);
/* file.c */
int __init btrfs_auto_defrag_init(void);
size_t num_pages, loff_t pos, size_t write_bytes,
struct extent_state **cached);
int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end);
-int btrfs_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len);
+loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags);
/* tree-defrag.c */
int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
int mirror_num = 0;
int failed_mirror = 0;
- clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
while (1) {
+ clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
mirror_num);
if (!ret) {
break;
}
- /*
- * This buffer's crc is fine, but its contents are corrupted, so
- * there is no reason to read the other copies, they won't be
- * any less wrong.
- */
- if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) ||
- ret == -EUCLEAN)
- break;
-
num_copies = btrfs_num_copies(fs_info,
eb->start, eb->len);
if (num_copies == 1)
struct btrfs_root *root = arg;
struct btrfs_fs_info *fs_info = root->fs_info;
int again;
- struct btrfs_trans_handle *trans;
- do {
+ while (1) {
again = 0;
/* Make the cleaner go to sleep early. */
*/
btrfs_delete_unused_bgs(fs_info);
sleep:
+ if (kthread_should_park())
+ kthread_parkme();
+ if (kthread_should_stop())
+ return 0;
if (!again) {
set_current_state(TASK_INTERRUPTIBLE);
- if (!kthread_should_stop())
- schedule();
+ schedule();
__set_current_state(TASK_RUNNING);
}
- } while (!kthread_should_stop());
-
- /*
- * Transaction kthread is stopped before us and wakes us up.
- * However we might have started a new transaction and COWed some
- * tree blocks when deleting unused block groups for example. So
- * make sure we commit the transaction we started to have a clean
- * shutdown when evicting the btree inode - if it has dirty pages
- * when we do the final iput() on it, eviction will trigger a
- * writeback for it which will fail with null pointer dereferences
- * since work queues and other resources were already released and
- * destroyed by the time the iput/eviction/writeback is made.
- */
- trans = btrfs_attach_transaction(root);
- if (IS_ERR(trans)) {
- if (PTR_ERR(trans) != -ENOENT)
- btrfs_err(fs_info,
- "cleaner transaction attach returned %ld",
- PTR_ERR(trans));
- } else {
- int ret;
-
- ret = btrfs_commit_transaction(trans);
- if (ret)
- btrfs_err(fs_info,
- "cleaner open transaction commit returned %d",
- ret);
}
-
- return 0;
}
static int transaction_kthread(void *arg)
int ret;
set_bit(BTRFS_FS_CLOSING_START, &fs_info->flags);
+ /*
+ * We don't want the cleaner to start new transactions, add more delayed
+ * iputs, etc. while we're closing. We can't use kthread_stop() yet
+ * because that frees the task_struct, and the transaction kthread might
+ * still try to wake up the cleaner.
+ */
+ kthread_park(fs_info->cleaner_kthread);
/* wait for the qgroup rescan worker to stop */
btrfs_qgroup_wait_for_completion(fs_info, false);
if (!sb_rdonly(fs_info->sb)) {
/*
- * If the cleaner thread is stopped and there are
- * block groups queued for removal, the deletion will be
- * skipped when we quit the cleaner thread.
+ * The cleaner kthread is stopped, so do one final pass over
+ * unused block groups.
*/
btrfs_delete_unused_bgs(fs_info);
unpin = pinned_extents;
again:
while (1) {
+ /*
+ * The btrfs_finish_extent_commit() may get the same range as
+ * ours between find_first_extent_bit and clear_extent_dirty.
+ * Hence, hold the unused_bg_unpin_mutex to avoid double unpin
+ * the same extent range.
+ */
+ mutex_lock(&fs_info->unused_bg_unpin_mutex);
ret = find_first_extent_bit(unpin, 0, &start, &end,
EXTENT_DIRTY, NULL);
- if (ret)
+ if (ret) {
+ mutex_unlock(&fs_info->unused_bg_unpin_mutex);
break;
+ }
clear_extent_dirty(unpin, start, end);
btrfs_error_unpin_extent_range(fs_info, start, end);
+ mutex_unlock(&fs_info->unused_bg_unpin_mutex);
cond_resched();
}
atomic_inc(&root->log_batch);
+ /*
+ * Before we acquired the inode's lock, someone may have dirtied more
+ * pages in the target range. We need to make sure that writeback for
+ * any such pages does not start while we are logging the inode, because
+ * if it does, any of the following might happen when we are not doing a
+ * full inode sync:
+ *
+ * 1) We log an extent after its writeback finishes but before its
+ * checksums are added to the csum tree, leading to -EIO errors
+ * when attempting to read the extent after a log replay.
+ *
+ * 2) We can end up logging an extent before its writeback finishes.
+ * Therefore after the log replay we will have a file extent item
+ * pointing to an unwritten extent (and no data checksums as well).
+ *
+ * So trigger writeback for any eventual new dirty pages and then we
+ * wait for all ordered extents to complete below.
+ */
+ ret = start_ordered_ops(inode, start, end);
+ if (ret) {
+ inode_unlock(inode);
+ goto out;
+ }
+
/*
* We have to do this here to avoid the priority inversion of waiting on
* IO of a lower priority task while holding a transaciton open.
#ifdef CONFIG_COMPAT
.compat_ioctl = btrfs_compat_ioctl,
#endif
- .clone_file_range = btrfs_clone_file_range,
- .dedupe_file_range = btrfs_dedupe_file_range,
+ .remap_file_range = btrfs_remap_file_range,
};
void __cold btrfs_auto_defrag_exit(void)
* sure NOFS is set to keep us from deadlocking.
*/
nofs_flag = memalloc_nofs_save();
- inode = btrfs_iget(fs_info->sb, &location, root, NULL);
+ inode = btrfs_iget_path(fs_info->sb, &location, root, NULL, path);
+ btrfs_release_path(path);
memalloc_nofs_restore(nofs_flag);
if (IS_ERR(inode))
return inode;
path->search_commit_root = 1;
path->skip_locking = 1;
+ /*
+ * We must pass a path with search_commit_root set to btrfs_iget in
+ * order to avoid a deadlock when allocating extents for the tree root.
+ *
+ * When we are COWing an extent buffer from the tree root, when looking
+ * for a free extent, at extent-tree.c:find_free_extent(), we can find
+ * block group without its free space cache loaded. When we find one
+ * we must load its space cache which requires reading its free space
+ * cache's inode item from the root tree. If this inode item is located
+ * in the same leaf that we started COWing before, then we end up in
+ * deadlock on the extent buffer (trying to read lock it when we
+ * previously write locked it).
+ *
+ * It's safe to read the inode item using the commit root because
+ * block groups, once loaded, stay in memory forever (until they are
+ * removed) as well as their space caches once loaded. New block groups
+ * once created get their ->cached field set to BTRFS_CACHE_FINISHED so
+ * we will never try to read their inode item while the fs is mounted.
+ */
inode = lookup_free_space_inode(fs_info, block_group, path);
if (IS_ERR(inode)) {
btrfs_free_path(path);
}
btrfs_release_path(path);
- if (cur_offset <= end && cow_start == (u64)-1) {
+ if (cur_offset <= end && cow_start == (u64)-1)
cow_start = cur_offset;
- cur_offset = end;
- }
if (cow_start != (u64)-1) {
+ cur_offset = end;
ret = cow_file_range(inode, locked_page, cow_start, end, end,
page_started, nr_written, 1, NULL);
if (ret)
/*
* read an inode from the btree into the in-memory inode
*/
-static int btrfs_read_locked_inode(struct inode *inode)
+static int btrfs_read_locked_inode(struct inode *inode,
+ struct btrfs_path *in_path)
{
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
- struct btrfs_path *path;
+ struct btrfs_path *path = in_path;
struct extent_buffer *leaf;
struct btrfs_inode_item *inode_item;
struct btrfs_root *root = BTRFS_I(inode)->root;
if (!ret)
filled = true;
- path = btrfs_alloc_path();
- if (!path)
- return -ENOMEM;
+ if (!path) {
+ path = btrfs_alloc_path();
+ if (!path)
+ return -ENOMEM;
+ }
memcpy(&location, &BTRFS_I(inode)->location, sizeof(location));
ret = btrfs_lookup_inode(NULL, root, path, &location, 0);
if (ret) {
- btrfs_free_path(path);
+ if (path != in_path)
+ btrfs_free_path(path);
return ret;
}
btrfs_ino(BTRFS_I(inode)),
root->root_key.objectid, ret);
}
- btrfs_free_path(path);
+ if (path != in_path)
+ btrfs_free_path(path);
if (!maybe_acls)
cache_no_acl(inode);
/* Get an inode object given its location and corresponding root.
* Returns in *is_new if the inode was read from disk
*/
-struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
- struct btrfs_root *root, int *new)
+struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location,
+ struct btrfs_root *root, int *new,
+ struct btrfs_path *path)
{
struct inode *inode;
if (inode->i_state & I_NEW) {
int ret;
- ret = btrfs_read_locked_inode(inode);
+ ret = btrfs_read_locked_inode(inode, path);
if (!ret) {
inode_tree_add(inode);
unlock_new_inode(inode);
return inode;
}
+struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
+ struct btrfs_root *root, int *new)
+{
+ return btrfs_iget_path(s, location, root, new, NULL);
+}
+
static struct inode *new_simple_dir(struct super_block *s,
struct btrfs_key *key,
struct btrfs_root *root)
const u64 sz = BTRFS_I(src)->root->fs_info->sectorsize;
len = round_down(i_size_read(src), sz) - loff;
+ if (len == 0)
+ return 0;
olen = len;
}
}
return ret;
}
-int btrfs_dedupe_file_range(struct file *src_file, loff_t src_loff,
- struct file *dst_file, loff_t dst_loff,
- u64 olen)
-{
- struct inode *src = file_inode(src_file);
- struct inode *dst = file_inode(dst_file);
- u64 bs = BTRFS_I(src)->root->fs_info->sb->s_blocksize;
-
- if (WARN_ON_ONCE(bs < PAGE_SIZE)) {
- /*
- * Btrfs does not support blocksize < page_size. As a
- * result, btrfs_cmp_data() won't correctly handle
- * this situation without an update.
- */
- return -EINVAL;
- }
-
- return btrfs_extent_same(src, src_loff, olen, dst, dst_loff);
-}
-
static int clone_finish_inode_update(struct btrfs_trans_handle *trans,
struct inode *inode,
u64 endoff,
goto out_unlock;
if (len == 0)
olen = len = src->i_size - off;
- /* if we extend to eof, continue to block boundary */
- if (off + len == src->i_size)
+ /*
+ * If we extend to eof, continue to block boundary if and only if the
+ * destination end offset matches the destination file's size, otherwise
+ * we would be corrupting data by placing the eof block into the middle
+ * of a file.
+ */
+ if (off + len == src->i_size) {
+ if (!IS_ALIGNED(len, bs) && destoff + len < inode->i_size)
+ goto out_unlock;
len = ALIGN(src->i_size, bs) - off;
+ }
if (len == 0) {
ret = 0;
return ret;
}
-int btrfs_clone_file_range(struct file *src_file, loff_t off,
- struct file *dst_file, loff_t destoff, u64 len)
+loff_t btrfs_remap_file_range(struct file *src_file, loff_t off,
+ struct file *dst_file, loff_t destoff, loff_t len,
+ unsigned int remap_flags)
{
- return btrfs_clone_files(dst_file, src_file, off, len, destoff);
+ int ret;
+
+ if (remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_ADVISORY))
+ return -EINVAL;
+
+ if (remap_flags & REMAP_FILE_DEDUP) {
+ struct inode *src = file_inode(src_file);
+ struct inode *dst = file_inode(dst_file);
+ u64 bs = BTRFS_I(src)->root->fs_info->sb->s_blocksize;
+
+ if (WARN_ON_ONCE(bs < PAGE_SIZE)) {
+ /*
+ * Btrfs does not support blocksize < page_size. As a
+ * result, btrfs_cmp_data() won't correctly handle
+ * this situation without an update.
+ */
+ return -EINVAL;
+ }
+
+ ret = btrfs_extent_same(src, off, len, dst, destoff);
+ } else {
+ ret = btrfs_clone_files(dst_file, src_file, off, len, destoff);
+ }
+ return ret < 0 ? ret : len;
}
static long btrfs_ioctl_default_subvol(struct file *file, void __user *argp)
int i;
u64 *i_qgroups;
struct btrfs_fs_info *fs_info = trans->fs_info;
- struct btrfs_root *quota_root = fs_info->quota_root;
+ struct btrfs_root *quota_root;
struct btrfs_qgroup *srcgroup;
struct btrfs_qgroup *dstgroup;
u32 level_size = 0;
if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
goto out;
+ quota_root = fs_info->quota_root;
if (!quota_root) {
ret = -EINVAL;
goto out;
restart:
if (update_backref_cache(trans, &rc->backref_cache)) {
btrfs_end_transaction(trans);
+ trans = NULL;
continue;
}
kfree(m);
}
-static void tail_append_pending_moves(struct pending_dir_move *moves,
+static void tail_append_pending_moves(struct send_ctx *sctx,
+ struct pending_dir_move *moves,
struct list_head *stack)
{
if (list_empty(&moves->list)) {
list_add_tail(&moves->list, stack);
list_splice_tail(&list, stack);
}
+ if (!RB_EMPTY_NODE(&moves->node)) {
+ rb_erase(&moves->node, &sctx->pending_dir_moves);
+ RB_CLEAR_NODE(&moves->node);
+ }
}
static int apply_children_dir_moves(struct send_ctx *sctx)
return 0;
INIT_LIST_HEAD(&stack);
- tail_append_pending_moves(pm, &stack);
+ tail_append_pending_moves(sctx, pm, &stack);
while (!list_empty(&stack)) {
pm = list_first_entry(&stack, struct pending_dir_move, list);
goto out;
pm = get_pending_dir_moves(sctx, parent_ino);
if (pm)
- tail_append_pending_moves(pm, &stack);
+ tail_append_pending_moves(sctx, pm, &stack);
}
return 0;
}
/* Used to sort the devices by max_avail(descending sort) */
-static int btrfs_cmp_device_free_bytes(const void *dev_info1,
+static inline int btrfs_cmp_device_free_bytes(const void *dev_info1,
const void *dev_info2)
{
if (((struct btrfs_device_info *)dev_info1)->max_avail >
* The helper to calc the free space on the devices that can be used to store
* file data.
*/
-static int btrfs_calc_avail_data_space(struct btrfs_fs_info *fs_info,
- u64 *free_bytes)
+static inline int btrfs_calc_avail_data_space(struct btrfs_fs_info *fs_info,
+ u64 *free_bytes)
{
struct btrfs_device_info *devices_info;
struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
vol = memdup_user((void __user *)arg, sizeof(*vol));
if (IS_ERR(vol))
return PTR_ERR(vol);
+ vol->name[BTRFS_PATH_NAME_MAX] = '\0';
switch (cmd) {
case BTRFS_IOC_SCAN_DEV:
/*
* Here we don't really care about alignment since extent allocator can
- * handle it. We care more about the size, as if one block group is
- * larger than maximum size, it's must be some obvious corruption.
+ * handle it. We care more about the size.
*/
- if (key->offset > BTRFS_MAX_DATA_CHUNK_SIZE || key->offset == 0) {
+ if (key->offset == 0) {
block_group_err(fs_info, leaf, slot,
- "invalid block group size, have %llu expect (0, %llu]",
- key->offset, BTRFS_MAX_DATA_CHUNK_SIZE);
+ "invalid block group size 0");
return -EUCLEAN;
}
type != (BTRFS_BLOCK_GROUP_METADATA |
BTRFS_BLOCK_GROUP_DATA)) {
block_group_err(fs_info, leaf, slot,
-"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llu or 0x%llx",
+"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llx or 0x%llx",
type, hweight64(type),
BTRFS_BLOCK_GROUP_DATA, BTRFS_BLOCK_GROUP_METADATA,
BTRFS_BLOCK_GROUP_SYSTEM,
logged_end = end;
list_for_each_entry_safe(em, n, &tree->modified_extents, list) {
+ /*
+ * Skip extents outside our logging range. It's important to do
+ * it for correctness because if we don't ignore them, we may
+ * log them before their ordered extent completes, and therefore
+ * we could log them without logging their respective checksums
+ * (the checksum items are added to the csum tree at the very
+ * end of btrfs_finish_ordered_io()). Also leave such extents
+ * outside of our range in the list, since we may have another
+ * ranged fsync in the near future that needs them. If an extent
+ * outside our range corresponds to a hole, log it to avoid
+ * leaving gaps between extents (fsck will complain when we are
+ * not using the NO_HOLES feature).
+ */
+ if ((em->start > end || em->start + em->len <= start) &&
+ em->block_start != EXTENT_MAP_HOLE)
+ continue;
+
list_del_init(&em->list);
/*
* Just an arbitrary number, this can be really CPU intensive
*/
bio = bio_alloc(GFP_NOIO, 1);
+ if (wbc) {
+ wbc_init_bio(wbc, bio);
+ wbc_account_io(wbc, bh->b_page, bh->b_size);
+ }
+
bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9);
bio_set_dev(bio, bh->b_bdev);
bio->bi_write_hint = write_hint;
op_flags |= REQ_PRIO;
bio_set_op_attrs(bio, op, op_flags);
- if (wbc) {
- wbc_init_bio(wbc, bio);
- wbc_account_io(wbc, bh->b_page, bh->b_size);
- }
-
submit_bio(bio);
return 0;
}
ASSERT(!test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags));
- cache->cache.ops->put_object(&xobject->fscache, cachefiles_obj_put_wait_retry);
+ cache->cache.ops->put_object(&xobject->fscache,
+ (enum fscache_obj_ref_trace)cachefiles_obj_put_wait_retry);
goto try_again;
requeue:
- cache->cache.ops->put_object(&xobject->fscache, cachefiles_obj_put_wait_timeo);
+ cache->cache.ops->put_object(&xobject->fscache,
+ (enum fscache_obj_ref_trace)cachefiles_obj_put_wait_timeo);
_leave(" = -ETIMEDOUT");
return -ETIMEDOUT;
}
try_again:
/* first step is to make up a grave dentry in the graveyard */
sprintf(nbuffer, "%08x%08x",
- (uint32_t) get_seconds(),
+ (uint32_t) ktime_get_real_seconds(),
(uint32_t) atomic_inc_return(&cache->gravecounter));
/* do the multiway lock magic */
netpage->index, cachefiles_gfp);
if (ret < 0) {
if (ret == -EEXIST) {
+ put_page(backpage);
+ backpage = NULL;
put_page(netpage);
+ netpage = NULL;
fscache_retrieval_complete(op, 1);
continue;
}
netpage->index, cachefiles_gfp);
if (ret < 0) {
if (ret == -EEXIST) {
+ put_page(backpage);
+ backpage = NULL;
put_page(netpage);
+ netpage = NULL;
fscache_retrieval_complete(op, 1);
continue;
}
__releases(&object->fscache.cookie->lock)
{
struct cachefiles_object *object;
- struct cachefiles_cache *cache;
object = container_of(_object, struct cachefiles_object, fscache);
- cache = container_of(object->fscache.cache,
- struct cachefiles_cache, cache);
_enter("%p,{%lu}", object, page->index);
struct dentry *dentry = object->dentry;
int ret;
- ASSERT(dentry);
+ if (!dentry)
+ return -ESTALE;
_enter("%p,#%d", object, auxdata->len);
more = len < iov_iter_count(to);
- if (unlikely(to->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(to))) {
ret = iov_iter_get_pages_alloc(to, &pages, len,
&page_off);
if (ret <= 0) {
ret += zlen;
}
- if (unlikely(to->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(to))) {
if (ret > 0) {
iov_iter_advance(to, ret);
off += ret;
aio_req->total_len = rc + zlen;
}
- iov_iter_bvec(&i, ITER_BVEC, osd_data->bvec_pos.bvecs,
+ iov_iter_bvec(&i, READ, osd_data->bvec_pos.bvecs,
osd_data->num_bvecs,
osd_data->bvec_pos.iter.bi_size);
iov_iter_advance(&i, rc);
int zlen = min_t(size_t, len - ret,
size - pos - ret);
- iov_iter_bvec(&i, ITER_BVEC, bvecs, num_pages,
- len);
+ iov_iter_bvec(&i, READ, bvecs, num_pages, len);
iov_iter_advance(&i, ret);
iov_iter_zero(zlen, &i);
ret += zlen;
if (!prealloc_cf)
return -ENOMEM;
- /* Start by sync'ing the source file */
+ /* Start by sync'ing the source and destination files */
ret = file_write_and_wait_range(src_file, src_off, (src_off + len));
- if (ret < 0)
+ if (ret < 0) {
+ dout("failed to write src file (%zd)\n", ret);
+ goto out;
+ }
+ ret = file_write_and_wait_range(dst_file, dst_off, (dst_off + len));
+ if (ret < 0) {
+ dout("failed to write dst file (%zd)\n", ret);
goto out;
+ }
/*
* We need FILE_WR caps for dst_ci and FILE_RD for src_ci as other
info->symlink = *p;
*p += info->symlink_len;
- if (features & CEPH_FEATURE_DIRLAYOUTHASH)
- ceph_decode_copy_safe(p, end, &info->dir_layout,
- sizeof(info->dir_layout), bad);
- else
- memset(&info->dir_layout, 0, sizeof(info->dir_layout));
-
+ ceph_decode_copy_safe(p, end, &info->dir_layout,
+ sizeof(info->dir_layout), bad);
ceph_decode_32_safe(p, end, info->xattr_len, bad);
ceph_decode_need(p, end, info->xattr_len, bad);
info->xattr_data = *p;
recon_state.pagelist = pagelist;
if (session->s_con.peer_features & CEPH_FEATURE_MDSENC)
recon_state.msg_version = 3;
- else if (session->s_con.peer_features & CEPH_FEATURE_FLOCK)
- recon_state.msg_version = 2;
else
- recon_state.msg_version = 1;
+ recon_state.msg_version = 2;
err = iterate_session_caps(session, encode_caps_cb, &recon_state);
if (err < 0)
goto fail;
ceph_put_snap_realm(mdsc, realm);
realm = next;
}
- ceph_put_snap_realm(mdsc, realm);
+ if (realm)
+ ceph_put_snap_realm(mdsc, realm);
up_read(&mdsc->snap_rwsem);
return exceeded;
config CIFS_POSIX
bool "CIFS POSIX Extensions"
- depends on CIFS_XATTR
+ depends on CIFS && CIFS_ALLOW_INSECURE_LEGACY && CIFS_XATTR
help
Enabling this option will cause the cifs client to attempt to
negotiate a newer dialect with servers, such as Samba 3.0.5
seq_printf(m, "\t\tIPv6: %pI6\n", &ipv6->sin6_addr);
}
+static int cifs_debug_files_proc_show(struct seq_file *m, void *v)
+{
+ struct list_head *stmp, *tmp, *tmp1, *tmp2;
+ struct TCP_Server_Info *server;
+ struct cifs_ses *ses;
+ struct cifs_tcon *tcon;
+ struct cifsFileInfo *cfile;
+
+ seq_puts(m, "# Version:1\n");
+ seq_puts(m, "# Format:\n");
+ seq_puts(m, "# <tree id> <persistent fid> <flags> <count> <pid> <uid>");
+#ifdef CONFIG_CIFS_DEBUG2
+ seq_printf(m, " <filename> <mid>\n");
+#else
+ seq_printf(m, " <filename>\n");
+#endif /* CIFS_DEBUG2 */
+ spin_lock(&cifs_tcp_ses_lock);
+ list_for_each(stmp, &cifs_tcp_ses_list) {
+ server = list_entry(stmp, struct TCP_Server_Info,
+ tcp_ses_list);
+ list_for_each(tmp, &server->smb_ses_list) {
+ ses = list_entry(tmp, struct cifs_ses, smb_ses_list);
+ list_for_each(tmp1, &ses->tcon_list) {
+ tcon = list_entry(tmp1, struct cifs_tcon, tcon_list);
+ spin_lock(&tcon->open_file_lock);
+ list_for_each(tmp2, &tcon->openFileList) {
+ cfile = list_entry(tmp2, struct cifsFileInfo,
+ tlist);
+ seq_printf(m,
+ "0x%x 0x%llx 0x%x %d %d %d %s",
+ tcon->tid,
+ cfile->fid.persistent_fid,
+ cfile->f_flags,
+ cfile->count,
+ cfile->pid,
+ from_kuid(&init_user_ns, cfile->uid),
+ cfile->dentry->d_name.name);
+#ifdef CONFIG_CIFS_DEBUG2
+ seq_printf(m, " 0x%llx\n", cfile->fid.mid);
+#else
+ seq_printf(m, "\n");
+#endif /* CIFS_DEBUG2 */
+ }
+ spin_unlock(&tcon->open_file_lock);
+ }
+ }
+ }
+ spin_unlock(&cifs_tcp_ses_lock);
+ seq_putc(m, '\n');
+ return 0;
+}
+
static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
{
struct list_head *tmp1, *tmp2, *tmp3;
proc_create_single("DebugData", 0, proc_fs_cifs,
cifs_debug_data_proc_show);
+ proc_create_single("open_files", 0400, proc_fs_cifs,
+ cifs_debug_files_proc_show);
+
proc_create("Stats", 0644, proc_fs_cifs, &cifs_stats_proc_fops);
proc_create("cifsFYI", 0644, proc_fs_cifs, &cifsFYI_proc_fops);
proc_create("traceSMB", 0644, proc_fs_cifs, &traceSMB_proc_fops);
return;
remove_proc_entry("DebugData", proc_fs_cifs);
+ remove_proc_entry("open_files", proc_fs_cifs);
remove_proc_entry("cifsFYI", proc_fs_cifs);
remove_proc_entry("traceSMB", proc_fs_cifs);
remove_proc_entry("Stats", proc_fs_cifs);
sprintf(dp, ";sec=krb5");
else if (server->sec_mskerberos)
sprintf(dp, ";sec=mskrb5");
- else
- goto out;
+ else {
+ cifs_dbg(VFS, "unknown or missing server auth type, use krb5\n");
+ sprintf(dp, ";sec=krb5");
+ }
dp = description + strlen(description);
sprintf(dp, ";uid=0x%x",
.listxattr = cifs_listxattr,
};
-static int cifs_clone_file_range(struct file *src_file, loff_t off,
- struct file *dst_file, loff_t destoff, u64 len)
+static loff_t cifs_remap_file_range(struct file *src_file, loff_t off,
+ struct file *dst_file, loff_t destoff, loff_t len,
+ unsigned int remap_flags)
{
struct inode *src_inode = file_inode(src_file);
struct inode *target_inode = file_inode(dst_file);
struct cifsFileInfo *smb_file_src = src_file->private_data;
- struct cifsFileInfo *smb_file_target = dst_file->private_data;
- struct cifs_tcon *target_tcon = tlink_tcon(smb_file_target->tlink);
+ struct cifsFileInfo *smb_file_target;
+ struct cifs_tcon *target_tcon;
unsigned int xid;
int rc;
+ if (remap_flags & ~REMAP_FILE_ADVISORY)
+ return -EINVAL;
+
cifs_dbg(FYI, "clone range\n");
xid = get_xid();
goto out;
}
+ smb_file_target = dst_file->private_data;
+ target_tcon = tlink_tcon(smb_file_target->tlink);
+
/*
* Note: cifs case is easier than btrfs since server responsible for
* checks for proper open modes and file type and if it wants
unlock_two_nondirectories(src_inode, target_inode);
out:
free_xid(xid);
- return rc;
+ return rc < 0 ? rc : len;
}
ssize_t cifs_file_copychunk_range(unsigned int xid,
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
.copy_file_range = cifs_copy_file_range,
- .clone_file_range = cifs_clone_file_range,
+ .remap_file_range = cifs_remap_file_range,
.setlease = cifs_setlease,
.fallocate = cifs_fallocate,
};
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
.copy_file_range = cifs_copy_file_range,
- .clone_file_range = cifs_clone_file_range,
+ .remap_file_range = cifs_remap_file_range,
.setlease = cifs_setlease,
.fallocate = cifs_fallocate,
};
const struct file_operations cifs_file_direct_ops = {
- /* BB reevaluate whether they can be done with directio, no cache */
- .read_iter = cifs_user_readv,
- .write_iter = cifs_user_writev,
+ .read_iter = cifs_direct_readv,
+ .write_iter = cifs_direct_writev,
.open = cifs_open,
.release = cifs_close,
.lock = cifs_lock,
.splice_write = iter_file_splice_write,
.unlocked_ioctl = cifs_ioctl,
.copy_file_range = cifs_copy_file_range,
- .clone_file_range = cifs_clone_file_range,
+ .remap_file_range = cifs_remap_file_range,
.llseek = cifs_llseek,
.setlease = cifs_setlease,
.fallocate = cifs_fallocate,
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
.copy_file_range = cifs_copy_file_range,
- .clone_file_range = cifs_clone_file_range,
+ .remap_file_range = cifs_remap_file_range,
.setlease = cifs_setlease,
.fallocate = cifs_fallocate,
};
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
.copy_file_range = cifs_copy_file_range,
- .clone_file_range = cifs_clone_file_range,
+ .remap_file_range = cifs_remap_file_range,
.setlease = cifs_setlease,
.fallocate = cifs_fallocate,
};
const struct file_operations cifs_file_direct_nobrl_ops = {
- /* BB reevaluate whether they can be done with directio, no cache */
- .read_iter = cifs_user_readv,
- .write_iter = cifs_user_writev,
+ .read_iter = cifs_direct_readv,
+ .write_iter = cifs_direct_writev,
.open = cifs_open,
.release = cifs_close,
.fsync = cifs_fsync,
.splice_write = iter_file_splice_write,
.unlocked_ioctl = cifs_ioctl,
.copy_file_range = cifs_copy_file_range,
- .clone_file_range = cifs_clone_file_range,
+ .remap_file_range = cifs_remap_file_range,
.llseek = cifs_llseek,
.setlease = cifs_setlease,
.fallocate = cifs_fallocate,
.read = generic_read_dir,
.unlocked_ioctl = cifs_ioctl,
.copy_file_range = cifs_copy_file_range,
- .clone_file_range = cifs_clone_file_range,
+ .remap_file_range = cifs_remap_file_range,
.llseek = generic_file_llseek,
.fsync = cifs_dir_fsync,
};
extern int cifs_close(struct inode *inode, struct file *file);
extern int cifs_closedir(struct inode *inode, struct file *file);
extern ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to);
+extern ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from);
+extern ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *from);
extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
extern int cifs_lock(struct file *, int, struct file_lock *);
extern int cifs_fsync(struct file *, loff_t, loff_t, int);
__u8 create_guid[16];
struct cifs_pending_open *pending_open;
unsigned int epoch;
+#ifdef CONFIG_CIFS_DEBUG2
+ __u64 mid;
+#endif /* CIFS_DEBUG2 */
bool purge_cache;
};
unsigned int len;
unsigned int total_len;
bool should_dirty;
+ /*
+ * Indicates if this aio_ctx is for direct_io,
+ * If yes, iter is a copy of the user passed iov_iter
+ */
+ bool direct_io;
};
struct cifs_readdata;
char PathBuffer[0];
} __attribute__((packed));
+/* Flag above */
+#define SYMLINK_FLAG_RELATIVE 0x00000001
+
/* For IO_REPARSE_TAG_NFS */
#define NFS_SPECFILE_LNK 0x00000000014B4E4C
#define NFS_SPECFILE_CHR 0x0000000000524843
{
struct msghdr smb_msg;
struct kvec iov = {.iov_base = buf, .iov_len = to_read};
- iov_iter_kvec(&smb_msg.msg_iter, READ | ITER_KVEC, &iov, 1, to_read);
+ iov_iter_kvec(&smb_msg.msg_iter, READ, &iov, 1, to_read);
return cifs_readv_from_socket(server, &smb_msg);
}
struct msghdr smb_msg;
struct bio_vec bv = {
.bv_page = page, .bv_len = to_read, .bv_offset = page_offset};
- iov_iter_bvec(&smb_msg.msg_iter, READ | ITER_BVEC, &bv, 1, to_read);
+ iov_iter_bvec(&smb_msg.msg_iter, READ, &bv, 1, to_read);
return cifs_readv_from_socket(server, &smb_msg);
}
cifs_dbg(FYI, "using cifs_sb prepath <%s>\n", cifs_sb->prepath);
memcpy(full_path+dfsplen+1, cifs_sb->prepath, pplen-1);
- full_path[dfsplen] = '\\';
+ full_path[dfsplen] = dirsep;
for (i = 0; i < pplen-1; i++)
if (full_path[dfsplen+1+i] == '/')
full_path[dfsplen+1+i] = CIFS_DIR_SEP(cifs_sb);
* Set the byte-range lock (mandatory style). Returns:
* 1) 0, if we set the lock and don't need to request to the server;
* 2) 1, if no locks prevent us but we need to request to the server;
- * 3) -EACCESS, if there is a lock that prevents us and wait is false.
+ * 3) -EACCES, if there is a lock that prevents us and wait is false.
*/
static int
cifs_lock_add_if(struct cifsFileInfo *cfile, struct cifsLockInfo *lock,
return 0;
}
+static int
+cifs_resend_wdata(struct cifs_writedata *wdata, struct list_head *wdata_list,
+ struct cifs_aio_ctx *ctx)
+{
+ unsigned int wsize, credits;
+ int rc;
+ struct TCP_Server_Info *server =
+ tlink_tcon(wdata->cfile->tlink)->ses->server;
+
+ /*
+ * Wait for credits to resend this wdata.
+ * Note: we are attempting to resend the whole wdata not in segments
+ */
+ do {
+ rc = server->ops->wait_mtu_credits(
+ server, wdata->bytes, &wsize, &credits);
+
+ if (rc)
+ goto out;
+
+ if (wsize < wdata->bytes) {
+ add_credits_and_wake_if(server, credits, 0);
+ msleep(1000);
+ }
+ } while (wsize < wdata->bytes);
+
+ rc = -EAGAIN;
+ while (rc == -EAGAIN) {
+ rc = 0;
+ if (wdata->cfile->invalidHandle)
+ rc = cifs_reopen_file(wdata->cfile, false);
+ if (!rc)
+ rc = server->ops->async_writev(wdata,
+ cifs_uncached_writedata_release);
+ }
+
+ if (!rc) {
+ list_add_tail(&wdata->list, wdata_list);
+ return 0;
+ }
+
+ add_credits_and_wake_if(server, wdata->credits, 0);
+out:
+ kref_put(&wdata->refcount, cifs_uncached_writedata_release);
+
+ return rc;
+}
+
static int
cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
struct cifsFileInfo *open_file,
loff_t saved_offset = offset;
pid_t pid;
struct TCP_Server_Info *server;
+ struct page **pagevec;
+ size_t start;
if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RWPIDFORWARD)
pid = open_file->pid;
if (rc)
break;
- nr_pages = get_numpages(wsize, len, &cur_len);
- wdata = cifs_writedata_alloc(nr_pages,
+ if (ctx->direct_io) {
+ ssize_t result;
+
+ result = iov_iter_get_pages_alloc(
+ from, &pagevec, wsize, &start);
+ if (result < 0) {
+ cifs_dbg(VFS,
+ "direct_writev couldn't get user pages "
+ "(rc=%zd) iter type %d iov_offset %zd "
+ "count %zd\n",
+ result, from->type,
+ from->iov_offset, from->count);
+ dump_stack();
+ break;
+ }
+ cur_len = (size_t)result;
+ iov_iter_advance(from, cur_len);
+
+ nr_pages =
+ (cur_len + start + PAGE_SIZE - 1) / PAGE_SIZE;
+
+ wdata = cifs_writedata_direct_alloc(pagevec,
cifs_uncached_writev_complete);
- if (!wdata) {
- rc = -ENOMEM;
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
+ if (!wdata) {
+ rc = -ENOMEM;
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
- rc = cifs_write_allocate_pages(wdata->pages, nr_pages);
- if (rc) {
- kfree(wdata);
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
- num_pages = nr_pages;
- rc = wdata_fill_from_iovec(wdata, from, &cur_len, &num_pages);
- if (rc) {
- for (i = 0; i < nr_pages; i++)
- put_page(wdata->pages[i]);
- kfree(wdata);
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
+ wdata->page_offset = start;
+ wdata->tailsz =
+ nr_pages > 1 ?
+ cur_len - (PAGE_SIZE - start) -
+ (nr_pages - 2) * PAGE_SIZE :
+ cur_len;
+ } else {
+ nr_pages = get_numpages(wsize, len, &cur_len);
+ wdata = cifs_writedata_alloc(nr_pages,
+ cifs_uncached_writev_complete);
+ if (!wdata) {
+ rc = -ENOMEM;
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
- /*
- * Bring nr_pages down to the number of pages we actually used,
- * and free any pages that we didn't use.
- */
- for ( ; nr_pages > num_pages; nr_pages--)
- put_page(wdata->pages[nr_pages - 1]);
+ rc = cifs_write_allocate_pages(wdata->pages, nr_pages);
+ if (rc) {
+ kfree(wdata);
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
+
+ num_pages = nr_pages;
+ rc = wdata_fill_from_iovec(
+ wdata, from, &cur_len, &num_pages);
+ if (rc) {
+ for (i = 0; i < nr_pages; i++)
+ put_page(wdata->pages[i]);
+ kfree(wdata);
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
+
+ /*
+ * Bring nr_pages down to the number of pages we
+ * actually used, and free any pages that we didn't use.
+ */
+ for ( ; nr_pages > num_pages; nr_pages--)
+ put_page(wdata->pages[nr_pages - 1]);
+
+ wdata->tailsz = cur_len - ((nr_pages - 1) * PAGE_SIZE);
+ }
wdata->sync_mode = WB_SYNC_ALL;
wdata->nr_pages = nr_pages;
wdata->pid = pid;
wdata->bytes = cur_len;
wdata->pagesz = PAGE_SIZE;
- wdata->tailsz = cur_len - ((nr_pages - 1) * PAGE_SIZE);
wdata->credits = credits;
wdata->ctx = ctx;
kref_get(&ctx->refcount);
INIT_LIST_HEAD(&tmp_list);
list_del_init(&wdata->list);
- iov_iter_advance(&tmp_from,
+ if (ctx->direct_io)
+ rc = cifs_resend_wdata(
+ wdata, &tmp_list, ctx);
+ else {
+ iov_iter_advance(&tmp_from,
wdata->offset - ctx->pos);
- rc = cifs_write_from_iter(wdata->offset,
+ rc = cifs_write_from_iter(wdata->offset,
wdata->bytes, &tmp_from,
ctx->cfile, cifs_sb, &tmp_list,
ctx);
+ }
list_splice(&tmp_list, &ctx->list);
kref_put(&wdata->refcount, cifs_uncached_writedata_release);
}
- for (i = 0; i < ctx->npages; i++)
- put_page(ctx->bv[i].bv_page);
+ if (!ctx->direct_io)
+ for (i = 0; i < ctx->npages; i++)
+ put_page(ctx->bv[i].bv_page);
cifs_stats_bytes_written(tcon, ctx->total_len);
set_bit(CIFS_INO_INVALID_MAPPING, &CIFS_I(dentry->d_inode)->flags);
complete(&ctx->done);
}
-ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from)
+static ssize_t __cifs_writev(
+ struct kiocb *iocb, struct iov_iter *from, bool direct)
{
struct file *file = iocb->ki_filp;
ssize_t total_written = 0;
struct cifs_sb_info *cifs_sb;
struct cifs_aio_ctx *ctx;
struct iov_iter saved_from = *from;
+ size_t len = iov_iter_count(from);
int rc;
/*
- * BB - optimize the way when signing is disabled. We can drop this
- * extra memory-to-memory copying and use iovec buffers for constructing
- * write request.
+ * iov_iter_get_pages_alloc doesn't work with ITER_KVEC.
+ * In this case, fall back to non-direct write function.
+ * this could be improved by getting pages directly in ITER_KVEC
*/
+ if (direct && from->type & ITER_KVEC) {
+ cifs_dbg(FYI, "use non-direct cifs_writev for kvec I/O\n");
+ direct = false;
+ }
rc = generic_write_checks(iocb, from);
if (rc <= 0)
ctx->pos = iocb->ki_pos;
- rc = setup_aio_ctx_iter(ctx, from, WRITE);
- if (rc) {
- kref_put(&ctx->refcount, cifs_aio_ctx_release);
- return rc;
+ if (direct) {
+ ctx->direct_io = true;
+ ctx->iter = *from;
+ ctx->len = len;
+ } else {
+ rc = setup_aio_ctx_iter(ctx, from, WRITE);
+ if (rc) {
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+ return rc;
+ }
}
/* grab a lock here due to read response handlers can access ctx */
return total_written;
}
+ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *from)
+{
+ return __cifs_writev(iocb, from, true);
+}
+
+ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from)
+{
+ return __cifs_writev(iocb, from, false);
+}
+
static ssize_t
cifs_writev(struct kiocb *iocb, struct iov_iter *from)
{
kref_put(&rdata->ctx->refcount, cifs_aio_ctx_release);
for (i = 0; i < rdata->nr_pages; i++) {
put_page(rdata->pages[i]);
- rdata->pages[i] = NULL;
}
cifs_readdata_release(refcount);
}
size_t copy = min_t(size_t, remaining, PAGE_SIZE);
size_t written;
- if (unlikely(iter->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(iter))) {
void *addr = kmap_atomic(page);
written = copy_to_iter(addr, copy, iter);
return uncached_fill_pages(server, rdata, iter, iter->count);
}
+static int cifs_resend_rdata(struct cifs_readdata *rdata,
+ struct list_head *rdata_list,
+ struct cifs_aio_ctx *ctx)
+{
+ unsigned int rsize, credits;
+ int rc;
+ struct TCP_Server_Info *server =
+ tlink_tcon(rdata->cfile->tlink)->ses->server;
+
+ /*
+ * Wait for credits to resend this rdata.
+ * Note: we are attempting to resend the whole rdata not in segments
+ */
+ do {
+ rc = server->ops->wait_mtu_credits(server, rdata->bytes,
+ &rsize, &credits);
+
+ if (rc)
+ goto out;
+
+ if (rsize < rdata->bytes) {
+ add_credits_and_wake_if(server, credits, 0);
+ msleep(1000);
+ }
+ } while (rsize < rdata->bytes);
+
+ rc = -EAGAIN;
+ while (rc == -EAGAIN) {
+ rc = 0;
+ if (rdata->cfile->invalidHandle)
+ rc = cifs_reopen_file(rdata->cfile, true);
+ if (!rc)
+ rc = server->ops->async_readv(rdata);
+ }
+
+ if (!rc) {
+ /* Add to aio pending list */
+ list_add_tail(&rdata->list, rdata_list);
+ return 0;
+ }
+
+ add_credits_and_wake_if(server, rdata->credits, 0);
+out:
+ kref_put(&rdata->refcount,
+ cifs_uncached_readdata_release);
+
+ return rc;
+}
+
static int
cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
struct cifs_sb_info *cifs_sb, struct list_head *rdata_list,
int rc;
pid_t pid;
struct TCP_Server_Info *server;
+ struct page **pagevec;
+ size_t start;
+ struct iov_iter direct_iov = ctx->iter;
server = tlink_tcon(open_file->tlink)->ses->server;
else
pid = current->tgid;
+ if (ctx->direct_io)
+ iov_iter_advance(&direct_iov, offset - ctx->pos);
+
do {
rc = server->ops->wait_mtu_credits(server, cifs_sb->rsize,
&rsize, &credits);
break;
cur_len = min_t(const size_t, len, rsize);
- npages = DIV_ROUND_UP(cur_len, PAGE_SIZE);
- /* allocate a readdata struct */
- rdata = cifs_readdata_alloc(npages,
+ if (ctx->direct_io) {
+ ssize_t result;
+
+ result = iov_iter_get_pages_alloc(
+ &direct_iov, &pagevec,
+ cur_len, &start);
+ if (result < 0) {
+ cifs_dbg(VFS,
+ "couldn't get user pages (cur_len=%zd)"
+ " iter type %d"
+ " iov_offset %zd count %zd\n",
+ result, direct_iov.type,
+ direct_iov.iov_offset,
+ direct_iov.count);
+ dump_stack();
+ break;
+ }
+ cur_len = (size_t)result;
+ iov_iter_advance(&direct_iov, cur_len);
+
+ rdata = cifs_readdata_direct_alloc(
+ pagevec, cifs_uncached_readv_complete);
+ if (!rdata) {
+ add_credits_and_wake_if(server, credits, 0);
+ rc = -ENOMEM;
+ break;
+ }
+
+ npages = (cur_len + start + PAGE_SIZE-1) / PAGE_SIZE;
+ rdata->page_offset = start;
+ rdata->tailsz = npages > 1 ?
+ cur_len-(PAGE_SIZE-start)-(npages-2)*PAGE_SIZE :
+ cur_len;
+
+ } else {
+
+ npages = DIV_ROUND_UP(cur_len, PAGE_SIZE);
+ /* allocate a readdata struct */
+ rdata = cifs_readdata_alloc(npages,
cifs_uncached_readv_complete);
- if (!rdata) {
- add_credits_and_wake_if(server, credits, 0);
- rc = -ENOMEM;
- break;
- }
+ if (!rdata) {
+ add_credits_and_wake_if(server, credits, 0);
+ rc = -ENOMEM;
+ break;
+ }
- rc = cifs_read_allocate_pages(rdata, npages);
- if (rc)
- goto error;
+ rc = cifs_read_allocate_pages(rdata, npages);
+ if (rc)
+ goto error;
+
+ rdata->tailsz = PAGE_SIZE;
+ }
rdata->cfile = cifsFileInfo_get(open_file);
rdata->nr_pages = npages;
rdata->bytes = cur_len;
rdata->pid = pid;
rdata->pagesz = PAGE_SIZE;
- rdata->tailsz = PAGE_SIZE;
rdata->read_into_pages = cifs_uncached_read_into_pages;
rdata->copy_into_pages = cifs_uncached_copy_into_pages;
rdata->credits = credits;
if (rc) {
add_credits_and_wake_if(server, rdata->credits, 0);
kref_put(&rdata->refcount,
- cifs_uncached_readdata_release);
- if (rc == -EAGAIN)
+ cifs_uncached_readdata_release);
+ if (rc == -EAGAIN) {
+ iov_iter_revert(&direct_iov, cur_len);
continue;
+ }
break;
}
* reading.
*/
if (got_bytes && got_bytes < rdata->bytes) {
- rc = cifs_readdata_to_iov(rdata, to);
+ rc = 0;
+ if (!ctx->direct_io)
+ rc = cifs_readdata_to_iov(rdata, to);
if (rc) {
kref_put(&rdata->refcount,
- cifs_uncached_readdata_release);
+ cifs_uncached_readdata_release);
continue;
}
}
- rc = cifs_send_async_read(
+ if (ctx->direct_io) {
+ /*
+ * Re-use rdata as this is a
+ * direct I/O
+ */
+ rc = cifs_resend_rdata(
+ rdata,
+ &tmp_list, ctx);
+ } else {
+ rc = cifs_send_async_read(
rdata->offset + got_bytes,
rdata->bytes - got_bytes,
rdata->cfile, cifs_sb,
&tmp_list, ctx);
+ kref_put(&rdata->refcount,
+ cifs_uncached_readdata_release);
+ }
+
list_splice(&tmp_list, &ctx->list);
- kref_put(&rdata->refcount,
- cifs_uncached_readdata_release);
goto again;
} else if (rdata->result)
rc = rdata->result;
- else
+ else if (!ctx->direct_io)
rc = cifs_readdata_to_iov(rdata, to);
/* if there was a short read -- discard anything left */
if (rdata->got_bytes && rdata->got_bytes < rdata->bytes)
rc = -ENODATA;
+
+ ctx->total_len += rdata->got_bytes;
}
list_del_init(&rdata->list);
kref_put(&rdata->refcount, cifs_uncached_readdata_release);
}
- for (i = 0; i < ctx->npages; i++) {
- if (ctx->should_dirty)
- set_page_dirty(ctx->bv[i].bv_page);
- put_page(ctx->bv[i].bv_page);
- }
+ if (!ctx->direct_io) {
+ for (i = 0; i < ctx->npages; i++) {
+ if (ctx->should_dirty)
+ set_page_dirty(ctx->bv[i].bv_page);
+ put_page(ctx->bv[i].bv_page);
+ }
- ctx->total_len = ctx->len - iov_iter_count(to);
+ ctx->total_len = ctx->len - iov_iter_count(to);
+ }
cifs_stats_bytes_read(tcon, ctx->total_len);
complete(&ctx->done);
}
-ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to)
+static ssize_t __cifs_readv(
+ struct kiocb *iocb, struct iov_iter *to, bool direct)
{
- struct file *file = iocb->ki_filp;
- ssize_t rc;
size_t len;
- ssize_t total_read = 0;
- loff_t offset = iocb->ki_pos;
+ struct file *file = iocb->ki_filp;
struct cifs_sb_info *cifs_sb;
- struct cifs_tcon *tcon;
struct cifsFileInfo *cfile;
+ struct cifs_tcon *tcon;
+ ssize_t rc, total_read = 0;
+ loff_t offset = iocb->ki_pos;
struct cifs_aio_ctx *ctx;
+ /*
+ * iov_iter_get_pages_alloc() doesn't work with ITER_KVEC,
+ * fall back to data copy read path
+ * this could be improved by getting pages directly in ITER_KVEC
+ */
+ if (direct && to->type & ITER_KVEC) {
+ cifs_dbg(FYI, "use non-direct cifs_user_readv for kvec I/O\n");
+ direct = false;
+ }
+
len = iov_iter_count(to);
if (!len)
return 0;
if (!is_sync_kiocb(iocb))
ctx->iocb = iocb;
- if (to->type == ITER_IOVEC)
+ if (iter_is_iovec(to))
ctx->should_dirty = true;
- rc = setup_aio_ctx_iter(ctx, to, READ);
- if (rc) {
- kref_put(&ctx->refcount, cifs_aio_ctx_release);
- return rc;
+ if (direct) {
+ ctx->pos = offset;
+ ctx->direct_io = true;
+ ctx->iter = *to;
+ ctx->len = len;
+ } else {
+ rc = setup_aio_ctx_iter(ctx, to, READ);
+ if (rc) {
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+ return rc;
+ }
+ len = ctx->len;
}
- len = ctx->len;
-
/* grab a lock here due to read response handlers can access ctx */
mutex_lock(&ctx->aio_mutex);
return rc;
}
+ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter *to)
+{
+ return __cifs_readv(iocb, to, true);
+}
+
+ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to)
+{
+ return __cifs_readv(iocb, to, false);
+}
+
ssize_t
cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to)
{
/*
* If d_inode(dentry) is null (usually meaning the cached dentry
* is a negative dentry) then we would attempt a standard SMB delete, but
- * if that fails we can not attempt the fall back mechanisms on EACCESS
- * but will return the EACCESS to the caller. Note that the VFS does not call
+ * if that fails we can not attempt the fall back mechanisms on EACCES
+ * but will return the EACCES to the caller. Note that the VFS does not call
* unlink on negative dentries currently.
*/
int cifs_unlink(struct inode *dir, struct dentry *dentry)
struct page **pages = NULL;
struct bio_vec *bv = NULL;
- if (iter->type & ITER_KVEC) {
+ if (iov_iter_is_kvec(iter)) {
memcpy(&ctx->iter, iter, sizeof(struct iov_iter));
ctx->len = count;
iov_iter_advance(iter, count);
ctx->bv = bv;
ctx->len = saved_len - count;
ctx->npages = npages;
- iov_iter_bvec(&ctx->iter, ITER_BVEC | rw, ctx->bv, npages, ctx->len);
+ iov_iter_bvec(&ctx->iter, rw, ctx->bv, npages, ctx->len);
return 0;
}
int rc = 0;
unsigned int ea_name_len = ea_name ? strlen(ea_name) : 0;
char *name, *value;
+ size_t buf_size = dst_size;
size_t name_len, value_len, user_name_len;
while (src_size > 0) {
/* 'user.' plus a terminating null */
user_name_len = 5 + 1 + name_len;
- rc += user_name_len;
-
- if (dst_size >= user_name_len) {
+ if (buf_size == 0) {
+ /* skip copy - calc size only */
+ rc += user_name_len;
+ } else if (dst_size >= user_name_len) {
dst_size -= user_name_len;
memcpy(dst, "user.", 5);
dst += 5;
dst += name_len;
*dst = 0;
++dst;
- } else if (dst_size == 0) {
- /* skip copy - calc size only */
+ rc += user_name_len;
} else {
/* stop before overrun buffer */
rc = -ERANGE;
cfile->fid.persistent_fid = fid->persistent_fid;
cfile->fid.volatile_fid = fid->volatile_fid;
+#ifdef CONFIG_CIFS_DEBUG2
+ cfile->fid.mid = fid->mid;
+#endif /* CIFS_DEBUG2 */
server->ops->set_oplock_level(cinode, oplock, fid->epoch,
&fid->purge_cache);
cinode->can_cache_brlcks = CIFS_CACHE_WRITE(cinode);
return 0;
}
- iov_iter_bvec(&iter, WRITE | ITER_BVEC, bvec, npages, data_len);
+ iov_iter_bvec(&iter, WRITE, bvec, npages, data_len);
} else if (buf_len >= data_offset + data_len) {
/* read response payload is in buf */
WARN_ONCE(npages > 0, "read data can be either in buf or in pages");
iov.iov_base = buf + data_offset;
iov.iov_len = data_len;
- iov_iter_kvec(&iter, WRITE | ITER_KVEC, &iov, 1, data_len);
+ iov_iter_kvec(&iter, WRITE, &iov, 1, data_len);
} else {
/* read response payload cannot be in both buf and pages */
WARN_ONCE(1, "buf can not contain only a part of read data");
rc = cifs_send_recv(xid, ses, &rqst, &resp_buftype, flags, &rsp_iov);
cifs_small_buf_release(req);
rsp = (struct smb2_tree_connect_rsp *)rsp_iov.iov_base;
-
+ trace_smb3_tcon(xid, tcon->tid, ses->Suid, tree, rc);
if (rc != 0) {
if (tcon) {
cifs_stats_fail_inc(tcon, SMB2_TREE_CONNECT_HE);
if (tcon->ses->server->ops->validate_negotiate)
rc = tcon->ses->server->ops->validate_negotiate(xid, tcon);
tcon_exit:
+
free_rsp_buf(resp_buftype, rsp);
kfree(unc_path);
return rc;
atomic_inc(&tcon->num_remote_opens);
oparms->fid->persistent_fid = rsp->PersistentFileId;
oparms->fid->volatile_fid = rsp->VolatileFileId;
+#ifdef CONFIG_CIFS_DEBUG2
+ oparms->fid->mid = le64_to_cpu(rsp->sync_hdr.MessageId);
+#endif /* CIFS_DEBUG2 */
if (buf) {
memcpy(buf, &rsp->CreationTime, 32);
/* Integrity flags for above */
#define FSCTL_INTEGRITY_FLAG_CHECKSUM_ENFORCEMENT_OFF 0x00000001
+/* Reparse structures - see MS-FSCC 2.1.2 */
+
+/* struct fsctl_reparse_info_req is empty, only response structs (see below) */
+
+struct reparse_data_buffer {
+ __le32 ReparseTag;
+ __le16 ReparseDataLength;
+ __u16 Reserved;
+ __u8 DataBuffer[0]; /* Variable Length */
+} __packed;
+
+struct reparse_guid_data_buffer {
+ __le32 ReparseTag;
+ __le16 ReparseDataLength;
+ __u16 Reserved;
+ __u8 ReparseGuid[16];
+ __u8 DataBuffer[0]; /* Variable Length */
+} __packed;
+
+struct reparse_mount_point_data_buffer {
+ __le32 ReparseTag;
+ __le16 ReparseDataLength;
+ __u16 Reserved;
+ __le16 SubstituteNameOffset;
+ __le16 SubstituteNameLength;
+ __le16 PrintNameOffset;
+ __le16 PrintNameLength;
+ __u8 PathBuffer[0]; /* Variable Length */
+} __packed;
+
+/* See MS-FSCC 2.1.2.4 and cifspdu.h for struct reparse_symlink_data */
+
+/* See MS-FSCC 2.1.2.6 and cifspdu.h for struct reparse_posix_data */
+
+
/* See MS-DFSC 2.2.2 */
struct fsctl_get_dfs_referral_req {
__le16 MaxReferralLevel;
info->smbd_recv_pending++;
- switch (msg->msg_iter.type) {
- case READ | ITER_KVEC:
+ if (iov_iter_rw(&msg->msg_iter) == WRITE) {
+ /* It's a bug in upper layer to get there */
+ cifs_dbg(VFS, "CIFS: invalid msg iter dir %u\n",
+ iov_iter_rw(&msg->msg_iter));
+ rc = -EINVAL;
+ goto out;
+ }
+
+ switch (iov_iter_type(&msg->msg_iter)) {
+ case ITER_KVEC:
buf = msg->msg_iter.kvec->iov_base;
to_read = msg->msg_iter.kvec->iov_len;
rc = smbd_recv_buf(info, buf, to_read);
break;
- case READ | ITER_BVEC:
+ case ITER_BVEC:
page = msg->msg_iter.bvec->bv_page;
page_offset = msg->msg_iter.bvec->bv_offset;
to_read = msg->msg_iter.bvec->bv_len;
default:
/* It's a bug in upper layer to get there */
cifs_dbg(VFS, "CIFS: invalid msg type %d\n",
- msg->msg_iter.type);
+ iov_iter_type(&msg->msg_iter));
rc = -EINVAL;
}
+out:
info->smbd_recv_pending--;
wake_up(&info->wait_smbd_recv_pending);
DEFINE_SMB3_ENTER_EXIT_EVENT(enter);
DEFINE_SMB3_ENTER_EXIT_EVENT(exit_done);
+/*
+ * For SMB2/SMB3 tree connect
+ */
+
+DECLARE_EVENT_CLASS(smb3_tcon_class,
+ TP_PROTO(unsigned int xid,
+ __u32 tid,
+ __u64 sesid,
+ const char *unc_name,
+ int rc),
+ TP_ARGS(xid, tid, sesid, unc_name, rc),
+ TP_STRUCT__entry(
+ __field(unsigned int, xid)
+ __field(__u32, tid)
+ __field(__u64, sesid)
+ __field(const char *, unc_name)
+ __field(int, rc)
+ ),
+ TP_fast_assign(
+ __entry->xid = xid;
+ __entry->tid = tid;
+ __entry->sesid = sesid;
+ __entry->unc_name = unc_name;
+ __entry->rc = rc;
+ ),
+ TP_printk("xid=%u sid=0x%llx tid=0x%x unc_name=%s rc=%d",
+ __entry->xid, __entry->sesid, __entry->tid,
+ __entry->unc_name, __entry->rc)
+)
+
+#define DEFINE_SMB3_TCON_EVENT(name) \
+DEFINE_EVENT(smb3_tcon_class, smb3_##name, \
+ TP_PROTO(unsigned int xid, \
+ __u32 tid, \
+ __u64 sesid, \
+ const char *unc_name, \
+ int rc), \
+ TP_ARGS(xid, tid, sesid, unc_name, rc))
+
+DEFINE_SMB3_TCON_EVENT(tcon);
+
+
/*
* For smb2/smb3 open call
*/
.iov_base = &rfc1002_marker,
.iov_len = 4
};
- iov_iter_kvec(&smb_msg.msg_iter, WRITE | ITER_KVEC, &hiov,
- 1, 4);
+ iov_iter_kvec(&smb_msg.msg_iter, WRITE, &hiov, 1, 4);
rc = smb_send_kvec(server, &smb_msg, &sent);
if (rc < 0)
goto uncork;
size += iov[i].iov_len;
}
- iov_iter_kvec(&smb_msg.msg_iter, WRITE | ITER_KVEC,
- iov, n_vec, size);
+ iov_iter_kvec(&smb_msg.msg_iter, WRITE, iov, n_vec, size);
rc = smb_send_kvec(server, &smb_msg, &sent);
if (rc < 0)
rqst_page_get_length(&rqst[j], i, &bvec.bv_len,
&bvec.bv_offset);
- iov_iter_bvec(&smb_msg.msg_iter, WRITE | ITER_BVEC,
+ iov_iter_bvec(&smb_msg.msg_iter, WRITE,
&bvec, 1, bvec.bv_len);
rc = smb_send_kvec(server, &smb_msg, &sent);
if (rc < 0)
return xa_mk_value(flags | (pfn_t_to_pfn(pfn) << DAX_SHIFT));
}
-static void *dax_make_page_entry(struct page *page)
-{
- pfn_t pfn = page_to_pfn_t(page);
- return dax_make_entry(pfn, PageHead(page) ? DAX_PMD : 0);
-}
-
static bool dax_is_locked(void *entry)
{
return xa_to_value(entry) & DAX_LOCKED;
return 0;
}
-static int dax_is_pmd_entry(void *entry)
+static unsigned long dax_is_pmd_entry(void *entry)
{
return xa_to_value(entry) & DAX_PMD;
}
-static int dax_is_pte_entry(void *entry)
+static bool dax_is_pte_entry(void *entry)
{
return !(xa_to_value(entry) & DAX_PMD);
}
ewait.wait.func = wake_exceptional_entry_func;
for (;;) {
- entry = xas_load(xas);
- if (!entry || xa_is_internal(entry) ||
- WARN_ON_ONCE(!xa_is_value(entry)) ||
+ entry = xas_find_conflict(xas);
+ if (!entry || WARN_ON_ONCE(!xa_is_value(entry)) ||
!dax_is_locked(entry))
return entry;
}
}
+/*
+ * The only thing keeping the address space around is the i_pages lock
+ * (it's cycled in clear_inode() after removing the entries from i_pages)
+ * After we call xas_unlock_irq(), we cannot touch xas->xa.
+ */
+static void wait_entry_unlocked(struct xa_state *xas, void *entry)
+{
+ struct wait_exceptional_entry_queue ewait;
+ wait_queue_head_t *wq;
+
+ init_wait(&ewait.wait);
+ ewait.wait.func = wake_exceptional_entry_func;
+
+ wq = dax_entry_waitqueue(xas, entry, &ewait.key);
+ prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
+ xas_unlock_irq(xas);
+ schedule();
+ finish_wait(wq, &ewait.wait);
+
+ /*
+ * Entry lock waits are exclusive. Wake up the next waiter since
+ * we aren't sure we will acquire the entry lock and thus wake
+ * the next waiter up on unlock.
+ */
+ if (waitqueue_active(wq))
+ __wake_up(wq, TASK_NORMAL, 1, &ewait.key);
+}
+
static void put_unlocked_entry(struct xa_state *xas, void *entry)
{
/* If we were the only waiter woken, wake the next one */
{
void *old;
+ BUG_ON(dax_is_locked(entry));
xas_reset(xas);
xas_lock_irq(xas);
old = xas_store(xas, entry);
return NULL;
}
-bool dax_lock_mapping_entry(struct page *page)
+/*
+ * dax_lock_mapping_entry - Lock the DAX entry corresponding to a page
+ * @page: The page whose entry we want to lock
+ *
+ * Context: Process context.
+ * Return: A cookie to pass to dax_unlock_page() or 0 if the entry could
+ * not be locked.
+ */
+dax_entry_t dax_lock_page(struct page *page)
{
XA_STATE(xas, NULL, 0);
void *entry;
+ /* Ensure page->mapping isn't freed while we look at it */
+ rcu_read_lock();
for (;;) {
struct address_space *mapping = READ_ONCE(page->mapping);
- if (!dax_mapping(mapping))
- return false;
+ entry = NULL;
+ if (!mapping || !dax_mapping(mapping))
+ break;
/*
* In the device-dax case there's no need to lock, a
* otherwise we would not have a valid pfn_to_page()
* translation.
*/
+ entry = (void *)~0UL;
if (S_ISCHR(mapping->host->i_mode))
- return true;
+ break;
xas.xa = &mapping->i_pages;
xas_lock_irq(&xas);
xas_set(&xas, page->index);
entry = xas_load(&xas);
if (dax_is_locked(entry)) {
- entry = get_unlocked_entry(&xas);
- /* Did the page move while we slept? */
- if (dax_to_pfn(entry) != page_to_pfn(page)) {
- xas_unlock_irq(&xas);
- continue;
- }
+ rcu_read_unlock();
+ wait_entry_unlocked(&xas, entry);
+ rcu_read_lock();
+ continue;
}
dax_lock_entry(&xas, entry);
xas_unlock_irq(&xas);
- return true;
+ break;
}
+ rcu_read_unlock();
+ return (dax_entry_t)entry;
}
-void dax_unlock_mapping_entry(struct page *page)
+void dax_unlock_page(struct page *page, dax_entry_t cookie)
{
struct address_space *mapping = page->mapping;
XA_STATE(xas, &mapping->i_pages, page->index);
if (S_ISCHR(mapping->host->i_mode))
return;
- dax_unlock_entry(&xas, dax_make_page_entry(page));
+ dax_unlock_entry(&xas, (void *)cookie);
}
/*
retry:
xas_lock_irq(xas);
entry = get_unlocked_entry(xas);
- if (xa_is_internal(entry))
- goto fallback;
if (entry) {
- if (WARN_ON_ONCE(!xa_is_value(entry))) {
+ if (!xa_is_value(entry)) {
xas_set_err(xas, EIO);
goto out_unlock;
}
/* Did we race with someone splitting entry or so? */
if (!entry ||
(order == 0 && !dax_is_pte_entry(entry)) ||
- (order == PMD_ORDER && (xa_is_internal(entry) ||
- !dax_is_pmd_entry(entry)))) {
+ (order == PMD_ORDER && !dax_is_pmd_entry(entry))) {
put_unlocked_entry(&xas, entry);
xas_unlock_irq(&xas);
trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf,
*/
dio->iocb->ki_pos += transferred;
- if (dio->op == REQ_OP_WRITE)
- ret = generic_write_sync(dio->iocb, transferred);
+ if (ret > 0 && dio->op == REQ_OP_WRITE)
+ ret = generic_write_sync(dio->iocb, ret);
dio->iocb->ki_complete(dio->iocb, ret, 0);
}
spin_lock_init(&dio->bio_lock);
dio->refcount = 1;
- dio->should_dirty = (iter->type == ITER_IOVEC);
+ dio->should_dirty = iter_is_iovec(iter) && iov_iter_rw(iter) == READ;
sdio.iter = iter;
sdio.final_block_in_request = end >> blkbits;
nvec = 2;
}
len = iov[0].iov_len + iov[1].iov_len;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, iov, nvec, len);
+ iov_iter_kvec(&msg.msg_iter, READ, iov, nvec, len);
r = ret = sock_recvmsg(con->sock, &msg, MSG_DONTWAIT | MSG_NOSIGNAL);
if (ret <= 0)
token = match_token(p, tokens, args);
switch (token) {
case Opt_name:
+ kfree(opts->dev_name);
opts->dev_name = match_strdup(&args[0]);
if (unlikely(!opts->dev_name)) {
EXOFS_ERR("Error allocating dev_name");
EXOFS_MIN_PID);
return -EINVAL;
}
- s_pid = 1;
+ s_pid = true;
break;
case Opt_to:
if (match_int(&args[0], &option))
int ret;
ret = parse_options(data, &opts);
- if (ret)
+ if (ret) {
+ kfree(opts.dev_name);
return ERR_PTR(ret);
+ }
if (!opts.dev_name)
opts.dev_name = dev_name;
struct dentry *parent = dget_parent(dentry);
dput(dentry);
- if (IS_ROOT(dentry)) {
+ if (dentry == parent) {
dput(parent);
return false;
}
tmp = lookup_one_len_unlocked(nbuf, parent, strlen(nbuf));
if (IS_ERR(tmp)) {
dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
+ err = PTR_ERR(tmp);
goto out_err;
}
if (tmp != dentry) {
if (sb->s_magic != EXT2_SUPER_MAGIC)
goto cantfind_ext2;
+ opts.s_mount_opt = 0;
/* Set defaults before we parse the mount options */
def_mount_opts = le32_to_cpu(es->s_default_mount_opts);
if (def_mount_opts & EXT2_DEFM_DEBUG)
}
cleanup:
- brelse(bh);
if (!(bh && header == HDR(bh)))
kfree(header);
+ brelse(bh);
up_write(&EXT2_I(inode)->xattr_sem);
return error;
#include <linux/compiler.h>
-/* Until this gets included into linux/compiler-gcc.h */
-#ifndef __nonstring
-#if defined(GCC_VERSION) && (GCC_VERSION >= 80000)
-#define __nonstring __attribute__((nonstring))
-#else
-#define __nonstring
-#endif
-#endif
-
/*
* The fourth extended filesystem constants/structures
*/
bit = (ino - 1) % EXT4_INODES_PER_GROUP(sb);
bitmap_bh = ext4_read_inode_bitmap(sb, block_group);
if (IS_ERR(bitmap_bh))
- return (struct inode *) bitmap_bh;
+ return ERR_CAST(bitmap_bh);
/* Having the inode bit set should be a 100% indicator that this
* is a valid orphan (no e2fsck run on fs). Orphans also include
{
int err = 0;
- if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
+ if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) {
+ put_bh(iloc->bh);
return -EIO;
-
+ }
if (IS_I_VERSION(inode))
inode_inc_iversion(inode);
if (!is_dx_block && type == INDEX) {
ext4_error_inode(inode, func, line, block,
"directory leaf block found instead of index block");
+ brelse(bh);
return ERR_PTR(-EFSCORRUPTED);
}
if (!ext4_has_metadata_csum(inode->i_sb) ||
bh = ext4_find_entry(dir, &dentry->d_name, &de, NULL);
if (IS_ERR(bh))
- return (struct dentry *) bh;
+ return ERR_CAST(bh);
inode = NULL;
if (bh) {
__u32 ino = le32_to_cpu(de->inode);
bh = ext4_find_entry(d_inode(child), &dotdot, &de, NULL);
if (IS_ERR(bh))
- return (struct dentry *) bh;
+ return ERR_CAST(bh);
if (!bh)
return ERR_PTR(-ENOENT);
ino = le32_to_cpu(de->inode);
list_del_init(&EXT4_I(inode)->i_orphan);
mutex_unlock(&sbi->s_orphan_lock);
}
- }
+ } else
+ brelse(iloc.bh);
+
jbd_debug(4, "superblock will point to %lu\n", inode->i_ino);
jbd_debug(4, "orphan inode %lu will point to %d\n",
inode->i_ino, NEXT_ORPHAN(inode));
bio = bio_alloc(GFP_NOIO, BIO_MAX_PAGES);
if (!bio)
return -ENOMEM;
+ wbc_init_bio(io->io_wbc, bio);
bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9);
bio_set_dev(bio, bh->b_bdev);
bio->bi_end_io = ext4_end_bio;
bio->bi_private = ext4_get_io_end(io->io_end);
io->io_bio = bio;
io->io_next_block = bh->b_blocknr;
- wbc_init_bio(io->io_wbc, bio);
return 0;
}
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
- if (err)
+ if (err) {
+ brelse(bh);
return err;
+ }
ext4_debug("mark block bitmap %#04llx (+%llu/%u)\n",
first_cluster, first_cluster - start, count2);
ext4_set_bits(bh->b_data, first_cluster - start, count2);
err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ brelse(bh);
if (unlikely(err))
return err;
- brelse(bh);
}
return 0;
bh = bclean(handle, sb, block);
if (IS_ERR(bh)) {
err = PTR_ERR(bh);
- bh = NULL;
goto out;
}
overhead = ext4_group_overhead_blocks(sb, group);
ext4_mark_bitmap_end(EXT4_B2C(sbi, group_data[i].blocks_count),
sb->s_blocksize * 8, bh->b_data);
err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ brelse(bh);
if (err)
goto out;
- brelse(bh);
handle_ib:
if (bg_flags[i] & EXT4_BG_INODE_UNINIT)
bh = bclean(handle, sb, block);
if (IS_ERR(bh)) {
err = PTR_ERR(bh);
- bh = NULL;
goto out;
}
ext4_mark_bitmap_end(EXT4_INODES_PER_GROUP(sb),
sb->s_blocksize * 8, bh->b_data);
err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ brelse(bh);
if (err)
goto out;
- brelse(bh);
}
- bh = NULL;
/* Mark group tables in block bitmap */
for (j = 0; j < GROUP_TABLE_COUNT; j++) {
}
out:
- brelse(bh);
err2 = ext4_journal_stop(handle);
if (err2 && !err)
err = err2;
err = ext4_handle_dirty_metadata(handle, NULL, gdb_bh);
if (unlikely(err)) {
ext4_std_error(sb, err);
+ iloc.bh = NULL;
goto exit_inode;
}
brelse(dind);
sizeof(struct buffer_head *),
GFP_NOFS);
if (!n_group_desc) {
+ brelse(gdb_bh);
err = -ENOMEM;
ext4_warning(sb, "not enough memory for %lu groups",
gdb_num + 1);
kvfree(o_group_desc);
BUFFER_TRACE(gdb_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, gdb_bh);
- if (unlikely(err))
- brelse(gdb_bh);
return err;
}
backup_block, backup_block -
ext4_group_first_block_no(sb, group));
BUFFER_TRACE(bh, "get_write_access");
- if ((err = ext4_journal_get_write_access(handle, bh)))
+ if ((err = ext4_journal_get_write_access(handle, bh))) {
+ brelse(bh);
break;
+ }
lock_buffer(bh);
memcpy(bh->b_data, data, size);
if (rest)
err = ext4_alloc_flex_bg_array(sb, n_group + 1);
if (err)
- return err;
+ goto out;
err = ext4_mb_alloc_groupinfo(sb, n_group + 1);
if (err)
n_blocks_count_retry = 0;
free_flex_gd(flex_gd);
flex_gd = NULL;
+ if (resize_inode) {
+ iput(resize_inode);
+ resize_inode = NULL;
+ }
goto retry;
}
sbi->s_groups_count = blocks_count;
sbi->s_blockfile_groups = min_t(ext4_group_t, sbi->s_groups_count,
(EXT4_MAX_BLOCK_FILE_PHYS / EXT4_BLOCKS_PER_GROUP(sb)));
+ if (((u64)sbi->s_groups_count * sbi->s_inodes_per_group) !=
+ le32_to_cpu(es->s_inodes_count)) {
+ ext4_msg(sb, KERN_ERR, "inodes count not valid: %u vs %llu",
+ le32_to_cpu(es->s_inodes_count),
+ ((u64)sbi->s_groups_count * sbi->s_inodes_per_group));
+ ret = -EINVAL;
+ goto failed_mount;
+ }
db_count = (sbi->s_groups_count + EXT4_DESC_PER_BLOCK(sb) - 1) /
EXT4_DESC_PER_BLOCK(sb);
if (ext4_has_feature_meta_bg(sb)) {
ret = -ENOMEM;
goto failed_mount;
}
- if (((u64)sbi->s_groups_count * sbi->s_inodes_per_group) !=
- le32_to_cpu(es->s_inodes_count)) {
- ext4_msg(sb, KERN_ERR, "inodes count not valid: %u vs %llu",
- le32_to_cpu(es->s_inodes_count),
- ((u64)sbi->s_groups_count * sbi->s_inodes_per_group));
- ret = -EINVAL;
- goto failed_mount;
- }
bgl_lock_init(sbi->s_blockgroup_lock);
percpu_counter_destroy(&sbi->s_freeinodes_counter);
percpu_counter_destroy(&sbi->s_dirs_counter);
percpu_counter_destroy(&sbi->s_dirtyclusters_counter);
+ percpu_free_rwsem(&sbi->s_journal_flag_rwsem);
failed_mount5:
ext4_ext_release(sb);
ext4_release_system_zone(sb);
inode_lock(ea_inode);
ret = ext4_reserve_inode_write(handle, ea_inode, &iloc);
- if (ret) {
- iloc.bh = NULL;
+ if (ret)
goto out;
- }
ref_count = ext4_xattr_inode_get_ref(ea_inode);
ref_count += ref_change;
}
ret = ext4_mark_iloc_dirty(handle, ea_inode, &iloc);
- iloc.bh = NULL;
if (ret)
ext4_warning_inode(ea_inode,
"ext4_mark_iloc_dirty() failed ret=%d", ret);
out:
- brelse(iloc.bh);
inode_unlock(ea_inode);
return ret;
}
bh = ext4_getblk(handle, ea_inode, block, 0);
if (IS_ERR(bh))
return PTR_ERR(bh);
+ if (!bh) {
+ WARN_ON_ONCE(1);
+ EXT4_ERROR_INODE(ea_inode,
+ "ext4_getblk() return bh = NULL");
+ return -EFSCORRUPTED;
+ }
ret = ext4_journal_get_write_access(handle, bh);
if (ret)
goto out;
if (!bh)
return ERR_PTR(-EIO);
error = ext4_xattr_check_block(inode, bh);
- if (error)
+ if (error) {
+ brelse(bh);
return ERR_PTR(error);
+ }
return bh;
}
error = ext4_xattr_block_set(handle, inode, &i, &bs);
} else if (error == -ENOSPC) {
if (EXT4_I(inode)->i_file_acl && !bs.s.base) {
+ brelse(bs.bh);
+ bs.bh = NULL;
error = ext4_xattr_block_find(inode, &i, &bs);
if (error)
goto cleanup;
kfree(buffer);
if (is)
brelse(is->iloc.bh);
+ if (bs)
+ brelse(bs->bh);
kfree(is);
kfree(bs);
struct ext4_inode *raw_inode, handle_t *handle)
{
struct ext4_xattr_ibody_header *header;
- struct buffer_head *bh;
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
static unsigned int mnt_count;
size_t min_offs;
* EA block can hold new_extra_isize bytes.
*/
if (EXT4_I(inode)->i_file_acl) {
+ struct buffer_head *bh;
+
bh = sb_bread(inode->i_sb, EXT4_I(inode)->i_file_acl);
error = -EIO;
if (!bh)
goto cleanup;
error = ext4_xattr_check_block(inode, bh);
- if (error)
+ if (error) {
+ brelse(bh);
goto cleanup;
+ }
base = BHDR(bh);
end = bh->b_data + bh->b_size;
min_offs = end - base;
if (awaken)
wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
+ if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
+ wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
+
/* Prevent a race with our last child, which has to signal EV_CLEARED
* before dropping our spinlock.
static void fuse_drop_waiting(struct fuse_conn *fc)
{
- if (fc->connected) {
- atomic_dec(&fc->num_waiting);
- } else if (atomic_dec_and_test(&fc->num_waiting)) {
+ /*
+ * lockess check of fc->connected is okay, because atomic_dec_and_test()
+ * provides a memory barrier mached with the one in fuse_wait_aborted()
+ * to ensure no wake-up is missed.
+ */
+ if (atomic_dec_and_test(&fc->num_waiting) &&
+ !READ_ONCE(fc->connected)) {
/* wake up aborters */
wake_up_all(&fc->blocked_waitq);
}
req->in.args[1].size = total_len;
err = fuse_request_send_notify_reply(fc, req, outarg->notify_unique);
- if (err)
+ if (err) {
fuse_retrieve_end(fc, req);
+ fuse_put_request(fc, req);
+ }
return err;
}
void fuse_wait_aborted(struct fuse_conn *fc)
{
+ /* matches implicit memory barrier in fuse_drop_waiting() */
+ smp_mb();
wait_event(fc->blocked_waitq, atomic_read(&fc->num_waiting) == 0);
}
ssize_t ret = 0;
/* Special case for kernel I/O: can copy directly into the buffer */
- if (ii->type & ITER_KVEC) {
+ if (iov_iter_is_kvec(ii)) {
unsigned long user_addr = fuse_get_user_addr(ii);
size_t frag_size = fuse_get_frag_size(ii, *nbytesp);
}
if (io->async) {
+ bool blocking = io->blocking;
+
fuse_aio_complete(io, ret < 0 ? ret : 0, -1);
/* we have a non-extending, async request, so return */
- if (!io->blocking)
+ if (!blocking)
return -EIOCBQUEUED;
wait_for_completion(&wait);
ret = gfs2_meta_inode_buffer(ip, &dibh);
if (ret)
goto unlock;
- iomap->private = dibh;
+ mp->mp_bh[0] = dibh;
if (gfs2_is_stuffed(ip)) {
if (flags & IOMAP_WRITE) {
len = lblock_stop - lblock + 1;
iomap->length = len << inode->i_blkbits;
- get_bh(dibh);
- mp->mp_bh[0] = dibh;
-
height = ip->i_height;
while ((lblock + 1) * sdp->sd_sb.sb_bsize > sdp->sd_heightsize[height])
height++;
iomap->bdev = inode->i_sb->s_bdev;
unlock:
up_read(&ip->i_rw_mutex);
- if (ret && dibh)
- brelse(dibh);
return ret;
do_alloc:
static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos,
loff_t length, unsigned flags,
- struct iomap *iomap)
+ struct iomap *iomap,
+ struct metapath *mp)
{
- struct metapath mp = { .mp_aheight = 1, };
struct gfs2_inode *ip = GFS2_I(inode);
struct gfs2_sbd *sdp = GFS2_SB(inode);
unsigned int data_blocks = 0, ind_blocks = 0, rblocks;
unstuff = gfs2_is_stuffed(ip) &&
pos + length > gfs2_max_stuffed_size(ip);
- ret = gfs2_iomap_get(inode, pos, length, flags, iomap, &mp);
+ ret = gfs2_iomap_get(inode, pos, length, flags, iomap, mp);
if (ret)
- goto out_release;
+ goto out_unlock;
alloc_required = unstuff || iomap->type == IOMAP_HOLE;
ret = gfs2_quota_lock_check(ip, &ap);
if (ret)
- goto out_release;
+ goto out_unlock;
ret = gfs2_inplace_reserve(ip, &ap);
if (ret)
ret = gfs2_unstuff_dinode(ip, NULL);
if (ret)
goto out_trans_end;
- release_metapath(&mp);
- brelse(iomap->private);
- iomap->private = NULL;
+ release_metapath(mp);
ret = gfs2_iomap_get(inode, iomap->offset, iomap->length,
- flags, iomap, &mp);
+ flags, iomap, mp);
if (ret)
goto out_trans_end;
}
if (iomap->type == IOMAP_HOLE) {
- ret = gfs2_iomap_alloc(inode, iomap, flags, &mp);
+ ret = gfs2_iomap_alloc(inode, iomap, flags, mp);
if (ret) {
gfs2_trans_end(sdp);
gfs2_inplace_release(ip);
goto out_qunlock;
}
}
- release_metapath(&mp);
if (!gfs2_is_stuffed(ip) && gfs2_is_jdata(ip))
iomap->page_done = gfs2_iomap_journaled_page_done;
return 0;
out_qunlock:
if (alloc_required)
gfs2_quota_unlock(ip);
-out_release:
- if (iomap->private)
- brelse(iomap->private);
- release_metapath(&mp);
+out_unlock:
gfs2_write_unlock(inode);
return ret;
}
trace_gfs2_iomap_start(ip, pos, length, flags);
if ((flags & IOMAP_WRITE) && !(flags & IOMAP_DIRECT)) {
- ret = gfs2_iomap_begin_write(inode, pos, length, flags, iomap);
+ ret = gfs2_iomap_begin_write(inode, pos, length, flags, iomap, &mp);
} else {
ret = gfs2_iomap_get(inode, pos, length, flags, iomap, &mp);
- release_metapath(&mp);
+
/*
* Silently fall back to buffered I/O for stuffed files or if
* we've hot a hole (see gfs2_file_direct_write).
iomap->type != IOMAP_MAPPED)
ret = -ENOTBLK;
}
+ if (!ret) {
+ get_bh(mp.mp_bh[0]);
+ iomap->private = mp.mp_bh[0];
+ }
+ release_metapath(&mp);
trace_gfs2_iomap_end(ip, iomap, ret);
return ret;
}
if (ret < 0)
goto out;
- /* issue read-ahead on metadata */
- if (mp.mp_aheight > 1) {
- for (; ret > 1; ret--) {
- metapointer_range(&mp, mp.mp_aheight - ret,
+ /* On the first pass, issue read-ahead on metadata. */
+ if (mp.mp_aheight > 1 && strip_h == ip->i_height - 1) {
+ unsigned int height = mp.mp_aheight - 1;
+
+ /* No read-ahead for data blocks. */
+ if (mp.mp_aheight - 1 == strip_h)
+ height--;
+
+ for (; height >= mp.mp_aheight - ret; height--) {
+ metapointer_range(&mp, height,
start_list, start_aligned,
end_list, end_aligned,
&start, &end);
if (gl) {
glock_clear_object(gl, rgd);
+ gfs2_rgrp_brelse(rgd);
gfs2_glock_put(gl);
}
* @rgd: the struct gfs2_rgrpd describing the RG to read in
*
* Read in all of a Resource Group's header and bitmap blocks.
- * Caller must eventually call gfs2_rgrp_relse() to free the bitmaps.
+ * Caller must eventually call gfs2_rgrp_brelse() to free the bitmaps.
*
* Returns: errno
*/
nidx -= len * 8;
i = node->next;
- hfs_bnode_put(node);
if (!i) {
/* panic */;
pr_crit("unable to free bnode %u. bmap not found!\n",
node->this);
+ hfs_bnode_put(node);
return;
}
+ hfs_bnode_put(node);
node = hfs_bnode_find(tree, i);
if (IS_ERR(node))
return;
nidx -= len * 8;
i = node->next;
- hfs_bnode_put(node);
if (!i) {
/* panic */;
pr_crit("unable to free bnode %u. "
"bmap not found!\n",
node->this);
+ hfs_bnode_put(node);
return;
}
+ hfs_bnode_put(node);
node = hfs_bnode_find(tree, i);
if (IS_ERR(node))
return;
return LRU_REMOVED;
}
- /* recently referenced inodes get one more pass */
- if (inode->i_state & I_REFERENCED) {
+ /*
+ * Recently referenced inodes and inodes with many attached pages
+ * get one more pass.
+ */
+ if (inode->i_state & I_REFERENCED || inode->i_data.nrpages > 1) {
inode->i_state &= ~I_REFERENCED;
spin_unlock(&inode->i_lock);
return LRU_ROTATE;
u64 off, u64 olen, u64 destoff)
{
struct fd src_file = fdget(srcfd);
+ loff_t cloned;
int ret;
if (!src_file.file)
ret = -EXDEV;
if (src_file.file->f_path.mnt != dst_file->f_path.mnt)
goto fdput;
- ret = vfs_clone_file_range(src_file.file, off, dst_file, destoff, olen);
+ cloned = vfs_clone_file_range(src_file.file, off, dst_file, destoff,
+ olen, 0);
+ if (cloned < 0)
+ ret = cloned;
+ else if (olen && cloned != olen)
+ ret = -EINVAL;
+ else
+ ret = 0;
fdput:
fdput(src_file);
return ret;
return ioctl_fiemap(filp, arg);
case FIGETBSZ:
+ /* anon_bdev filesystems may not have a block size */
+ if (!inode->i_sb->s_blocksize)
+ return -EINVAL;
return put_user(inode->i_sb->s_blocksize, argp);
case FICLONE:
#include <linux/task_io_accounting_ops.h>
#include <linux/dax.h>
#include <linux/sched/signal.h>
-#include <linux/swap.h>
#include "internal.h"
iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
loff_t *pos, loff_t length, unsigned *offp, unsigned *lenp)
{
+ loff_t orig_pos = *pos;
+ loff_t isize = i_size_read(inode);
unsigned block_bits = inode->i_blkbits;
unsigned block_size = (1 << block_bits);
unsigned poff = offset_in_page(*pos);
unsigned plen = min_t(loff_t, PAGE_SIZE - poff, length);
unsigned first = poff >> block_bits;
unsigned last = (poff + plen - 1) >> block_bits;
- unsigned end = offset_in_page(i_size_read(inode)) >> block_bits;
/*
* If the block size is smaller than the page size we need to check the
* handle both halves separately so that we properly zero data in the
* page cache for blocks that are entirely outside of i_size.
*/
- if (first <= end && last > end)
- plen -= (last - end) * block_size;
+ if (orig_pos <= isize && orig_pos + length > isize) {
+ unsigned end = offset_in_page(isize - 1) >> block_bits;
+
+ if (first <= end && last > end)
+ plen -= (last - end) * block_size;
+ }
*offp = poff;
*lenp = plen;
struct bio *bio;
bool need_zeroout = false;
bool use_fua = false;
- int nr_pages, ret;
+ int nr_pages, ret = 0;
size_t copied = 0;
if ((pos | length | align) & ((1 << blkbits) - 1))
if (iomap->flags & IOMAP_F_NEW) {
need_zeroout = true;
- } else {
+ } else if (iomap->type == IOMAP_MAPPED) {
/*
- * Use a FUA write if we need datasync semantics, this
- * is a pure data IO that doesn't require any metadata
- * updates and the underlying device supports FUA. This
- * allows us to avoid cache flushes on IO completion.
+ * Use a FUA write if we need datasync semantics, this is a pure
+ * data IO that doesn't require any metadata updates (including
+ * after IO completion such as unwritten extent conversion) and
+ * the underlying device supports FUA. This allows us to avoid
+ * cache flushes on IO completion.
*/
if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) &&
(dio->flags & IOMAP_DIO_WRITE_FUA) &&
ret = bio_iov_iter_get_pages(bio, &iter);
if (unlikely(ret)) {
+ /*
+ * We have to stop part way through an IO. We must fall
+ * through to the sub-block tail zeroing here, otherwise
+ * this short IO may expose stale data in the tail of
+ * the block we haven't written data to.
+ */
bio_put(bio);
- return copied ? copied : ret;
+ goto zero_tail;
}
n = bio->bi_iter.bi_size;
dio->submit.cookie = submit_bio(bio);
} while (nr_pages);
- if (need_zeroout) {
+ /*
+ * We need to zeroout the tail of a sub-block write if the extent type
+ * requires zeroing or the write extends beyond EOF. If we don't zero
+ * the block tail in the latter case, we can expose stale data via mmap
+ * reads of the EOF block.
+ */
+zero_tail:
+ if (need_zeroout ||
+ ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
/* zero out from the end of the write to the end of the block */
pad = pos & (fs_block_size - 1);
if (pad)
iomap_dio_zero(dio, iomap, pos, fs_block_size - pad);
}
- return copied;
+ return copied ? copied : ret;
}
static loff_t
if (pos >= dio->i_size)
goto out_free_dio;
- if (iter->type == ITER_IOVEC)
+ if (iter_is_iovec(iter) && iov_iter_rw(iter) == READ)
dio->flags |= IOMAP_DIO_DIRTY;
} else {
flags |= IOMAP_WRITE;
hlist_for_each_entry(mp, chain, m_hash) {
if (mp->m_dentry == dentry) {
- /* might be worth a WARN_ON() */
- if (d_unlinked(dentry))
- return ERR_PTR(-ENOENT);
mp->m_count++;
return mp;
}
int ret;
if (d_mountpoint(dentry)) {
+ /* might be worth a WARN_ON() */
+ if (d_unlinked(dentry))
+ return ERR_PTR(-ENOENT);
mountpoint:
read_seqlock_excl(&mount_lock);
mp = lookup_mountpoint(dentry);
namespace_lock();
lock_mount_hash();
- event++;
+ /* Recheck MNT_LOCKED with the locks held */
+ retval = -EINVAL;
+ if (mnt->mnt.mnt_flags & MNT_LOCKED)
+ goto out;
+
+ event++;
if (flags & MNT_DETACH) {
if (!list_empty(&mnt->mnt_list))
umount_tree(mnt, UMOUNT_PROPAGATE);
retval = 0;
}
}
+out:
unlock_mount_hash();
namespace_unlock();
return retval;
goto dput_and_out;
if (!check_mnt(mnt))
goto dput_and_out;
- if (mnt->mnt.mnt_flags & MNT_LOCKED)
+ if (mnt->mnt.mnt_flags & MNT_LOCKED) /* Check optimistically */
goto dput_and_out;
retval = -EPERM;
if (flags & MNT_FORCE && !capable(CAP_SYS_ADMIN))
for (s = r; s; s = next_mnt(s, r)) {
if (!(flag & CL_COPY_UNBINDABLE) &&
IS_MNT_UNBINDABLE(s)) {
- s = skip_mnt_tree(s);
- continue;
+ if (s->mnt.mnt_flags & MNT_LOCKED) {
+ /* Both unbindable and locked. */
+ q = ERR_PTR(-EPERM);
+ goto out;
+ } else {
+ s = skip_mnt_tree(s);
+ continue;
+ }
}
if (!(flag & CL_COPY_MNT_NS_FILE) &&
is_mnt_ns_file(s->mnt.mnt_root)) {
{
namespace_lock();
lock_mount_hash();
- umount_tree(real_mount(mnt), UMOUNT_SYNC);
+ umount_tree(real_mount(mnt), 0);
unlock_mount_hash();
namespace_unlock();
}
out_iput:
rcu_read_unlock();
trace_nfs4_cb_getattr(cps->clp, &args->fh, inode, -ntohl(res->status));
- iput(inode);
+ nfs_iput_and_deactive(inode);
out:
dprintk("%s: exit with status = %d\n", __func__, ntohl(res->status));
return res->status;
}
trace_nfs4_cb_recall(cps->clp, &args->fh, inode,
&args->stateid, -ntohl(res));
- iput(inode);
+ nfs_iput_and_deactive(inode);
out:
dprintk("%s: exit with status = %d\n", __func__, ntohl(res));
return res;
{
struct cb_offloadargs *args = data;
struct nfs_server *server;
- struct nfs4_copy_state *copy;
+ struct nfs4_copy_state *copy, *tmp_copy;
bool found = false;
+ copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS);
+ if (!copy)
+ return htonl(NFS4ERR_SERVERFAULT);
+
spin_lock(&cps->clp->cl_lock);
rcu_read_lock();
list_for_each_entry_rcu(server, &cps->clp->cl_superblocks,
client_link) {
- list_for_each_entry(copy, &server->ss_copies, copies) {
+ list_for_each_entry(tmp_copy, &server->ss_copies, copies) {
if (memcmp(args->coa_stateid.other,
- copy->stateid.other,
+ tmp_copy->stateid.other,
sizeof(args->coa_stateid.other)))
continue;
- nfs4_copy_cb_args(copy, args);
- complete(©->completion);
+ nfs4_copy_cb_args(tmp_copy, args);
+ complete(&tmp_copy->completion);
found = true;
goto out;
}
out:
rcu_read_unlock();
if (!found) {
- copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS);
- if (!copy) {
- spin_unlock(&cps->clp->cl_lock);
- return htonl(NFS4ERR_SERVERFAULT);
- }
memcpy(©->stateid, &args->coa_stateid, NFS4_STATEID_SIZE);
nfs4_copy_cb_args(copy, args);
list_add_tail(©->copies, &cps->clp->pending_cb_stateids);
- }
+ } else
+ kfree(copy);
spin_unlock(&cps->clp->cl_lock);
return 0;
const struct nfs_fh *fhandle)
{
struct nfs_delegation *delegation;
- struct inode *res = NULL;
+ struct inode *freeme, *res = NULL;
list_for_each_entry_rcu(delegation, &server->delegations, super_list) {
spin_lock(&delegation->lock);
if (delegation->inode != NULL &&
nfs_compare_fh(fhandle, &NFS_I(delegation->inode)->fh) == 0) {
- res = igrab(delegation->inode);
+ freeme = igrab(delegation->inode);
+ if (freeme && nfs_sb_active(freeme->i_sb))
+ res = freeme;
spin_unlock(&delegation->lock);
if (res != NULL)
return res;
+ if (freeme) {
+ rcu_read_unlock();
+ iput(freeme);
+ rcu_read_lock();
+ }
return ERR_PTR(-EAGAIN);
}
spin_unlock(&delegation->lock);
struct pnfs_ds_commit_info ds_cinfo; /* Storage for cinfo */
struct work_struct work;
int flags;
+ /* for write */
#define NFS_ODIRECT_DO_COMMIT (1) /* an unstable reply was received */
#define NFS_ODIRECT_RESCHED_WRITES (2) /* write verification failed */
+ /* for read */
+#define NFS_ODIRECT_SHOULD_DIRTY (3) /* dirty user-space page after read */
struct nfs_writeverf verf; /* unstable write verifier */
};
struct nfs_page *req = nfs_list_entry(hdr->pages.next);
struct page *page = req->wb_page;
- if (!PageCompound(page) && bytes < hdr->good_bytes)
+ if (!PageCompound(page) && bytes < hdr->good_bytes &&
+ (dreq->flags == NFS_ODIRECT_SHOULD_DIRTY))
set_page_dirty(page);
bytes += req->wb_bytes;
nfs_list_remove_request(req);
if (!is_sync_kiocb(iocb))
dreq->iocb = iocb;
+ if (iter_is_iovec(iter))
+ dreq->flags = NFS_ODIRECT_SHOULD_DIRTY;
+
nfs_start_io_direct(inode);
NFS_I(inode)->read_io += count;
task))
return;
- if (ff_layout_read_prepare_common(task, hdr))
- return;
-
- if (nfs4_set_rw_stateid(&hdr->args.stateid, hdr->args.context,
- hdr->args.lock_context, FMODE_READ) == -EIO)
- rpc_exit(task, -EIO); /* lost lock, terminate I/O */
+ ff_layout_read_prepare_common(task, hdr);
}
static void ff_layout_read_call_done(struct rpc_task *task, void *data)
task))
return;
- if (ff_layout_write_prepare_common(task, hdr))
- return;
-
- if (nfs4_set_rw_stateid(&hdr->args.stateid, hdr->args.context,
- hdr->args.lock_context, FMODE_WRITE) == -EIO)
- rpc_exit(task, -EIO); /* lost lock, terminate I/O */
+ ff_layout_write_prepare_common(task, hdr);
}
static void ff_layout_write_call_done(struct rpc_task *task, void *data)
fh = nfs4_ff_layout_select_ds_fh(lseg, idx);
if (fh)
hdr->args.fh = fh;
+
+ if (vers == 4 &&
+ !nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid))
+ goto out_failed;
+
/*
* Note that if we ever decide to split across DSes,
* then we may need to handle dense-like offsets.
if (fh)
hdr->args.fh = fh;
+ if (vers == 4 &&
+ !nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid))
+ goto out_failed;
+
/*
* Note that if we ever decide to split across DSes,
* then we may need to handle dense-like offsets.
unsigned int maxnum);
struct nfs_fh *
nfs4_ff_layout_select_ds_fh(struct pnfs_layout_segment *lseg, u32 mirror_idx);
+int
+nfs4_ff_layout_select_ds_stateid(struct pnfs_layout_segment *lseg,
+ u32 mirror_idx,
+ nfs4_stateid *stateid);
struct nfs4_pnfs_ds *
nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx,
return fh;
}
+int
+nfs4_ff_layout_select_ds_stateid(struct pnfs_layout_segment *lseg,
+ u32 mirror_idx,
+ nfs4_stateid *stateid)
+{
+ struct nfs4_ff_layout_mirror *mirror = FF_LAYOUT_COMP(lseg, mirror_idx);
+
+ if (!ff_layout_mirror_valid(lseg, mirror, false)) {
+ pr_err_ratelimited("NFS: %s: No data server for mirror offset index %d\n",
+ __func__, mirror_idx);
+ goto out;
+ }
+
+ nfs4_stateid_copy(stateid, &mirror->stateid);
+ return 1;
+out:
+ return 0;
+}
+
/**
* nfs4_ff_layout_prepare_ds - prepare a DS connection for an RPC call
* @lseg: the layout segment we're operating on
struct file *dst,
nfs4_stateid *src_stateid)
{
- struct nfs4_copy_state *copy;
+ struct nfs4_copy_state *copy, *tmp_copy;
int status = NFS4_OK;
bool found_pending = false;
struct nfs_open_context *ctx = nfs_file_open_context(dst);
+ copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS);
+ if (!copy)
+ return -ENOMEM;
+
spin_lock(&server->nfs_client->cl_lock);
- list_for_each_entry(copy, &server->nfs_client->pending_cb_stateids,
+ list_for_each_entry(tmp_copy, &server->nfs_client->pending_cb_stateids,
copies) {
- if (memcmp(&res->write_res.stateid, ©->stateid,
+ if (memcmp(&res->write_res.stateid, &tmp_copy->stateid,
NFS4_STATEID_SIZE))
continue;
found_pending = true;
- list_del(©->copies);
+ list_del(&tmp_copy->copies);
break;
}
if (found_pending) {
spin_unlock(&server->nfs_client->cl_lock);
+ kfree(copy);
+ copy = tmp_copy;
goto out;
}
- copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS);
- if (!copy) {
- spin_unlock(&server->nfs_client->cl_lock);
- return -ENOMEM;
- }
memcpy(©->stateid, &res->write_res.stateid, NFS4_STATEID_SIZE);
init_completion(©->completion);
copy->parent_state = ctx->state;
NFS4CLNT_MOVED,
NFS4CLNT_LEASE_MOVED,
NFS4CLNT_DELEGATION_EXPIRED,
+ NFS4CLNT_RUN_MANAGER,
+ NFS4CLNT_DELEGRETURN_RUNNING,
};
#define NFS4_RENEW_TIMEOUT 0x01
return nfs42_proc_allocate(filep, offset, len);
}
-static int nfs42_clone_file_range(struct file *src_file, loff_t src_off,
- struct file *dst_file, loff_t dst_off, u64 count)
+static loff_t nfs42_remap_file_range(struct file *src_file, loff_t src_off,
+ struct file *dst_file, loff_t dst_off, loff_t count,
+ unsigned int remap_flags)
{
struct inode *dst_inode = file_inode(dst_file);
struct nfs_server *server = NFS_SERVER(dst_inode);
bool same_inode = false;
int ret;
+ if (remap_flags & ~REMAP_FILE_ADVISORY)
+ return -EINVAL;
+
/* check alignment w.r.t. clone_blksize */
ret = -EINVAL;
if (bs) {
inode_unlock(src_inode);
}
out:
- return ret;
+ return ret < 0 ? ret : count;
}
#endif /* CONFIG_NFS_V4_2 */
.copy_file_range = nfs4_copy_file_range,
.llseek = nfs4_file_llseek,
.fallocate = nfs42_fallocate,
- .clone_file_range = nfs42_clone_file_range,
+ .remap_file_range = nfs42_remap_file_range,
#else
.llseek = nfs_file_llseek,
#endif
}
/*
- * -EACCESS could mean that the user doesn't have correct permissions
+ * -EACCES could mean that the user doesn't have correct permissions
* to access the mount. It could also mean that we tried to mount
* with a gss auth flavor, but rpc.gssd isn't running. Either way,
* existing mount programs don't handle -EACCES very well so it should
struct task_struct *task;
char buf[INET6_ADDRSTRLEN + sizeof("-manager") + 1];
+ set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
if (test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) != 0)
return;
__module_get(THIS_MODULE);
/* Ensure exclusive access to NFSv4 state */
do {
+ clear_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
if (test_bit(NFS4CLNT_PURGE_STATE, &clp->cl_state)) {
section = "purge state";
status = nfs4_purge_lease(clp);
}
nfs4_end_drain_session(clp);
- if (test_and_clear_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state)) {
- nfs_client_return_marked_delegations(clp);
- continue;
+ nfs4_clear_state_manager_bit(clp);
+
+ if (!test_and_set_bit(NFS4CLNT_DELEGRETURN_RUNNING, &clp->cl_state)) {
+ if (test_and_clear_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state)) {
+ nfs_client_return_marked_delegations(clp);
+ set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
+ }
+ clear_bit(NFS4CLNT_DELEGRETURN_RUNNING, &clp->cl_state);
}
- nfs4_clear_state_manager_bit(clp);
/* Did we race with an attempt to give us more work? */
- if (clp->cl_state == 0)
- break;
+ if (!test_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state))
+ return;
if (test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) != 0)
- break;
- } while (refcount_read(&clp->cl_count) > 1);
- return;
+ return;
+ } while (refcount_read(&clp->cl_count) > 1 && !signalled());
+ goto out_drain;
+
out_error:
if (strlen(section))
section_sep = ": ";
" with error %d\n", section_sep, section,
clp->cl_hostname, -status);
ssleep(1);
+out_drain:
nfs4_end_drain_session(clp);
nfs4_clear_state_manager_bit(clp);
}
{
__be32 status;
+ if (!cstate->save_fh.fh_dentry)
+ return nfserr_nofilehandle;
+
status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->save_fh,
src_stateid, RD_STATE, src, NULL);
if (status) {
__be32 nfsd4_clone_file_range(struct file *src, u64 src_pos, struct file *dst,
u64 dst_pos, u64 count)
{
- return nfserrno(vfs_clone_file_range(src, src_pos, dst, dst_pos,
- count));
+ loff_t cloned;
+
+ cloned = vfs_clone_file_range(src, src_pos, dst, dst_pos, count, 0);
+ if (count && cloned != count)
+ cloned = -EINVAL;
+ return nfserrno(cloned < 0 ? cloned : 0);
}
ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
int host_err;
trace_nfsd_read_vector(rqstp, fhp, offset, *count);
- iov_iter_kvec(&iter, READ | ITER_KVEC, vec, vlen, *count);
+ iov_iter_kvec(&iter, READ, vec, vlen, *count);
host_err = vfs_iter_read(file, &iter, &offset, 0);
return nfsd_finish_read(rqstp, fhp, file, offset, count, host_err);
}
if (stable && !use_wgather)
flags |= RWF_SYNC;
- iov_iter_kvec(&iter, WRITE | ITER_KVEC, vec, vlen, *cnt);
+ iov_iter_kvec(&iter, WRITE, vec, vlen, *cnt);
host_err = vfs_iter_write(file, &iter, &pos, flags);
if (host_err < 0)
goto out_nfserr;
return;
if (nbh == NULL) { /* blocksize == pagesize */
- xa_lock_irq(&btnc->i_pages);
- __xa_erase(&btnc->i_pages, newkey);
- xa_unlock_irq(&btnc->i_pages);
+ xa_erase_irq(&btnc->i_pages, newkey);
unlock_page(ctxt->bh->b_page);
} else
brelse(nbh);
continue;
mark = iter_info->marks[type];
/*
- * if the event is for a child and this inode doesn't care about
- * events on the child, don't send it!
+ * If the event is for a child and this mark doesn't care about
+ * events on a child, don't send it!
*/
- if (type == FSNOTIFY_OBJ_TYPE_INODE &&
- (event_mask & FS_EVENT_ON_CHILD) &&
- !(mark->mask & FS_EVENT_ON_CHILD))
+ if (event_mask & FS_EVENT_ON_CHILD &&
+ (type != FSNOTIFY_OBJ_TYPE_INODE ||
+ !(mark->mask & FS_EVENT_ON_CHILD)))
continue;
marks_mask |= mark->mask;
parent = dget_parent(dentry);
p_inode = parent->d_inode;
- if (unlikely(!fsnotify_inode_watches_children(p_inode)))
+ if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
__fsnotify_update_child_dentry_flags(p_inode);
- else if (p_inode->i_fsnotify_mask & mask) {
+ } else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
struct name_snapshot name;
/* we are notifying a parent so come up with the new mask which
sb = mnt->mnt.mnt_sb;
mnt_or_sb_mask = mnt->mnt_fsnotify_mask | sb->s_fsnotify_mask;
}
+ /* An event "on child" is not intended for a mount/sb mark */
+ if (mask & FS_EVENT_ON_CHILD)
+ mnt_or_sb_mask = 0;
/*
* Optimization: srcu_read_lock() has a memory barrier which can
/* Get the mft record of the inode belonging to the child dentry. */
mrec = map_mft_record(ni);
if (IS_ERR(mrec))
- return (struct dentry *)mrec;
+ return ERR_CAST(mrec);
/* Find the first file name attribute in the mft record. */
ctx = ntfs_attr_get_search_ctx(ni, mrec);
if (unlikely(!ctx)) {
/* this io's submitter should not have unlocked this before we could */
BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
- if (bytes > 0 && private)
- ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
+ if (bytes <= 0)
+ mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
+ (long long)bytes);
+ if (private) {
+ if (bytes > 0)
+ ret = ocfs2_dio_end_io_write(inode, private, offset,
+ bytes);
+ else
+ ocfs2_dio_free_write_ctx(inode, private);
+ }
ocfs2_iocb_clear_rw_locked(iocb);
return ret;
}
+/* Caller must provide a bhs[] with all NULL or non-NULL entries, so it
+ * will be easier to handle read failure.
+ */
int ocfs2_read_blocks_sync(struct ocfs2_super *osb, u64 block,
unsigned int nr, struct buffer_head *bhs[])
{
int status = 0;
unsigned int i;
struct buffer_head *bh;
+ int new_bh = 0;
trace_ocfs2_read_blocks_sync((unsigned long long)block, nr);
if (!nr)
goto bail;
+ /* Don't put buffer head and re-assign it to NULL if it is allocated
+ * outside since the caller can't be aware of this alternation!
+ */
+ new_bh = (bhs[0] == NULL);
+
for (i = 0 ; i < nr ; i++) {
if (bhs[i] == NULL) {
bhs[i] = sb_getblk(osb->sb, block++);
if (bhs[i] == NULL) {
status = -ENOMEM;
mlog_errno(status);
- goto bail;
+ break;
}
}
bh = bhs[i];
submit_bh(REQ_OP_READ, 0, bh);
}
+read_failure:
for (i = nr; i > 0; i--) {
bh = bhs[i - 1];
+ if (unlikely(status)) {
+ if (new_bh && bh) {
+ /* If middle bh fails, let previous bh
+ * finish its read and then put it to
+ * aovoid bh leak
+ */
+ if (!buffer_jbd(bh))
+ wait_on_buffer(bh);
+ put_bh(bh);
+ bhs[i - 1] = NULL;
+ } else if (bh && buffer_uptodate(bh)) {
+ clear_buffer_uptodate(bh);
+ }
+ continue;
+ }
+
/* No need to wait on the buffer if it's managed by JBD. */
if (!buffer_jbd(bh))
wait_on_buffer(bh);
* so we can safely record this and loop back
* to cleanup the other buffers. */
status = -EIO;
- put_bh(bh);
- bhs[i - 1] = NULL;
+ goto read_failure;
}
}
return status;
}
+/* Caller must provide a bhs[] with all NULL or non-NULL entries, so it
+ * will be easier to handle read failure.
+ */
int ocfs2_read_blocks(struct ocfs2_caching_info *ci, u64 block, int nr,
struct buffer_head *bhs[], int flags,
int (*validate)(struct super_block *sb,
int i, ignore_cache = 0;
struct buffer_head *bh;
struct super_block *sb = ocfs2_metadata_cache_get_super(ci);
+ int new_bh = 0;
trace_ocfs2_read_blocks_begin(ci, (unsigned long long)block, nr, flags);
goto bail;
}
+ /* Don't put buffer head and re-assign it to NULL if it is allocated
+ * outside since the caller can't be aware of this alternation!
+ */
+ new_bh = (bhs[0] == NULL);
+
ocfs2_metadata_cache_io_lock(ci);
for (i = 0 ; i < nr ; i++) {
if (bhs[i] == NULL) {
ocfs2_metadata_cache_io_unlock(ci);
status = -ENOMEM;
mlog_errno(status);
- goto bail;
+ /* Don't forget to put previous bh! */
+ break;
}
}
bh = bhs[i];
}
}
- status = 0;
-
+read_failure:
for (i = (nr - 1); i >= 0; i--) {
bh = bhs[i];
if (!(flags & OCFS2_BH_READAHEAD)) {
- if (status) {
- /* Clear the rest of the buffers on error */
- put_bh(bh);
- bhs[i] = NULL;
+ if (unlikely(status)) {
+ /* Clear the buffers on error including those
+ * ever succeeded in reading
+ */
+ if (new_bh && bh) {
+ /* If middle bh fails, let previous bh
+ * finish its read and then put it to
+ * aovoid bh leak
+ */
+ if (!buffer_jbd(bh))
+ wait_on_buffer(bh);
+ put_bh(bh);
+ bhs[i] = NULL;
+ } else if (bh && buffer_uptodate(bh)) {
+ clear_buffer_uptodate(bh);
+ }
continue;
}
/* We know this can't have changed as we hold the
* uptodate. */
status = -EIO;
clear_buffer_needs_validate(bh);
- put_bh(bh);
- bhs[i] = NULL;
- continue;
+ goto read_failure;
}
if (buffer_needs_validate(bh)) {
BUG_ON(buffer_jbd(bh));
clear_buffer_needs_validate(bh);
status = validate(sb, bh);
- if (status) {
- put_bh(bh);
- bhs[i] = NULL;
- continue;
- }
+ if (status)
+ goto read_failure;
}
}
##__VA_ARGS__); \
} while (0)
+#define mlog_ratelimited(mask, fmt, ...) \
+do { \
+ static DEFINE_RATELIMIT_STATE(_rs, \
+ DEFAULT_RATELIMIT_INTERVAL, \
+ DEFAULT_RATELIMIT_BURST); \
+ if (__ratelimit(&_rs)) \
+ mlog(mask, fmt, ##__VA_ARGS__); \
+} while (0)
+
#define mlog_errno(st) ({ \
int _st = (st); \
if (_st != -ERESTARTSYS && _st != -EINTR && \
{
struct kvec vec = { .iov_len = len, .iov_base = data, };
struct msghdr msg = { .msg_flags = MSG_DONTWAIT, };
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &vec, 1, len);
+ iov_iter_kvec(&msg.msg_iter, READ, &vec, 1, len);
return sock_recvmsg(sock, &msg, MSG_DONTWAIT);
}
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;
/* LVB only has room for 64 bits of time here so we pack it for
* now. */
-static u64 ocfs2_pack_timespec(struct timespec *spec)
+static u64 ocfs2_pack_timespec(struct timespec64 *spec)
{
u64 res;
- u64 sec = spec->tv_sec;
+ u64 sec = clamp_t(time64_t, spec->tv_sec, 0, 0x3ffffffffull);
u32 nsec = spec->tv_nsec;
res = (sec << OCFS2_SEC_SHIFT) | (nsec & OCFS2_NSEC_MASK);
struct ocfs2_inode_info *oi = OCFS2_I(inode);
struct ocfs2_lock_res *lockres = &oi->ip_inode_lockres;
struct ocfs2_meta_lvb *lvb;
- struct timespec ts;
lvb = ocfs2_dlm_lvb(&lockres->l_lksb);
lvb->lvb_igid = cpu_to_be32(i_gid_read(inode));
lvb->lvb_imode = cpu_to_be16(inode->i_mode);
lvb->lvb_inlink = cpu_to_be16(inode->i_nlink);
- ts = timespec64_to_timespec(inode->i_atime);
lvb->lvb_iatime_packed =
- cpu_to_be64(ocfs2_pack_timespec(&ts));
- ts = timespec64_to_timespec(inode->i_ctime);
+ cpu_to_be64(ocfs2_pack_timespec(&inode->i_atime));
lvb->lvb_ictime_packed =
- cpu_to_be64(ocfs2_pack_timespec(&ts));
- ts = timespec64_to_timespec(inode->i_mtime);
+ cpu_to_be64(ocfs2_pack_timespec(&inode->i_ctime));
lvb->lvb_imtime_packed =
- cpu_to_be64(ocfs2_pack_timespec(&ts));
+ cpu_to_be64(ocfs2_pack_timespec(&inode->i_mtime));
lvb->lvb_iattr = cpu_to_be32(oi->ip_attr);
lvb->lvb_idynfeatures = cpu_to_be16(oi->ip_dyn_features);
lvb->lvb_igeneration = cpu_to_be32(inode->i_generation);
mlog_meta_lvb(0, lockres);
}
-static void ocfs2_unpack_timespec(struct timespec *spec,
+static void ocfs2_unpack_timespec(struct timespec64 *spec,
u64 packed_time)
{
spec->tv_sec = packed_time >> OCFS2_SEC_SHIFT;
static void ocfs2_refresh_inode_from_lvb(struct inode *inode)
{
- struct timespec ts;
struct ocfs2_inode_info *oi = OCFS2_I(inode);
struct ocfs2_lock_res *lockres = &oi->ip_inode_lockres;
struct ocfs2_meta_lvb *lvb;
i_gid_write(inode, be32_to_cpu(lvb->lvb_igid));
inode->i_mode = be16_to_cpu(lvb->lvb_imode);
set_nlink(inode, be16_to_cpu(lvb->lvb_inlink));
- ocfs2_unpack_timespec(&ts,
+ ocfs2_unpack_timespec(&inode->i_atime,
be64_to_cpu(lvb->lvb_iatime_packed));
- inode->i_atime = timespec_to_timespec64(ts);
- ocfs2_unpack_timespec(&ts,
+ ocfs2_unpack_timespec(&inode->i_mtime,
be64_to_cpu(lvb->lvb_imtime_packed));
- inode->i_mtime = timespec_to_timespec64(ts);
- ocfs2_unpack_timespec(&ts,
+ ocfs2_unpack_timespec(&inode->i_ctime,
be64_to_cpu(lvb->lvb_ictime_packed));
- inode->i_ctime = timespec_to_timespec64(ts);
spin_unlock(&oi->ip_lock);
}
* we can recover correctly from node failure. Otherwise, we may get
* invalid LVB in LKB, but without DLM_SBF_VALNOTVALID being set.
*/
- if (!ocfs2_is_o2cb_active() &&
+ if (ocfs2_userspace_stack(osb) &&
lockres->l_ops->flags & LOCK_TYPE_USES_LVB)
lvb = 1;
check_gen:
if (handle->ih_generation != inode->i_generation) {
- iput(inode);
trace_ocfs2_get_dentry_generation((unsigned long long)blkno,
handle->ih_generation,
inode->i_generation);
+ iput(inode);
result = ERR_PTR(-ESTALE);
goto bail;
}
written = __generic_file_write_iter(iocb, from);
/* buffered aio wouldn't have proper lock coverage today */
- BUG_ON(written == -EIOCBQUEUED && !(iocb->ki_flags & IOCB_DIRECT));
+ BUG_ON(written == -EIOCBQUEUED && !direct_io);
/*
* deep in g_f_a_w_n()->ocfs2_direct_IO we pass in a ocfs2_dio_end_io
trace_generic_file_read_iter_ret(ret);
/* buffered aio wouldn't have proper lock coverage today */
- BUG_ON(ret == -EIOCBQUEUED && !(iocb->ki_flags & IOCB_DIRECT));
+ BUG_ON(ret == -EIOCBQUEUED && !direct_io);
/* see ocfs2_file_write_iter */
if (ret == -EIOCBQUEUED || !ocfs2_iocb_is_rw_locked(iocb)) {
return offset;
}
-static int ocfs2_file_clone_range(struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len)
+static loff_t ocfs2_remap_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags)
{
- return ocfs2_reflink_remap_range(file_in, pos_in, file_out, pos_out,
- len, false);
-}
+ struct inode *inode_in = file_inode(file_in);
+ struct inode *inode_out = file_inode(file_out);
+ struct ocfs2_super *osb = OCFS2_SB(inode_in->i_sb);
+ struct buffer_head *in_bh = NULL, *out_bh = NULL;
+ bool same_inode = (inode_in == inode_out);
+ loff_t remapped = 0;
+ ssize_t ret;
-static int ocfs2_file_dedupe_range(struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len)
-{
- return ocfs2_reflink_remap_range(file_in, pos_in, file_out, pos_out,
- len, true);
+ if (remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_ADVISORY))
+ return -EINVAL;
+ if (!ocfs2_refcount_tree(osb))
+ return -EOPNOTSUPP;
+ if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
+ return -EROFS;
+
+ /* Lock both files against IO */
+ ret = ocfs2_reflink_inodes_lock(inode_in, &in_bh, inode_out, &out_bh);
+ if (ret)
+ return ret;
+
+ /* Check file eligibility and prepare for block sharing. */
+ ret = -EINVAL;
+ if ((OCFS2_I(inode_in)->ip_flags & OCFS2_INODE_SYSTEM_FILE) ||
+ (OCFS2_I(inode_out)->ip_flags & OCFS2_INODE_SYSTEM_FILE))
+ goto out_unlock;
+
+ ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out,
+ &len, remap_flags);
+ if (ret < 0 || len == 0)
+ goto out_unlock;
+
+ /* Lock out changes to the allocation maps and remap. */
+ down_write(&OCFS2_I(inode_in)->ip_alloc_sem);
+ if (!same_inode)
+ down_write_nested(&OCFS2_I(inode_out)->ip_alloc_sem,
+ SINGLE_DEPTH_NESTING);
+
+ /* Zap any page cache for the destination file's range. */
+ truncate_inode_pages_range(&inode_out->i_data,
+ round_down(pos_out, PAGE_SIZE),
+ round_up(pos_out + len, PAGE_SIZE) - 1);
+
+ remapped = ocfs2_reflink_remap_blocks(inode_in, in_bh, pos_in,
+ inode_out, out_bh, pos_out, len);
+ up_write(&OCFS2_I(inode_in)->ip_alloc_sem);
+ if (!same_inode)
+ up_write(&OCFS2_I(inode_out)->ip_alloc_sem);
+ if (remapped < 0) {
+ ret = remapped;
+ mlog_errno(ret);
+ goto out_unlock;
+ }
+
+ /*
+ * Empty the extent map so that we may get the right extent
+ * record from the disk.
+ */
+ ocfs2_extent_map_trunc(inode_in, 0);
+ ocfs2_extent_map_trunc(inode_out, 0);
+
+ ret = ocfs2_reflink_update_dest(inode_out, out_bh, pos_out + len);
+ if (ret) {
+ mlog_errno(ret);
+ goto out_unlock;
+ }
+
+out_unlock:
+ ocfs2_reflink_inodes_unlock(inode_in, in_bh, inode_out, out_bh);
+ return remapped > 0 ? remapped : ret;
}
const struct inode_operations ocfs2_file_iops = {
.splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.fallocate = ocfs2_fallocate,
- .clone_file_range = ocfs2_file_clone_range,
- .dedupe_file_range = ocfs2_file_dedupe_range,
+ .remap_file_range = ocfs2_remap_file_range,
};
const struct file_operations ocfs2_dops = {
.splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.fallocate = ocfs2_fallocate,
- .clone_file_range = ocfs2_file_clone_range,
- .dedupe_file_range = ocfs2_file_dedupe_range,
+ .remap_file_range = ocfs2_remap_file_range,
};
const struct file_operations ocfs2_dops_no_plocks = {
int rm_quota_used = 0, i;
struct ocfs2_quota_recovery *qrec;
+ /* Whether the quota supported. */
+ int quota_enabled = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb,
+ OCFS2_FEATURE_RO_COMPAT_USRQUOTA)
+ || OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA);
+
status = ocfs2_wait_on_mount(osb);
if (status < 0) {
goto bail;
}
- rm_quota = kcalloc(osb->max_slots, sizeof(int), GFP_NOFS);
- if (!rm_quota) {
- status = -ENOMEM;
- goto bail;
+ if (quota_enabled) {
+ rm_quota = kcalloc(osb->max_slots, sizeof(int), GFP_NOFS);
+ if (!rm_quota) {
+ status = -ENOMEM;
+ goto bail;
+ }
}
restart:
status = ocfs2_super_lock(osb, 1);
* then quota usage would be out of sync until some node takes
* the slot. So we remember which nodes need quota recovery
* and when everything else is done, we recover quotas. */
- for (i = 0; i < rm_quota_used && rm_quota[i] != slot_num; i++);
- if (i == rm_quota_used)
- rm_quota[rm_quota_used++] = slot_num;
+ if (quota_enabled) {
+ for (i = 0; i < rm_quota_used
+ && rm_quota[i] != slot_num; i++)
+ ;
+
+ if (i == rm_quota_used)
+ rm_quota[rm_quota_used++] = slot_num;
+ }
status = ocfs2_recover_node(osb, node_num, slot_num);
skip_recovery:
/* Now it is right time to recover quotas... We have to do this under
* superblock lock so that no one can start using the slot (and crash)
* before we recover it */
- for (i = 0; i < rm_quota_used; i++) {
- qrec = ocfs2_begin_quota_recovery(osb, rm_quota[i]);
- if (IS_ERR(qrec)) {
- status = PTR_ERR(qrec);
- mlog_errno(status);
- continue;
+ if (quota_enabled) {
+ for (i = 0; i < rm_quota_used; i++) {
+ qrec = ocfs2_begin_quota_recovery(osb, rm_quota[i]);
+ if (IS_ERR(qrec)) {
+ status = PTR_ERR(qrec);
+ mlog_errno(status);
+ continue;
+ }
+ ocfs2_queue_recovery_completion(osb->journal,
+ rm_quota[i],
+ NULL, NULL, qrec,
+ ORPHAN_NEED_TRUNCATE);
}
- ocfs2_queue_recovery_completion(osb->journal, rm_quota[i],
- NULL, NULL, qrec,
- ORPHAN_NEED_TRUNCATE);
}
ocfs2_super_unlock(osb, 1);
mutex_unlock(&osb->recovery_lock);
- kfree(rm_quota);
+ if (quota_enabled)
+ kfree(rm_quota);
/* no one is callint kthread_stop() for us so the kthread() api
* requires that we call do_exit(). And it isn't exported, but
#include "ocfs2_ioctl.h"
#include "alloc.h"
+#include "localalloc.h"
#include "aops.h"
#include "dlmglue.h"
#include "extent_map.h"
}
/*
- * lock allocators, and reserving appropriate number of bits for
- * meta blocks and data clusters.
- *
- * in some cases, we don't need to reserve clusters, just let data_ac
- * be NULL.
+ * lock allocator, and reserve appropriate number of bits for
+ * meta blocks.
*/
-static int ocfs2_lock_allocators_move_extents(struct inode *inode,
+static int ocfs2_lock_meta_allocator_move_extents(struct inode *inode,
struct ocfs2_extent_tree *et,
u32 clusters_to_move,
u32 extents_to_split,
struct ocfs2_alloc_context **meta_ac,
- struct ocfs2_alloc_context **data_ac,
int extra_blocks,
int *credits)
{
goto out;
}
- if (data_ac) {
- ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
- }
*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
struct ocfs2_refcount_tree *ref_tree = NULL;
u32 new_phys_cpos, new_len;
u64 phys_blkno = ocfs2_clusters_to_blocks(inode->i_sb, phys_cpos);
+ int need_free = 0;
if ((ext_flags & OCFS2_EXT_REFCOUNTED) && *len) {
BUG_ON(!ocfs2_is_refcount_inode(inode));
}
}
- ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
- &context->meta_ac,
- &context->data_ac,
- extra_blocks, &credits);
+ ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
+ *len, 1,
+ &context->meta_ac,
+ extra_blocks, &credits);
if (ret) {
mlog_errno(ret);
goto out;
}
}
+ /*
+ * Make sure ocfs2_reserve_cluster is called after
+ * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
+ *
+ * If ocfs2_reserve_cluster is called
+ * before __ocfs2_flush_truncate_log, dead lock on global bitmap
+ * may happen.
+ *
+ */
+ ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
+ if (ret) {
+ mlog_errno(ret);
+ goto out_unlock_mutex;
+ }
+
handle = ocfs2_start_trans(osb, credits);
if (IS_ERR(handle)) {
ret = PTR_ERR(handle);
if (!partial) {
context->range->me_flags &= ~OCFS2_MOVE_EXT_FL_COMPLETE;
ret = -ENOSPC;
+ need_free = 1;
goto out_commit;
}
}
mlog_errno(ret);
out_commit:
+ if (need_free && context->data_ac) {
+ struct ocfs2_alloc_context *data_ac = context->data_ac;
+
+ if (context->data_ac->ac_which == OCFS2_AC_USE_LOCAL)
+ ocfs2_free_local_alloc_bits(osb, handle, data_ac,
+ new_phys_cpos, new_len);
+ else
+ ocfs2_free_clusters(handle,
+ data_ac->ac_inode,
+ data_ac->ac_bh,
+ ocfs2_clusters_to_blocks(osb->sb, new_phys_cpos),
+ new_len);
+ }
+
ocfs2_commit_trans(osb, handle);
out_unlock_mutex:
}
}
- ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
- &context->meta_ac,
- NULL, extra_blocks, &credits);
+ ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
+ len, 1,
+ &context->meta_ac,
+ extra_blocks, &credits);
if (ret) {
mlog_errno(ret);
goto out;
}
/* Update destination inode size, if necessary. */
-static int ocfs2_reflink_update_dest(struct inode *dest,
- struct buffer_head *d_bh,
- loff_t newlen)
+int ocfs2_reflink_update_dest(struct inode *dest,
+ struct buffer_head *d_bh,
+ loff_t newlen)
{
handle_t *handle;
int ret;
}
/* Remap the range pos_in:len in s_inode to pos_out:len in t_inode. */
-static int ocfs2_reflink_remap_extent(struct inode *s_inode,
- struct buffer_head *s_bh,
- loff_t pos_in,
- struct inode *t_inode,
- struct buffer_head *t_bh,
- loff_t pos_out,
- loff_t len,
- struct ocfs2_cached_dealloc_ctxt *dealloc)
+static loff_t ocfs2_reflink_remap_extent(struct inode *s_inode,
+ struct buffer_head *s_bh,
+ loff_t pos_in,
+ struct inode *t_inode,
+ struct buffer_head *t_bh,
+ loff_t pos_out,
+ loff_t len,
+ struct ocfs2_cached_dealloc_ctxt *dealloc)
{
struct ocfs2_extent_tree s_et;
struct ocfs2_extent_tree t_et;
struct buffer_head *ref_root_bh = NULL;
struct ocfs2_refcount_tree *ref_tree;
struct ocfs2_super *osb;
+ loff_t remapped_bytes = 0;
loff_t pstart, plen;
- u32 p_cluster, num_clusters, slast, spos, tpos;
+ u32 p_cluster, num_clusters, slast, spos, tpos, remapped_clus = 0;
unsigned int ext_flags;
int ret = 0;
next_loop:
spos += num_clusters;
tpos += num_clusters;
+ remapped_clus += num_clusters;
}
-out:
- return ret;
+ goto out;
out_unlock_refcount:
ocfs2_unlock_refcount_tree(osb, ref_tree, 1);
brelse(ref_root_bh);
- return ret;
+out:
+ remapped_bytes = ocfs2_clusters_to_bytes(t_inode->i_sb, remapped_clus);
+ remapped_bytes = min_t(loff_t, len, remapped_bytes);
+
+ return remapped_bytes > 0 ? remapped_bytes : ret;
}
/* Set up refcount tree and remap s_inode to t_inode. */
-static int ocfs2_reflink_remap_blocks(struct inode *s_inode,
- struct buffer_head *s_bh,
- loff_t pos_in,
- struct inode *t_inode,
- struct buffer_head *t_bh,
- loff_t pos_out,
- loff_t len)
+loff_t ocfs2_reflink_remap_blocks(struct inode *s_inode,
+ struct buffer_head *s_bh,
+ loff_t pos_in,
+ struct inode *t_inode,
+ struct buffer_head *t_bh,
+ loff_t pos_out,
+ loff_t len)
{
struct ocfs2_cached_dealloc_ctxt dealloc;
struct ocfs2_super *osb;
struct ocfs2_dinode *dis;
struct ocfs2_dinode *dit;
- int ret;
+ loff_t ret;
osb = OCFS2_SB(s_inode->i_sb);
dis = (struct ocfs2_dinode *)s_bh->b_data;
/* Actually remap extents now. */
ret = ocfs2_reflink_remap_extent(s_inode, s_bh, pos_in, t_inode, t_bh,
pos_out, len, &dealloc);
- if (ret) {
+ if (ret < 0) {
mlog_errno(ret);
goto out;
}
}
/* Lock an inode and grab a bh pointing to the inode. */
-static int ocfs2_reflink_inodes_lock(struct inode *s_inode,
- struct buffer_head **bh1,
- struct inode *t_inode,
- struct buffer_head **bh2)
+int ocfs2_reflink_inodes_lock(struct inode *s_inode,
+ struct buffer_head **bh1,
+ struct inode *t_inode,
+ struct buffer_head **bh2)
{
struct inode *inode1;
struct inode *inode2;
}
/* Unlock both inodes and release buffers. */
-static void ocfs2_reflink_inodes_unlock(struct inode *s_inode,
- struct buffer_head *s_bh,
- struct inode *t_inode,
- struct buffer_head *t_bh)
+void ocfs2_reflink_inodes_unlock(struct inode *s_inode,
+ struct buffer_head *s_bh,
+ struct inode *t_inode,
+ struct buffer_head *t_bh)
{
ocfs2_inode_unlock(s_inode, 1);
ocfs2_rw_unlock(s_inode, 1);
}
unlock_two_nondirectories(s_inode, t_inode);
}
-
-/* Link a range of blocks from one file to another. */
-int ocfs2_reflink_remap_range(struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len,
- bool is_dedupe)
-{
- struct inode *inode_in = file_inode(file_in);
- struct inode *inode_out = file_inode(file_out);
- struct ocfs2_super *osb = OCFS2_SB(inode_in->i_sb);
- struct buffer_head *in_bh = NULL, *out_bh = NULL;
- bool same_inode = (inode_in == inode_out);
- ssize_t ret;
-
- if (!ocfs2_refcount_tree(osb))
- return -EOPNOTSUPP;
- if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
- return -EROFS;
-
- /* Lock both files against IO */
- ret = ocfs2_reflink_inodes_lock(inode_in, &in_bh, inode_out, &out_bh);
- if (ret)
- return ret;
-
- /* Check file eligibility and prepare for block sharing. */
- ret = -EINVAL;
- if ((OCFS2_I(inode_in)->ip_flags & OCFS2_INODE_SYSTEM_FILE) ||
- (OCFS2_I(inode_out)->ip_flags & OCFS2_INODE_SYSTEM_FILE))
- goto out_unlock;
-
- ret = vfs_clone_file_prep_inodes(inode_in, pos_in, inode_out, pos_out,
- &len, is_dedupe);
- if (ret <= 0)
- goto out_unlock;
-
- /* Lock out changes to the allocation maps and remap. */
- down_write(&OCFS2_I(inode_in)->ip_alloc_sem);
- if (!same_inode)
- down_write_nested(&OCFS2_I(inode_out)->ip_alloc_sem,
- SINGLE_DEPTH_NESTING);
-
- ret = ocfs2_reflink_remap_blocks(inode_in, in_bh, pos_in, inode_out,
- out_bh, pos_out, len);
-
- /* Zap any page cache for the destination file's range. */
- if (!ret)
- truncate_inode_pages_range(&inode_out->i_data, pos_out,
- PAGE_ALIGN(pos_out + len) - 1);
-
- up_write(&OCFS2_I(inode_in)->ip_alloc_sem);
- if (!same_inode)
- up_write(&OCFS2_I(inode_out)->ip_alloc_sem);
- if (ret) {
- mlog_errno(ret);
- goto out_unlock;
- }
-
- /*
- * Empty the extent map so that we may get the right extent
- * record from the disk.
- */
- ocfs2_extent_map_trunc(inode_in, 0);
- ocfs2_extent_map_trunc(inode_out, 0);
-
- ret = ocfs2_reflink_update_dest(inode_out, out_bh, pos_out + len);
- if (ret) {
- mlog_errno(ret);
- goto out_unlock;
- }
-
- ocfs2_reflink_inodes_unlock(inode_in, in_bh, inode_out, out_bh);
- return 0;
-
-out_unlock:
- ocfs2_reflink_inodes_unlock(inode_in, in_bh, inode_out, out_bh);
- return ret;
-}
const char __user *oldname,
const char __user *newname,
bool preserve);
-int ocfs2_reflink_remap_range(struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len,
- bool is_dedupe);
+loff_t ocfs2_reflink_remap_blocks(struct inode *s_inode,
+ struct buffer_head *s_bh,
+ loff_t pos_in,
+ struct inode *t_inode,
+ struct buffer_head *t_bh,
+ loff_t pos_out,
+ loff_t len);
+int ocfs2_reflink_inodes_lock(struct inode *s_inode,
+ struct buffer_head **bh1,
+ struct inode *t_inode,
+ struct buffer_head **bh2);
+void ocfs2_reflink_inodes_unlock(struct inode *s_inode,
+ struct buffer_head *s_bh,
+ struct inode *t_inode,
+ struct buffer_head *t_bh);
+int ocfs2_reflink_update_dest(struct inode *dest,
+ struct buffer_head *d_bh,
+ loff_t newlen);
#endif /* OCFS2_REFCOUNTTREE_H */
*/
static struct ocfs2_stack_plugin *active_stack;
-inline int ocfs2_is_o2cb_active(void)
-{
- return !strcmp(active_stack->sp_name, OCFS2_STACK_PLUGIN_O2CB);
-}
-EXPORT_SYMBOL_GPL(ocfs2_is_o2cb_active);
-
static struct ocfs2_stack_plugin *ocfs2_stack_lookup(const char *name)
{
struct ocfs2_stack_plugin *p;
int ocfs2_stack_glue_register(struct ocfs2_stack_plugin *plugin);
void ocfs2_stack_glue_unregister(struct ocfs2_stack_plugin *plugin);
-/* In ocfs2_downconvert_lock(), we need to know which stack we are using */
-int ocfs2_is_o2cb_active(void);
-
extern struct kset *ocfs2_kset;
#endif /* STACKGLUE_H */
struct iov_iter to;
struct bio_vec bv = {.bv_page = page, .bv_len = PAGE_SIZE};
- iov_iter_bvec(&to, ITER_BVEC | READ, &bv, 1, PAGE_SIZE);
+ iov_iter_bvec(&to, READ, &bv, 1, PAGE_SIZE);
gossip_debug(GOSSIP_INODE_DEBUG,
"orangefs_readpage called with page %p\n",
struct file *new_file;
loff_t old_pos = 0;
loff_t new_pos = 0;
+ loff_t cloned;
int error = 0;
if (len == 0)
}
/* Try to use clone_file_range to clone up within the same fs */
- error = do_clone_file_range(old_file, 0, new_file, 0, len);
- if (!error)
+ cloned = do_clone_file_range(old_file, 0, new_file, 0, len, 0);
+ if (cloned == len)
goto out;
/* Couldn't clone, so now we try to copy the data */
- error = 0;
/* FIXME: copy up sparse files efficiently */
while (len) {
struct dentry *destdir;
struct qstr destname;
struct dentry *workdir;
- bool tmpfile;
bool origin;
bool indexed;
bool metacopy;
return err;
}
-static int ovl_install_temp(struct ovl_copy_up_ctx *c, struct dentry *temp,
- struct dentry **newdentry)
-{
- int err;
- struct dentry *upper;
- struct inode *udir = d_inode(c->destdir);
-
- upper = lookup_one_len(c->destname.name, c->destdir, c->destname.len);
- if (IS_ERR(upper))
- return PTR_ERR(upper);
-
- if (c->tmpfile)
- err = ovl_do_link(temp, udir, upper);
- else
- err = ovl_do_rename(d_inode(c->workdir), temp, udir, upper, 0);
-
- if (!err)
- *newdentry = dget(c->tmpfile ? upper : temp);
- dput(upper);
-
- return err;
-}
-
-static struct dentry *ovl_get_tmpfile(struct ovl_copy_up_ctx *c)
-{
- int err;
- struct dentry *temp;
- const struct cred *old_creds = NULL;
- struct cred *new_creds = NULL;
- struct ovl_cattr cattr = {
- /* Can't properly set mode on creation because of the umask */
- .mode = c->stat.mode & S_IFMT,
- .rdev = c->stat.rdev,
- .link = c->link
- };
-
- err = security_inode_copy_up(c->dentry, &new_creds);
- temp = ERR_PTR(err);
- if (err < 0)
- goto out;
-
- if (new_creds)
- old_creds = override_creds(new_creds);
-
- if (c->tmpfile)
- temp = ovl_do_tmpfile(c->workdir, c->stat.mode);
- else
- temp = ovl_create_temp(c->workdir, &cattr);
-out:
- if (new_creds) {
- revert_creds(old_creds);
- put_cred(new_creds);
- }
-
- return temp;
-}
-
static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
{
int err;
return err;
}
-static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
+struct ovl_cu_creds {
+ const struct cred *old;
+ struct cred *new;
+};
+
+static int ovl_prep_cu_creds(struct dentry *dentry, struct ovl_cu_creds *cc)
+{
+ int err;
+
+ cc->old = cc->new = NULL;
+ err = security_inode_copy_up(dentry, &cc->new);
+ if (err < 0)
+ return err;
+
+ if (cc->new)
+ cc->old = override_creds(cc->new);
+
+ return 0;
+}
+
+static void ovl_revert_cu_creds(struct ovl_cu_creds *cc)
+{
+ if (cc->new) {
+ revert_creds(cc->old);
+ put_cred(cc->new);
+ }
+}
+
+/*
+ * Copyup using workdir to prepare temp file. Used when copying up directories,
+ * special files or when upper fs doesn't support O_TMPFILE.
+ */
+static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
{
- struct inode *udir = c->destdir->d_inode;
struct inode *inode;
- struct dentry *newdentry = NULL;
- struct dentry *temp;
+ struct inode *udir = d_inode(c->destdir), *wdir = d_inode(c->workdir);
+ struct dentry *temp, *upper;
+ struct ovl_cu_creds cc;
int err;
+ struct ovl_cattr cattr = {
+ /* Can't properly set mode on creation because of the umask */
+ .mode = c->stat.mode & S_IFMT,
+ .rdev = c->stat.rdev,
+ .link = c->link
+ };
+
+ err = ovl_lock_rename_workdir(c->workdir, c->destdir);
+ if (err)
+ return err;
+
+ err = ovl_prep_cu_creds(c->dentry, &cc);
+ if (err)
+ goto unlock;
- temp = ovl_get_tmpfile(c);
+ temp = ovl_create_temp(c->workdir, &cattr);
+ ovl_revert_cu_creds(&cc);
+
+ err = PTR_ERR(temp);
if (IS_ERR(temp))
- return PTR_ERR(temp);
+ goto unlock;
err = ovl_copy_up_inode(c, temp);
if (err)
- goto out;
+ goto cleanup;
if (S_ISDIR(c->stat.mode) && c->indexed) {
err = ovl_create_index(c->dentry, c->lowerpath.dentry, temp);
if (err)
- goto out;
+ goto cleanup;
}
- if (c->tmpfile) {
- inode_lock_nested(udir, I_MUTEX_PARENT);
- err = ovl_install_temp(c, temp, &newdentry);
- inode_unlock(udir);
- } else {
- err = ovl_install_temp(c, temp, &newdentry);
- }
+ upper = lookup_one_len(c->destname.name, c->destdir, c->destname.len);
+ err = PTR_ERR(upper);
+ if (IS_ERR(upper))
+ goto cleanup;
+
+ err = ovl_do_rename(wdir, temp, udir, upper, 0);
+ dput(upper);
if (err)
- goto out;
+ goto cleanup;
if (!c->metacopy)
ovl_set_upperdata(d_inode(c->dentry));
inode = d_inode(c->dentry);
- ovl_inode_update(inode, newdentry);
+ ovl_inode_update(inode, temp);
if (S_ISDIR(inode->i_mode))
ovl_set_flag(OVL_WHITEOUTS, inode);
+unlock:
+ unlock_rename(c->workdir, c->destdir);
-out:
- if (err && !c->tmpfile)
- ovl_cleanup(d_inode(c->workdir), temp);
- dput(temp);
return err;
+cleanup:
+ ovl_cleanup(wdir, temp);
+ dput(temp);
+ goto unlock;
+}
+
+/* Copyup using O_TMPFILE which does not require cross dir locking */
+static int ovl_copy_up_tmpfile(struct ovl_copy_up_ctx *c)
+{
+ struct inode *udir = d_inode(c->destdir);
+ struct dentry *temp, *upper;
+ struct ovl_cu_creds cc;
+ int err;
+
+ err = ovl_prep_cu_creds(c->dentry, &cc);
+ if (err)
+ return err;
+
+ temp = ovl_do_tmpfile(c->workdir, c->stat.mode);
+ ovl_revert_cu_creds(&cc);
+
+ if (IS_ERR(temp))
+ return PTR_ERR(temp);
+
+ err = ovl_copy_up_inode(c, temp);
+ if (err)
+ goto out_dput;
+
+ inode_lock_nested(udir, I_MUTEX_PARENT);
+
+ upper = lookup_one_len(c->destname.name, c->destdir, c->destname.len);
+ err = PTR_ERR(upper);
+ if (!IS_ERR(upper)) {
+ err = ovl_do_link(temp, udir, upper);
+ dput(upper);
+ }
+ inode_unlock(udir);
+
+ if (err)
+ goto out_dput;
+
+ if (!c->metacopy)
+ ovl_set_upperdata(d_inode(c->dentry));
+ ovl_inode_update(d_inode(c->dentry), temp);
+
+ return 0;
+
+out_dput:
+ dput(temp);
+ return err;
}
/*
}
/* Should we copyup with O_TMPFILE or with workdir? */
- if (S_ISREG(c->stat.mode) && ofs->tmpfile) {
- c->tmpfile = true;
- err = ovl_copy_up_locked(c);
- } else {
- err = ovl_lock_rename_workdir(c->workdir, c->destdir);
- if (!err) {
- err = ovl_copy_up_locked(c);
- unlock_rename(c->workdir, c->destdir);
- }
- }
-
-
+ if (S_ISREG(c->stat.mode) && ofs->tmpfile)
+ err = ovl_copy_up_tmpfile(c);
+ else
+ err = ovl_copy_up_workdir(c);
if (err)
goto out;
if (!IS_ENABLED(CONFIG_FS_POSIX_ACL) || !acl)
return 0;
- size = posix_acl_to_xattr(NULL, acl, NULL, 0);
+ size = posix_acl_xattr_size(acl->a_count);
buffer = kmalloc(size, GFP_KERNEL);
if (!buffer)
return -ENOMEM;
- size = posix_acl_to_xattr(&init_user_ns, acl, buffer, size);
- err = size;
+ err = posix_acl_to_xattr(&init_user_ns, acl, buffer, size);
if (err < 0)
goto out_free;
if (IS_ERR(upper))
goto out_unlock;
+ err = -ESTALE;
+ if (d_is_negative(upper) || !IS_WHITEOUT(d_inode(upper)))
+ goto out_dput;
+
newdentry = ovl_create_temp(workdir, cattr);
err = PTR_ERR(newdentry);
if (IS_ERR(newdentry))
struct dentry *new)
{
int err;
- bool locked = false;
struct inode *inode;
err = ovl_want_write(old);
if (err)
goto out_drop_write;
+ err = ovl_copy_up(new->d_parent);
+ if (err)
+ goto out_drop_write;
+
if (ovl_is_metacopy_dentry(old)) {
err = ovl_set_redirect(old, false);
if (err)
goto out_drop_write;
}
- err = ovl_nlink_start(old, &locked);
+ err = ovl_nlink_start(old);
if (err)
goto out_drop_write;
if (err)
iput(inode);
- ovl_nlink_end(old, locked);
+ ovl_nlink_end(old);
out_drop_write:
ovl_drop_write(old);
out:
static int ovl_do_remove(struct dentry *dentry, bool is_dir)
{
int err;
- bool locked = false;
const struct cred *old_cred;
struct dentry *upperdentry;
bool lower_positive = ovl_lower_positive(dentry);
if (err)
goto out_drop_write;
- err = ovl_nlink_start(dentry, &locked);
+ err = ovl_nlink_start(dentry);
if (err)
goto out_drop_write;
else
drop_nlink(dentry->d_inode);
}
- ovl_nlink_end(dentry, locked);
+ ovl_nlink_end(dentry);
/*
* Copy ctime
unsigned int flags)
{
int err;
- bool locked = false;
struct dentry *old_upperdir;
struct dentry *new_upperdir;
struct dentry *olddentry;
bool old_opaque;
bool new_opaque;
bool cleanup_whiteout = false;
+ bool update_nlink = false;
bool overwrite = !(flags & RENAME_EXCHANGE);
bool is_dir = d_is_dir(old);
bool new_is_dir = d_is_dir(new);
err = ovl_copy_up(new);
if (err)
goto out_drop_write;
- } else {
- err = ovl_nlink_start(new, &locked);
+ } else if (d_inode(new)) {
+ err = ovl_nlink_start(new);
if (err)
goto out_drop_write;
+
+ update_nlink = true;
}
old_cred = ovl_override_creds(old->d_sb);
unlock_rename(new_upperdir, old_upperdir);
out_revert_creds:
revert_creds(old_cred);
- ovl_nlink_end(new, locked);
+ if (update_nlink)
+ ovl_nlink_end(new);
out_drop_write:
ovl_drop_write(old);
out:
OVL_DEDUPE,
};
-static ssize_t ovl_copyfile(struct file *file_in, loff_t pos_in,
+static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
- u64 len, unsigned int flags, enum ovl_copyop op)
+ loff_t len, unsigned int flags, enum ovl_copyop op)
{
struct inode *inode_out = file_inode(file_out);
struct fd real_in, real_out;
const struct cred *old_cred;
- ssize_t ret;
+ loff_t ret;
ret = ovl_real_fdget(file_out, &real_out);
if (ret)
case OVL_CLONE:
ret = vfs_clone_file_range(real_in.file, pos_in,
- real_out.file, pos_out, len);
+ real_out.file, pos_out, len, flags);
break;
case OVL_DEDUPE:
ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
- real_out.file, pos_out, len);
+ real_out.file, pos_out, len,
+ flags);
break;
}
revert_creds(old_cred);
OVL_COPY);
}
-static int ovl_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len)
+static loff_t ovl_remap_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags)
{
- return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
- OVL_CLONE);
-}
+ enum ovl_copyop op;
+
+ if (remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_ADVISORY))
+ return -EINVAL;
+
+ if (remap_flags & REMAP_FILE_DEDUP)
+ op = OVL_DEDUPE;
+ else
+ op = OVL_CLONE;
-static int ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len)
-{
/*
* Don't copy up because of a dedupe request, this wouldn't make sense
* most of the time (data would be duplicated instead of deduplicated).
*/
- if (!ovl_inode_upper(file_inode(file_in)) ||
- !ovl_inode_upper(file_inode(file_out)))
+ if (op == OVL_DEDUPE &&
+ (!ovl_inode_upper(file_inode(file_in)) ||
+ !ovl_inode_upper(file_inode(file_out))))
return -EPERM;
- return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
- OVL_DEDUPE);
+ return ovl_copyfile(file_in, pos_in, file_out, pos_out, len,
+ remap_flags, op);
}
const struct file_operations ovl_file_operations = {
.compat_ioctl = ovl_compat_ioctl,
.copy_file_range = ovl_copy_file_range,
- .clone_file_range = ovl_clone_file_range,
- .dedupe_file_range = ovl_dedupe_file_range,
+ .remap_file_range = ovl_remap_file_range,
};
if (err)
return err;
- old_cred = ovl_override_creds(inode->i_sb);
- if (!upperinode &&
- !special_file(realinode->i_mode) && mask & MAY_WRITE) {
+ /* No need to do any access on underlying for special files */
+ if (special_file(realinode->i_mode))
+ return 0;
+
+ /* No need to access underlying for execute */
+ mask &= ~MAY_EXEC;
+ if ((mask & (MAY_READ | MAY_WRITE)) == 0)
+ return 0;
+
+ /* Lower files get copied up, so turn write access into read */
+ if (!upperinode && mask & MAY_WRITE) {
mask &= ~(MAY_WRITE | MAY_APPEND);
- /* Make sure mounter can read file for copy up later */
mask |= MAY_READ;
}
+
+ old_cred = ovl_override_creds(inode->i_sb);
err = inode_permission(realinode, mask);
revert_creds(old_cred);
fh = ovl_encode_real_fh(real, is_upper);
err = PTR_ERR(fh);
- if (IS_ERR(fh))
+ if (IS_ERR(fh)) {
+ fh = NULL;
goto fail;
+ }
err = ovl_verify_fh(dentry, name, fh);
if (set && err == -ENODATA)
bool ovl_inuse_trylock(struct dentry *dentry);
void ovl_inuse_unlock(struct dentry *dentry);
bool ovl_need_index(struct dentry *dentry);
-int ovl_nlink_start(struct dentry *dentry, bool *locked);
-void ovl_nlink_end(struct dentry *dentry, bool locked);
+int ovl_nlink_start(struct dentry *dentry);
+void ovl_nlink_end(struct dentry *dentry);
int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
int ovl_check_metacopy_xattr(struct dentry *dentry);
bool ovl_is_metacopy_dentry(struct dentry *dentry);
return ofs->xino_bits;
}
+static inline int ovl_inode_lock(struct inode *inode)
+{
+ return mutex_lock_interruptible(&OVL_I(inode)->lock);
+}
+
+static inline void ovl_inode_unlock(struct inode *inode)
+{
+ mutex_unlock(&OVL_I(inode)->lock);
+}
+
/* namei.c */
int ovl_check_fh_len(struct ovl_fh *fh, int fh_len);
{
char *p;
int err;
+ bool metacopy_opt = false, redirect_opt = false;
config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
if (!config->redirect_mode)
config->redirect_mode = match_strdup(&args[0]);
if (!config->redirect_mode)
return -ENOMEM;
+ redirect_opt = true;
break;
case OPT_INDEX_ON:
case OPT_METACOPY_ON:
config->metacopy = true;
+ metacopy_opt = true;
break;
case OPT_METACOPY_OFF:
if (err)
return err;
- /* metacopy feature with upper requires redirect_dir=on */
- if (config->upperdir && config->metacopy && !config->redirect_dir) {
- pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=on\", falling back to metacopy=off.\n");
- config->metacopy = false;
- } else if (config->metacopy && !config->redirect_follow) {
- pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=follow\" on non-upper mount, falling back to metacopy=off.\n");
- config->metacopy = false;
+ /*
+ * This is to make the logic below simpler. It doesn't make any other
+ * difference, since config->redirect_dir is only used for upper.
+ */
+ if (!config->upperdir && config->redirect_follow)
+ config->redirect_dir = true;
+
+ /* Resolve metacopy -> redirect_dir dependency */
+ if (config->metacopy && !config->redirect_dir) {
+ if (metacopy_opt && redirect_opt) {
+ pr_err("overlayfs: conflicting options: metacopy=on,redirect_dir=%s\n",
+ config->redirect_mode);
+ return -EINVAL;
+ }
+ if (redirect_opt) {
+ /*
+ * There was an explicit redirect_dir=... that resulted
+ * in this conflict.
+ */
+ pr_info("overlayfs: disabling metacopy due to redirect_dir=%s\n",
+ config->redirect_mode);
+ config->metacopy = false;
+ } else {
+ /* Automatically enable redirect otherwise. */
+ config->redirect_follow = config->redirect_dir = true;
+ }
}
return 0;
return err;
}
+static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
+{
+ unsigned int i;
+
+ if (!ofs->config.nfs_export && !(ofs->config.index && ofs->upper_mnt))
+ return true;
+
+ for (i = 0; i < ofs->numlowerfs; i++) {
+ /*
+ * We use uuid to associate an overlay lower file handle with a
+ * lower layer, so we can accept lower fs with null uuid as long
+ * as all lower layers with null uuid are on the same fs.
+ */
+ if (uuid_equal(&ofs->lower_fs[i].sb->s_uuid, uuid))
+ return false;
+ }
+ return true;
+}
+
/* Get a unique fsid for the layer */
-static int ovl_get_fsid(struct ovl_fs *ofs, struct super_block *sb)
+static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
{
+ struct super_block *sb = path->mnt->mnt_sb;
unsigned int i;
dev_t dev;
int err;
return i + 1;
}
+ if (!ovl_lower_uuid_ok(ofs, &sb->s_uuid)) {
+ ofs->config.index = false;
+ ofs->config.nfs_export = false;
+ pr_warn("overlayfs: %s uuid detected in lower fs '%pd2', falling back to index=off,nfs_export=off.\n",
+ uuid_is_null(&sb->s_uuid) ? "null" : "conflicting",
+ path->dentry);
+ }
+
err = get_anon_bdev(&dev);
if (err) {
pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
struct vfsmount *mnt;
int fsid;
- err = fsid = ovl_get_fsid(ofs, stack[i].mnt->mnt_sb);
+ err = fsid = ovl_get_fsid(ofs, &stack[i]);
if (err < 0)
goto out;
*/
int ovl_can_decode_fh(struct super_block *sb)
{
- if (!sb->s_export_op || !sb->s_export_op->fh_to_dentry ||
- uuid_is_null(&sb->s_uuid))
+ if (!sb->s_export_op || !sb->s_export_op->fh_to_dentry)
return 0;
return sb->s_export_op->encode_fh ? -1 : FILEID_INO32_GEN;
int ovl_copy_up_start(struct dentry *dentry, int flags)
{
- struct ovl_inode *oi = OVL_I(d_inode(dentry));
+ struct inode *inode = d_inode(dentry);
int err;
- err = mutex_lock_interruptible(&oi->lock);
+ err = ovl_inode_lock(inode);
if (!err && ovl_already_copied_up_locked(dentry, flags)) {
err = 1; /* Already copied up */
- mutex_unlock(&oi->lock);
+ ovl_inode_unlock(inode);
}
return err;
void ovl_copy_up_end(struct dentry *dentry)
{
- mutex_unlock(&OVL_I(d_inode(dentry))->lock);
+ ovl_inode_unlock(d_inode(dentry));
}
bool ovl_check_origin_xattr(struct dentry *dentry)
* Operations that change overlay inode and upper inode nlink need to be
* synchronized with copy up for persistent nlink accounting.
*/
-int ovl_nlink_start(struct dentry *dentry, bool *locked)
+int ovl_nlink_start(struct dentry *dentry)
{
- struct ovl_inode *oi = OVL_I(d_inode(dentry));
+ struct inode *inode = d_inode(dentry);
const struct cred *old_cred;
int err;
- if (!d_inode(dentry))
- return 0;
+ if (WARN_ON(!inode))
+ return -ENOENT;
/*
* With inodes index is enabled, we store the union overlay nlink
return err;
}
- err = mutex_lock_interruptible(&oi->lock);
+ err = ovl_inode_lock(inode);
if (err)
return err;
- if (d_is_dir(dentry) || !ovl_test_flag(OVL_INDEX, d_inode(dentry)))
+ if (d_is_dir(dentry) || !ovl_test_flag(OVL_INDEX, inode))
goto out;
old_cred = ovl_override_creds(dentry->d_sb);
out:
if (err)
- mutex_unlock(&oi->lock);
- else
- *locked = true;
+ ovl_inode_unlock(inode);
return err;
}
-void ovl_nlink_end(struct dentry *dentry, bool locked)
+void ovl_nlink_end(struct dentry *dentry)
{
- if (locked) {
- if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) &&
- d_inode(dentry)->i_nlink == 0) {
- const struct cred *old_cred;
+ struct inode *inode = d_inode(dentry);
- old_cred = ovl_override_creds(dentry->d_sb);
- ovl_cleanup_index(dentry);
- revert_creds(old_cred);
- }
+ if (ovl_test_flag(OVL_INDEX, inode) && inode->i_nlink == 0) {
+ const struct cred *old_cred;
- mutex_unlock(&OVL_I(d_inode(dentry))->lock);
+ old_cred = ovl_override_creds(dentry->d_sb);
+ ovl_cleanup_index(dentry);
+ revert_creds(old_cred);
}
+
+ ovl_inode_unlock(inode);
}
int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
}
#endif /* CONFIG_LIVEPATCH */
+#ifdef CONFIG_STACKLEAK_METRICS
+static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
+{
+ unsigned long prev_depth = THREAD_SIZE -
+ (task->prev_lowest_stack & (THREAD_SIZE - 1));
+ unsigned long depth = THREAD_SIZE -
+ (task->lowest_stack & (THREAD_SIZE - 1));
+
+ seq_printf(m, "previous stack depth: %lu\nstack depth: %lu\n",
+ prev_depth, depth);
+ return 0;
+}
+#endif /* CONFIG_STACKLEAK_METRICS */
+
/*
* Thread groups
*/
#ifdef CONFIG_LIVEPATCH
ONE("patch_state", S_IRUSR, proc_pid_patch_state),
#endif
+#ifdef CONFIG_STACKLEAK_METRICS
+ ONE("stack_depth", S_IRUGO, proc_stack_depth),
+#endif
};
static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
cxt->pstore.data = cxt;
/*
- * Console can handle any buffer size, so prefer LOG_LINE_MAX. If we
- * have to handle dumps, we must have at least record_size buffer. And
- * for ftrace, bufsize is irrelevant (if bufsize is 0, buf will be
- * ZERO_SIZE_PTR).
+ * Since bufsize is only used for dmesg crash dumps, it
+ * must match the size of the dprz record (after PRZ header
+ * and ECC bytes have been accounted for).
*/
- if (cxt->console_size)
- cxt->pstore.bufsize = 1024; /* LOG_LINE_MAX */
- cxt->pstore.bufsize = max(cxt->record_size, cxt->pstore.bufsize);
- cxt->pstore.buf = kmalloc(cxt->pstore.bufsize, GFP_KERNEL);
+ cxt->pstore.bufsize = cxt->dprzs[0]->buffer_size;
+ cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL);
if (!cxt->pstore.buf) {
- pr_err("cannot allocate pstore buffer\n");
+ pr_err("cannot allocate pstore crash dump buffer\n");
err = -ENOMEM;
goto fail_clear;
}
goto fput_in;
if (!(out.file->f_mode & FMODE_WRITE))
goto fput_out;
- retval = -EINVAL;
in_inode = file_inode(in.file);
out_inode = file_inode(out.file);
out_pos = out.file->f_pos;
* Try cloning first, this is supported by more file systems, and
* more efficient if both clone and copy are supported (e.g. NFS).
*/
- if (file_in->f_op->clone_file_range) {
- ret = file_in->f_op->clone_file_range(file_in, pos_in,
- file_out, pos_out, len);
- if (ret == 0) {
- ret = len;
+ if (file_in->f_op->remap_file_range) {
+ loff_t cloned;
+
+ cloned = file_in->f_op->remap_file_range(file_in, pos_in,
+ file_out, pos_out,
+ min_t(loff_t, MAX_RW_COUNT, len),
+ REMAP_FILE_CAN_SHORTEN);
+ if (cloned > 0) {
+ ret = cloned;
goto done;
}
}
return ret;
}
-static int clone_verify_area(struct file *file, loff_t pos, u64 len, bool write)
+static int remap_verify_area(struct file *file, loff_t pos, loff_t len,
+ bool write)
{
struct inode *inode = file_inode(file);
- if (unlikely(pos < 0))
+ if (unlikely(pos < 0 || len < 0))
return -EINVAL;
if (unlikely((loff_t) (pos + len) < 0))
return security_file_permission(file, write ? MAY_WRITE : MAY_READ);
}
+/*
+ * Ensure that we don't remap a partial EOF block in the middle of something
+ * else. Assume that the offsets have already been checked for block
+ * alignment.
+ *
+ * For deduplication we always scale down to the previous block because we
+ * can't meaningfully compare post-EOF contents.
+ *
+ * For clone we only link a partial EOF block above the destination file's EOF.
+ *
+ * Shorten the request if possible.
+ */
+static int generic_remap_check_len(struct inode *inode_in,
+ struct inode *inode_out,
+ loff_t pos_out,
+ loff_t *len,
+ unsigned int remap_flags)
+{
+ u64 blkmask = i_blocksize(inode_in) - 1;
+ loff_t new_len = *len;
+
+ if ((*len & blkmask) == 0)
+ return 0;
+
+ if ((remap_flags & REMAP_FILE_DEDUP) ||
+ pos_out + *len < i_size_read(inode_out))
+ new_len &= ~blkmask;
+
+ if (new_len == *len)
+ return 0;
+
+ if (remap_flags & REMAP_FILE_CAN_SHORTEN) {
+ *len = new_len;
+ return 0;
+ }
+
+ return (remap_flags & REMAP_FILE_DEDUP) ? -EBADE : -EINVAL;
+}
+
+/*
+ * Read a page's worth of file data into the page cache. Return the page
+ * locked.
+ */
+static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset)
+{
+ struct page *page;
+
+ page = read_mapping_page(inode->i_mapping, offset >> PAGE_SHIFT, NULL);
+ if (IS_ERR(page))
+ return page;
+ if (!PageUptodate(page)) {
+ put_page(page);
+ return ERR_PTR(-EIO);
+ }
+ lock_page(page);
+ return page;
+}
+
+/*
+ * Compare extents of two files to see if they are the same.
+ * Caller must have locked both inodes to prevent write races.
+ */
+static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
+ struct inode *dest, loff_t destoff,
+ loff_t len, bool *is_same)
+{
+ loff_t src_poff;
+ loff_t dest_poff;
+ void *src_addr;
+ void *dest_addr;
+ struct page *src_page;
+ struct page *dest_page;
+ loff_t cmp_len;
+ bool same;
+ int error;
+
+ error = -EINVAL;
+ same = true;
+ while (len) {
+ src_poff = srcoff & (PAGE_SIZE - 1);
+ dest_poff = destoff & (PAGE_SIZE - 1);
+ cmp_len = min(PAGE_SIZE - src_poff,
+ PAGE_SIZE - dest_poff);
+ cmp_len = min(cmp_len, len);
+ if (cmp_len <= 0)
+ goto out_error;
+
+ src_page = vfs_dedupe_get_page(src, srcoff);
+ if (IS_ERR(src_page)) {
+ error = PTR_ERR(src_page);
+ goto out_error;
+ }
+ dest_page = vfs_dedupe_get_page(dest, destoff);
+ if (IS_ERR(dest_page)) {
+ error = PTR_ERR(dest_page);
+ unlock_page(src_page);
+ put_page(src_page);
+ goto out_error;
+ }
+ src_addr = kmap_atomic(src_page);
+ dest_addr = kmap_atomic(dest_page);
+
+ flush_dcache_page(src_page);
+ flush_dcache_page(dest_page);
+
+ if (memcmp(src_addr + src_poff, dest_addr + dest_poff, cmp_len))
+ same = false;
+
+ kunmap_atomic(dest_addr);
+ kunmap_atomic(src_addr);
+ unlock_page(dest_page);
+ unlock_page(src_page);
+ put_page(dest_page);
+ put_page(src_page);
+
+ if (!same)
+ break;
+
+ srcoff += cmp_len;
+ destoff += cmp_len;
+ len -= cmp_len;
+ }
+
+ *is_same = same;
+ return 0;
+
+out_error:
+ return error;
+}
/*
* Check that the two inodes are eligible for cloning, the ranges make
* sense, and then flush all dirty data. Caller must ensure that the
* inodes have been locked against any other modifications.
*
- * Returns: 0 for "nothing to clone", 1 for "something to clone", or
- * the usual negative error code.
+ * If there's an error, then the usual negative error code is returned.
+ * Otherwise returns 0 with *len set to the request length.
*/
-int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
- struct inode *inode_out, loff_t pos_out,
- u64 *len, bool is_dedupe)
+int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t *len, unsigned int remap_flags)
{
- loff_t bs = inode_out->i_sb->s_blocksize;
- loff_t blen;
- loff_t isize;
+ struct inode *inode_in = file_inode(file_in);
+ struct inode *inode_out = file_inode(file_out);
bool same_inode = (inode_in == inode_out);
int ret;
if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
return -EINVAL;
- /* Are we going all the way to the end? */
- isize = i_size_read(inode_in);
- if (isize == 0)
- return 0;
-
/* Zero length dedupe exits immediately; reflink goes to EOF. */
if (*len == 0) {
- if (is_dedupe || pos_in == isize)
+ loff_t isize = i_size_read(inode_in);
+
+ if ((remap_flags & REMAP_FILE_DEDUP) || pos_in == isize)
return 0;
if (pos_in > isize)
return -EINVAL;
*len = isize - pos_in;
+ if (*len == 0)
+ return 0;
}
- /* Ensure offsets don't wrap and the input is inside i_size */
- if (pos_in + *len < pos_in || pos_out + *len < pos_out ||
- pos_in + *len > isize)
- return -EINVAL;
-
- /* Don't allow dedupe past EOF in the dest file */
- if (is_dedupe) {
- loff_t disize;
-
- disize = i_size_read(inode_out);
- if (pos_out >= disize || pos_out + *len > disize)
- return -EINVAL;
- }
-
- /* If we're linking to EOF, continue to the block boundary. */
- if (pos_in + *len == isize)
- blen = ALIGN(isize, bs) - pos_in;
- else
- blen = *len;
-
- /* Only reflink if we're aligned to block boundaries */
- if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) ||
- !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs))
- return -EINVAL;
-
- /* Don't allow overlapped reflink within the same file */
- if (same_inode) {
- if (pos_out + blen > pos_in && pos_out < pos_in + blen)
- return -EINVAL;
- }
+ /* Check that we don't violate system file offset limits. */
+ ret = generic_remap_checks(file_in, pos_in, file_out, pos_out, len,
+ remap_flags);
+ if (ret)
+ return ret;
/* Wait for the completion of any pending IOs on both files */
inode_dio_wait(inode_in);
/*
* Check that the extents are the same.
*/
- if (is_dedupe) {
+ if (remap_flags & REMAP_FILE_DEDUP) {
bool is_same = false;
ret = vfs_dedupe_file_range_compare(inode_in, pos_in,
return -EBADE;
}
- return 1;
+ ret = generic_remap_check_len(inode_in, inode_out, pos_out, len,
+ remap_flags);
+ if (ret)
+ return ret;
+
+ /* If can't alter the file contents, we're done. */
+ if (!(remap_flags & REMAP_FILE_DEDUP)) {
+ /* Update the timestamps, since we can alter file contents. */
+ if (!(file_out->f_mode & FMODE_NOCMTIME)) {
+ ret = file_update_time(file_out);
+ if (ret)
+ return ret;
+ }
+
+ /*
+ * Clear the security bits if the process is not being run by
+ * root. This keeps people from modifying setuid and setgid
+ * binaries.
+ */
+ ret = file_remove_privs(file_out);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
}
-EXPORT_SYMBOL(vfs_clone_file_prep_inodes);
+EXPORT_SYMBOL(generic_remap_file_range_prep);
-int do_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len)
+loff_t do_clone_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags)
{
struct inode *inode_in = file_inode(file_in);
struct inode *inode_out = file_inode(file_out);
- int ret;
+ loff_t ret;
+
+ WARN_ON_ONCE(remap_flags & REMAP_FILE_DEDUP);
if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
return -EISDIR;
(file_out->f_flags & O_APPEND))
return -EBADF;
- if (!file_in->f_op->clone_file_range)
+ if (!file_in->f_op->remap_file_range)
return -EOPNOTSUPP;
- ret = clone_verify_area(file_in, pos_in, len, false);
+ ret = remap_verify_area(file_in, pos_in, len, false);
if (ret)
return ret;
- ret = clone_verify_area(file_out, pos_out, len, true);
+ ret = remap_verify_area(file_out, pos_out, len, true);
if (ret)
return ret;
- if (pos_in + len > i_size_read(inode_in))
- return -EINVAL;
-
- ret = file_in->f_op->clone_file_range(file_in, pos_in,
- file_out, pos_out, len);
- if (!ret) {
- fsnotify_access(file_in);
- fsnotify_modify(file_out);
- }
+ ret = file_in->f_op->remap_file_range(file_in, pos_in,
+ file_out, pos_out, len, remap_flags);
+ if (ret < 0)
+ return ret;
+ fsnotify_access(file_in);
+ fsnotify_modify(file_out);
return ret;
}
EXPORT_SYMBOL(do_clone_file_range);
-int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len)
+loff_t vfs_clone_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags)
{
- int ret;
+ loff_t ret;
file_start_write(file_out);
- ret = do_clone_file_range(file_in, pos_in, file_out, pos_out, len);
+ ret = do_clone_file_range(file_in, pos_in, file_out, pos_out, len,
+ remap_flags);
file_end_write(file_out);
return ret;
}
EXPORT_SYMBOL(vfs_clone_file_range);
-/*
- * Read a page's worth of file data into the page cache. Return the page
- * locked.
- */
-static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset)
+/* Check whether we are allowed to dedupe the destination file */
+static bool allow_file_dedupe(struct file *file)
{
- struct address_space *mapping;
- struct page *page;
- pgoff_t n;
-
- n = offset >> PAGE_SHIFT;
- mapping = inode->i_mapping;
- page = read_mapping_page(mapping, n, NULL);
- if (IS_ERR(page))
- return page;
- if (!PageUptodate(page)) {
- put_page(page);
- return ERR_PTR(-EIO);
- }
- lock_page(page);
- return page;
+ if (capable(CAP_SYS_ADMIN))
+ return true;
+ if (file->f_mode & FMODE_WRITE)
+ return true;
+ if (uid_eq(current_fsuid(), file_inode(file)->i_uid))
+ return true;
+ if (!inode_permission(file_inode(file), MAY_WRITE))
+ return true;
+ return false;
}
-/*
- * Compare extents of two files to see if they are the same.
- * Caller must have locked both inodes to prevent write races.
- */
-int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
- struct inode *dest, loff_t destoff,
- loff_t len, bool *is_same)
+loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
+ struct file *dst_file, loff_t dst_pos,
+ loff_t len, unsigned int remap_flags)
{
- loff_t src_poff;
- loff_t dest_poff;
- void *src_addr;
- void *dest_addr;
- struct page *src_page;
- struct page *dest_page;
- loff_t cmp_len;
- bool same;
- int error;
+ loff_t ret;
- error = -EINVAL;
- same = true;
- while (len) {
- src_poff = srcoff & (PAGE_SIZE - 1);
- dest_poff = destoff & (PAGE_SIZE - 1);
- cmp_len = min(PAGE_SIZE - src_poff,
- PAGE_SIZE - dest_poff);
- cmp_len = min(cmp_len, len);
- if (cmp_len <= 0)
- goto out_error;
-
- src_page = vfs_dedupe_get_page(src, srcoff);
- if (IS_ERR(src_page)) {
- error = PTR_ERR(src_page);
- goto out_error;
- }
- dest_page = vfs_dedupe_get_page(dest, destoff);
- if (IS_ERR(dest_page)) {
- error = PTR_ERR(dest_page);
- unlock_page(src_page);
- put_page(src_page);
- goto out_error;
- }
- src_addr = kmap_atomic(src_page);
- dest_addr = kmap_atomic(dest_page);
-
- flush_dcache_page(src_page);
- flush_dcache_page(dest_page);
-
- if (memcmp(src_addr + src_poff, dest_addr + dest_poff, cmp_len))
- same = false;
-
- kunmap_atomic(dest_addr);
- kunmap_atomic(src_addr);
- unlock_page(dest_page);
- unlock_page(src_page);
- put_page(dest_page);
- put_page(src_page);
-
- if (!same)
- break;
-
- srcoff += cmp_len;
- destoff += cmp_len;
- len -= cmp_len;
- }
-
- *is_same = same;
- return 0;
-
-out_error:
- return error;
-}
-EXPORT_SYMBOL(vfs_dedupe_file_range_compare);
-
-int vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
- struct file *dst_file, loff_t dst_pos, u64 len)
-{
- s64 ret;
+ WARN_ON_ONCE(remap_flags & ~(REMAP_FILE_DEDUP |
+ REMAP_FILE_CAN_SHORTEN));
ret = mnt_want_write_file(dst_file);
if (ret)
return ret;
- ret = clone_verify_area(dst_file, dst_pos, len, true);
+ ret = remap_verify_area(dst_file, dst_pos, len, true);
if (ret < 0)
goto out_drop_write;
- ret = -EINVAL;
- if (!(capable(CAP_SYS_ADMIN) || (dst_file->f_mode & FMODE_WRITE)))
+ ret = -EPERM;
+ if (!allow_file_dedupe(dst_file))
goto out_drop_write;
ret = -EXDEV;
goto out_drop_write;
ret = -EINVAL;
- if (!dst_file->f_op->dedupe_file_range)
+ if (!dst_file->f_op->remap_file_range)
goto out_drop_write;
- ret = dst_file->f_op->dedupe_file_range(src_file, src_pos,
- dst_file, dst_pos, len);
+ if (len == 0) {
+ ret = 0;
+ goto out_drop_write;
+ }
+
+ ret = dst_file->f_op->remap_file_range(src_file, src_pos, dst_file,
+ dst_pos, len, remap_flags | REMAP_FILE_DEDUP);
out_drop_write:
mnt_drop_write_file(dst_file);
int i;
int ret;
u16 count = same->dest_count;
- int deduped;
+ loff_t deduped;
if (!(file->f_mode & FMODE_READ))
return -EINVAL;
off = same->src_offset;
len = same->src_length;
- ret = -EISDIR;
if (S_ISDIR(src->i_mode))
- goto out;
+ return -EISDIR;
- ret = -EINVAL;
if (!S_ISREG(src->i_mode))
- goto out;
+ return -EINVAL;
- ret = clone_verify_area(file, off, len, false);
+ if (!file->f_op->remap_file_range)
+ return -EOPNOTSUPP;
+
+ ret = remap_verify_area(file, off, len, false);
if (ret < 0)
- goto out;
+ return ret;
ret = 0;
if (off + len > i_size_read(src))
}
deduped = vfs_dedupe_file_range_one(file, off, dst_file,
- info->dest_offset, len);
+ info->dest_offset, len,
+ REMAP_FILE_CAN_SHORTEN);
if (deduped == -EBADE)
info->status = FILE_DEDUPE_RANGE_DIFFERS;
else if (deduped < 0)
fdput(dst_fd);
next_loop:
if (fatal_signal_pending(current))
- goto out;
+ break;
}
-
-out:
return ret;
}
EXPORT_SYMBOL(vfs_dedupe_file_range);
struct kiocb kiocb;
int idx, ret;
- iov_iter_pipe(&to, ITER_PIPE | READ, pipe, len);
+ iov_iter_pipe(&to, READ, pipe, len);
idx = to.idx;
init_sync_kiocb(&kiocb, in);
kiocb.ki_pos = *ppos;
*/
offset = *ppos & ~PAGE_MASK;
- iov_iter_pipe(&to, ITER_PIPE | READ, pipe, len + offset);
+ iov_iter_pipe(&to, READ, pipe, len + offset);
res = iov_iter_get_pages_alloc(&to, &pages, len + offset, &base);
if (res <= 0)
left -= this_len;
}
- iov_iter_bvec(&from, ITER_BVEC | WRITE, array, n,
- sd.total_len - left);
+ iov_iter_bvec(&from, WRITE, array, n, sd.total_len - left);
ret = vfs_iter_write(out, &from, &sd.pos, 0);
if (ret <= 0)
break;
sd->flags &= ~SPLICE_F_NONBLOCK;
more = sd->flags & SPLICE_F_MORE;
+ WARN_ON_ONCE(pipe->nrbufs != 0);
+
while (len) {
size_t read_len;
loff_t pos = sd->pos, prev_pos = pos;
- ret = do_splice_to(in, &pos, pipe, len, flags);
+ /* Don't try to read more the pipe has space for. */
+ read_len = min_t(size_t, len,
+ (pipe->buffers - pipe->nrbufs) << PAGE_SHIFT);
+ ret = do_splice_to(in, &pos, pipe, read_len, flags);
if (unlikely(ret <= 0))
goto out_release;
}
}
brelse(bh);
- return 0;
+ return err;
}
int sysv_write_inode(struct inode *inode, struct writeback_control *wbc)
select CRYPTO if UBIFS_FS_ZLIB
select CRYPTO_LZO if UBIFS_FS_LZO
select CRYPTO_DEFLATE if UBIFS_FS_ZLIB
+ select CRYPTO_HASH_INFO
depends on MTD_UBI
help
UBIFS is a file system for flash devices which works on top of UBI.
the extended attribute support in advance.
If you are not using a security module, say N.
+
+config UBIFS_FS_AUTHENTICATION
+ bool "UBIFS authentication support"
+ select CRYPTO_HMAC
+ help
+ Enable authentication support for UBIFS. This feature offers protection
+ against offline changes for both data and metadata of the filesystem.
+ If you say yes here you should also select a hashing algorithm such as
+ sha256, these are not selected automatically since there are many
+ different options.
ubifs-y += misc.o
ubifs-$(CONFIG_UBIFS_FS_ENCRYPTION) += crypto.o
ubifs-$(CONFIG_UBIFS_FS_XATTR) += xattr.o
+ubifs-$(CONFIG_UBIFS_FS_AUTHENTICATION) += auth.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file is part of UBIFS.
+ *
+ * Copyright (C) 2018 Pengutronix, Sascha Hauer <s.hauer@pengutronix.de>
+ */
+
+/*
+ * This file implements various helper functions for UBIFS authentication support
+ */
+
+#include <linux/crypto.h>
+#include <crypto/hash.h>
+#include <crypto/sha.h>
+#include <crypto/algapi.h>
+#include <keys/user-type.h>
+
+#include "ubifs.h"
+
+/**
+ * ubifs_node_calc_hash - calculate the hash of a UBIFS node
+ * @c: UBIFS file-system description object
+ * @node: the node to calculate a hash for
+ * @hash: the returned hash
+ *
+ * Returns 0 for success or a negative error code otherwise.
+ */
+int __ubifs_node_calc_hash(const struct ubifs_info *c, const void *node,
+ u8 *hash)
+{
+ const struct ubifs_ch *ch = node;
+ SHASH_DESC_ON_STACK(shash, c->hash_tfm);
+ int err;
+
+ shash->tfm = c->hash_tfm;
+ shash->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ err = crypto_shash_digest(shash, node, le32_to_cpu(ch->len), hash);
+ if (err < 0)
+ return err;
+ return 0;
+}
+
+/**
+ * ubifs_hash_calc_hmac - calculate a HMAC from a hash
+ * @c: UBIFS file-system description object
+ * @hash: the node to calculate a HMAC for
+ * @hmac: the returned HMAC
+ *
+ * Returns 0 for success or a negative error code otherwise.
+ */
+static int ubifs_hash_calc_hmac(const struct ubifs_info *c, const u8 *hash,
+ u8 *hmac)
+{
+ SHASH_DESC_ON_STACK(shash, c->hmac_tfm);
+ int err;
+
+ shash->tfm = c->hmac_tfm;
+ shash->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ err = crypto_shash_digest(shash, hash, c->hash_len, hmac);
+ if (err < 0)
+ return err;
+ return 0;
+}
+
+/**
+ * ubifs_prepare_auth_node - Prepare an authentication node
+ * @c: UBIFS file-system description object
+ * @node: the node to calculate a hash for
+ * @hash: input hash of previous nodes
+ *
+ * This function prepares an authentication node for writing onto flash.
+ * It creates a HMAC from the given input hash and writes it to the node.
+ *
+ * Returns 0 for success or a negative error code otherwise.
+ */
+int ubifs_prepare_auth_node(struct ubifs_info *c, void *node,
+ struct shash_desc *inhash)
+{
+ SHASH_DESC_ON_STACK(hash_desc, c->hash_tfm);
+ struct ubifs_auth_node *auth = node;
+ u8 *hash;
+ int err;
+
+ hash = kmalloc(crypto_shash_descsize(c->hash_tfm), GFP_NOFS);
+ if (!hash)
+ return -ENOMEM;
+
+ hash_desc->tfm = c->hash_tfm;
+ hash_desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+ ubifs_shash_copy_state(c, inhash, hash_desc);
+
+ err = crypto_shash_final(hash_desc, hash);
+ if (err)
+ goto out;
+
+ err = ubifs_hash_calc_hmac(c, hash, auth->hmac);
+ if (err)
+ goto out;
+
+ auth->ch.node_type = UBIFS_AUTH_NODE;
+ ubifs_prepare_node(c, auth, ubifs_auth_node_sz(c), 0);
+
+ err = 0;
+out:
+ kfree(hash);
+
+ return err;
+}
+
+static struct shash_desc *ubifs_get_desc(const struct ubifs_info *c,
+ struct crypto_shash *tfm)
+{
+ struct shash_desc *desc;
+ int err;
+
+ if (!ubifs_authenticated(c))
+ return NULL;
+
+ desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL);
+ if (!desc)
+ return ERR_PTR(-ENOMEM);
+
+ desc->tfm = tfm;
+ desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ err = crypto_shash_init(desc);
+ if (err) {
+ kfree(desc);
+ return ERR_PTR(err);
+ }
+
+ return desc;
+}
+
+/**
+ * __ubifs_hash_get_desc - get a descriptor suitable for hashing a node
+ * @c: UBIFS file-system description object
+ *
+ * This function returns a descriptor suitable for hashing a node. Free after use
+ * with kfree.
+ */
+struct shash_desc *__ubifs_hash_get_desc(const struct ubifs_info *c)
+{
+ return ubifs_get_desc(c, c->hash_tfm);
+}
+
+/**
+ * __ubifs_shash_final - finalize shash
+ * @c: UBIFS file-system description object
+ * @desc: the descriptor
+ * @out: the output hash
+ *
+ * Simple wrapper around crypto_shash_final(), safe to be called with
+ * disabled authentication.
+ */
+int __ubifs_shash_final(const struct ubifs_info *c, struct shash_desc *desc,
+ u8 *out)
+{
+ if (ubifs_authenticated(c))
+ return crypto_shash_final(desc, out);
+
+ return 0;
+}
+
+/**
+ * ubifs_bad_hash - Report hash mismatches
+ * @c: UBIFS file-system description object
+ * @node: the node
+ * @hash: the expected hash
+ * @lnum: the LEB @node was read from
+ * @offs: offset in LEB @node was read from
+ *
+ * This function reports a hash mismatch when a node has a different hash than
+ * expected.
+ */
+void ubifs_bad_hash(const struct ubifs_info *c, const void *node, const u8 *hash,
+ int lnum, int offs)
+{
+ int len = min(c->hash_len, 20);
+ int cropped = len != c->hash_len;
+ const char *cont = cropped ? "..." : "";
+
+ u8 calc[UBIFS_HASH_ARR_SZ];
+
+ __ubifs_node_calc_hash(c, node, calc);
+
+ ubifs_err(c, "hash mismatch on node at LEB %d:%d", lnum, offs);
+ ubifs_err(c, "hash expected: %*ph%s", len, hash, cont);
+ ubifs_err(c, "hash calculated: %*ph%s", len, calc, cont);
+}
+
+/**
+ * __ubifs_node_check_hash - check the hash of a node against given hash
+ * @c: UBIFS file-system description object
+ * @node: the node
+ * @expected: the expected hash
+ *
+ * This function calculates a hash over a node and compares it to the given hash.
+ * Returns 0 if both hashes are equal or authentication is disabled, otherwise a
+ * negative error code is returned.
+ */
+int __ubifs_node_check_hash(const struct ubifs_info *c, const void *node,
+ const u8 *expected)
+{
+ u8 calc[UBIFS_HASH_ARR_SZ];
+ int err;
+
+ err = __ubifs_node_calc_hash(c, node, calc);
+ if (err)
+ return err;
+
+ if (ubifs_check_hash(c, expected, calc))
+ return -EPERM;
+
+ return 0;
+}
+
+/**
+ * ubifs_init_authentication - initialize UBIFS authentication support
+ * @c: UBIFS file-system description object
+ *
+ * This function returns 0 for success or a negative error code otherwise.
+ */
+int ubifs_init_authentication(struct ubifs_info *c)
+{
+ struct key *keyring_key;
+ const struct user_key_payload *ukp;
+ int err;
+ char hmac_name[CRYPTO_MAX_ALG_NAME];
+
+ if (!c->auth_hash_name) {
+ ubifs_err(c, "authentication hash name needed with authentication");
+ return -EINVAL;
+ }
+
+ c->auth_hash_algo = match_string(hash_algo_name, HASH_ALGO__LAST,
+ c->auth_hash_name);
+ if ((int)c->auth_hash_algo < 0) {
+ ubifs_err(c, "Unknown hash algo %s specified",
+ c->auth_hash_name);
+ return -EINVAL;
+ }
+
+ snprintf(hmac_name, CRYPTO_MAX_ALG_NAME, "hmac(%s)",
+ c->auth_hash_name);
+
+ keyring_key = request_key(&key_type_logon, c->auth_key_name, NULL);
+
+ if (IS_ERR(keyring_key)) {
+ ubifs_err(c, "Failed to request key: %ld",
+ PTR_ERR(keyring_key));
+ return PTR_ERR(keyring_key);
+ }
+
+ down_read(&keyring_key->sem);
+
+ if (keyring_key->type != &key_type_logon) {
+ ubifs_err(c, "key type must be logon");
+ err = -ENOKEY;
+ goto out;
+ }
+
+ ukp = user_key_payload_locked(keyring_key);
+ if (!ukp) {
+ /* key was revoked before we acquired its semaphore */
+ err = -EKEYREVOKED;
+ goto out;
+ }
+
+ c->hash_tfm = crypto_alloc_shash(c->auth_hash_name, 0,
+ CRYPTO_ALG_ASYNC);
+ if (IS_ERR(c->hash_tfm)) {
+ err = PTR_ERR(c->hash_tfm);
+ ubifs_err(c, "Can not allocate %s: %d",
+ c->auth_hash_name, err);
+ goto out;
+ }
+
+ c->hash_len = crypto_shash_digestsize(c->hash_tfm);
+ if (c->hash_len > UBIFS_HASH_ARR_SZ) {
+ ubifs_err(c, "hash %s is bigger than maximum allowed hash size (%d > %d)",
+ c->auth_hash_name, c->hash_len, UBIFS_HASH_ARR_SZ);
+ err = -EINVAL;
+ goto out_free_hash;
+ }
+
+ c->hmac_tfm = crypto_alloc_shash(hmac_name, 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(c->hmac_tfm)) {
+ err = PTR_ERR(c->hmac_tfm);
+ ubifs_err(c, "Can not allocate %s: %d", hmac_name, err);
+ goto out_free_hash;
+ }
+
+ c->hmac_desc_len = crypto_shash_digestsize(c->hmac_tfm);
+ if (c->hmac_desc_len > UBIFS_HMAC_ARR_SZ) {
+ ubifs_err(c, "hmac %s is bigger than maximum allowed hmac size (%d > %d)",
+ hmac_name, c->hmac_desc_len, UBIFS_HMAC_ARR_SZ);
+ err = -EINVAL;
+ goto out_free_hash;
+ }
+
+ err = crypto_shash_setkey(c->hmac_tfm, ukp->data, ukp->datalen);
+ if (err)
+ goto out_free_hmac;
+
+ c->authenticated = true;
+
+ c->log_hash = ubifs_hash_get_desc(c);
+ if (IS_ERR(c->log_hash))
+ goto out_free_hmac;
+
+ err = 0;
+
+out_free_hmac:
+ if (err)
+ crypto_free_shash(c->hmac_tfm);
+out_free_hash:
+ if (err)
+ crypto_free_shash(c->hash_tfm);
+out:
+ up_read(&keyring_key->sem);
+ key_put(keyring_key);
+
+ return err;
+}
+
+/**
+ * __ubifs_exit_authentication - release resource
+ * @c: UBIFS file-system description object
+ *
+ * This function releases the authentication related resources.
+ */
+void __ubifs_exit_authentication(struct ubifs_info *c)
+{
+ if (!ubifs_authenticated(c))
+ return;
+
+ crypto_free_shash(c->hmac_tfm);
+ crypto_free_shash(c->hash_tfm);
+ kfree(c->log_hash);
+}
+
+/**
+ * ubifs_node_calc_hmac - calculate the HMAC of a UBIFS node
+ * @c: UBIFS file-system description object
+ * @node: the node to insert a HMAC into.
+ * @len: the length of the node
+ * @ofs_hmac: the offset in the node where the HMAC is inserted
+ * @hmac: returned HMAC
+ *
+ * This function calculates a HMAC of a UBIFS node. The HMAC is expected to be
+ * embedded into the node, so this area is not covered by the HMAC. Also not
+ * covered is the UBIFS_NODE_MAGIC and the CRC of the node.
+ */
+static int ubifs_node_calc_hmac(const struct ubifs_info *c, const void *node,
+ int len, int ofs_hmac, void *hmac)
+{
+ SHASH_DESC_ON_STACK(shash, c->hmac_tfm);
+ int hmac_len = c->hmac_desc_len;
+ int err;
+
+ ubifs_assert(c, ofs_hmac > 8);
+ ubifs_assert(c, ofs_hmac + hmac_len < len);
+
+ shash->tfm = c->hmac_tfm;
+ shash->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ err = crypto_shash_init(shash);
+ if (err)
+ return err;
+
+ /* behind common node header CRC up to HMAC begin */
+ err = crypto_shash_update(shash, node + 8, ofs_hmac - 8);
+ if (err < 0)
+ return err;
+
+ /* behind HMAC, if any */
+ if (len - ofs_hmac - hmac_len > 0) {
+ err = crypto_shash_update(shash, node + ofs_hmac + hmac_len,
+ len - ofs_hmac - hmac_len);
+ if (err < 0)
+ return err;
+ }
+
+ return crypto_shash_final(shash, hmac);
+}
+
+/**
+ * __ubifs_node_insert_hmac - insert a HMAC into a UBIFS node
+ * @c: UBIFS file-system description object
+ * @node: the node to insert a HMAC into.
+ * @len: the length of the node
+ * @ofs_hmac: the offset in the node where the HMAC is inserted
+ *
+ * This function inserts a HMAC at offset @ofs_hmac into the node given in
+ * @node.
+ *
+ * This function returns 0 for success or a negative error code otherwise.
+ */
+int __ubifs_node_insert_hmac(const struct ubifs_info *c, void *node, int len,
+ int ofs_hmac)
+{
+ return ubifs_node_calc_hmac(c, node, len, ofs_hmac, node + ofs_hmac);
+}
+
+/**
+ * __ubifs_node_verify_hmac - verify the HMAC of UBIFS node
+ * @c: UBIFS file-system description object
+ * @node: the node to insert a HMAC into.
+ * @len: the length of the node
+ * @ofs_hmac: the offset in the node where the HMAC is inserted
+ *
+ * This function verifies the HMAC at offset @ofs_hmac of the node given in
+ * @node. Returns 0 if successful or a negative error code otherwise.
+ */
+int __ubifs_node_verify_hmac(const struct ubifs_info *c, const void *node,
+ int len, int ofs_hmac)
+{
+ int hmac_len = c->hmac_desc_len;
+ u8 *hmac;
+ int err;
+
+ hmac = kmalloc(hmac_len, GFP_NOFS);
+ if (!hmac)
+ return -ENOMEM;
+
+ err = ubifs_node_calc_hmac(c, node, len, ofs_hmac, hmac);
+ if (err)
+ return err;
+
+ err = crypto_memneq(hmac, node + ofs_hmac, hmac_len);
+
+ kfree(hmac);
+
+ if (!err)
+ return 0;
+
+ return -EPERM;
+}
+
+int __ubifs_shash_copy_state(const struct ubifs_info *c, struct shash_desc *src,
+ struct shash_desc *target)
+{
+ u8 *state;
+ int err;
+
+ state = kmalloc(crypto_shash_descsize(src->tfm), GFP_NOFS);
+ if (!state)
+ return -ENOMEM;
+
+ err = crypto_shash_export(src, state);
+ if (err)
+ goto out;
+
+ err = crypto_shash_import(target, state);
+
+out:
+ kfree(state);
+
+ return err;
+}
+
+/**
+ * ubifs_hmac_wkm - Create a HMAC of the well known message
+ * @c: UBIFS file-system description object
+ * @hmac: The HMAC of the well known message
+ *
+ * This function creates a HMAC of a well known message. This is used
+ * to check if the provided key is suitable to authenticate a UBIFS
+ * image. This is only a convenience to the user to provide a better
+ * error message when the wrong key is provided.
+ *
+ * This function returns 0 for success or a negative error code otherwise.
+ */
+int ubifs_hmac_wkm(struct ubifs_info *c, u8 *hmac)
+{
+ SHASH_DESC_ON_STACK(shash, c->hmac_tfm);
+ int err;
+ const char well_known_message[] = "UBIFS";
+
+ if (!ubifs_authenticated(c))
+ return 0;
+
+ shash->tfm = c->hmac_tfm;
+ shash->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ err = crypto_shash_init(shash);
+ if (err)
+ return err;
+
+ err = crypto_shash_update(shash, well_known_message,
+ sizeof(well_known_message) - 1);
+ if (err < 0)
+ return err;
+
+ err = crypto_shash_final(shash, hmac);
+ if (err)
+ return err;
+ return 0;
+}
return "commit start node";
case UBIFS_ORPH_NODE:
return "orphan node";
+ case UBIFS_AUTH_NODE:
+ return "auth node";
default:
return "unknown node";
}
(unsigned long long)le64_to_cpu(orph->inos[i]));
break;
}
+ case UBIFS_AUTH_NODE:
+ {
+ break;
+ }
default:
pr_err("node type %d was not recognized\n",
(int)ch->node_type);
snod->type == UBIFS_DATA_NODE ||
snod->type == UBIFS_DENT_NODE ||
snod->type == UBIFS_XENT_NODE ||
- snod->type == UBIFS_TRUN_NODE);
+ snod->type == UBIFS_TRUN_NODE ||
+ snod->type == UBIFS_AUTH_NODE);
if (snod->type != UBIFS_INO_NODE &&
snod->type != UBIFS_DATA_NODE &&
/* Write nodes to their new location. Use the first-fit strategy */
while (1) {
- int avail;
+ int avail, moved = 0;
struct ubifs_scan_node *snod, *tmp;
/* Move data nodes */
list_for_each_entry_safe(snod, tmp, &sleb->nodes, list) {
- avail = c->leb_size - wbuf->offs - wbuf->used;
+ avail = c->leb_size - wbuf->offs - wbuf->used -
+ ubifs_auth_node_sz(c);
if (snod->len > avail)
/*
* Do not skip data nodes in order to optimize
*/
break;
+ err = ubifs_shash_update(c, c->jheads[GCHD].log_hash,
+ snod->node, snod->len);
+ if (err)
+ goto out;
+
err = move_node(c, sleb, snod, wbuf);
if (err)
goto out;
+ moved = 1;
}
/* Move non-data nodes */
list_for_each_entry_safe(snod, tmp, &nondata, list) {
- avail = c->leb_size - wbuf->offs - wbuf->used;
+ avail = c->leb_size - wbuf->offs - wbuf->used -
+ ubifs_auth_node_sz(c);
if (avail < min)
break;
continue;
}
+ err = ubifs_shash_update(c, c->jheads[GCHD].log_hash,
+ snod->node, snod->len);
+ if (err)
+ goto out;
+
err = move_node(c, sleb, snod, wbuf);
if (err)
goto out;
+ moved = 1;
+ }
+
+ if (ubifs_authenticated(c) && moved) {
+ struct ubifs_auth_node *auth;
+
+ auth = kmalloc(ubifs_auth_node_sz(c), GFP_NOFS);
+ if (!auth) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ err = ubifs_prepare_auth_node(c, auth,
+ c->jheads[GCHD].log_hash);
+ if (err) {
+ kfree(auth);
+ goto out;
+ }
+
+ err = ubifs_wbuf_write_nolock(wbuf, auth,
+ ubifs_auth_node_sz(c));
+ if (err) {
+ kfree(auth);
+ goto out;
+ }
+
+ ubifs_add_dirt(c, wbuf->lnum, ubifs_auth_node_sz(c));
}
if (list_empty(&sleb->nodes) && list_empty(&nondata))
return sqnum;
}
-/**
- * ubifs_prepare_node - prepare node to be written to flash.
- * @c: UBIFS file-system description object
- * @node: the node to pad
- * @len: node length
- * @pad: if the buffer has to be padded
- *
- * This function prepares node at @node to be written to the media - it
- * calculates node CRC, fills the common header, and adds proper padding up to
- * the next minimum I/O unit if @pad is not zero.
- */
-void ubifs_prepare_node(struct ubifs_info *c, void *node, int len, int pad)
+void ubifs_init_node(struct ubifs_info *c, void *node, int len, int pad)
{
- uint32_t crc;
struct ubifs_ch *ch = node;
unsigned long long sqnum = next_sqnum(c);
ch->group_type = UBIFS_NO_NODE_GROUP;
ch->sqnum = cpu_to_le64(sqnum);
ch->padding[0] = ch->padding[1] = 0;
- crc = crc32(UBIFS_CRC32_INIT, node + 8, len - 8);
- ch->crc = cpu_to_le32(crc);
if (pad) {
len = ALIGN(len, 8);
}
}
+void ubifs_crc_node(struct ubifs_info *c, void *node, int len)
+{
+ struct ubifs_ch *ch = node;
+ uint32_t crc;
+
+ crc = crc32(UBIFS_CRC32_INIT, node + 8, len - 8);
+ ch->crc = cpu_to_le32(crc);
+}
+
+/**
+ * ubifs_prepare_node_hmac - prepare node to be written to flash.
+ * @c: UBIFS file-system description object
+ * @node: the node to pad
+ * @len: node length
+ * @hmac_offs: offset of the HMAC in the node
+ * @pad: if the buffer has to be padded
+ *
+ * This function prepares node at @node to be written to the media - it
+ * calculates node CRC, fills the common header, and adds proper padding up to
+ * the next minimum I/O unit if @pad is not zero. if @hmac_offs is positive then
+ * a HMAC is inserted into the node at the given offset.
+ *
+ * This function returns 0 for success or a negative error code otherwise.
+ */
+int ubifs_prepare_node_hmac(struct ubifs_info *c, void *node, int len,
+ int hmac_offs, int pad)
+{
+ int err;
+
+ ubifs_init_node(c, node, len, pad);
+
+ if (hmac_offs > 0) {
+ err = ubifs_node_insert_hmac(c, node, len, hmac_offs);
+ if (err)
+ return err;
+ }
+
+ ubifs_crc_node(c, node, len);
+
+ return 0;
+}
+
+/**
+ * ubifs_prepare_node - prepare node to be written to flash.
+ * @c: UBIFS file-system description object
+ * @node: the node to pad
+ * @len: node length
+ * @pad: if the buffer has to be padded
+ *
+ * This function prepares node at @node to be written to the media - it
+ * calculates node CRC, fills the common header, and adds proper padding up to
+ * the next minimum I/O unit if @pad is not zero.
+ */
+void ubifs_prepare_node(struct ubifs_info *c, void *node, int len, int pad)
+{
+ /*
+ * Deliberately ignore return value since this function can only fail
+ * when a hmac offset is given.
+ */
+ ubifs_prepare_node_hmac(c, node, len, 0, pad);
+}
+
/**
* ubifs_prep_grp_node - prepare node of a group to be written to flash.
* @c: UBIFS file-system description object
}
/**
- * ubifs_write_node - write node to the media.
+ * ubifs_write_node_hmac - write node to the media.
* @c: UBIFS file-system description object
* @buf: the node to write
* @len: node length
* @lnum: logical eraseblock number
* @offs: offset within the logical eraseblock
+ * @hmac_offs: offset of the HMAC within the node
*
* This function automatically fills node magic number, assigns sequence
* number, and calculates node CRC checksum. The length of the @buf buffer has
* appends padding node and padding bytes if needed. Returns zero in case of
* success and a negative error code in case of failure.
*/
-int ubifs_write_node(struct ubifs_info *c, void *buf, int len, int lnum,
- int offs)
+int ubifs_write_node_hmac(struct ubifs_info *c, void *buf, int len, int lnum,
+ int offs, int hmac_offs)
{
int err, buf_len = ALIGN(len, c->min_io_size);
if (c->ro_error)
return -EROFS;
- ubifs_prepare_node(c, buf, len, 1);
+ err = ubifs_prepare_node_hmac(c, buf, len, hmac_offs, 1);
+ if (err)
+ return err;
+
err = ubifs_leb_write(c, lnum, buf, offs, buf_len);
if (err)
ubifs_dump_node(c, buf);
return err;
}
+/**
+ * ubifs_write_node - write node to the media.
+ * @c: UBIFS file-system description object
+ * @buf: the node to write
+ * @len: node length
+ * @lnum: logical eraseblock number
+ * @offs: offset within the logical eraseblock
+ *
+ * This function automatically fills node magic number, assigns sequence
+ * number, and calculates node CRC checksum. The length of the @buf buffer has
+ * to be aligned to the minimal I/O unit size. This function automatically
+ * appends padding node and padding bytes if needed. Returns zero in case of
+ * success and a negative error code in case of failure.
+ */
+int ubifs_write_node(struct ubifs_info *c, void *buf, int len, int lnum,
+ int offs)
+{
+ return ubifs_write_node_hmac(c, buf, len, lnum, offs, -1);
+}
+
/**
* ubifs_read_node_wbuf - read node from the media or write-buffer.
* @wbuf: wbuf to check for un-written data
memset(trun->padding, 0, 12);
}
+static void ubifs_add_auth_dirt(struct ubifs_info *c, int lnum)
+{
+ if (ubifs_authenticated(c))
+ ubifs_add_dirt(c, lnum, ubifs_auth_node_sz(c));
+}
+
/**
* reserve_space - reserve space in the journal.
* @c: UBIFS file-system description object
return err;
}
-/**
- * write_node - write node to a journal head.
- * @c: UBIFS file-system description object
- * @jhead: journal head
- * @node: node to write
- * @len: node length
- * @lnum: LEB number written is returned here
- * @offs: offset written is returned here
- *
- * This function writes a node to reserved space of journal head @jhead.
- * Returns zero in case of success and a negative error code in case of
- * failure.
- */
-static int write_node(struct ubifs_info *c, int jhead, void *node, int len,
- int *lnum, int *offs)
+static int ubifs_hash_nodes(struct ubifs_info *c, void *node,
+ int len, struct shash_desc *hash)
{
- struct ubifs_wbuf *wbuf = &c->jheads[jhead].wbuf;
+ int auth_node_size = ubifs_auth_node_sz(c);
+ int err;
- ubifs_assert(c, jhead != GCHD);
+ while (1) {
+ const struct ubifs_ch *ch = node;
+ int nodelen = le32_to_cpu(ch->len);
- *lnum = c->jheads[jhead].wbuf.lnum;
- *offs = c->jheads[jhead].wbuf.offs + c->jheads[jhead].wbuf.used;
+ ubifs_assert(c, len >= auth_node_size);
- dbg_jnl("jhead %s, LEB %d:%d, len %d",
- dbg_jhead(jhead), *lnum, *offs, len);
- ubifs_prepare_node(c, node, len, 0);
+ if (len == auth_node_size)
+ break;
+
+ ubifs_assert(c, len > nodelen);
+ ubifs_assert(c, ch->magic == cpu_to_le32(UBIFS_NODE_MAGIC));
- return ubifs_wbuf_write_nolock(wbuf, node, len);
+ err = ubifs_shash_update(c, hash, (void *)node, nodelen);
+ if (err)
+ return err;
+
+ node += ALIGN(nodelen, 8);
+ len -= ALIGN(nodelen, 8);
+ }
+
+ return ubifs_prepare_auth_node(c, node, hash);
}
/**
* @offs: offset written is returned here
* @sync: non-zero if the write-buffer has to by synchronized
*
- * This function is the same as 'write_node()' but it does not assume the
- * buffer it is writing is a node, so it does not prepare it (which means
- * initializing common header and calculating CRC).
+ * This function writes data to the reserved space of journal head @jhead.
+ * Returns zero in case of success and a negative error code in case of
+ * failure.
*/
static int write_head(struct ubifs_info *c, int jhead, void *buf, int len,
int *lnum, int *offs, int sync)
dbg_jnl("jhead %s, LEB %d:%d, len %d",
dbg_jhead(jhead), *lnum, *offs, len);
+ if (ubifs_authenticated(c)) {
+ err = ubifs_hash_nodes(c, buf, len, c->jheads[jhead].log_hash);
+ if (err)
+ return err;
+ }
+
err = ubifs_wbuf_write_nolock(wbuf, buf, len);
if (err)
return err;
struct ubifs_dent_node *dent;
struct ubifs_ino_node *ino;
union ubifs_key dent_key, ino_key;
+ u8 hash_dent[UBIFS_HASH_ARR_SZ];
+ u8 hash_ino[UBIFS_HASH_ARR_SZ];
+ u8 hash_ino_host[UBIFS_HASH_ARR_SZ];
ubifs_assert(c, mutex_is_locked(&host_ui->ui_mutex));
len = aligned_dlen + aligned_ilen + UBIFS_INO_NODE_SZ;
/* Make sure to also account for extended attributes */
- len += host_ui->data_len;
+ if (ubifs_authenticated(c))
+ len += ALIGN(host_ui->data_len, 8) + ubifs_auth_node_sz(c);
+ else
+ len += host_ui->data_len;
dent = kzalloc(len, GFP_NOFS);
if (!dent)
zero_dent_node_unused(dent);
ubifs_prep_grp_node(c, dent, dlen, 0);
+ err = ubifs_node_calc_hash(c, dent, hash_dent);
+ if (err)
+ goto out_release;
ino = (void *)dent + aligned_dlen;
pack_inode(c, ino, inode, 0);
+ err = ubifs_node_calc_hash(c, ino, hash_ino);
+ if (err)
+ goto out_release;
+
ino = (void *)ino + aligned_ilen;
pack_inode(c, ino, dir, 1);
+ err = ubifs_node_calc_hash(c, ino, hash_ino_host);
+ if (err)
+ goto out_release;
if (last_reference) {
err = ubifs_add_orphan(c, inode->i_ino);
}
release_head(c, BASEHD);
kfree(dent);
+ ubifs_add_auth_dirt(c, lnum);
if (deletion) {
if (nm->hash)
goto out_ro;
err = ubifs_add_dirt(c, lnum, dlen);
} else
- err = ubifs_tnc_add_nm(c, &dent_key, lnum, dent_offs, dlen, nm);
+ err = ubifs_tnc_add_nm(c, &dent_key, lnum, dent_offs, dlen,
+ hash_dent, nm);
if (err)
goto out_ro;
*/
ino_key_init(c, &ino_key, inode->i_ino);
ino_offs = dent_offs + aligned_dlen;
- err = ubifs_tnc_add(c, &ino_key, lnum, ino_offs, ilen);
+ err = ubifs_tnc_add(c, &ino_key, lnum, ino_offs, ilen, hash_ino);
if (err)
goto out_ro;
ino_key_init(c, &ino_key, dir->i_ino);
ino_offs += aligned_ilen;
err = ubifs_tnc_add(c, &ino_key, lnum, ino_offs,
- UBIFS_INO_NODE_SZ + host_ui->data_len);
+ UBIFS_INO_NODE_SZ + host_ui->data_len, hash_ino_host);
if (err)
goto out_ro;
const union ubifs_key *key, const void *buf, int len)
{
struct ubifs_data_node *data;
- int err, lnum, offs, compr_type, out_len, compr_len;
+ int err, lnum, offs, compr_type, out_len, compr_len, auth_len;
int dlen = COMPRESSED_DATA_NODE_BUF_SZ, allocated = 1;
+ int write_len;
struct ubifs_inode *ui = ubifs_inode(inode);
bool encrypted = ubifs_crypt_is_encrypted(inode);
+ u8 hash[UBIFS_HASH_ARR_SZ];
dbg_jnlk(key, "ino %lu, blk %u, len %d, key ",
(unsigned long)key_inum(c, key), key_block(c, key), len);
if (encrypted)
dlen += UBIFS_CIPHER_BLOCK_SIZE;
- data = kmalloc(dlen, GFP_NOFS | __GFP_NOWARN);
+ auth_len = ubifs_auth_node_sz(c);
+
+ data = kmalloc(dlen + auth_len, GFP_NOFS | __GFP_NOWARN);
if (!data) {
/*
* Fall-back to the write reserve buffer. Note, we might be
}
dlen = UBIFS_DATA_NODE_SZ + out_len;
+ if (ubifs_authenticated(c))
+ write_len = ALIGN(dlen, 8) + auth_len;
+ else
+ write_len = dlen;
+
data->compr_type = cpu_to_le16(compr_type);
/* Make reservation before allocating sequence numbers */
- err = make_reservation(c, DATAHD, dlen);
+ err = make_reservation(c, DATAHD, write_len);
if (err)
goto out_free;
- err = write_node(c, DATAHD, data, dlen, &lnum, &offs);
+ ubifs_prepare_node(c, data, dlen, 0);
+ err = write_head(c, DATAHD, data, write_len, &lnum, &offs, 0);
+ if (err)
+ goto out_release;
+
+ err = ubifs_node_calc_hash(c, data, hash);
if (err)
goto out_release;
+
ubifs_wbuf_add_ino_nolock(&c->jheads[DATAHD].wbuf, key_inum(c, key));
release_head(c, DATAHD);
- err = ubifs_tnc_add(c, key, lnum, offs, dlen);
+ ubifs_add_auth_dirt(c, lnum);
+
+ err = ubifs_tnc_add(c, key, lnum, offs, dlen, hash);
if (err)
goto out_ro;
int err, lnum, offs;
struct ubifs_ino_node *ino;
struct ubifs_inode *ui = ubifs_inode(inode);
- int sync = 0, len = UBIFS_INO_NODE_SZ, last_reference = !inode->i_nlink;
+ int sync = 0, write_len, ilen = UBIFS_INO_NODE_SZ;
+ int last_reference = !inode->i_nlink;
+ u8 hash[UBIFS_HASH_ARR_SZ];
dbg_jnl("ino %lu, nlink %u", inode->i_ino, inode->i_nlink);
* need to synchronize the write-buffer either.
*/
if (!last_reference) {
- len += ui->data_len;
+ ilen += ui->data_len;
sync = IS_SYNC(inode);
}
- ino = kmalloc(len, GFP_NOFS);
+
+ if (ubifs_authenticated(c))
+ write_len = ALIGN(ilen, 8) + ubifs_auth_node_sz(c);
+ else
+ write_len = ilen;
+
+ ino = kmalloc(write_len, GFP_NOFS);
if (!ino)
return -ENOMEM;
/* Make reservation before allocating sequence numbers */
- err = make_reservation(c, BASEHD, len);
+ err = make_reservation(c, BASEHD, write_len);
if (err)
goto out_free;
pack_inode(c, ino, inode, 1);
- err = write_head(c, BASEHD, ino, len, &lnum, &offs, sync);
+ err = ubifs_node_calc_hash(c, ino, hash);
+ if (err)
+ goto out_release;
+
+ err = write_head(c, BASEHD, ino, write_len, &lnum, &offs, sync);
if (err)
goto out_release;
if (!sync)
inode->i_ino);
release_head(c, BASEHD);
+ ubifs_add_auth_dirt(c, lnum);
+
if (last_reference) {
err = ubifs_tnc_remove_ino(c, inode->i_ino);
if (err)
goto out_ro;
ubifs_delete_orphan(c, inode->i_ino);
- err = ubifs_add_dirt(c, lnum, len);
+ err = ubifs_add_dirt(c, lnum, ilen);
} else {
union ubifs_key key;
ino_key_init(c, &key, inode->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs, len);
+ err = ubifs_tnc_add(c, &key, lnum, offs, ilen, hash);
}
if (err)
goto out_ro;
int aligned_dlen1, aligned_dlen2;
int twoparents = (fst_dir != snd_dir);
void *p;
+ u8 hash_dent1[UBIFS_HASH_ARR_SZ];
+ u8 hash_dent2[UBIFS_HASH_ARR_SZ];
+ u8 hash_p1[UBIFS_HASH_ARR_SZ];
+ u8 hash_p2[UBIFS_HASH_ARR_SZ];
ubifs_assert(c, ubifs_inode(fst_dir)->data_len == 0);
ubifs_assert(c, ubifs_inode(snd_dir)->data_len == 0);
if (twoparents)
len += plen;
+ len += ubifs_auth_node_sz(c);
+
dent1 = kzalloc(len, GFP_NOFS);
if (!dent1)
return -ENOMEM;
set_dent_cookie(c, dent1);
zero_dent_node_unused(dent1);
ubifs_prep_grp_node(c, dent1, dlen1, 0);
+ err = ubifs_node_calc_hash(c, dent1, hash_dent1);
+ if (err)
+ goto out_release;
/* Make new dent for 2nd entry */
dent2 = (void *)dent1 + aligned_dlen1;
set_dent_cookie(c, dent2);
zero_dent_node_unused(dent2);
ubifs_prep_grp_node(c, dent2, dlen2, 0);
+ err = ubifs_node_calc_hash(c, dent2, hash_dent2);
+ if (err)
+ goto out_release;
p = (void *)dent2 + aligned_dlen2;
- if (!twoparents)
+ if (!twoparents) {
pack_inode(c, p, fst_dir, 1);
- else {
+ err = ubifs_node_calc_hash(c, p, hash_p1);
+ if (err)
+ goto out_release;
+ } else {
pack_inode(c, p, fst_dir, 0);
+ err = ubifs_node_calc_hash(c, p, hash_p1);
+ if (err)
+ goto out_release;
p += ALIGN(plen, 8);
pack_inode(c, p, snd_dir, 1);
+ err = ubifs_node_calc_hash(c, p, hash_p2);
+ if (err)
+ goto out_release;
}
err = write_head(c, BASEHD, dent1, len, &lnum, &offs, sync);
}
release_head(c, BASEHD);
+ ubifs_add_auth_dirt(c, lnum);
+
dent_key_init(c, &key, snd_dir->i_ino, snd_nm);
- err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen1, snd_nm);
+ err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen1, hash_dent1, snd_nm);
if (err)
goto out_ro;
offs += aligned_dlen1;
dent_key_init(c, &key, fst_dir->i_ino, fst_nm);
- err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen2, fst_nm);
+ err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen2, hash_dent2, fst_nm);
if (err)
goto out_ro;
offs += aligned_dlen2;
ino_key_init(c, &key, fst_dir->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs, plen);
+ err = ubifs_tnc_add(c, &key, lnum, offs, plen, hash_p1);
if (err)
goto out_ro;
if (twoparents) {
offs += ALIGN(plen, 8);
ino_key_init(c, &key, snd_dir->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs, plen);
+ err = ubifs_tnc_add(c, &key, lnum, offs, plen, hash_p2);
if (err)
goto out_ro;
}
int last_reference = !!(new_inode && new_inode->i_nlink == 0);
int move = (old_dir != new_dir);
struct ubifs_inode *uninitialized_var(new_ui);
+ u8 hash_old_dir[UBIFS_HASH_ARR_SZ];
+ u8 hash_new_dir[UBIFS_HASH_ARR_SZ];
+ u8 hash_new_inode[UBIFS_HASH_ARR_SZ];
+ u8 hash_dent1[UBIFS_HASH_ARR_SZ];
+ u8 hash_dent2[UBIFS_HASH_ARR_SZ];
ubifs_assert(c, ubifs_inode(old_dir)->data_len == 0);
ubifs_assert(c, ubifs_inode(new_dir)->data_len == 0);
len = aligned_dlen1 + aligned_dlen2 + ALIGN(ilen, 8) + ALIGN(plen, 8);
if (move)
len += plen;
+
+ len += ubifs_auth_node_sz(c);
+
dent = kzalloc(len, GFP_NOFS);
if (!dent)
return -ENOMEM;
set_dent_cookie(c, dent);
zero_dent_node_unused(dent);
ubifs_prep_grp_node(c, dent, dlen1, 0);
+ err = ubifs_node_calc_hash(c, dent, hash_dent1);
+ if (err)
+ goto out_release;
dent2 = (void *)dent + aligned_dlen1;
dent2->ch.node_type = UBIFS_DENT_NODE;
set_dent_cookie(c, dent2);
zero_dent_node_unused(dent2);
ubifs_prep_grp_node(c, dent2, dlen2, 0);
+ err = ubifs_node_calc_hash(c, dent2, hash_dent2);
+ if (err)
+ goto out_release;
p = (void *)dent2 + aligned_dlen2;
if (new_inode) {
pack_inode(c, p, new_inode, 0);
+ err = ubifs_node_calc_hash(c, p, hash_new_inode);
+ if (err)
+ goto out_release;
+
p += ALIGN(ilen, 8);
}
- if (!move)
+ if (!move) {
pack_inode(c, p, old_dir, 1);
- else {
+ err = ubifs_node_calc_hash(c, p, hash_old_dir);
+ if (err)
+ goto out_release;
+ } else {
pack_inode(c, p, old_dir, 0);
+ err = ubifs_node_calc_hash(c, p, hash_old_dir);
+ if (err)
+ goto out_release;
+
p += ALIGN(plen, 8);
pack_inode(c, p, new_dir, 1);
+ err = ubifs_node_calc_hash(c, p, hash_new_dir);
+ if (err)
+ goto out_release;
}
if (last_reference) {
}
release_head(c, BASEHD);
+ ubifs_add_auth_dirt(c, lnum);
+
dent_key_init(c, &key, new_dir->i_ino, new_nm);
- err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen1, new_nm);
+ err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen1, hash_dent1, new_nm);
if (err)
goto out_ro;
offs += aligned_dlen1;
if (whiteout) {
dent_key_init(c, &key, old_dir->i_ino, old_nm);
- err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen2, old_nm);
+ err = ubifs_tnc_add_nm(c, &key, lnum, offs, dlen2, hash_dent2, old_nm);
if (err)
goto out_ro;
offs += aligned_dlen2;
if (new_inode) {
ino_key_init(c, &key, new_inode->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs, ilen);
+ err = ubifs_tnc_add(c, &key, lnum, offs, ilen, hash_new_inode);
if (err)
goto out_ro;
offs += ALIGN(ilen, 8);
}
ino_key_init(c, &key, old_dir->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs, plen);
+ err = ubifs_tnc_add(c, &key, lnum, offs, plen, hash_old_dir);
if (err)
goto out_ro;
if (move) {
offs += ALIGN(plen, 8);
ino_key_init(c, &key, new_dir->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs, plen);
+ err = ubifs_tnc_add(c, &key, lnum, offs, plen, hash_new_dir);
if (err)
goto out_ro;
}
struct ubifs_inode *ui = ubifs_inode(inode);
ino_t inum = inode->i_ino;
unsigned int blk;
+ u8 hash_ino[UBIFS_HASH_ARR_SZ];
+ u8 hash_dn[UBIFS_HASH_ARR_SZ];
dbg_jnl("ino %lu, size %lld -> %lld",
(unsigned long)inum, old_size, new_size);
sz = UBIFS_TRUN_NODE_SZ + UBIFS_INO_NODE_SZ +
UBIFS_MAX_DATA_NODE_SZ * WORST_COMPR_FACTOR;
+
+ sz += ubifs_auth_node_sz(c);
+
ino = kmalloc(sz, GFP_NOFS);
if (!ino)
return -ENOMEM;
/* Must make reservation before allocating sequence numbers */
len = UBIFS_TRUN_NODE_SZ + UBIFS_INO_NODE_SZ;
- if (dlen)
+
+ if (ubifs_authenticated(c))
+ len += ALIGN(dlen, 8) + ubifs_auth_node_sz(c);
+ else
len += dlen;
+
err = make_reservation(c, BASEHD, len);
if (err)
goto out_free;
pack_inode(c, ino, inode, 0);
+ err = ubifs_node_calc_hash(c, ino, hash_ino);
+ if (err)
+ goto out_release;
+
ubifs_prep_grp_node(c, trun, UBIFS_TRUN_NODE_SZ, dlen ? 0 : 1);
- if (dlen)
+ if (dlen) {
ubifs_prep_grp_node(c, dn, dlen, 1);
+ err = ubifs_node_calc_hash(c, dn, hash_dn);
+ if (err)
+ goto out_release;
+ }
err = write_head(c, BASEHD, ino, len, &lnum, &offs, sync);
if (err)
ubifs_wbuf_add_ino_nolock(&c->jheads[BASEHD].wbuf, inum);
release_head(c, BASEHD);
+ ubifs_add_auth_dirt(c, lnum);
+
if (dlen) {
sz = offs + UBIFS_INO_NODE_SZ + UBIFS_TRUN_NODE_SZ;
- err = ubifs_tnc_add(c, &key, lnum, sz, dlen);
+ err = ubifs_tnc_add(c, &key, lnum, sz, dlen, hash_dn);
if (err)
goto out_ro;
}
ino_key_init(c, &key, inum);
- err = ubifs_tnc_add(c, &key, lnum, offs, UBIFS_INO_NODE_SZ);
+ err = ubifs_tnc_add(c, &key, lnum, offs, UBIFS_INO_NODE_SZ, hash_ino);
if (err)
goto out_ro;
const struct inode *inode,
const struct fscrypt_name *nm)
{
- int err, xlen, hlen, len, lnum, xent_offs, aligned_xlen;
+ int err, xlen, hlen, len, lnum, xent_offs, aligned_xlen, write_len;
struct ubifs_dent_node *xent;
struct ubifs_ino_node *ino;
union ubifs_key xent_key, key1, key2;
int sync = IS_DIRSYNC(host);
struct ubifs_inode *host_ui = ubifs_inode(host);
+ u8 hash[UBIFS_HASH_ARR_SZ];
ubifs_assert(c, inode->i_nlink == 0);
ubifs_assert(c, mutex_is_locked(&host_ui->ui_mutex));
hlen = host_ui->data_len + UBIFS_INO_NODE_SZ;
len = aligned_xlen + UBIFS_INO_NODE_SZ + ALIGN(hlen, 8);
- xent = kzalloc(len, GFP_NOFS);
+ write_len = len + ubifs_auth_node_sz(c);
+
+ xent = kzalloc(write_len, GFP_NOFS);
if (!xent)
return -ENOMEM;
/* Make reservation before allocating sequence numbers */
- err = make_reservation(c, BASEHD, len);
+ err = make_reservation(c, BASEHD, write_len);
if (err) {
kfree(xent);
return err;
pack_inode(c, ino, inode, 0);
ino = (void *)ino + UBIFS_INO_NODE_SZ;
pack_inode(c, ino, host, 1);
+ err = ubifs_node_calc_hash(c, ino, hash);
+ if (err)
+ goto out_release;
- err = write_head(c, BASEHD, xent, len, &lnum, &xent_offs, sync);
+ err = write_head(c, BASEHD, xent, write_len, &lnum, &xent_offs, sync);
if (!sync && !err)
ubifs_wbuf_add_ino_nolock(&c->jheads[BASEHD].wbuf, host->i_ino);
release_head(c, BASEHD);
+
+ ubifs_add_auth_dirt(c, lnum);
kfree(xent);
if (err)
goto out_ro;
/* And update TNC with the new host inode position */
ino_key_init(c, &key1, host->i_ino);
- err = ubifs_tnc_add(c, &key1, lnum, xent_offs + len - hlen, hlen);
+ err = ubifs_tnc_add(c, &key1, lnum, xent_offs + len - hlen, hlen, hash);
if (err)
goto out_ro;
mark_inode_clean(c, host_ui);
return 0;
+out_release:
+ kfree(xent);
+ release_head(c, BASEHD);
out_ro:
ubifs_ro_mode(c, err);
finish_reservation(c);
struct ubifs_ino_node *ino;
union ubifs_key key;
int sync = IS_DIRSYNC(host);
+ u8 hash_host[UBIFS_HASH_ARR_SZ];
+ u8 hash[UBIFS_HASH_ARR_SZ];
dbg_jnl("ino %lu, ino %lu", host->i_ino, inode->i_ino);
ubifs_assert(c, host->i_nlink > 0);
aligned_len1 = ALIGN(len1, 8);
aligned_len = aligned_len1 + ALIGN(len2, 8);
+ aligned_len += ubifs_auth_node_sz(c);
+
ino = kzalloc(aligned_len, GFP_NOFS);
if (!ino)
return -ENOMEM;
goto out_free;
pack_inode(c, ino, host, 0);
+ err = ubifs_node_calc_hash(c, ino, hash_host);
+ if (err)
+ goto out_release;
pack_inode(c, (void *)ino + aligned_len1, inode, 1);
+ err = ubifs_node_calc_hash(c, (void *)ino + aligned_len1, hash);
+ if (err)
+ goto out_release;
err = write_head(c, BASEHD, ino, aligned_len, &lnum, &offs, 0);
if (!sync && !err) {
if (err)
goto out_ro;
+ ubifs_add_auth_dirt(c, lnum);
+
ino_key_init(c, &key, host->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs, len1);
+ err = ubifs_tnc_add(c, &key, lnum, offs, len1, hash_host);
if (err)
goto out_ro;
ino_key_init(c, &key, inode->i_ino);
- err = ubifs_tnc_add(c, &key, lnum, offs + aligned_len1, len2);
+ err = ubifs_tnc_add(c, &key, lnum, offs + aligned_len1, len2, hash);
if (err)
goto out_ro;
kfree(ino);
return 0;
+out_release:
+ release_head(c, BASEHD);
out_ro:
ubifs_ro_mode(c, err);
finish_reservation(c);
bud->lnum = lnum;
bud->start = offs;
bud->jhead = jhead;
+ bud->log_hash = NULL;
ref->ch.node_type = UBIFS_REF_NODE;
ref->lnum = cpu_to_le32(bud->lnum);
if (err)
goto out_unlock;
+ err = ubifs_shash_update(c, c->log_hash, ref, UBIFS_REF_NODE_SZ);
+ if (err)
+ goto out_unlock;
+
+ err = ubifs_shash_copy_state(c, c->log_hash, c->jheads[jhead].log_hash);
+ if (err)
+ goto out_unlock;
+
c->lhead_offs += c->ref_node_alsz;
ubifs_add_bud(c, bud);
cs->cmt_no = cpu_to_le64(c->cmt_no);
ubifs_prepare_node(c, cs, UBIFS_CS_NODE_SZ, 0);
+ err = ubifs_shash_init(c, c->log_hash);
+ if (err)
+ goto out;
+
+ err = ubifs_shash_update(c, c->log_hash, cs, UBIFS_CS_NODE_SZ);
+ if (err < 0)
+ goto out;
+
/*
* Note, we do not lock 'c->log_mutex' because this is the commit start
* phase and we are exclusively using the log. And we do not lock
ubifs_prepare_node(c, ref, UBIFS_REF_NODE_SZ, 0);
len += UBIFS_REF_NODE_SZ;
+
+ err = ubifs_shash_update(c, c->log_hash, ref,
+ UBIFS_REF_NODE_SZ);
+ if (err)
+ goto out;
+ ubifs_shash_copy_state(c, c->log_hash, c->jheads[i].log_hash);
}
ubifs_pad(c, buf + len, ALIGN(len, c->min_io_size) - len);
if (err)
return err;
list_del(&bud->list);
+ kfree(bud->log_hash);
kfree(bud);
}
mutex_lock(&c->log_mutex);
* @lpt_first: LEB number of first LPT LEB
* @lpt_lebs: number of LEBs for LPT is passed and returned here
* @big_lpt: use big LPT model is passed and returned here
+ * @hash: hash of the LPT is returned here
*
* This function returns %0 on success and a negative error code on failure.
*/
int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
- int *lpt_lebs, int *big_lpt)
+ int *lpt_lebs, int *big_lpt, u8 *hash)
{
int lnum, err = 0, node_sz, iopos, i, j, cnt, len, alen, row;
int blnum, boffs, bsz, bcnt;
void *buf = NULL, *p;
struct ubifs_lpt_lprops *ltab = NULL;
int *lsave = NULL;
+ struct shash_desc *desc;
err = calc_dflt_lpt_geom(c, main_lebs, big_lpt);
if (err)
/* Needed by 'ubifs_pack_lsave()' */
c->main_first = c->leb_cnt - *main_lebs;
+ desc = ubifs_hash_get_desc(c);
+ if (IS_ERR(desc))
+ return PTR_ERR(desc);
+
lsave = kmalloc_array(c->lsave_cnt, sizeof(int), GFP_KERNEL);
pnode = kzalloc(sizeof(struct ubifs_pnode), GFP_KERNEL);
nnode = kzalloc(sizeof(struct ubifs_nnode), GFP_KERNEL);
/* Add first pnode */
ubifs_pack_pnode(c, p, pnode);
+ err = ubifs_shash_update(c, desc, p, c->pnode_sz);
+ if (err)
+ goto out;
+
p += c->pnode_sz;
len = c->pnode_sz;
pnode->num += 1;
len = 0;
}
ubifs_pack_pnode(c, p, pnode);
+ err = ubifs_shash_update(c, desc, p, c->pnode_sz);
+ if (err)
+ goto out;
+
p += c->pnode_sz;
len += c->pnode_sz;
/*
if (err)
goto out;
+ err = ubifs_shash_final(c, desc, hash);
+ if (err)
+ goto out;
+
c->nhead_lnum = lnum;
c->nhead_offs = ALIGN(len, c->min_io_size);
dbg_lp("LPT lsave is at %d:%d", c->lsave_lnum, c->lsave_offs);
out:
c->ltab = NULL;
+ kfree(desc);
kfree(lsave);
vfree(ltab);
vfree(buf);
}
/**
- * ubifs_lpt_lookup - lookup LEB properties in the LPT.
+ * ubifs_pnode_lookup - lookup a pnode in the LPT.
* @c: UBIFS file-system description object
- * @lnum: LEB number to lookup
+ * @i: pnode number (0 to (main_lebs - 1) / UBIFS_LPT_FANOUT)
*
- * This function returns a pointer to the LEB properties on success or a
- * negative error code on failure.
+ * This function returns a pointer to the pnode on success or a negative
+ * error code on failure.
*/
-struct ubifs_lprops *ubifs_lpt_lookup(struct ubifs_info *c, int lnum)
+struct ubifs_pnode *ubifs_pnode_lookup(struct ubifs_info *c, int i)
{
- int err, i, h, iip, shft;
+ int err, h, iip, shft;
struct ubifs_nnode *nnode;
- struct ubifs_pnode *pnode;
if (!c->nroot) {
err = ubifs_read_nnode(c, NULL, 0);
if (err)
return ERR_PTR(err);
}
+ i <<= UBIFS_LPT_FANOUT_SHIFT;
nnode = c->nroot;
- i = lnum - c->main_first;
shft = c->lpt_hght * UBIFS_LPT_FANOUT_SHIFT;
for (h = 1; h < c->lpt_hght; h++) {
iip = ((i >> shft) & (UBIFS_LPT_FANOUT - 1));
return ERR_CAST(nnode);
}
iip = ((i >> shft) & (UBIFS_LPT_FANOUT - 1));
- pnode = ubifs_get_pnode(c, nnode, iip);
+ return ubifs_get_pnode(c, nnode, iip);
+}
+
+/**
+ * ubifs_lpt_lookup - lookup LEB properties in the LPT.
+ * @c: UBIFS file-system description object
+ * @lnum: LEB number to lookup
+ *
+ * This function returns a pointer to the LEB properties on success or a
+ * negative error code on failure.
+ */
+struct ubifs_lprops *ubifs_lpt_lookup(struct ubifs_info *c, int lnum)
+{
+ int i, iip;
+ struct ubifs_pnode *pnode;
+
+ i = lnum - c->main_first;
+ pnode = ubifs_pnode_lookup(c, i >> UBIFS_LPT_FANOUT_SHIFT);
if (IS_ERR(pnode))
return ERR_CAST(pnode);
iip = (i & (UBIFS_LPT_FANOUT - 1));
return &pnode->lprops[iip];
}
+/**
+ * ubifs_lpt_calc_hash - Calculate hash of the LPT pnodes
+ * @c: UBIFS file-system description object
+ * @hash: the returned hash of the LPT pnodes
+ *
+ * This function iterates over the LPT pnodes and creates a hash over them.
+ * Returns 0 for success or a negative error code otherwise.
+ */
+int ubifs_lpt_calc_hash(struct ubifs_info *c, u8 *hash)
+{
+ struct ubifs_nnode *nnode, *nn;
+ struct ubifs_cnode *cnode;
+ struct shash_desc *desc;
+ int iip = 0, i;
+ int bufsiz = max_t(int, c->nnode_sz, c->pnode_sz);
+ void *buf;
+ int err;
+
+ if (!ubifs_authenticated(c))
+ return 0;
+
+ desc = ubifs_hash_get_desc(c);
+ if (IS_ERR(desc))
+ return PTR_ERR(desc);
+
+ buf = kmalloc(bufsiz, GFP_NOFS);
+ if (!buf) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ if (!c->nroot) {
+ err = ubifs_read_nnode(c, NULL, 0);
+ if (err)
+ return err;
+ }
+
+ cnode = (struct ubifs_cnode *)c->nroot;
+
+ while (cnode) {
+ nnode = cnode->parent;
+ nn = (struct ubifs_nnode *)cnode;
+ if (cnode->level > 1) {
+ while (iip < UBIFS_LPT_FANOUT) {
+ if (nn->nbranch[iip].lnum == 0) {
+ /* Go right */
+ iip++;
+ continue;
+ }
+
+ nnode = ubifs_get_nnode(c, nn, iip);
+ if (IS_ERR(nnode)) {
+ err = PTR_ERR(nnode);
+ goto out;
+ }
+
+ /* Go down */
+ iip = 0;
+ cnode = (struct ubifs_cnode *)nnode;
+ break;
+ }
+ if (iip < UBIFS_LPT_FANOUT)
+ continue;
+ } else {
+ struct ubifs_pnode *pnode;
+
+ for (i = 0; i < UBIFS_LPT_FANOUT; i++) {
+ if (nn->nbranch[i].lnum == 0)
+ continue;
+ pnode = ubifs_get_pnode(c, nn, i);
+ if (IS_ERR(pnode)) {
+ err = PTR_ERR(pnode);
+ goto out;
+ }
+
+ ubifs_pack_pnode(c, buf, pnode);
+ err = ubifs_shash_update(c, desc, buf,
+ c->pnode_sz);
+ if (err)
+ goto out;
+ }
+ }
+ /* Go up and to the right */
+ iip = cnode->iip + 1;
+ cnode = (struct ubifs_cnode *)nnode;
+ }
+
+ err = ubifs_shash_final(c, desc, hash);
+out:
+ kfree(desc);
+ kfree(buf);
+
+ return err;
+}
+
+/**
+ * lpt_check_hash - check the hash of the LPT.
+ * @c: UBIFS file-system description object
+ *
+ * This function calculates a hash over all pnodes in the LPT and compares it with
+ * the hash stored in the master node. Returns %0 on success and a negative error
+ * code on failure.
+ */
+static int lpt_check_hash(struct ubifs_info *c)
+{
+ int err;
+ u8 hash[UBIFS_HASH_ARR_SZ];
+
+ if (!ubifs_authenticated(c))
+ return 0;
+
+ err = ubifs_lpt_calc_hash(c, hash);
+ if (err)
+ return err;
+
+ if (ubifs_check_hash(c, c->mst_node->hash_lpt, hash)) {
+ err = -EPERM;
+ ubifs_err(c, "Failed to authenticate LPT");
+ } else {
+ err = 0;
+ }
+
+ return err;
+}
+
/**
* lpt_init_rd - initialize the LPT for reading.
* @c: UBIFS file-system description object
if (err)
return err;
+ err = lpt_check_hash(c);
+ if (err)
+ return err;
+
dbg_lp("space_bits %d", c->space_bits);
dbg_lp("lpt_lnum_bits %d", c->lpt_lnum_bits);
dbg_lp("lpt_offs_bits %d", c->lpt_offs_bits);
return ubifs_get_pnode(c, nnode, iip);
}
-/**
- * pnode_lookup - lookup a pnode in the LPT.
- * @c: UBIFS file-system description object
- * @i: pnode number (0 to (main_lebs - 1) / UBIFS_LPT_FANOUT))
- *
- * This function returns a pointer to the pnode on success or a negative
- * error code on failure.
- */
-static struct ubifs_pnode *pnode_lookup(struct ubifs_info *c, int i)
-{
- int err, h, iip, shft;
- struct ubifs_nnode *nnode;
-
- if (!c->nroot) {
- err = ubifs_read_nnode(c, NULL, 0);
- if (err)
- return ERR_PTR(err);
- }
- i <<= UBIFS_LPT_FANOUT_SHIFT;
- nnode = c->nroot;
- shft = c->lpt_hght * UBIFS_LPT_FANOUT_SHIFT;
- for (h = 1; h < c->lpt_hght; h++) {
- iip = ((i >> shft) & (UBIFS_LPT_FANOUT - 1));
- shft -= UBIFS_LPT_FANOUT_SHIFT;
- nnode = ubifs_get_nnode(c, nnode, iip);
- if (IS_ERR(nnode))
- return ERR_CAST(nnode);
- }
- iip = ((i >> shft) & (UBIFS_LPT_FANOUT - 1));
- return ubifs_get_pnode(c, nnode, iip);
-}
-
/**
* add_pnode_dirt - add dirty space to LPT LEB properties.
* @c: UBIFS file-system description object
{
struct ubifs_pnode *pnode;
- pnode = pnode_lookup(c, 0);
+ pnode = ubifs_pnode_lookup(c, 0);
if (IS_ERR(pnode))
return PTR_ERR(pnode);
struct ubifs_pnode *pnode;
struct ubifs_nbranch *branch;
- pnode = pnode_lookup(c, node_num);
+ pnode = ubifs_pnode_lookup(c, node_num);
if (IS_ERR(pnode))
return PTR_ERR(pnode);
branch = &pnode->parent->nbranch[pnode->iip];
if (err)
goto out;
+ err = ubifs_lpt_calc_hash(c, c->mst_node->hash_lpt);
+ if (err)
+ goto out;
+
/* Copy the LPT's own lprops for end commit to write */
memcpy(c->ltab_cmt, c->ltab,
sizeof(struct ubifs_lpt_lprops) * c->lpt_lebs);
struct ubifs_nbranch *branch;
cond_resched();
- pnode = pnode_lookup(c, i);
+ pnode = ubifs_pnode_lookup(c, i);
if (IS_ERR(pnode))
return PTR_ERR(pnode);
branch = &pnode->parent->nbranch[pnode->iip];
for (i = 0; i < cnt; i++) {
struct ubifs_pnode *pnode;
- pnode = pnode_lookup(c, i);
+ pnode = ubifs_pnode_lookup(c, i);
if (IS_ERR(pnode))
return PTR_ERR(pnode);
cond_resched();
#include "ubifs.h"
+/**
+ * ubifs_compare_master_node - compare two UBIFS master nodes
+ * @c: UBIFS file-system description object
+ * @m1: the first node
+ * @m2: the second node
+ *
+ * This function compares two UBIFS master nodes. Returns 0 if they are equal
+ * and nonzero if not.
+ */
+int ubifs_compare_master_node(struct ubifs_info *c, void *m1, void *m2)
+{
+ int ret;
+ int behind;
+ int hmac_offs = offsetof(struct ubifs_mst_node, hmac);
+
+ /*
+ * Do not compare the common node header since the sequence number and
+ * hence the CRC are different.
+ */
+ ret = memcmp(m1 + UBIFS_CH_SZ, m2 + UBIFS_CH_SZ,
+ hmac_offs - UBIFS_CH_SZ);
+ if (ret)
+ return ret;
+
+ /*
+ * Do not compare the embedded HMAC aswell which also must be different
+ * due to the different common node header.
+ */
+ behind = hmac_offs + UBIFS_MAX_HMAC_LEN;
+
+ if (UBIFS_MST_NODE_SZ > behind)
+ return memcmp(m1 + behind, m2 + behind, UBIFS_MST_NODE_SZ - behind);
+
+ return 0;
+}
+
/**
* scan_for_master - search the valid master node.
* @c: UBIFS file-system description object
{
struct ubifs_scan_leb *sleb;
struct ubifs_scan_node *snod;
- int lnum, offs = 0, nodes_cnt;
+ int lnum, offs = 0, nodes_cnt, err;
lnum = UBIFS_MST_LNUM;
goto out_dump;
if (snod->offs != offs)
goto out;
- if (memcmp((void *)c->mst_node + UBIFS_CH_SZ,
- (void *)snod->node + UBIFS_CH_SZ,
- UBIFS_MST_NODE_SZ - UBIFS_CH_SZ))
+ if (ubifs_compare_master_node(c, c->mst_node, snod->node))
goto out;
+
c->mst_offs = offs;
ubifs_scan_destroy(sleb);
+
+ if (!ubifs_authenticated(c))
+ return 0;
+
+ err = ubifs_node_verify_hmac(c, c->mst_node,
+ sizeof(struct ubifs_mst_node),
+ offsetof(struct ubifs_mst_node, hmac));
+ if (err) {
+ ubifs_err(c, "Failed to verify master node HMAC");
+ return -EPERM;
+ }
+
return 0;
out:
c->lst.total_dead = le64_to_cpu(c->mst_node->total_dead);
c->lst.total_dark = le64_to_cpu(c->mst_node->total_dark);
+ ubifs_copy_hash(c, c->mst_node->hash_root_idx, c->zroot.hash);
+
c->calc_idx_sz = c->bi.old_idx_sz;
if (c->mst_node->flags & cpu_to_le32(UBIFS_MST_NO_ORPHS))
c->mst_offs = offs;
c->mst_node->highest_inum = cpu_to_le64(c->highest_inum);
- err = ubifs_write_node(c, c->mst_node, len, lnum, offs);
+ ubifs_copy_hash(c, c->zroot.hash, c->mst_node->hash_root_idx);
+ err = ubifs_write_node_hmac(c, c->mst_node, len, lnum, offs,
+ offsetof(struct ubifs_mst_node, hmac));
if (err)
return err;
if (err)
return err;
}
- err = ubifs_write_node(c, c->mst_node, len, lnum, offs);
+ err = ubifs_write_node_hmac(c, c->mst_node, len, lnum, offs,
+ offsetof(struct ubifs_mst_node, hmac));
return err;
}
*/
static inline int ubifs_idx_node_sz(const struct ubifs_info *c, int child_cnt)
{
- return UBIFS_IDX_NODE_SZ + (UBIFS_BRANCH_SZ + c->key_len) * child_cnt;
+ return UBIFS_IDX_NODE_SZ + (UBIFS_BRANCH_SZ + c->key_len + c->hash_len)
+ * child_cnt;
}
/**
int bnum)
{
return (struct ubifs_branch *)((void *)idx->branches +
- (UBIFS_BRANCH_SZ + c->key_len) * bnum);
+ (UBIFS_BRANCH_SZ + c->key_len + c->hash_len) * bnum);
}
/**
save_flags = mst->flags;
mst->flags |= cpu_to_le32(UBIFS_MST_RCVRY);
- ubifs_prepare_node(c, mst, UBIFS_MST_NODE_SZ, 1);
+ err = ubifs_prepare_node_hmac(c, mst, UBIFS_MST_NODE_SZ,
+ offsetof(struct ubifs_mst_node, hmac), 1);
+ if (err)
+ goto out;
err = ubifs_leb_change(c, lnum, mst, sz);
if (err)
goto out;
offs2 = (void *)mst2 - buf2;
if (offs1 == offs2) {
/* Same offset, so must be the same */
- if (memcmp((void *)mst1 + UBIFS_CH_SZ,
- (void *)mst2 + UBIFS_CH_SZ,
- UBIFS_MST_NODE_SZ - UBIFS_CH_SZ))
+ if (ubifs_compare_master_node(c, mst1, mst2))
goto out_err;
mst = mst1;
} else if (offs2 + sz == offs1) {
return err;
}
+/**
+ * inode_fix_size - fix inode size
+ * @c: UBIFS file-system description object
+ * @e: inode size information for recovery
+ */
+static int inode_fix_size(struct ubifs_info *c, struct size_entry *e)
+{
+ struct inode *inode;
+ struct ubifs_inode *ui;
+ int err;
+
+ if (c->ro_mount)
+ ubifs_assert(c, !e->inode);
+
+ if (e->inode) {
+ /* Remounting rw, pick up inode we stored earlier */
+ inode = e->inode;
+ } else {
+ inode = ubifs_iget(c->vfs_sb, e->inum);
+ if (IS_ERR(inode))
+ return PTR_ERR(inode);
+
+ if (inode->i_size >= e->d_size) {
+ /*
+ * The original inode in the index already has a size
+ * big enough, nothing to do
+ */
+ iput(inode);
+ return 0;
+ }
+
+ dbg_rcvry("ino %lu size %lld -> %lld",
+ (unsigned long)e->inum,
+ inode->i_size, e->d_size);
+
+ ui = ubifs_inode(inode);
+
+ inode->i_size = e->d_size;
+ ui->ui_size = e->d_size;
+ ui->synced_i_size = e->d_size;
+
+ e->inode = inode;
+ }
+
+ /*
+ * In readonly mode just keep the inode pinned in memory until we go
+ * readwrite. In readwrite mode write the inode to the journal with the
+ * fixed size.
+ */
+ if (c->ro_mount)
+ return 0;
+
+ err = ubifs_jnl_write_inode(c, inode);
+
+ iput(inode);
+
+ if (err)
+ return err;
+
+ rb_erase(&e->rb, &c->size_tree);
+ kfree(e);
+
+ return 0;
+}
+
/**
* ubifs_recover_size - recover inode size.
* @c: UBIFS file-system description object
+ * @in_place: If true, do a in-place size fixup
*
* This function attempts to fix inode size discrepancies identified by the
* 'ubifs_recover_size_accum()' function.
*
* This functions returns %0 on success and a negative error code on failure.
*/
-int ubifs_recover_size(struct ubifs_info *c)
+int ubifs_recover_size(struct ubifs_info *c, bool in_place)
{
struct rb_node *this = rb_first(&c->size_tree);
int err;
e = rb_entry(this, struct size_entry, rb);
+
+ this = rb_next(this);
+
if (!e->exists) {
union ubifs_key key;
}
if (e->exists && e->i_size < e->d_size) {
- if (c->ro_mount) {
- /* Fix the inode size and pin it in memory */
- struct inode *inode;
- struct ubifs_inode *ui;
-
- ubifs_assert(c, !e->inode);
-
- inode = ubifs_iget(c->vfs_sb, e->inum);
- if (IS_ERR(inode))
- return PTR_ERR(inode);
-
- ui = ubifs_inode(inode);
- if (inode->i_size < e->d_size) {
- dbg_rcvry("ino %lu size %lld -> %lld",
- (unsigned long)e->inum,
- inode->i_size, e->d_size);
- inode->i_size = e->d_size;
- ui->ui_size = e->d_size;
- ui->synced_i_size = e->d_size;
- e->inode = inode;
- this = rb_next(this);
- continue;
- }
- iput(inode);
- } else {
- /* Fix the size in place */
+ ubifs_assert(c, !(c->ro_mount && in_place));
+
+ /*
+ * We found data that is outside the found inode size,
+ * fixup the inode size
+ */
+
+ if (in_place) {
err = fix_size_in_place(c, e);
if (err)
return err;
iput(e->inode);
+ } else {
+ err = inode_fix_size(c, e);
+ if (err)
+ return err;
+ continue;
}
}
- this = rb_next(this);
rb_erase(&e->rb, &c->size_tree);
kfree(e);
}
#include "ubifs.h"
#include <linux/list_sort.h>
+#include <crypto/hash.h>
+#include <crypto/algapi.h>
/**
* struct replay_entry - replay list entry.
int lnum;
int offs;
int len;
+ u8 hash[UBIFS_HASH_ARR_SZ];
unsigned int deletion:1;
unsigned long long sqnum;
struct list_head list;
err = ubifs_tnc_remove_nm(c, &r->key, &r->nm);
else
err = ubifs_tnc_add_nm(c, &r->key, r->lnum, r->offs,
- r->len, &r->nm);
+ r->len, r->hash, &r->nm);
} else {
if (r->deletion)
switch (key_type(c, &r->key)) {
}
else
err = ubifs_tnc_add(c, &r->key, r->lnum, r->offs,
- r->len);
+ r->len, r->hash);
if (err)
return err;
* in case of success and a negative error code in case of failure.
*/
static int insert_node(struct ubifs_info *c, int lnum, int offs, int len,
- union ubifs_key *key, unsigned long long sqnum,
- int deletion, int *used, loff_t old_size,
- loff_t new_size)
+ const u8 *hash, union ubifs_key *key,
+ unsigned long long sqnum, int deletion, int *used,
+ loff_t old_size, loff_t new_size)
{
struct replay_entry *r;
r->lnum = lnum;
r->offs = offs;
r->len = len;
+ ubifs_copy_hash(c, hash, r->hash);
r->deletion = !!deletion;
r->sqnum = sqnum;
key_copy(c, key, &r->key);
* negative error code in case of failure.
*/
static int insert_dent(struct ubifs_info *c, int lnum, int offs, int len,
- union ubifs_key *key, const char *name, int nlen,
- unsigned long long sqnum, int deletion, int *used)
+ const u8 *hash, union ubifs_key *key,
+ const char *name, int nlen, unsigned long long sqnum,
+ int deletion, int *used)
{
struct replay_entry *r;
char *nbuf;
r->lnum = lnum;
r->offs = offs;
r->len = len;
+ ubifs_copy_hash(c, hash, r->hash);
r->deletion = !!deletion;
r->sqnum = sqnum;
key_copy(c, key, &r->key);
return data == 0xFFFFFFFF;
}
+/**
+ * authenticate_sleb - authenticate one scan LEB
+ * @c: UBIFS file-system description object
+ * @sleb: the scan LEB to authenticate
+ * @log_hash:
+ * @is_last: if true, this is is the last LEB
+ *
+ * This function iterates over the buds of a single LEB authenticating all buds
+ * with the authentication nodes on this LEB. Authentication nodes are written
+ * after some buds and contain a HMAC covering the authentication node itself
+ * and the buds between the last authentication node and the current
+ * authentication node. It can happen that the last buds cannot be authenticated
+ * because a powercut happened when some nodes were written but not the
+ * corresponding authentication node. This function returns the number of nodes
+ * that could be authenticated or a negative error code.
+ */
+static int authenticate_sleb(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
+ struct shash_desc *log_hash, int is_last)
+{
+ int n_not_auth = 0;
+ struct ubifs_scan_node *snod;
+ int n_nodes = 0;
+ int err;
+ u8 *hash, *hmac;
+
+ if (!ubifs_authenticated(c))
+ return sleb->nodes_cnt;
+
+ hash = kmalloc(crypto_shash_descsize(c->hash_tfm), GFP_NOFS);
+ hmac = kmalloc(c->hmac_desc_len, GFP_NOFS);
+ if (!hash || !hmac) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ list_for_each_entry(snod, &sleb->nodes, list) {
+
+ n_nodes++;
+
+ if (snod->type == UBIFS_AUTH_NODE) {
+ struct ubifs_auth_node *auth = snod->node;
+ SHASH_DESC_ON_STACK(hash_desc, c->hash_tfm);
+ SHASH_DESC_ON_STACK(hmac_desc, c->hmac_tfm);
+
+ hash_desc->tfm = c->hash_tfm;
+ hash_desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ ubifs_shash_copy_state(c, log_hash, hash_desc);
+ err = crypto_shash_final(hash_desc, hash);
+ if (err)
+ goto out;
+
+ hmac_desc->tfm = c->hmac_tfm;
+ hmac_desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+ err = crypto_shash_digest(hmac_desc, hash, c->hash_len,
+ hmac);
+ if (err)
+ goto out;
+
+ err = ubifs_check_hmac(c, auth->hmac, hmac);
+ if (err) {
+ err = -EPERM;
+ goto out;
+ }
+ n_not_auth = 0;
+ } else {
+ err = crypto_shash_update(log_hash, snod->node,
+ snod->len);
+ if (err)
+ goto out;
+ n_not_auth++;
+ }
+ }
+
+ /*
+ * A powercut can happen when some nodes were written, but not yet
+ * the corresponding authentication node. This may only happen on
+ * the last bud though.
+ */
+ if (n_not_auth) {
+ if (is_last) {
+ dbg_mnt("%d unauthenticated nodes found on LEB %d, Ignoring them",
+ n_not_auth, sleb->lnum);
+ err = 0;
+ } else {
+ dbg_mnt("%d unauthenticated nodes found on non-last LEB %d",
+ n_not_auth, sleb->lnum);
+ err = -EPERM;
+ }
+ } else {
+ err = 0;
+ }
+out:
+ kfree(hash);
+ kfree(hmac);
+
+ return err ? err : n_nodes - n_not_auth;
+}
+
/**
* replay_bud - replay a bud logical eraseblock.
* @c: UBIFS file-system description object
{
int is_last = is_last_bud(c, b->bud);
int err = 0, used = 0, lnum = b->bud->lnum, offs = b->bud->start;
+ int n_nodes, n = 0;
struct ubifs_scan_leb *sleb;
struct ubifs_scan_node *snod;
if (IS_ERR(sleb))
return PTR_ERR(sleb);
+ n_nodes = authenticate_sleb(c, sleb, b->bud->log_hash, is_last);
+ if (n_nodes < 0) {
+ err = n_nodes;
+ goto out;
+ }
+
+ ubifs_shash_copy_state(c, b->bud->log_hash,
+ c->jheads[b->bud->jhead].log_hash);
+
/*
* The bud does not have to start from offset zero - the beginning of
* the 'lnum' LEB may contain previously committed data. One of the
*/
list_for_each_entry(snod, &sleb->nodes, list) {
+ u8 hash[UBIFS_HASH_ARR_SZ];
int deletion = 0;
cond_resched();
goto out_dump;
}
+ ubifs_node_calc_hash(c, snod->node, hash);
+
if (snod->sqnum > c->max_sqnum)
c->max_sqnum = snod->sqnum;
if (le32_to_cpu(ino->nlink) == 0)
deletion = 1;
- err = insert_node(c, lnum, snod->offs, snod->len,
+ err = insert_node(c, lnum, snod->offs, snod->len, hash,
&snod->key, snod->sqnum, deletion,
&used, 0, new_size);
break;
key_block(c, &snod->key) *
UBIFS_BLOCK_SIZE;
- err = insert_node(c, lnum, snod->offs, snod->len,
+ err = insert_node(c, lnum, snod->offs, snod->len, hash,
&snod->key, snod->sqnum, deletion,
&used, 0, new_size);
break;
if (err)
goto out_dump;
- err = insert_dent(c, lnum, snod->offs, snod->len,
+ err = insert_dent(c, lnum, snod->offs, snod->len, hash,
&snod->key, dent->name,
le16_to_cpu(dent->nlen), snod->sqnum,
!le64_to_cpu(dent->inum), &used);
* functions which expect nodes to have keys.
*/
trun_key_init(c, &key, le32_to_cpu(trun->inum));
- err = insert_node(c, lnum, snod->offs, snod->len,
+ err = insert_node(c, lnum, snod->offs, snod->len, hash,
&key, snod->sqnum, 1, &used,
old_size, new_size);
break;
}
+ case UBIFS_AUTH_NODE:
+ break;
default:
ubifs_err(c, "unexpected node type %d in bud LEB %d:%d",
snod->type, lnum, snod->offs);
}
if (err)
goto out;
+
+ n++;
+ if (n == n_nodes)
+ break;
}
ubifs_assert(c, ubifs_search_bud(c, lnum));
{
struct ubifs_bud *bud;
struct bud_entry *b;
+ int err;
dbg_mnt("add replay bud LEB %d:%d, head %d", lnum, offs, jhead);
b = kmalloc(sizeof(struct bud_entry), GFP_KERNEL);
if (!b) {
- kfree(bud);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto out;
}
bud->lnum = lnum;
bud->start = offs;
bud->jhead = jhead;
+ bud->log_hash = ubifs_hash_get_desc(c);
+ if (IS_ERR(bud->log_hash)) {
+ err = PTR_ERR(bud->log_hash);
+ goto out;
+ }
+
+ ubifs_shash_copy_state(c, c->log_hash, bud->log_hash);
+
ubifs_add_bud(c, bud);
b->bud = bud;
list_add_tail(&b->list, &c->replay_buds);
return 0;
+out:
+ kfree(bud);
+ kfree(b);
+
+ return err;
}
/**
c->cs_sqnum = le64_to_cpu(node->ch.sqnum);
dbg_mnt("commit start sqnum %llu", c->cs_sqnum);
+
+ err = ubifs_shash_init(c, c->log_hash);
+ if (err)
+ goto out;
+
+ err = ubifs_shash_update(c, c->log_hash, node, UBIFS_CS_NODE_SZ);
+ if (err < 0)
+ goto out;
}
if (snod->sqnum < c->cs_sqnum) {
if (err)
goto out_dump;
+ err = ubifs_shash_update(c, c->log_hash, ref,
+ UBIFS_REF_NODE_SZ);
+ if (err)
+ goto out;
+
err = add_replay_bud(c, le32_to_cpu(ref->lnum),
le32_to_cpu(ref->offs),
le32_to_cpu(ref->jhead),
int err, tmp, jnl_lebs, log_lebs, max_buds, main_lebs, main_first;
int lpt_lebs, lpt_first, orph_lebs, big_lpt, ino_waste, sup_flags = 0;
int min_leb_cnt = UBIFS_MIN_LEB_CNT;
+ int idx_node_size;
long long tmp64, main_bytes;
__le64 tmp_le64;
__le32 tmp_le32;
struct timespec64 ts;
+ u8 hash[UBIFS_HASH_ARR_SZ];
+ u8 hash_lpt[UBIFS_HASH_ARR_SZ];
/* Some functions called from here depend on the @c->key_len filed */
c->key_len = UBIFS_SK_LEN;
c->lsave_cnt = DEFAULT_LSAVE_CNT;
c->max_leb_cnt = c->leb_cnt;
err = ubifs_create_dflt_lpt(c, &main_lebs, lpt_first, &lpt_lebs,
- &big_lpt);
+ &big_lpt, hash_lpt);
if (err)
return err;
main_first = c->leb_cnt - main_lebs;
+ sup = kzalloc(ALIGN(UBIFS_SB_NODE_SZ, c->min_io_size), GFP_KERNEL);
+ mst = kzalloc(c->mst_node_alsz, GFP_KERNEL);
+ idx_node_size = ubifs_idx_node_sz(c, 1);
+ idx = kzalloc(ALIGN(tmp, c->min_io_size), GFP_KERNEL);
+ ino = kzalloc(ALIGN(UBIFS_INO_NODE_SZ, c->min_io_size), GFP_KERNEL);
+ cs = kzalloc(ALIGN(UBIFS_CS_NODE_SZ, c->min_io_size), GFP_KERNEL);
+
+ if (!sup || !mst || !idx || !ino || !cs) {
+ err = -ENOMEM;
+ goto out;
+ }
+
/* Create default superblock */
- tmp = ALIGN(UBIFS_SB_NODE_SZ, c->min_io_size);
- sup = kzalloc(tmp, GFP_KERNEL);
- if (!sup)
- return -ENOMEM;
tmp64 = (long long)max_buds * c->leb_size;
if (big_lpt)
sup_flags |= UBIFS_FLG_BIGLPT;
sup_flags |= UBIFS_FLG_DOUBLE_HASH;
+ if (ubifs_authenticated(c)) {
+ sup_flags |= UBIFS_FLG_AUTHENTICATION;
+ sup->hash_algo = cpu_to_le16(c->auth_hash_algo);
+ err = ubifs_hmac_wkm(c, sup->hmac_wkm);
+ if (err)
+ goto out;
+ } else {
+ sup->hash_algo = 0xffff;
+ }
+
sup->ch.node_type = UBIFS_SB_NODE;
sup->key_hash = UBIFS_KEY_HASH_R5;
sup->flags = cpu_to_le32(sup_flags);
sup->rp_size = cpu_to_le64(tmp64);
sup->ro_compat_version = cpu_to_le32(UBIFS_RO_COMPAT_VERSION);
- err = ubifs_write_node(c, sup, UBIFS_SB_NODE_SZ, 0, 0);
- kfree(sup);
- if (err)
- return err;
-
dbg_gen("default superblock created at LEB 0:0");
/* Create default master node */
- mst = kzalloc(c->mst_node_alsz, GFP_KERNEL);
- if (!mst)
- return -ENOMEM;
mst->ch.node_type = UBIFS_MST_NODE;
mst->log_lnum = cpu_to_le32(UBIFS_LOG_LNUM);
mst->empty_lebs = cpu_to_le32(main_lebs - 2);
mst->idx_lebs = cpu_to_le32(1);
mst->leb_cnt = cpu_to_le32(c->leb_cnt);
+ ubifs_copy_hash(c, hash_lpt, mst->hash_lpt);
/* Calculate lprops statistics */
tmp64 = main_bytes;
mst->total_used = cpu_to_le64(UBIFS_INO_NODE_SZ);
- err = ubifs_write_node(c, mst, UBIFS_MST_NODE_SZ, UBIFS_MST_LNUM, 0);
- if (err) {
- kfree(mst);
- return err;
- }
- err = ubifs_write_node(c, mst, UBIFS_MST_NODE_SZ, UBIFS_MST_LNUM + 1,
- 0);
- kfree(mst);
- if (err)
- return err;
-
dbg_gen("default master node created at LEB %d:0", UBIFS_MST_LNUM);
/* Create the root indexing node */
- tmp = ubifs_idx_node_sz(c, 1);
- idx = kzalloc(ALIGN(tmp, c->min_io_size), GFP_KERNEL);
- if (!idx)
- return -ENOMEM;
c->key_fmt = UBIFS_SIMPLE_KEY_FMT;
c->key_hash = key_r5_hash;
key_write_idx(c, &key, &br->key);
br->lnum = cpu_to_le32(main_first + DEFAULT_DATA_LEB);
br->len = cpu_to_le32(UBIFS_INO_NODE_SZ);
- err = ubifs_write_node(c, idx, tmp, main_first + DEFAULT_IDX_LEB, 0);
- kfree(idx);
- if (err)
- return err;
dbg_gen("default root indexing node created LEB %d:0",
main_first + DEFAULT_IDX_LEB);
/* Create default root inode */
- tmp = ALIGN(UBIFS_INO_NODE_SZ, c->min_io_size);
- ino = kzalloc(tmp, GFP_KERNEL);
- if (!ino)
- return -ENOMEM;
ino_key_init_flash(c, &ino->key, UBIFS_ROOT_INO);
ino->ch.node_type = UBIFS_INO_NODE;
/* Set compression enabled by default */
ino->flags = cpu_to_le32(UBIFS_COMPR_FL);
- err = ubifs_write_node(c, ino, UBIFS_INO_NODE_SZ,
- main_first + DEFAULT_DATA_LEB, 0);
- kfree(ino);
- if (err)
- return err;
-
dbg_gen("root inode created at LEB %d:0",
main_first + DEFAULT_DATA_LEB);
* always the case during normal file-system operation. Write a fake
* commit start node to the log.
*/
- tmp = ALIGN(UBIFS_CS_NODE_SZ, c->min_io_size);
- cs = kzalloc(tmp, GFP_KERNEL);
- if (!cs)
- return -ENOMEM;
cs->ch.node_type = UBIFS_CS_NODE;
+
+ err = ubifs_write_node_hmac(c, sup, UBIFS_SB_NODE_SZ, 0, 0,
+ offsetof(struct ubifs_sb_node, hmac));
+ if (err)
+ goto out;
+
+ err = ubifs_write_node(c, ino, UBIFS_INO_NODE_SZ,
+ main_first + DEFAULT_DATA_LEB, 0);
+ if (err)
+ goto out;
+
+ ubifs_node_calc_hash(c, ino, hash);
+ ubifs_copy_hash(c, hash, ubifs_branch_hash(c, br));
+
+ err = ubifs_write_node(c, idx, idx_node_size, main_first + DEFAULT_IDX_LEB, 0);
+ if (err)
+ goto out;
+
+ ubifs_node_calc_hash(c, idx, hash);
+ ubifs_copy_hash(c, hash, mst->hash_root_idx);
+
+ err = ubifs_write_node_hmac(c, mst, UBIFS_MST_NODE_SZ, UBIFS_MST_LNUM, 0,
+ offsetof(struct ubifs_mst_node, hmac));
+ if (err)
+ goto out;
+
+ err = ubifs_write_node_hmac(c, mst, UBIFS_MST_NODE_SZ, UBIFS_MST_LNUM + 1,
+ 0, offsetof(struct ubifs_mst_node, hmac));
+ if (err)
+ goto out;
+
err = ubifs_write_node(c, cs, UBIFS_CS_NODE_SZ, UBIFS_LOG_LNUM, 0);
- kfree(cs);
if (err)
- return err;
+ goto out;
ubifs_msg(c, "default file-system created");
- return 0;
+
+ err = 0;
+out:
+ kfree(sup);
+ kfree(mst);
+ kfree(idx);
+ kfree(ino);
+ kfree(cs);
+
+ return err;
}
/**
* code. Note, the user of this function is responsible of kfree()'ing the
* returned superblock buffer.
*/
-struct ubifs_sb_node *ubifs_read_sb_node(struct ubifs_info *c)
+static struct ubifs_sb_node *ubifs_read_sb_node(struct ubifs_info *c)
{
struct ubifs_sb_node *sup;
int err;
return sup;
}
+static int authenticate_sb_node(struct ubifs_info *c,
+ const struct ubifs_sb_node *sup)
+{
+ unsigned int sup_flags = le32_to_cpu(sup->flags);
+ u8 hmac_wkm[UBIFS_HMAC_ARR_SZ];
+ int authenticated = !!(sup_flags & UBIFS_FLG_AUTHENTICATION);
+ int hash_algo;
+ int err;
+
+ if (c->authenticated && !authenticated) {
+ ubifs_err(c, "authenticated FS forced, but found FS without authentication");
+ return -EINVAL;
+ }
+
+ if (!c->authenticated && authenticated) {
+ ubifs_err(c, "authenticated FS found, but no key given");
+ return -EINVAL;
+ }
+
+ ubifs_msg(c, "Mounting in %sauthenticated mode",
+ c->authenticated ? "" : "un");
+
+ if (!c->authenticated)
+ return 0;
+
+ if (!IS_ENABLED(CONFIG_UBIFS_FS_AUTHENTICATION))
+ return -EOPNOTSUPP;
+
+ hash_algo = le16_to_cpu(sup->hash_algo);
+ if (hash_algo >= HASH_ALGO__LAST) {
+ ubifs_err(c, "superblock uses unknown hash algo %d",
+ hash_algo);
+ return -EINVAL;
+ }
+
+ if (strcmp(hash_algo_name[hash_algo], c->auth_hash_name)) {
+ ubifs_err(c, "This filesystem uses %s for hashing,"
+ " but %s is specified", hash_algo_name[hash_algo],
+ c->auth_hash_name);
+ return -EINVAL;
+ }
+
+ err = ubifs_hmac_wkm(c, hmac_wkm);
+ if (err)
+ return err;
+
+ if (ubifs_check_hmac(c, hmac_wkm, sup->hmac_wkm)) {
+ ubifs_err(c, "provided key does not fit");
+ return -ENOKEY;
+ }
+
+ err = ubifs_node_verify_hmac(c, sup, sizeof(*sup),
+ offsetof(struct ubifs_sb_node, hmac));
+ if (err)
+ ubifs_err(c, "Failed to authenticate superblock: %d", err);
+
+ return err;
+}
+
/**
* ubifs_write_sb_node - write superblock node.
* @c: UBIFS file-system description object
int ubifs_write_sb_node(struct ubifs_info *c, struct ubifs_sb_node *sup)
{
int len = ALIGN(UBIFS_SB_NODE_SZ, c->min_io_size);
+ int err;
+
+ err = ubifs_prepare_node_hmac(c, sup, UBIFS_SB_NODE_SZ,
+ offsetof(struct ubifs_sb_node, hmac), 1);
+ if (err)
+ return err;
- ubifs_prepare_node(c, sup, UBIFS_SB_NODE_SZ, 1);
return ubifs_leb_change(c, UBIFS_SB_LNUM, sup, len);
}
if (IS_ERR(sup))
return PTR_ERR(sup);
+ c->sup_node = sup;
+
c->fmt_version = le32_to_cpu(sup->fmt_version);
c->ro_compat_version = le32_to_cpu(sup->ro_compat_version);
c->key_hash = key_test_hash;
c->key_hash_type = UBIFS_KEY_HASH_TEST;
break;
- };
+ }
c->key_fmt = sup->key_fmt;
c->double_hash = !!(sup_flags & UBIFS_FLG_DOUBLE_HASH);
c->encrypted = !!(sup_flags & UBIFS_FLG_ENCRYPTION);
+ err = authenticate_sb_node(c, sup);
+ if (err)
+ goto out;
+
if ((sup_flags & ~UBIFS_FLG_MASK) != 0) {
ubifs_err(c, "Unknown feature flags found: %#x",
sup_flags & ~UBIFS_FLG_MASK);
err = validate_sb(c, sup);
out:
- kfree(sup);
return err;
}
int ubifs_fixup_free_space(struct ubifs_info *c)
{
int err;
- struct ubifs_sb_node *sup;
+ struct ubifs_sb_node *sup = c->sup_node;
ubifs_assert(c, c->space_fixup);
ubifs_assert(c, !c->ro_mount);
if (err)
return err;
- sup = ubifs_read_sb_node(c);
- if (IS_ERR(sup))
- return PTR_ERR(sup);
-
/* Free-space fixup is no longer required */
c->space_fixup = 0;
sup->flags &= cpu_to_le32(~UBIFS_FLG_SPACE_FIXUP);
err = ubifs_write_sb_node(c, sup);
- kfree(sup);
if (err)
return err;
int ubifs_enable_encryption(struct ubifs_info *c)
{
int err;
- struct ubifs_sb_node *sup;
+ struct ubifs_sb_node *sup = c->sup_node;
if (c->encrypted)
return 0;
return -EINVAL;
}
- sup = ubifs_read_sb_node(c);
- if (IS_ERR(sup))
- return PTR_ERR(sup);
-
sup->flags |= cpu_to_le32(UBIFS_FLG_ENCRYPTION);
err = ubifs_write_sb_node(c, sup);
if (!err)
c->encrypted = 1;
- kfree(sup);
return err;
}
c->ranges[UBIFS_REF_NODE].len = UBIFS_REF_NODE_SZ;
c->ranges[UBIFS_TRUN_NODE].len = UBIFS_TRUN_NODE_SZ;
c->ranges[UBIFS_CS_NODE].len = UBIFS_CS_NODE_SZ;
+ c->ranges[UBIFS_AUTH_NODE].min_len = UBIFS_AUTH_NODE_SZ;
+ c->ranges[UBIFS_AUTH_NODE].max_len = UBIFS_AUTH_NODE_SZ +
+ UBIFS_MAX_HMAC_LEN;
c->ranges[UBIFS_INO_NODE].min_len = UBIFS_INO_NODE_SZ;
c->ranges[UBIFS_INO_NODE].max_len = UBIFS_MAX_INO_NODE_SZ;
c->jheads[i].wbuf.sync_callback = &bud_wbuf_callback;
c->jheads[i].wbuf.jhead = i;
c->jheads[i].grouped = 1;
+ c->jheads[i].log_hash = ubifs_hash_get_desc(c);
+ if (IS_ERR(c->jheads[i].log_hash))
+ goto out;
}
/*
c->jheads[GCHD].grouped = 0;
return 0;
+
+out:
+ while (i--)
+ kfree(c->jheads[i].log_hash);
+
+ return err;
}
/**
for (i = 0; i < c->jhead_cnt; i++) {
kfree(c->jheads[i].wbuf.buf);
kfree(c->jheads[i].wbuf.inodes);
+ kfree(c->jheads[i].log_hash);
}
kfree(c->jheads);
c->jheads = NULL;
* Opt_no_chk_data_crc: do not check CRCs when reading data nodes
* Opt_override_compr: override default compressor
* Opt_assert: set ubifs_assert() action
+ * Opt_auth_key: The key name used for authentication
+ * Opt_auth_hash_name: The hash type used for authentication
* Opt_err: just end of array marker
*/
enum {
Opt_no_chk_data_crc,
Opt_override_compr,
Opt_assert,
+ Opt_auth_key,
+ Opt_auth_hash_name,
Opt_ignore,
Opt_err,
};
{Opt_chk_data_crc, "chk_data_crc"},
{Opt_no_chk_data_crc, "no_chk_data_crc"},
{Opt_override_compr, "compr=%s"},
+ {Opt_auth_key, "auth_key=%s"},
+ {Opt_auth_hash_name, "auth_hash_name=%s"},
{Opt_ignore, "ubi=%s"},
{Opt_ignore, "vol=%s"},
{Opt_assert, "assert=%s"},
kfree(act);
break;
}
+ case Opt_auth_key:
+ c->auth_key_name = kstrdup(args[0].from, GFP_KERNEL);
+ if (!c->auth_key_name)
+ return -ENOMEM;
+ break;
+ case Opt_auth_hash_name:
+ c->auth_hash_name = kstrdup(args[0].from, GFP_KERNEL);
+ if (!c->auth_hash_name)
+ return -ENOMEM;
+ break;
case Opt_ignore:
break;
default:
c->mounting = 1;
+ if (c->auth_key_name) {
+ if (IS_ENABLED(CONFIG_UBIFS_FS_AUTHENTICATION)) {
+ err = ubifs_init_authentication(c);
+ if (err)
+ goto out_free;
+ } else {
+ ubifs_err(c, "auth_key_name, but UBIFS is built without"
+ " authentication support");
+ err = -EINVAL;
+ goto out_free;
+ }
+ }
+
err = ubifs_read_superblock(c);
if (err)
goto out_free;
}
if (c->need_recovery) {
- err = ubifs_recover_size(c);
- if (err)
- goto out_orphans;
+ if (!ubifs_authenticated(c)) {
+ err = ubifs_recover_size(c, true);
+ if (err)
+ goto out_orphans;
+ }
+
err = ubifs_rcvry_gc_commit(c);
if (err)
goto out_orphans;
+
+ if (ubifs_authenticated(c)) {
+ err = ubifs_recover_size(c, false);
+ if (err)
+ goto out_orphans;
+ }
} else {
err = take_gc_lnum(c);
if (err)
if (err)
goto out_orphans;
} else if (c->need_recovery) {
- err = ubifs_recover_size(c);
+ err = ubifs_recover_size(c, false);
if (err)
goto out_orphans;
} else {
free_wbufs(c);
free_orphans(c);
ubifs_lpt_free(c, 0);
+ ubifs_exit_authentication(c);
+ kfree(c->auth_key_name);
+ kfree(c->auth_hash_name);
kfree(c->cbuf);
kfree(c->rcvrd_mst_node);
kfree(c->mst_node);
goto out;
if (c->old_leb_cnt != c->leb_cnt) {
- struct ubifs_sb_node *sup;
+ struct ubifs_sb_node *sup = c->sup_node;
- sup = ubifs_read_sb_node(c);
- if (IS_ERR(sup)) {
- err = PTR_ERR(sup);
- goto out;
- }
sup->leb_cnt = cpu_to_le32(c->leb_cnt);
err = ubifs_write_sb_node(c, sup);
- kfree(sup);
if (err)
goto out;
}
err = ubifs_write_rcvrd_mst_node(c);
if (err)
goto out;
- err = ubifs_recover_size(c);
- if (err)
- goto out;
+ if (!ubifs_authenticated(c)) {
+ err = ubifs_recover_size(c, true);
+ if (err)
+ goto out;
+ }
err = ubifs_clean_lebs(c, c->sbuf);
if (err)
goto out;
goto out;
}
- if (c->need_recovery)
+ if (c->need_recovery) {
err = ubifs_rcvry_gc_commit(c);
- else
+ if (err)
+ goto out;
+
+ if (ubifs_authenticated(c)) {
+ err = ubifs_recover_size(c, false);
+ if (err)
+ goto out;
+ }
+ } else {
err = ubifs_leb_unmap(c, c->gc_lnum);
+ }
if (err)
goto out;
#include "ubifs.h"
static int try_read_node(const struct ubifs_info *c, void *buf, int type,
- int len, int lnum, int offs);
+ struct ubifs_zbranch *zbr);
static int fallible_read_node(struct ubifs_info *c, const union ubifs_key *key,
struct ubifs_zbranch *zbr, void *node);
* @c: UBIFS file-system description object
* @buf: buffer to read to
* @type: node type
- * @len: node length (not aligned)
- * @lnum: LEB number of node to read
- * @offs: offset of node to read
+ * @zbr: the zbranch describing the node to read
*
* This function tries to read a node of known type and length, checks it and
* stores it in @buf. This function returns %1 if a node is present and %0 if
* journal nodes may potentially be corrupted, so checking is required.
*/
static int try_read_node(const struct ubifs_info *c, void *buf, int type,
- int len, int lnum, int offs)
+ struct ubifs_zbranch *zbr)
{
+ int len = zbr->len;
+ int lnum = zbr->lnum;
+ int offs = zbr->offs;
int err, node_len;
struct ubifs_ch *ch = buf;
uint32_t crc, node_crc;
if (crc != node_crc)
return 0;
+ err = ubifs_node_check_hash(c, buf, zbr->hash);
+ if (err) {
+ ubifs_bad_hash(c, buf, zbr->hash, lnum, offs);
+ return 0;
+ }
+
return 1;
}
dbg_tnck(key, "LEB %d:%d, key ", zbr->lnum, zbr->offs);
- ret = try_read_node(c, node, key_type(c, key), zbr->len, zbr->lnum,
- zbr->offs);
+ ret = try_read_node(c, node, key_type(c, key), zbr);
if (ret == 1) {
union ubifs_key node_key;
struct ubifs_dent_node *dent = node;
goto out;
}
+ err = ubifs_node_check_hash(c, buf, zbr->hash);
+ if (err) {
+ ubifs_bad_hash(c, buf, zbr->hash, zbr->lnum, zbr->offs);
+ return err;
+ }
+
len = le32_to_cpu(ch->len);
if (len != zbr->len) {
ubifs_err(c, "bad node length %d, expected %d", len, zbr->len);
* @lnum: LEB number of node
* @offs: node offset
* @len: node length
+ * @hash: The hash over the node
*
* This function adds a node with key @key to TNC. The node may be new or it may
* obsolete some existing one. Returns %0 on success or negative error code on
* failure.
*/
int ubifs_tnc_add(struct ubifs_info *c, const union ubifs_key *key, int lnum,
- int offs, int len)
+ int offs, int len, const u8 *hash)
{
int found, n, err = 0;
struct ubifs_znode *znode;
zbr.lnum = lnum;
zbr.offs = offs;
zbr.len = len;
+ ubifs_copy_hash(c, hash, zbr.hash);
key_copy(c, key, &zbr.key);
err = tnc_insert(c, znode, &zbr, n + 1);
} else if (found == 1) {
zbr->lnum = lnum;
zbr->offs = offs;
zbr->len = len;
+ ubifs_copy_hash(c, hash, zbr->hash);
} else
err = found;
if (!err)
* @lnum: LEB number of node
* @offs: node offset
* @len: node length
+ * @hash: The hash over the node
* @nm: node name
*
* This is the same as 'ubifs_tnc_add()' but it should be used with keys which
* may have collisions, like directory entry keys.
*/
int ubifs_tnc_add_nm(struct ubifs_info *c, const union ubifs_key *key,
- int lnum, int offs, int len,
+ int lnum, int offs, int len, const u8 *hash,
const struct fscrypt_name *nm)
{
int found, n, err = 0;
zbr->lnum = lnum;
zbr->offs = offs;
zbr->len = len;
+ ubifs_copy_hash(c, hash, zbr->hash);
goto out_unlock;
}
}
zbr.lnum = lnum;
zbr.offs = offs;
zbr.len = len;
+ ubifs_copy_hash(c, hash, zbr.hash);
key_copy(c, key, &zbr.key);
err = tnc_insert(c, znode, &zbr, n + 1);
if (err)
struct ubifs_znode *znode, int lnum, int offs, int len)
{
struct ubifs_znode *zp;
+ u8 hash[UBIFS_HASH_ARR_SZ];
int i, err;
/* Make index node */
br->lnum = cpu_to_le32(zbr->lnum);
br->offs = cpu_to_le32(zbr->offs);
br->len = cpu_to_le32(zbr->len);
+ ubifs_copy_hash(c, zbr->hash, ubifs_branch_hash(c, br));
if (!zbr->lnum || !zbr->len) {
ubifs_err(c, "bad ref in znode");
ubifs_dump_znode(c, znode);
}
}
ubifs_prepare_node(c, idx, len, 0);
+ ubifs_node_calc_hash(c, idx, hash);
znode->lnum = lnum;
znode->offs = offs;
zbr->lnum = lnum;
zbr->offs = offs;
zbr->len = len;
+ ubifs_copy_hash(c, hash, zbr->hash);
} else {
c->zroot.lnum = lnum;
c->zroot.offs = offs;
c->zroot.len = len;
+ ubifs_copy_hash(c, hash, c->zroot.hash);
}
c->calc_idx_sz += ALIGN(len, 8);
znode->cnext = c->cnext;
break;
}
+ znode->cparent = znode->parent;
+ znode->ciip = znode->iip;
znode->cnext = cnext;
znode = cnext;
cnt += 1;
}
while (1) {
+ u8 hash[UBIFS_HASH_ARR_SZ];
+
cond_resched();
znode = cnext;
br->lnum = cpu_to_le32(zbr->lnum);
br->offs = cpu_to_le32(zbr->offs);
br->len = cpu_to_le32(zbr->len);
+ ubifs_copy_hash(c, zbr->hash, ubifs_branch_hash(c, br));
if (!zbr->lnum || !zbr->len) {
ubifs_err(c, "bad ref in znode");
ubifs_dump_znode(c, znode);
}
len = ubifs_idx_node_sz(c, znode->child_cnt);
ubifs_prepare_node(c, idx, len, 0);
+ ubifs_node_calc_hash(c, idx, hash);
+
+ mutex_lock(&c->tnc_mutex);
+
+ if (znode->cparent)
+ ubifs_copy_hash(c, hash,
+ znode->cparent->zbranch[znode->ciip].hash);
+
+ if (znode->parent) {
+ if (!ubifs_zn_obsolete(znode))
+ ubifs_copy_hash(c, hash,
+ znode->parent->zbranch[znode->iip].hash);
+ } else {
+ ubifs_copy_hash(c, hash, c->zroot.hash);
+ }
+
+ mutex_unlock(&c->tnc_mutex);
/* Determine the index node position */
if (lnum == -1) {
/**
* read_znode - read an indexing node from flash and fill znode.
* @c: UBIFS file-system description object
- * @lnum: LEB of the indexing node to read
- * @offs: node offset
- * @len: node length
+ * @zzbr: the zbranch describing the node to read
* @znode: znode to read to
*
* This function reads an indexing node from the flash media and fills znode
* is wrong with it, this function prints complaint messages and returns
* %-EINVAL.
*/
-static int read_znode(struct ubifs_info *c, int lnum, int offs, int len,
+static int read_znode(struct ubifs_info *c, struct ubifs_zbranch *zzbr,
struct ubifs_znode *znode)
{
+ int lnum = zzbr->lnum;
+ int offs = zzbr->offs;
+ int len = zzbr->len;
int i, err, type, cmp;
struct ubifs_idx_node *idx;
return err;
}
+ err = ubifs_node_check_hash(c, idx, zzbr->hash);
+ if (err) {
+ ubifs_bad_hash(c, idx, zzbr->hash, lnum, offs);
+ return err;
+ }
+
znode->child_cnt = le16_to_cpu(idx->child_cnt);
znode->level = le16_to_cpu(idx->level);
}
for (i = 0; i < znode->child_cnt; i++) {
- const struct ubifs_branch *br = ubifs_idx_branch(c, idx, i);
+ struct ubifs_branch *br = ubifs_idx_branch(c, idx, i);
struct ubifs_zbranch *zbr = &znode->zbranch[i];
key_read(c, &br->key, &zbr->key);
zbr->lnum = le32_to_cpu(br->lnum);
zbr->offs = le32_to_cpu(br->offs);
zbr->len = le32_to_cpu(br->len);
+ ubifs_copy_hash(c, ubifs_branch_hash(c, br), zbr->hash);
zbr->znode = NULL;
/* Validate branch */
if (!znode)
return ERR_PTR(-ENOMEM);
- err = read_znode(c, zbr->lnum, zbr->offs, zbr->len, znode);
+ err = read_znode(c, zbr, znode);
if (err)
goto out;
return -EINVAL;
}
+ err = ubifs_node_check_hash(c, node, zbr->hash);
+ if (err) {
+ ubifs_bad_hash(c, node, zbr->hash, zbr->lnum, zbr->offs);
+ return err;
+ }
+
return 0;
}
#define UBIFS_IDX_NODE_SZ sizeof(struct ubifs_idx_node)
#define UBIFS_CS_NODE_SZ sizeof(struct ubifs_cs_node)
#define UBIFS_ORPH_NODE_SZ sizeof(struct ubifs_orph_node)
+#define UBIFS_AUTH_NODE_SZ sizeof(struct ubifs_auth_node)
/* Extended attribute entry nodes are identical to directory entry nodes */
#define UBIFS_XENT_NODE_SZ UBIFS_DENT_NODE_SZ
/* Only this does not have to be multiple of 8 bytes */
/* The largest UBIFS node */
#define UBIFS_MAX_NODE_SZ UBIFS_MAX_INO_NODE_SZ
+/* The maxmimum size of a hash, enough for sha512 */
+#define UBIFS_MAX_HASH_LEN 64
+
+/* The maxmimum size of a hmac, enough for hmac(sha512) */
+#define UBIFS_MAX_HMAC_LEN 64
+
/*
* xattr name of UBIFS encryption context, we don't use a prefix
* nor a long name to not waste space on the flash.
* UBIFS_IDX_NODE: index node
* UBIFS_CS_NODE: commit start node
* UBIFS_ORPH_NODE: orphan node
+ * UBIFS_AUTH_NODE: authentication node
* UBIFS_NODE_TYPES_CNT: count of supported node types
*
* Note, we index arrays by these numbers, so keep them low and contiguous.
UBIFS_IDX_NODE,
UBIFS_CS_NODE,
UBIFS_ORPH_NODE,
+ UBIFS_AUTH_NODE,
UBIFS_NODE_TYPES_CNT,
};
* UBIFS_FLG_DOUBLE_HASH: store a 32bit cookie in directory entry nodes to
* support 64bit cookies for lookups by hash
* UBIFS_FLG_ENCRYPTION: this filesystem contains encrypted files
+ * UBIFS_FLG_AUTHENTICATION: this filesystem contains hashes for authentication
*/
enum {
UBIFS_FLG_BIGLPT = 0x02,
UBIFS_FLG_SPACE_FIXUP = 0x04,
UBIFS_FLG_DOUBLE_HASH = 0x08,
UBIFS_FLG_ENCRYPTION = 0x10,
+ UBIFS_FLG_AUTHENTICATION = 0x20,
};
-#define UBIFS_FLG_MASK (UBIFS_FLG_BIGLPT|UBIFS_FLG_SPACE_FIXUP|UBIFS_FLG_DOUBLE_HASH|UBIFS_FLG_ENCRYPTION)
+#define UBIFS_FLG_MASK (UBIFS_FLG_BIGLPT | UBIFS_FLG_SPACE_FIXUP | \
+ UBIFS_FLG_DOUBLE_HASH | UBIFS_FLG_ENCRYPTION | \
+ UBIFS_FLG_AUTHENTICATION)
/**
* struct ubifs_ch - common header node.
* @time_gran: time granularity in nanoseconds
* @uuid: UUID generated when the file system image was created
* @ro_compat_version: UBIFS R/O compatibility version
+ * @hmac: HMAC to authenticate the superblock node
+ * @hmac_wkm: HMAC of a well known message (the string "UBIFS") as a convenience
+ * to the user to check if the correct key is passed.
+ * @hash_algo: The hash algo used for this filesystem (one of enum hash_algo)
*/
struct ubifs_sb_node {
struct ubifs_ch ch;
__le32 time_gran;
__u8 uuid[16];
__le32 ro_compat_version;
- __u8 padding2[3968];
+ __u8 hmac[UBIFS_MAX_HMAC_LEN];
+ __u8 hmac_wkm[UBIFS_MAX_HMAC_LEN];
+ __le16 hash_algo;
+ __u8 padding2[3838];
} __packed;
/**
* @empty_lebs: number of empty logical eraseblocks
* @idx_lebs: number of indexing logical eraseblocks
* @leb_cnt: count of LEBs used by file-system
+ * @hash_root_idx: the hash of the root index node
+ * @hash_lpt: the hash of the LPT
+ * @hmac: HMAC to authenticate the master node
* @padding: reserved for future, zeroes
*/
struct ubifs_mst_node {
__le32 empty_lebs;
__le32 idx_lebs;
__le32 leb_cnt;
- __u8 padding[344];
+ __u8 hash_root_idx[UBIFS_MAX_HASH_LEN];
+ __u8 hash_lpt[UBIFS_MAX_HASH_LEN];
+ __u8 hmac[UBIFS_MAX_HMAC_LEN];
+ __u8 padding[152];
} __packed;
/**
__u8 padding[28];
} __packed;
+/**
+ * struct ubifs_auth_node - node for authenticating other nodes
+ * @ch: common header
+ * @hmac: The HMAC
+ */
+struct ubifs_auth_node {
+ struct ubifs_ch ch;
+ __u8 hmac[];
+} __packed;
+
/**
* struct ubifs_branch - key/reference/length branch
* @lnum: LEB number of the target node
* @offs: offset within @lnum
* @len: target node length
* @key: key
+ *
+ * In an authenticated UBIFS we have the hash of the referenced node after @key.
+ * This can't be added to the struct type definition because @key is a
+ * dynamically sized element already.
*/
struct ubifs_branch {
__le32 lnum;
#include <linux/security.h>
#include <linux/xattr.h>
#include <linux/random.h>
+#include <crypto/hash_info.h>
+#include <crypto/hash.h>
+#include <crypto/algapi.h>
#define __FS_HAS_ENCRYPTION IS_ENABLED(CONFIG_UBIFS_FS_ENCRYPTION)
#include <linux/fscrypt.h>
/* Maximum number of data nodes to bulk-read */
#define UBIFS_MAX_BULK_READ 32
+#ifdef CONFIG_UBIFS_FS_AUTHENTICATION
+#define UBIFS_HASH_ARR_SZ UBIFS_MAX_HASH_LEN
+#define UBIFS_HMAC_ARR_SZ UBIFS_MAX_HMAC_LEN
+#else
+#define UBIFS_HASH_ARR_SZ 0
+#define UBIFS_HMAC_ARR_SZ 0
+#endif
+
/*
* Lockdep classes for UBIFS inode @ui_mutex.
*/
* @jhead: journal head number this bud belongs to
* @list: link in the list buds belonging to the same journal head
* @rb: link in the tree of all buds
+ * @log_hash: the log hash from the commit start node up to this bud
*/
struct ubifs_bud {
int lnum;
int jhead;
struct list_head list;
struct rb_node rb;
+ struct shash_desc *log_hash;
};
/**
* @wbuf: head's write-buffer
* @buds_list: list of bud LEBs belonging to this journal head
* @grouped: non-zero if UBIFS groups nodes when writing to this journal head
+ * @log_hash: the log hash from the commit start node up to this journal head
*
* Note, the @buds list is protected by the @c->buds_lock.
*/
struct ubifs_wbuf wbuf;
struct list_head buds_list;
unsigned int grouped:1;
+ struct shash_desc *log_hash;
};
/**
* @lnum: LEB number of the target node (indexing node or data node)
* @offs: target node offset within @lnum
* @len: target node length
+ * @hash: the hash of the target node
*/
struct ubifs_zbranch {
union ubifs_key key;
int lnum;
int offs;
int len;
+ u8 hash[UBIFS_HASH_ARR_SZ];
};
/**
* struct ubifs_znode - in-memory representation of an indexing node.
* @parent: parent znode or NULL if it is the root
* @cnext: next znode to commit
+ * @cparent: parent node for this commit
+ * @ciip: index in cparent's zbranch array
* @flags: znode flags (%DIRTY_ZNODE, %COW_ZNODE or %OBSOLETE_ZNODE)
* @time: last access time (seconds)
* @level: level of the entry in the TNC tree
struct ubifs_znode {
struct ubifs_znode *parent;
struct ubifs_znode *cnext;
+ struct ubifs_znode *cparent;
+ int ciip;
unsigned long flags;
time64_t time;
int level;
* struct ubifs_info - UBIFS file-system description data structure
* (per-superblock).
* @vfs_sb: VFS @struct super_block object
+ * @sup_node: The super block node as read from the device
*
* @highest_inum: highest used inode number
* @max_sqnum: current global sequence number
* @default_compr: default compression algorithm (%UBIFS_COMPR_LZO, etc)
* @rw_incompat: the media is not R/W compatible
* @assert_action: action to take when a ubifs_assert() fails
+ * @authenticated: flag indigating the FS is mounted in authenticated mode
*
* @tnc_mutex: protects the Tree Node Cache (TNC), @zroot, @cnext, @enext, and
* @calc_idx_sz
* @key_hash: direntry key hash function
* @key_fmt: key format
* @key_len: key length
+ * @hash_len: The length of the index node hashes
* @fanout: fanout of the index tree (number of links per indexing node)
*
* @min_io_size: minimal input/output unit size
* @rp_uid: reserved pool user ID
* @rp_gid: reserved pool group ID
*
+ * @hash_tfm: the hash transformation used for hashing nodes
+ * @hmac_tfm: the HMAC transformation for this filesystem
+ * @hmac_desc_len: length of the HMAC used for authentication
+ * @auth_key_name: the authentication key name
+ * @auth_hash_name: the name of the hash algorithm used for authentication
+ * @auth_hash_algo: the authentication hash used for this fs
+ * @log_hash: the log hash from the commit start node up to the latest reference
+ * node.
+ *
* @empty: %1 if the UBI device is empty
* @need_recovery: %1 if the file-system needs recovery
* @replaying: %1 during journal replay
*/
struct ubifs_info {
struct super_block *vfs_sb;
+ struct ubifs_sb_node *sup_node;
ino_t highest_inum;
unsigned long long max_sqnum;
unsigned int default_compr:2;
unsigned int rw_incompat:1;
unsigned int assert_action:2;
+ unsigned int authenticated:1;
struct mutex tnc_mutex;
struct ubifs_zbranch zroot;
uint32_t (*key_hash)(const char *str, int len);
int key_fmt;
int key_len;
+ int hash_len;
int fanout;
int min_io_size;
kuid_t rp_uid;
kgid_t rp_gid;
+ struct crypto_shash *hash_tfm;
+ struct crypto_shash *hmac_tfm;
+ int hmac_desc_len;
+ char *auth_key_name;
+ char *auth_hash_name;
+ enum hash_algo auth_hash_algo;
+
+ struct shash_desc *log_hash;
+
/* The below fields are used only during mounting and re-mounting */
unsigned int empty:1;
unsigned int need_recovery:1;
extern const struct inode_operations ubifs_symlink_inode_operations;
extern struct ubifs_compressor *ubifs_compressors[UBIFS_COMPR_TYPES_CNT];
+/* auth.c */
+static inline int ubifs_authenticated(const struct ubifs_info *c)
+{
+ return (IS_ENABLED(CONFIG_UBIFS_FS_AUTHENTICATION)) && c->authenticated;
+}
+
+struct shash_desc *__ubifs_hash_get_desc(const struct ubifs_info *c);
+static inline struct shash_desc *ubifs_hash_get_desc(const struct ubifs_info *c)
+{
+ return ubifs_authenticated(c) ? __ubifs_hash_get_desc(c) : NULL;
+}
+
+static inline int ubifs_shash_init(const struct ubifs_info *c,
+ struct shash_desc *desc)
+{
+ if (ubifs_authenticated(c))
+ return crypto_shash_init(desc);
+ else
+ return 0;
+}
+
+static inline int ubifs_shash_update(const struct ubifs_info *c,
+ struct shash_desc *desc, const void *buf,
+ unsigned int len)
+{
+ int err = 0;
+
+ if (ubifs_authenticated(c)) {
+ err = crypto_shash_update(desc, buf, len);
+ if (err < 0)
+ return err;
+ }
+
+ return 0;
+}
+
+static inline int ubifs_shash_final(const struct ubifs_info *c,
+ struct shash_desc *desc, u8 *out)
+{
+ return ubifs_authenticated(c) ? crypto_shash_final(desc, out) : 0;
+}
+
+int __ubifs_node_calc_hash(const struct ubifs_info *c, const void *buf,
+ u8 *hash);
+static inline int ubifs_node_calc_hash(const struct ubifs_info *c,
+ const void *buf, u8 *hash)
+{
+ if (ubifs_authenticated(c))
+ return __ubifs_node_calc_hash(c, buf, hash);
+ else
+ return 0;
+}
+
+int ubifs_prepare_auth_node(struct ubifs_info *c, void *node,
+ struct shash_desc *inhash);
+
+/**
+ * ubifs_check_hash - compare two hashes
+ * @c: UBIFS file-system description object
+ * @expected: first hash
+ * @got: second hash
+ *
+ * Compare two hashes @expected and @got. Returns 0 when they are equal, a
+ * negative error code otherwise.
+ */
+static inline int ubifs_check_hash(const struct ubifs_info *c,
+ const u8 *expected, const u8 *got)
+{
+ return crypto_memneq(expected, got, c->hash_len);
+}
+
+/**
+ * ubifs_check_hmac - compare two HMACs
+ * @c: UBIFS file-system description object
+ * @expected: first HMAC
+ * @got: second HMAC
+ *
+ * Compare two hashes @expected and @got. Returns 0 when they are equal, a
+ * negative error code otherwise.
+ */
+static inline int ubifs_check_hmac(const struct ubifs_info *c,
+ const u8 *expected, const u8 *got)
+{
+ return crypto_memneq(expected, got, c->hmac_desc_len);
+}
+
+void ubifs_bad_hash(const struct ubifs_info *c, const void *node,
+ const u8 *hash, int lnum, int offs);
+
+int __ubifs_node_check_hash(const struct ubifs_info *c, const void *buf,
+ const u8 *expected);
+static inline int ubifs_node_check_hash(const struct ubifs_info *c,
+ const void *buf, const u8 *expected)
+{
+ if (ubifs_authenticated(c))
+ return __ubifs_node_check_hash(c, buf, expected);
+ else
+ return 0;
+}
+
+int ubifs_init_authentication(struct ubifs_info *c);
+void __ubifs_exit_authentication(struct ubifs_info *c);
+static inline void ubifs_exit_authentication(struct ubifs_info *c)
+{
+ if (ubifs_authenticated(c))
+ __ubifs_exit_authentication(c);
+}
+
+/**
+ * ubifs_branch_hash - returns a pointer to the hash of a branch
+ * @c: UBIFS file-system description object
+ * @br: branch to get the hash from
+ *
+ * This returns a pointer to the hash of a branch. Since the key already is a
+ * dynamically sized object we cannot use a struct member here.
+ */
+static inline u8 *ubifs_branch_hash(struct ubifs_info *c,
+ struct ubifs_branch *br)
+{
+ return (void *)br + sizeof(*br) + c->key_len;
+}
+
+/**
+ * ubifs_copy_hash - copy a hash
+ * @c: UBIFS file-system description object
+ * @from: source hash
+ * @to: destination hash
+ *
+ * With authentication this copies a hash, otherwise does nothing.
+ */
+static inline void ubifs_copy_hash(const struct ubifs_info *c, const u8 *from,
+ u8 *to)
+{
+ if (ubifs_authenticated(c))
+ memcpy(to, from, c->hash_len);
+}
+
+int __ubifs_node_insert_hmac(const struct ubifs_info *c, void *buf,
+ int len, int ofs_hmac);
+static inline int ubifs_node_insert_hmac(const struct ubifs_info *c, void *buf,
+ int len, int ofs_hmac)
+{
+ if (ubifs_authenticated(c))
+ return __ubifs_node_insert_hmac(c, buf, len, ofs_hmac);
+ else
+ return 0;
+}
+
+int __ubifs_node_verify_hmac(const struct ubifs_info *c, const void *buf,
+ int len, int ofs_hmac);
+static inline int ubifs_node_verify_hmac(const struct ubifs_info *c,
+ const void *buf, int len, int ofs_hmac)
+{
+ if (ubifs_authenticated(c))
+ return __ubifs_node_verify_hmac(c, buf, len, ofs_hmac);
+ else
+ return 0;
+}
+
+/**
+ * ubifs_auth_node_sz - returns the size of an authentication node
+ * @c: UBIFS file-system description object
+ *
+ * This function returns the size of an authentication node which can
+ * be 0 for unauthenticated filesystems or the real size of an auth node
+ * authentication is enabled.
+ */
+static inline int ubifs_auth_node_sz(const struct ubifs_info *c)
+{
+ if (ubifs_authenticated(c))
+ return sizeof(struct ubifs_auth_node) + c->hmac_desc_len;
+ else
+ return 0;
+}
+
+int ubifs_hmac_wkm(struct ubifs_info *c, u8 *hmac);
+
+int __ubifs_shash_copy_state(const struct ubifs_info *c, struct shash_desc *src,
+ struct shash_desc *target);
+static inline int ubifs_shash_copy_state(const struct ubifs_info *c,
+ struct shash_desc *src,
+ struct shash_desc *target)
+{
+ if (ubifs_authenticated(c))
+ return __ubifs_shash_copy_state(c, src, target);
+ else
+ return 0;
+}
+
/* io.c */
void ubifs_ro_mode(struct ubifs_info *c, int err);
int ubifs_leb_read(const struct ubifs_info *c, int lnum, void *buf, int offs,
int lnum, int offs);
int ubifs_write_node(struct ubifs_info *c, void *node, int len, int lnum,
int offs);
+int ubifs_write_node_hmac(struct ubifs_info *c, void *buf, int len, int lnum,
+ int offs, int hmac_offs);
int ubifs_check_node(const struct ubifs_info *c, const void *buf, int lnum,
int offs, int quiet, int must_chk_crc);
+void ubifs_init_node(struct ubifs_info *c, void *buf, int len, int pad);
+void ubifs_crc_node(struct ubifs_info *c, void *buf, int len);
void ubifs_prepare_node(struct ubifs_info *c, void *buf, int len, int pad);
+int ubifs_prepare_node_hmac(struct ubifs_info *c, void *node, int len,
+ int hmac_offs, int pad);
void ubifs_prep_grp_node(struct ubifs_info *c, void *node, int len, int last);
int ubifs_io_init(struct ubifs_info *c);
void ubifs_pad(const struct ubifs_info *c, void *buf, int pad);
int ubifs_tnc_locate(struct ubifs_info *c, const union ubifs_key *key,
void *node, int *lnum, int *offs);
int ubifs_tnc_add(struct ubifs_info *c, const union ubifs_key *key, int lnum,
- int offs, int len);
+ int offs, int len, const u8 *hash);
int ubifs_tnc_replace(struct ubifs_info *c, const union ubifs_key *key,
int old_lnum, int old_offs, int lnum, int offs, int len);
int ubifs_tnc_add_nm(struct ubifs_info *c, const union ubifs_key *key,
- int lnum, int offs, int len, const struct fscrypt_name *nm);
+ int lnum, int offs, int len, const u8 *hash,
+ const struct fscrypt_name *nm);
int ubifs_tnc_remove(struct ubifs_info *c, const union ubifs_key *key);
int ubifs_tnc_remove_nm(struct ubifs_info *c, const union ubifs_key *key,
const struct fscrypt_name *nm);
void ubifs_wait_for_commit(struct ubifs_info *c);
/* master.c */
+int ubifs_compare_master_node(struct ubifs_info *c, void *m1, void *m2);
int ubifs_read_master(struct ubifs_info *c);
int ubifs_write_master(struct ubifs_info *c);
/* sb.c */
int ubifs_read_superblock(struct ubifs_info *c);
-struct ubifs_sb_node *ubifs_read_sb_node(struct ubifs_info *c);
int ubifs_write_sb_node(struct ubifs_info *c, struct ubifs_sb_node *sup);
int ubifs_fixup_free_space(struct ubifs_info *c);
int ubifs_enable_encryption(struct ubifs_info *c);
/* lpt.c */
int ubifs_calc_lpt_geom(struct ubifs_info *c);
int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
- int *lpt_lebs, int *big_lpt);
+ int *lpt_lebs, int *big_lpt, u8 *hash);
int ubifs_lpt_init(struct ubifs_info *c, int rd, int wr);
struct ubifs_lprops *ubifs_lpt_lookup(struct ubifs_info *c, int lnum);
struct ubifs_lprops *ubifs_lpt_lookup_dirty(struct ubifs_info *c, int lnum);
struct ubifs_nnode *parent, int iip);
struct ubifs_nnode *ubifs_get_nnode(struct ubifs_info *c,
struct ubifs_nnode *parent, int iip);
+struct ubifs_pnode *ubifs_pnode_lookup(struct ubifs_info *c, int i);
int ubifs_read_nnode(struct ubifs_info *c, struct ubifs_nnode *parent, int iip);
void ubifs_add_lpt_dirt(struct ubifs_info *c, int lnum, int dirty);
void ubifs_add_nnode_dirt(struct ubifs_info *c, struct ubifs_nnode *nnode);
/* Needed only in debugging code in lpt_commit.c */
int ubifs_unpack_nnode(const struct ubifs_info *c, void *buf,
struct ubifs_nnode *nnode);
+int ubifs_lpt_calc_hash(struct ubifs_info *c, u8 *hash);
/* lpt_commit.c */
int ubifs_lpt_start_commit(struct ubifs_info *c);
int ubifs_rcvry_gc_commit(struct ubifs_info *c);
int ubifs_recover_size_accum(struct ubifs_info *c, union ubifs_key *key,
int deletion, loff_t new_size);
-int ubifs_recover_size(struct ubifs_info *c);
+int ubifs_recover_size(struct ubifs_info *c, bool in_place);
void ubifs_destroy_size_tree(struct ubifs_info *c);
/* ioctl.c */
ret = udf_dstrCS0toChar(sb, outstr, 31, pvoldesc->volIdent, 32);
- if (ret < 0)
- goto out_bh;
-
- strncpy(UDF_SB(sb)->s_volume_ident, outstr, ret);
+ if (ret < 0) {
+ strcpy(UDF_SB(sb)->s_volume_ident, "InvalidName");
+ pr_warn("incorrect volume identification, setting to "
+ "'InvalidName'\n");
+ } else {
+ strncpy(UDF_SB(sb)->s_volume_ident, outstr, ret);
+ }
udf_debug("volIdent[] = '%s'\n", UDF_SB(sb)->s_volume_ident);
ret = udf_dstrCS0toChar(sb, outstr, 127, pvoldesc->volSetIdent, 128);
- if (ret < 0)
+ if (ret < 0) {
+ ret = 0;
goto out_bh;
-
+ }
outstr[ret] = 0;
udf_debug("volSetIdent[] = '%s'\n", outstr);
return u_len;
}
+/*
+ * Convert CS0 dstring to output charset. Warning: This function may truncate
+ * input string if it is too long as it is used for informational strings only
+ * and it is better to truncate the string than to refuse mounting a media.
+ */
int udf_dstrCS0toChar(struct super_block *sb, uint8_t *utf_o, int o_len,
const uint8_t *ocu_i, int i_len)
{
if (i_len > 0) {
s_len = ocu_i[i_len - 1];
if (s_len >= i_len) {
- pr_err("incorrect dstring lengths (%d/%d)\n",
- s_len, i_len);
- return -EINVAL;
+ pr_warn("incorrect dstring lengths (%d/%d),"
+ " truncating\n", s_len, i_len);
+ s_len = i_len - 1;
+ /* 2-byte encoding? Need to round properly... */
+ if (ocu_i[0] == 16)
+ s_len -= (s_len - 1) & 2;
}
}
ret = -EINVAL;
if (!vma_can_userfault(cur))
goto out_unlock;
+
+ /*
+ * UFFDIO_COPY will fill file holes even without
+ * PROT_WRITE. This check enforces that if this is a
+ * MAP_SHARED, the process has write permission to the backing
+ * file. If VM_MAYWRITE is set it also enforces that on a
+ * MAP_SHARED vma: there is no F_WRITE_SEAL and no further
+ * F_WRITE_SEAL can be taken until the vma is destroyed.
+ */
+ ret = -EPERM;
+ if (unlikely(!(cur->vm_flags & VM_MAYWRITE)))
+ goto out_unlock;
+
/*
* If this vma contains ending address, and huge pages
* check alignment.
BUG_ON(!vma_can_userfault(vma));
BUG_ON(vma->vm_userfaultfd_ctx.ctx &&
vma->vm_userfaultfd_ctx.ctx != ctx);
+ WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
/*
* Nothing to do: this vma is already registered into this
cond_resched();
BUG_ON(!vma_can_userfault(vma));
+ WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
/*
* Nothing to do: this vma is already registered into this
struct xfs_mount *mp = bp->b_target->bt_mount;
struct xfs_attr_leafblock *leaf = bp->b_addr;
struct xfs_attr_leaf_entry *entries;
- uint16_t end;
+ uint32_t end; /* must be 32bit - see below */
int i;
xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &ichdr, leaf);
/*
* Quickly check the freemap information. Attribute data has to be
* aligned to 4-byte boundaries, and likewise for the free space.
+ *
+ * Note that for 64k block size filesystems, the freemap entries cannot
+ * overflow as they are only be16 fields. However, when checking end
+ * pointer of the freemap, we have to be careful to detect overflows and
+ * so use uint32_t for those checks.
*/
for (i = 0; i < XFS_ATTR_LEAF_MAPSIZE; i++) {
if (ichdr.freemap[i].base > mp->m_attr_geo->blksize)
return __this_address;
if (ichdr.freemap[i].size & 0x3)
return __this_address;
- end = ichdr.freemap[i].base + ichdr.freemap[i].size;
+
+ /* be care of 16 bit overflows here */
+ end = (uint32_t)ichdr.freemap[i].base + ichdr.freemap[i].size;
if (end < ichdr.freemap[i].base)
return __this_address;
if (end > mp->m_attr_geo->blksize)
case BMAP_LEFT_FILLING | BMAP_RIGHT_FILLING | BMAP_RIGHT_CONTIG:
/*
* Filling in all of a previously delayed allocation extent.
- * The right neighbor is contiguous, the left is not.
+ * The right neighbor is contiguous, the left is not. Take care
+ * with delay -> unwritten extent allocation here because the
+ * delalloc record we are overwriting is always written.
*/
PREV.br_startblock = new->br_startblock;
PREV.br_blockcount += RIGHT.br_blockcount;
+ PREV.br_state = new->br_state;
xfs_iext_next(ifp, &bma->icur);
xfs_iext_remove(bma->ip, &bma->icur, state);
if (xfs_sb_version_hascrc(&mp->m_sb)) {
if (!xfs_log_check_lsn(mp, be64_to_cpu(block->bb_u.s.bb_lsn)))
- return __this_address;
+ return false;
return xfs_buf_verify_cksum(bp, XFS_BTREE_SBLOCK_CRC_OFF);
}
static xfs_extlen_t
xfs_inobt_max_size(
- struct xfs_mount *mp)
+ struct xfs_mount *mp,
+ xfs_agnumber_t agno)
{
+ xfs_agblock_t agblocks = xfs_ag_block_count(mp, agno);
+
/* Bail out if we're uninitialized, which can happen in mkfs. */
if (mp->m_inobt_mxr[0] == 0)
return 0;
return xfs_btree_calc_size(mp->m_inobt_mnr,
- (uint64_t)mp->m_sb.sb_agblocks * mp->m_sb.sb_inopblock /
- XFS_INODES_PER_CHUNK);
+ (uint64_t)agblocks * mp->m_sb.sb_inopblock /
+ XFS_INODES_PER_CHUNK);
}
static int
if (error)
return error;
- *ask += xfs_inobt_max_size(mp);
+ *ask += xfs_inobt_max_size(mp, agno);
*used += tree_len;
return 0;
}
goto out_unlock;
}
-static int
+int
xfs_flush_unmap_range(
struct xfs_inode *ip,
xfs_off_t offset,
* page could be mmap'd and iomap_zero_range doesn't do that for us.
* Writeback of the eof page will do this, albeit clumsily.
*/
- if (offset + len >= XFS_ISIZE(ip) && ((offset + len) & PAGE_MASK)) {
+ if (offset + len >= XFS_ISIZE(ip) && offset_in_page(offset + len) > 0) {
error = filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
- (offset + len) & ~PAGE_MASK, LLONG_MAX);
+ round_down(offset + len, PAGE_SIZE), LLONG_MAX);
}
return error;
* Writeback and invalidate cache for the remainder of the file as we're
* about to shift down every extent from offset to EOF.
*/
- error = filemap_write_and_wait_range(VFS_I(ip)->i_mapping, offset, -1);
- if (error)
- return error;
- error = invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
- offset >> PAGE_SHIFT, -1);
- if (error)
- return error;
+ error = xfs_flush_unmap_range(ip, offset, XFS_ISIZE(ip));
/*
* Clean out anything hanging around in the cow fork now that
int whichfork, xfs_extnum_t *nextents,
xfs_filblks_t *count);
+int xfs_flush_unmap_range(struct xfs_inode *ip, xfs_off_t offset,
+ xfs_off_t len);
+
#endif /* __XFS_BMAP_UTIL_H__ */
}
/*
- * Requeue a failed buffer for writeback
+ * Requeue a failed buffer for writeback.
*
- * Return true if the buffer has been re-queued properly, false otherwise
+ * We clear the log item failed state here as well, but we have to be careful
+ * about reference counts because the only active reference counts on the buffer
+ * may be the failed log items. Hence if we clear the log item failed state
+ * before queuing the buffer for IO we can release all active references to
+ * the buffer and free it, leading to use after free problems in
+ * xfs_buf_delwri_queue. It makes no difference to the buffer or log items which
+ * order we process them in - the buffer is locked, and we own the buffer list
+ * so nothing on them is going to change while we are performing this action.
+ *
+ * Hence we can safely queue the buffer for IO before we clear the failed log
+ * item state, therefore always having an active reference to the buffer and
+ * avoiding the transient zero-reference state that leads to use-after-free.
+ *
+ * Return true if the buffer was added to the buffer list, false if it was
+ * already on the buffer list.
*/
bool
xfs_buf_resubmit_failed_buffers(
struct list_head *buffer_list)
{
struct xfs_log_item *lip;
+ bool ret;
+
+ ret = xfs_buf_delwri_queue(bp, buffer_list);
/*
- * Clear XFS_LI_FAILED flag from all items before resubmit
- *
- * XFS_LI_FAILED set/clear is protected by ail_lock, caller this
+ * XFS_LI_FAILED set/clear is protected by ail_lock, caller of this
* function already have it acquired
*/
list_for_each_entry(lip, &bp->b_li_list, li_bio_list)
xfs_clear_li_failed(lip);
- /* Add this buffer back to the delayed write list */
- return xfs_buf_delwri_queue(bp, buffer_list);
+ return ret;
}
return error;
}
-STATIC int
-xfs_file_clone_range(
- struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len)
-{
- return xfs_reflink_remap_range(file_in, pos_in, file_out, pos_out,
- len, false);
-}
-STATIC int
-xfs_file_dedupe_range(
- struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len)
+STATIC loff_t
+xfs_file_remap_range(
+ struct file *file_in,
+ loff_t pos_in,
+ struct file *file_out,
+ loff_t pos_out,
+ loff_t len,
+ unsigned int remap_flags)
{
- return xfs_reflink_remap_range(file_in, pos_in, file_out, pos_out,
- len, true);
+ struct inode *inode_in = file_inode(file_in);
+ struct xfs_inode *src = XFS_I(inode_in);
+ struct inode *inode_out = file_inode(file_out);
+ struct xfs_inode *dest = XFS_I(inode_out);
+ struct xfs_mount *mp = src->i_mount;
+ loff_t remapped = 0;
+ xfs_extlen_t cowextsize;
+ int ret;
+
+ if (remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_ADVISORY))
+ return -EINVAL;
+
+ if (!xfs_sb_version_hasreflink(&mp->m_sb))
+ return -EOPNOTSUPP;
+
+ if (XFS_FORCED_SHUTDOWN(mp))
+ return -EIO;
+
+ /* Prepare and then clone file data. */
+ ret = xfs_reflink_remap_prep(file_in, pos_in, file_out, pos_out,
+ &len, remap_flags);
+ if (ret < 0 || len == 0)
+ return ret;
+
+ trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out);
+
+ ret = xfs_reflink_remap_blocks(src, pos_in, dest, pos_out, len,
+ &remapped);
+ if (ret)
+ goto out_unlock;
+
+ /*
+ * Carry the cowextsize hint from src to dest if we're sharing the
+ * entire source file to the entire destination file, the source file
+ * has a cowextsize hint, and the destination file does not.
+ */
+ cowextsize = 0;
+ if (pos_in == 0 && len == i_size_read(inode_in) &&
+ (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) &&
+ pos_out == 0 && len >= i_size_read(inode_out) &&
+ !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE))
+ cowextsize = src->i_d.di_cowextsize;
+
+ ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize,
+ remap_flags);
+
+out_unlock:
+ xfs_reflink_remap_unlock(file_in, file_out);
+ if (ret)
+ trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_);
+ return remapped > 0 ? remapped : ret;
}
STATIC int
.fsync = xfs_file_fsync,
.get_unmapped_area = thp_get_unmapped_area,
.fallocate = xfs_file_fallocate,
- .clone_file_range = xfs_file_clone_range,
- .dedupe_file_range = xfs_file_dedupe_range,
+ .remap_file_range = xfs_file_remap_range,
};
const struct file_operations xfs_dir_file_operations = {
error = 0;
out_free_buf:
kmem_free(buf);
- return 0;
+ return error;
}
struct getfsmap_info {
void
xfs_hex_dump(void *p, int length)
{
- print_hex_dump(KERN_ALERT, "", DUMP_PREFIX_ADDRESS, 16, 1, p, length, 1);
+ print_hex_dump(KERN_ALERT, "", DUMP_PREFIX_OFFSET, 16, 1, p, length, 1);
}
statp->f_files = limit;
statp->f_ffree =
(statp->f_files > dqp->q_res_icount) ?
- (statp->f_ffree - dqp->q_res_icount) : 0;
+ (statp->f_files - dqp->q_res_icount) : 0;
}
}
if (error)
return error;
+ xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
trace_xfs_reflink_cow_alloc(ip, &got);
return 0;
}
/*
* Update destination inode size & cowextsize hint, if necessary.
*/
-STATIC int
+int
xfs_reflink_update_dest(
struct xfs_inode *dest,
xfs_off_t newlen,
xfs_extlen_t cowextsize,
- bool is_dedupe)
+ unsigned int remap_flags)
{
struct xfs_mount *mp = dest->i_mount;
struct xfs_trans *tp;
int error;
- if (is_dedupe && newlen <= i_size_read(VFS_I(dest)) && cowextsize == 0)
+ if (newlen <= i_size_read(VFS_I(dest)) && cowextsize == 0)
return 0;
error = xfs_trans_alloc(mp, &M_RES(mp)->tr_ichange, 0, 0, 0, &tp);
dest->i_d.di_flags2 |= XFS_DIFLAG2_COWEXTSIZE;
}
- if (!is_dedupe) {
- xfs_trans_ichgtime(tp, dest,
- XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
- }
xfs_trans_log_inode(tp, dest, XFS_ILOG_CORE);
error = xfs_trans_commit(tp);
/*
* Iteratively remap one file's extents (and holes) to another's.
*/
-STATIC int
+int
xfs_reflink_remap_blocks(
struct xfs_inode *src,
- xfs_fileoff_t srcoff,
+ loff_t pos_in,
struct xfs_inode *dest,
- xfs_fileoff_t destoff,
- xfs_filblks_t len,
- xfs_off_t new_isize)
+ loff_t pos_out,
+ loff_t remap_len,
+ loff_t *remapped)
{
struct xfs_bmbt_irec imap;
+ xfs_fileoff_t srcoff;
+ xfs_fileoff_t destoff;
+ xfs_filblks_t len;
+ xfs_filblks_t range_len;
+ xfs_filblks_t remapped_len = 0;
+ xfs_off_t new_isize = pos_out + remap_len;
int nimaps;
int error = 0;
- xfs_filblks_t range_len;
+
+ destoff = XFS_B_TO_FSBT(src->i_mount, pos_out);
+ srcoff = XFS_B_TO_FSBT(src->i_mount, pos_in);
+ len = XFS_B_TO_FSB(src->i_mount, remap_len);
/* drange = (destoff, destoff + len); srange = (srcoff, srcoff + len) */
while (len) {
error = xfs_bmapi_read(src, srcoff, len, &imap, &nimaps, 0);
xfs_iunlock(src, lock_mode);
if (error)
- goto err;
+ break;
ASSERT(nimaps == 1);
trace_xfs_reflink_remap_imap(src, srcoff, len, XFS_IO_OVERWRITE,
error = xfs_reflink_remap_extent(dest, &imap, destoff,
new_isize);
if (error)
- goto err;
+ break;
if (fatal_signal_pending(current)) {
error = -EINTR;
- goto err;
+ break;
}
/* Advance drange/srange */
srcoff += range_len;
destoff += range_len;
len -= range_len;
+ remapped_len += range_len;
}
- return 0;
-
-err:
- trace_xfs_reflink_remap_blocks_error(dest, error, _RET_IP_);
+ if (error)
+ trace_xfs_reflink_remap_blocks_error(dest, error, _RET_IP_);
+ *remapped = min_t(loff_t, remap_len,
+ XFS_FSB_TO_B(src->i_mount, remapped_len));
return error;
}
}
/* Unlock both inodes after they've been prepped for a range clone. */
-STATIC void
+void
xfs_reflink_remap_unlock(
struct file *file_in,
struct file *file_out)
* stale data in the destination file. Hence we reject these clone attempts with
* -EINVAL in this case.
*/
-STATIC int
+int
xfs_reflink_remap_prep(
struct file *file_in,
loff_t pos_in,
struct file *file_out,
loff_t pos_out,
- u64 *len,
- bool is_dedupe)
+ loff_t *len,
+ unsigned int remap_flags)
{
struct inode *inode_in = file_inode(file_in);
struct xfs_inode *src = XFS_I(inode_in);
struct inode *inode_out = file_inode(file_out);
struct xfs_inode *dest = XFS_I(inode_out);
bool same_inode = (inode_in == inode_out);
- u64 blkmask = i_blocksize(inode_in) - 1;
ssize_t ret;
/* Lock both files against IO */
if (IS_DAX(inode_in) || IS_DAX(inode_out))
goto out_unlock;
- ret = vfs_clone_file_prep_inodes(inode_in, pos_in, inode_out, pos_out,
- len, is_dedupe);
- if (ret <= 0)
+ ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out,
+ len, remap_flags);
+ if (ret < 0 || *len == 0)
goto out_unlock;
- /*
- * If the dedupe data matches, chop off the partial EOF block
- * from the source file so we don't try to dedupe the partial
- * EOF block.
- */
- if (is_dedupe) {
- *len &= ~blkmask;
- } else if (*len & blkmask) {
- /*
- * The user is attempting to share a partial EOF block,
- * if it's inside the destination EOF then reject it.
- */
- if (pos_out + *len < i_size_read(inode_out)) {
- ret = -EINVAL;
- goto out_unlock;
- }
- }
-
/* Attach dquots to dest inode before changing block map */
ret = xfs_qm_dqattach(dest);
if (ret)
if (ret)
goto out_unlock;
- /* Zap any page cache for the destination file's range. */
- truncate_inode_pages_range(&inode_out->i_data, pos_out,
- PAGE_ALIGN(pos_out + *len) - 1);
-
- /* If we're altering the file contents... */
- if (!is_dedupe) {
- /*
- * ...update the timestamps (which will grab the ilock again
- * from xfs_fs_dirty_inode, so we have to call it before we
- * take the ilock).
- */
- if (!(file_out->f_mode & FMODE_NOCMTIME)) {
- ret = file_update_time(file_out);
- if (ret)
- goto out_unlock;
- }
-
- /*
- * ...clear the security bits if the process is not being run
- * by root. This keeps people from modifying setuid and setgid
- * binaries.
- */
- ret = file_remove_privs(file_out);
- if (ret)
- goto out_unlock;
+ /*
+ * If pos_out > EOF, we may have dirtied blocks between EOF and
+ * pos_out. In that case, we need to extend the flush and unmap to cover
+ * from EOF to the end of the copy length.
+ */
+ if (pos_out > XFS_ISIZE(dest)) {
+ loff_t flen = *len + (pos_out - XFS_ISIZE(dest));
+ ret = xfs_flush_unmap_range(dest, XFS_ISIZE(dest), flen);
+ } else {
+ ret = xfs_flush_unmap_range(dest, pos_out, *len);
}
-
- return 1;
-out_unlock:
- xfs_reflink_remap_unlock(file_in, file_out);
- return ret;
-}
-
-/*
- * Link a range of blocks from one file to another.
- */
-int
-xfs_reflink_remap_range(
- struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len,
- bool is_dedupe)
-{
- struct inode *inode_in = file_inode(file_in);
- struct xfs_inode *src = XFS_I(inode_in);
- struct inode *inode_out = file_inode(file_out);
- struct xfs_inode *dest = XFS_I(inode_out);
- struct xfs_mount *mp = src->i_mount;
- xfs_fileoff_t sfsbno, dfsbno;
- xfs_filblks_t fsblen;
- xfs_extlen_t cowextsize;
- ssize_t ret;
-
- if (!xfs_sb_version_hasreflink(&mp->m_sb))
- return -EOPNOTSUPP;
-
- if (XFS_FORCED_SHUTDOWN(mp))
- return -EIO;
-
- /* Prepare and then clone file data. */
- ret = xfs_reflink_remap_prep(file_in, pos_in, file_out, pos_out,
- &len, is_dedupe);
- if (ret <= 0)
- return ret;
-
- trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out);
-
- dfsbno = XFS_B_TO_FSBT(mp, pos_out);
- sfsbno = XFS_B_TO_FSBT(mp, pos_in);
- fsblen = XFS_B_TO_FSB(mp, len);
- ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen,
- pos_out + len);
if (ret)
goto out_unlock;
- /*
- * Carry the cowextsize hint from src to dest if we're sharing the
- * entire source file to the entire destination file, the source file
- * has a cowextsize hint, and the destination file does not.
- */
- cowextsize = 0;
- if (pos_in == 0 && len == i_size_read(inode_in) &&
- (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) &&
- pos_out == 0 && len >= i_size_read(inode_out) &&
- !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE))
- cowextsize = src->i_d.di_cowextsize;
-
- ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize,
- is_dedupe);
-
+ return 1;
out_unlock:
xfs_reflink_remap_unlock(file_in, file_out);
- if (ret)
- trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_);
return ret;
}
extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
xfs_off_t count);
extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
-extern int xfs_reflink_remap_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
+extern loff_t xfs_reflink_remap_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out, loff_t len,
+ unsigned int remap_flags);
extern int xfs_reflink_inode_has_shared_extents(struct xfs_trans *tp,
struct xfs_inode *ip, bool *has_shared);
extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
struct xfs_trans **tpp);
extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
xfs_off_t len);
+extern int xfs_reflink_remap_prep(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out, loff_t *len,
+ unsigned int remap_flags);
+extern int xfs_reflink_remap_blocks(struct xfs_inode *src, loff_t pos_in,
+ struct xfs_inode *dest, loff_t pos_out, loff_t remap_len,
+ loff_t *remapped);
+extern int xfs_reflink_update_dest(struct xfs_inode *dest, xfs_off_t newlen,
+ xfs_extlen_t cowextsize, unsigned int remap_flags);
+extern void xfs_reflink_remap_unlock(struct file *file_in,
+ struct file *file_out);
#endif /* __XFS_REFLINK_H */
),
TP_fast_assign(
__entry->dev = bp->b_target->bt_dev;
- __entry->bno = bp->b_bn;
+ if (bp->b_bn == XFS_BUF_DADDR_NULL)
+ __entry->bno = bp->b_maps[0].bm_bn;
+ else
+ __entry->bno = bp->b_bn;
__entry->nblks = bp->b_length;
__entry->hold = atomic_read(&bp->b_hold);
__entry->pincount = atomic_read(&bp->b_pin_count);
#define _4LEVEL_FIXUP_H
#define __ARCH_HAS_4LEVEL_HACK
-#define __PAGETABLE_PUD_FOLDED
+#define __PAGETABLE_PUD_FOLDED 1
#define PUD_SHIFT PGDIR_SHIFT
#define PUD_SIZE PGDIR_SIZE
#define _5LEVEL_FIXUP_H
#define __ARCH_HAS_5LEVEL_HACK
-#define __PAGETABLE_P4D_FOLDED
+#define __PAGETABLE_P4D_FOLDED 1
#define P4D_SHIFT PGDIR_SHIFT
#define P4D_SIZE PGDIR_SIZE
#ifndef __ASSEMBLY__
#include <asm-generic/5level-fixup.h>
-#define __PAGETABLE_PUD_FOLDED
+#define __PAGETABLE_PUD_FOLDED 1
/*
* Having the pud type consist of a pgd gets the size right, and allows
#ifndef __ASSEMBLY__
-#define __PAGETABLE_P4D_FOLDED
+#define __PAGETABLE_P4D_FOLDED 1
typedef struct { pgd_t pgd; } p4d_t;
struct mm_struct;
-#define __PAGETABLE_PMD_FOLDED
+#define __PAGETABLE_PMD_FOLDED 1
/*
* Having the pmd type consist of a pud gets the size right, and allows
#else
#include <asm-generic/pgtable-nop4d.h>
-#define __PAGETABLE_PUD_FOLDED
+#define __PAGETABLE_PUD_FOLDED 1
/*
* Having the pud type consist of a p4d gets the size right, and allows
#endif
#endif
+/*
+ * On some architectures it depends on the mm if the p4d/pud or pmd
+ * layer of the page table hierarchy is folded or not.
+ */
+#ifndef mm_p4d_folded
+#define mm_p4d_folded(mm) __is_defined(__PAGETABLE_P4D_FOLDED)
+#endif
+
+#ifndef mm_pud_folded
+#define mm_pud_folded(mm) __is_defined(__PAGETABLE_PUD_FOLDED)
+#endif
+
+#ifndef mm_pmd_folded
+#define mm_pmd_folded(mm) __is_defined(__PAGETABLE_PMD_FOLDED)
+#endif
+
#endif /* _ASM_GENERIC_PGTABLE_H */
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+#ifndef _LINUX_ASYM_TPM_SUBTYPE_H
+#define _LINUX_ASYM_TPM_SUBTYPE_H
+
+#include <linux/keyctl.h>
+
+struct tpm_key {
+ void *blob;
+ u32 blob_len;
+ uint16_t key_len; /* Size in bits of the key */
+ const void *pub_key; /* pointer inside blob to the public key bytes */
+ uint16_t pub_key_len; /* length of the public key */
+};
+
+struct tpm_key *tpm_key_create(const void *blob, uint32_t blob_len);
+
+extern struct asymmetric_key_subtype asym_tpm_subtype;
+
+#endif /* _LINUX_ASYM_TPM_SUBTYPE_H */
#ifndef _LINUX_PUBLIC_KEY_H
#define _LINUX_PUBLIC_KEY_H
+#include <linux/keyctl.h>
+
/*
* Cryptographic data for the public-key subtype of the asymmetric key type.
*
struct public_key {
void *key;
u32 keylen;
+ bool key_is_private;
const char *id_type;
const char *pkey_algo;
};
u8 digest_size; /* Number of bytes in digest */
const char *pkey_algo;
const char *hash_algo;
+ const char *encoding;
};
extern void public_key_signature_free(struct public_key_signature *sig);
const union key_payload *payload,
struct key *trusted);
-extern int verify_signature(const struct key *key,
- const struct public_key_signature *sig);
+extern int query_asymmetric_key(const struct kernel_pkey_params *,
+ struct kernel_pkey_query *);
+
+extern int encrypt_blob(struct kernel_pkey_params *, const void *, void *);
+extern int decrypt_blob(struct kernel_pkey_params *, const void *, void *);
+extern int create_signature(struct kernel_pkey_params *, const void *, void *);
+extern int verify_signature(const struct key *,
+ const struct public_key_signature *);
int public_key_verify_signature(const struct public_key *pkey,
const struct public_key_signature *sig);
connector_status_unknown = 3,
};
+/**
+ * enum drm_connector_registration_status - userspace registration status for
+ * a &drm_connector
+ *
+ * This enum is used to track the status of initializing a connector and
+ * registering it with userspace, so that DRM can prevent bogus modesets on
+ * connectors that no longer exist.
+ */
+enum drm_connector_registration_state {
+ /**
+ * @DRM_CONNECTOR_INITIALIZING: The connector has just been created,
+ * but has yet to be exposed to userspace. There should be no
+ * additional restrictions to how the state of this connector may be
+ * modified.
+ */
+ DRM_CONNECTOR_INITIALIZING = 0,
+
+ /**
+ * @DRM_CONNECTOR_REGISTERED: The connector has been fully initialized
+ * and registered with sysfs, as such it has been exposed to
+ * userspace. There should be no additional restrictions to how the
+ * state of this connector may be modified.
+ */
+ DRM_CONNECTOR_REGISTERED = 1,
+
+ /**
+ * @DRM_CONNECTOR_UNREGISTERED: The connector has either been exposed
+ * to userspace and has since been unregistered and removed from
+ * userspace, or the connector was unregistered before it had a chance
+ * to be exposed to userspace (e.g. still in the
+ * @DRM_CONNECTOR_INITIALIZING state). When a connector is
+ * unregistered, there are additional restrictions to how its state
+ * may be modified:
+ *
+ * - An unregistered connector may only have its DPMS changed from
+ * On->Off. Once DPMS is changed to Off, it may not be switched back
+ * to On.
+ * - Modesets are not allowed on unregistered connectors, unless they
+ * would result in disabling its assigned CRTCs. This means
+ * disabling a CRTC on an unregistered connector is OK, but enabling
+ * one is not.
+ * - Removing a CRTC from an unregistered connector is OK, but new
+ * CRTCs may never be assigned to an unregistered connector.
+ */
+ DRM_CONNECTOR_UNREGISTERED = 2,
+};
+
enum subpixel_order {
SubPixelUnknown = 0,
SubPixelHorizontalRGB,
bool ycbcr_420_allowed;
/**
- * @registered: Is this connector exposed (registered) with userspace?
+ * @registration_state: Is this connector initializing, exposed
+ * (registered) with userspace, or unregistered?
+ *
* Protected by @mutex.
*/
- bool registered;
+ enum drm_connector_registration_state registration_state;
/**
* @modes:
drm_connector_put(connector);
}
+/**
+ * drm_connector_is_unregistered - has the connector been unregistered from
+ * userspace?
+ * @connector: DRM connector
+ *
+ * Checks whether or not @connector has been unregistered from userspace.
+ *
+ * Returns:
+ * True if the connector was unregistered, false if the connector is
+ * registered or has not yet been registered with userspace.
+ */
+static inline bool
+drm_connector_is_unregistered(struct drm_connector *connector)
+{
+ return READ_ONCE(connector->registration_state) ==
+ DRM_CONNECTOR_UNREGISTERED;
+}
+
const char *drm_get_connector_status_name(enum drm_connector_status status);
const char *drm_get_subpixel_order_name(enum subpixel_order order);
const char *drm_get_dpms_name(int val);
#include <linux/seq_file.h>
#include <keys/asymmetric-type.h>
+struct kernel_pkey_query;
+struct kernel_pkey_params;
struct public_key_signature;
/*
/* Destroy a key of this subtype */
void (*destroy)(void *payload_crypto, void *payload_auth);
+ int (*query)(const struct kernel_pkey_params *params,
+ struct kernel_pkey_query *info);
+
+ /* Encrypt/decrypt/sign data */
+ int (*eds_op)(struct kernel_pkey_params *params,
+ const void *in, void *out);
+
/* Verify the signature on a key of this subtype (optional) */
int (*verify_signature)(const struct key *key,
const struct public_key_signature *sig);
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __TRUSTED_KEY_H
+#define __TRUSTED_KEY_H
+
+/* implementation specific TPM constants */
+#define MAX_BUF_SIZE 1024
+#define TPM_GETRANDOM_SIZE 14
+#define TPM_OSAP_SIZE 36
+#define TPM_OIAP_SIZE 10
+#define TPM_SEAL_SIZE 87
+#define TPM_UNSEAL_SIZE 104
+#define TPM_SIZE_OFFSET 2
+#define TPM_RETURN_OFFSET 6
+#define TPM_DATA_OFFSET 10
+
+#define LOAD32(buffer, offset) (ntohl(*(uint32_t *)&buffer[offset]))
+#define LOAD32N(buffer, offset) (*(uint32_t *)&buffer[offset])
+#define LOAD16(buffer, offset) (ntohs(*(uint16_t *)&buffer[offset]))
+
+struct tpm_buf {
+ int len;
+ unsigned char data[MAX_BUF_SIZE];
+};
+
+#define INIT_BUF(tb) (tb->len = 0)
+
+struct osapsess {
+ uint32_t handle;
+ unsigned char secret[SHA1_DIGEST_SIZE];
+ unsigned char enonce[TPM_NONCE_SIZE];
+};
+
+/* discrete values, but have to store in uint16_t for TPM use */
+enum {
+ SEAL_keytype = 1,
+ SRK_keytype = 4
+};
+
+int TSS_authhmac(unsigned char *digest, const unsigned char *key,
+ unsigned int keylen, unsigned char *h1,
+ unsigned char *h2, unsigned char h3, ...);
+int TSS_checkhmac1(unsigned char *buffer,
+ const uint32_t command,
+ const unsigned char *ononce,
+ const unsigned char *key,
+ unsigned int keylen, ...);
+
+int trusted_tpm_send(unsigned char *cmd, size_t buflen);
+int oiap(struct tpm_buf *tb, uint32_t *handle, unsigned char *nonce);
+
+#define TPM_DEBUG 0
+
+#if TPM_DEBUG
+static inline void dump_options(struct trusted_key_options *o)
+{
+ pr_info("trusted_key: sealing key type %d\n", o->keytype);
+ pr_info("trusted_key: sealing key handle %0X\n", o->keyhandle);
+ pr_info("trusted_key: pcrlock %d\n", o->pcrlock);
+ pr_info("trusted_key: pcrinfo %d\n", o->pcrinfo_len);
+ print_hex_dump(KERN_INFO, "pcrinfo ", DUMP_PREFIX_NONE,
+ 16, 1, o->pcrinfo, o->pcrinfo_len, 0);
+}
+
+static inline void dump_payload(struct trusted_key_payload *p)
+{
+ pr_info("trusted_key: key_len %d\n", p->key_len);
+ print_hex_dump(KERN_INFO, "key ", DUMP_PREFIX_NONE,
+ 16, 1, p->key, p->key_len, 0);
+ pr_info("trusted_key: bloblen %d\n", p->blob_len);
+ print_hex_dump(KERN_INFO, "blob ", DUMP_PREFIX_NONE,
+ 16, 1, p->blob, p->blob_len, 0);
+ pr_info("trusted_key: migratable %d\n", p->migratable);
+}
+
+static inline void dump_sess(struct osapsess *s)
+{
+ print_hex_dump(KERN_INFO, "trusted-key: handle ", DUMP_PREFIX_NONE,
+ 16, 1, &s->handle, 4, 0);
+ pr_info("trusted-key: secret:\n");
+ print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE,
+ 16, 1, &s->secret, SHA1_DIGEST_SIZE, 0);
+ pr_info("trusted-key: enonce:\n");
+ print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE,
+ 16, 1, &s->enonce, SHA1_DIGEST_SIZE, 0);
+}
+
+static inline void dump_tpm_buf(unsigned char *buf)
+{
+ int len;
+
+ pr_info("\ntrusted-key: tpm buffer\n");
+ len = LOAD32(buf, TPM_SIZE_OFFSET);
+ print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE, 16, 1, buf, len, 0);
+}
+#else
+static inline void dump_options(struct trusted_key_options *o)
+{
+}
+
+static inline void dump_payload(struct trusted_key_payload *p)
+{
+}
+
+static inline void dump_sess(struct osapsess *s)
+{
+}
+
+static inline void dump_tpm_buf(unsigned char *buf)
+{
+}
+#endif
+
+static inline void store8(struct tpm_buf *buf, const unsigned char value)
+{
+ buf->data[buf->len++] = value;
+}
+
+static inline void store16(struct tpm_buf *buf, const uint16_t value)
+{
+ *(uint16_t *) & buf->data[buf->len] = htons(value);
+ buf->len += sizeof value;
+}
+
+static inline void store32(struct tpm_buf *buf, const uint32_t value)
+{
+ *(uint32_t *) & buf->data[buf->len] = htonl(value);
+ buf->len += sizeof value;
+}
+
+static inline void storebytes(struct tpm_buf *buf, const unsigned char *in,
+ const int len)
+{
+ memcpy(buf->data + buf->len, in, len);
+ buf->len += len;
+}
+#endif
#ifndef _LINUX_ADXL_H
#define _LINUX_ADXL_H
+#ifdef CONFIG_ACPI_ADXL
const char * const *adxl_get_component_names(void);
int adxl_decode(u64 addr, u64 component_values[]);
+#else
+static inline const char * const *adxl_get_component_names(void) { return NULL; }
+static inline int adxl_decode(u64 addr, u64 component_values[]) { return -EOPNOTSUPP; }
+#endif
#endif /* _LINUX_ADXL_H */
/* Error Codes */
enum virtchnl_status_code {
VIRTCHNL_STATUS_SUCCESS = 0,
- VIRTCHNL_ERR_PARAM = -5,
+ VIRTCHNL_STATUS_ERR_PARAM = -5,
+ VIRTCHNL_STATUS_ERR_NO_MEMORY = -18,
VIRTCHNL_STATUS_ERR_OPCODE_MISMATCH = -38,
VIRTCHNL_STATUS_ERR_CQP_COMPL_ERROR = -39,
VIRTCHNL_STATUS_ERR_INVALID_VF_ID = -40,
- VIRTCHNL_STATUS_NOT_SUPPORTED = -64,
+ VIRTCHNL_STATUS_ERR_ADMIN_QUEUE_ERROR = -53,
+ VIRTCHNL_STATUS_ERR_NOT_SUPPORTED = -64,
};
+/* Backward compatibility */
+#define VIRTCHNL_ERR_PARAM VIRTCHNL_STATUS_ERR_PARAM
+#define VIRTCHNL_STATUS_NOT_SUPPORTED VIRTCHNL_STATUS_ERR_NOT_SUPPORTED
+
#define VIRTCHNL_LINK_SPEED_100MB_SHIFT 0x1
#define VIRTCHNL_LINK_SPEED_1000MB_SHIFT 0x2
#define VIRTCHNL_LINK_SPEED_10GB_SHIFT 0x3
case VIRTCHNL_OP_EVENT:
case VIRTCHNL_OP_UNKNOWN:
default:
- return VIRTCHNL_ERR_PARAM;
+ return VIRTCHNL_STATUS_ERR_PARAM;
}
/* few more checks */
if (err_msg_format || valid_len != msglen)
disk_devt((bio)->bi_disk)
#if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP)
-int bio_associate_blkg_from_page(struct bio *bio, struct page *page);
+int bio_associate_blkcg_from_page(struct bio *bio, struct page *page);
#else
-static inline int bio_associate_blkg_from_page(struct bio *bio,
- struct page *page) { return 0; }
+static inline int bio_associate_blkcg_from_page(struct bio *bio,
+ struct page *page) { return 0; }
#endif
#ifdef CONFIG_BLK_CGROUP
+int bio_associate_blkcg(struct bio *bio, struct cgroup_subsys_state *blkcg_css);
int bio_associate_blkg(struct bio *bio, struct blkcg_gq *blkg);
-int bio_associate_blkg_from_css(struct bio *bio,
- struct cgroup_subsys_state *css);
-int bio_associate_create_blkg(struct request_queue *q, struct bio *bio);
-int bio_reassociate_blkg(struct request_queue *q, struct bio *bio);
void bio_disassociate_task(struct bio *bio);
-void bio_clone_blkg_association(struct bio *dst, struct bio *src);
+void bio_clone_blkcg_association(struct bio *dst, struct bio *src);
#else /* CONFIG_BLK_CGROUP */
-static inline int bio_associate_blkg_from_css(struct bio *bio,
- struct cgroup_subsys_state *css)
-{ return 0; }
-static inline int bio_associate_create_blkg(struct request_queue *q,
- struct bio *bio) { return 0; }
-static inline int bio_reassociate_blkg(struct request_queue *q, struct bio *bio)
-{ return 0; }
+static inline int bio_associate_blkcg(struct bio *bio,
+ struct cgroup_subsys_state *blkcg_css) { return 0; }
static inline void bio_disassociate_task(struct bio *bio) { }
-static inline void bio_clone_blkg_association(struct bio *dst,
- struct bio *src) { }
+static inline void bio_clone_blkcg_association(struct bio *dst,
+ struct bio *src) { }
#endif /* CONFIG_BLK_CGROUP */
#ifdef CONFIG_HIGHMEM
struct request_list rl;
/* reference count */
- struct percpu_ref refcnt;
+ atomic_t refcnt;
/* is this blkg online? protected by both blkcg and q locks */
bool online;
struct blkcg_gq *blkg_lookup_slowpath(struct blkcg *blkcg,
struct request_queue *q, bool update_hint);
-struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
- struct request_queue *q);
struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
struct request_queue *q);
int blkcg_init_queue(struct request_queue *q);
char *input, struct blkg_conf_ctx *ctx);
void blkg_conf_finish(struct blkg_conf_ctx *ctx);
-/**
- * blkcg_css - find the current css
- *
- * Find the css associated with either the kthread or the current task.
- * This may return a dying css, so it is up to the caller to use tryget logic
- * to confirm it is alive and well.
- */
-static inline struct cgroup_subsys_state *blkcg_css(void)
-{
- struct cgroup_subsys_state *css;
-
- css = kthread_blkcg();
- if (css)
- return css;
- return task_css(current, io_cgrp_id);
-}
static inline struct blkcg *css_to_blkcg(struct cgroup_subsys_state *css)
{
return css ? container_of(css, struct blkcg, css) : NULL;
}
-/**
- * __bio_blkcg - internal version of bio_blkcg for bfq and cfq
- *
- * DO NOT USE.
- * There is a flaw using this version of the function. In particular, this was
- * used in a broken paradigm where association was called on the given css. It
- * is possible though that the returned css from task_css() is in the process
- * of dying due to migration of the current task. So it is improper to assume
- * *_get() is going to succeed. Both BFQ and CFQ rely on this logic and will
- * take additional work to handle more gracefully.
- */
-static inline struct blkcg *__bio_blkcg(struct bio *bio)
-{
- if (bio && bio->bi_blkg)
- return bio->bi_blkg->blkcg;
- return css_to_blkcg(blkcg_css());
-}
-
-/**
- * bio_blkcg - grab the blkcg associated with a bio
- * @bio: target bio
- *
- * This returns the blkcg associated with a bio, NULL if not associated.
- * Callers are expected to either handle NULL or know association has been
- * done prior to calling this.
- */
static inline struct blkcg *bio_blkcg(struct bio *bio)
{
- if (bio && bio->bi_blkg)
- return bio->bi_blkg->blkcg;
- return NULL;
+ struct cgroup_subsys_state *css;
+
+ if (bio && bio->bi_css)
+ return css_to_blkcg(bio->bi_css);
+ css = kthread_blkcg();
+ if (css)
+ return css_to_blkcg(css);
+ return css_to_blkcg(task_css(current, io_cgrp_id));
}
static inline bool blk_cgroup_congested(void)
*/
static inline void blkg_get(struct blkcg_gq *blkg)
{
- percpu_ref_get(&blkg->refcnt);
+ WARN_ON_ONCE(atomic_read(&blkg->refcnt) <= 0);
+ atomic_inc(&blkg->refcnt);
}
/**
- * blkg_tryget - try and get a blkg reference
+ * blkg_try_get - try and get a blkg reference
* @blkg: blkg to get
*
* This is for use when doing an RCU lookup of the blkg. We may be in the midst
* of freeing this blkg, so we can only use it if the refcnt is not zero.
*/
-static inline bool blkg_tryget(struct blkcg_gq *blkg)
+static inline struct blkcg_gq *blkg_try_get(struct blkcg_gq *blkg)
{
- return percpu_ref_tryget(&blkg->refcnt);
+ if (atomic_inc_not_zero(&blkg->refcnt))
+ return blkg;
+ return NULL;
}
-/**
- * blkg_tryget_closest - try and get a blkg ref on the closet blkg
- * @blkg: blkg to get
- *
- * This walks up the blkg tree to find the closest non-dying blkg and returns
- * the blkg that it did association with as it may not be the passed in blkg.
- */
-static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
-{
- while (!percpu_ref_tryget(&blkg->refcnt))
- blkg = blkg->parent;
- return blkg;
-}
+void __blkg_release_rcu(struct rcu_head *rcu);
/**
* blkg_put - put a blkg reference
*/
static inline void blkg_put(struct blkcg_gq *blkg)
{
- percpu_ref_put(&blkg->refcnt);
+ WARN_ON_ONCE(atomic_read(&blkg->refcnt) <= 0);
+ if (atomic_dec_and_test(&blkg->refcnt))
+ call_rcu(&blkg->rcu_head, __blkg_release_rcu);
}
/**
rcu_read_lock();
- if (bio && bio->bi_blkg) {
- blkcg = bio->bi_blkg->blkcg;
- if (blkcg == &blkcg_root)
- goto rl_use_root;
-
- blkg_get(bio->bi_blkg);
- rcu_read_unlock();
- return &bio->bi_blkg->rl;
- }
+ blkcg = bio_blkcg(bio);
- blkcg = css_to_blkcg(blkcg_css());
+ /* bypass blkg lookup and use @q->root_rl directly for root */
if (blkcg == &blkcg_root)
- goto rl_use_root;
+ goto root_rl;
+ /*
+ * Try to use blkg->rl. blkg lookup may fail under memory pressure
+ * or if either the blkcg or queue is going away. Fall back to
+ * root_rl in such cases.
+ */
blkg = blkg_lookup(blkcg, q);
if (unlikely(!blkg))
- blkg = __blkg_lookup_create(blkcg, q);
-
- if (blkg->blkcg == &blkcg_root || !blkg_tryget(blkg))
- goto rl_use_root;
+ goto root_rl;
+ blkg_get(blkg);
rcu_read_unlock();
return &blkg->rl;
-
- /*
- * Each blkg has its own request_list, however, the root blkcg
- * uses the request_queue's root_rl. This is to avoid most
- * overhead for the root blkcg.
- */
-rl_use_root:
+root_rl:
rcu_read_unlock();
return &q->root_rl;
}
struct bio *bio) { return false; }
#endif
-
-static inline void blkcg_bio_issue_init(struct bio *bio)
-{
- bio_issue_init(&bio->bi_issue, bio_sectors(bio));
-}
-
static inline bool blkcg_bio_issue_check(struct request_queue *q,
struct bio *bio)
{
+ struct blkcg *blkcg;
struct blkcg_gq *blkg;
bool throtl = false;
rcu_read_lock();
+ blkcg = bio_blkcg(bio);
+
+ /* associate blkcg if bio hasn't attached one */
+ bio_associate_blkcg(bio, &blkcg->css);
- bio_associate_create_blkg(q, bio);
- blkg = bio->bi_blkg;
+ blkg = blkg_lookup(blkcg, q);
+ if (unlikely(!blkg)) {
+ spin_lock_irq(q->queue_lock);
+ blkg = blkg_lookup_create(blkcg, q);
+ if (IS_ERR(blkg))
+ blkg = NULL;
+ spin_unlock_irq(q->queue_lock);
+ }
throtl = blk_throtl_bio(q, blkg, bio);
if (!throtl) {
+ blkg = blkg ?: q->root_blkg;
/*
* If the bio is flagged with BIO_QUEUE_ENTERED it means this
* is a split bio and we would have already accounted for the
blkg_rwstat_add(&blkg->stat_ios, bio->bi_opf, 1);
}
- blkcg_bio_issue_init(bio);
-
rcu_read_unlock();
return !throtl;
}
static inline void blkcg_deactivate_policy(struct request_queue *q,
const struct blkcg_policy *pol) { }
-static inline struct blkcg *__bio_blkcg(struct bio *bio) { return NULL; }
static inline struct blkcg *bio_blkcg(struct bio *bio) { return NULL; }
static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg,
static inline void blk_rq_set_rl(struct request *rq, struct request_list *rl) { }
static inline struct request_list *blk_rq_rl(struct request *rq) { return &rq->q->root_rl; }
-static inline void blkcg_bio_issue_init(struct bio *bio) { }
static inline bool blkcg_bio_issue_check(struct request_queue *q,
struct bio *bio) { return true; }
* release. Read comment on top of bio_associate_current().
*/
struct io_context *bi_ioc;
+ struct cgroup_subsys_state *bi_css;
struct blkcg_gq *bi_blkg;
struct bio_issue bi_issue;
#endif
* PTR_TO_MAP_VALUE_OR_NULL
*/
struct bpf_map *map_ptr;
+
+ /* Max size from any of the above. */
+ unsigned long raw;
};
/* Fixed part of pointer offset, pointer types only */
s32 off;
void can_put_echo_skb(struct sk_buff *skb, struct net_device *dev,
unsigned int idx);
+struct sk_buff *__can_get_echo_skb(struct net_device *dev, unsigned int idx, u8 *len_ptr);
unsigned int can_get_echo_skb(struct net_device *dev, unsigned int idx);
void can_free_echo_skb(struct net_device *dev, unsigned int idx);
int can_rx_offload_add_fifo(struct net_device *dev, struct can_rx_offload *offload, unsigned int weight);
int can_rx_offload_irq_offload_timestamp(struct can_rx_offload *offload, u64 reg);
int can_rx_offload_irq_offload_fifo(struct can_rx_offload *offload);
-int can_rx_offload_irq_queue_err_skb(struct can_rx_offload *offload, struct sk_buff *skb);
+int can_rx_offload_queue_sorted(struct can_rx_offload *offload,
+ struct sk_buff *skb, u32 timestamp);
+unsigned int can_rx_offload_get_echo_skb(struct can_rx_offload *offload,
+ unsigned int idx, u32 timestamp);
+int can_rx_offload_queue_tail(struct can_rx_offload *offload,
+ struct sk_buff *skb);
void can_rx_offload_reset(struct can_rx_offload *offload);
void can_rx_offload_del(struct can_rx_offload *offload);
void can_rx_offload_enable(struct can_rx_offload *offload);
CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING | \
CEPH_FEATURE_CEPHX_V2)
-#define CEPH_FEATURES_REQUIRED_DEFAULT \
- (CEPH_FEATURE_NOSRCADDR | \
- CEPH_FEATURE_SUBSCRIBE2 | \
- CEPH_FEATURE_RECONNECT_SEQ | \
- CEPH_FEATURE_PGID64 | \
- CEPH_FEATURE_PGPOOL3 | \
- CEPH_FEATURE_OSDENC)
+#define CEPH_FEATURES_REQUIRED_DEFAULT 0
#endif
bool css_has_online_children(struct cgroup_subsys_state *css);
struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss);
-struct cgroup_subsys_state *cgroup_e_css(struct cgroup *cgroup,
- struct cgroup_subsys *ss);
struct cgroup_subsys_state *cgroup_get_e_css(struct cgroup *cgroup,
struct cgroup_subsys *ss);
struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry,
#else /* !CONFIG_COMPAT */
#define is_compat_task() (0)
-#ifndef in_compat_syscall
+/* Ensure no one redefines in_compat_syscall() under !CONFIG_COMPAT */
+#define in_compat_syscall in_compat_syscall
static inline bool in_compat_syscall(void) { return false; }
-#endif
#endif /* CONFIG_COMPAT */
#define __SANITIZE_ADDRESS__
#endif
-#define __no_sanitize_address __attribute__((no_sanitize("address")))
-
/*
* Not all versions of clang implement the the type-generic versions
* of the builtin overflow checkers. Fortunately, clang implements
* compilers, like ICC.
*/
#define barrier() __asm__ __volatile__("" : : : "memory")
-#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
-#define __assume_aligned(a, ...) \
- __attribute__((__assume_aligned__(a, ## __VA_ARGS__)))
*/
#define uninitialized_var(x) x = x
-#ifdef __CHECKER__
-#define __must_be_array(a) 0
-#else
-/* &a[0] degrades to a pointer: a different type from an array */
-#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
-#endif
-
#ifdef RETPOLINE
-#define __noretpoline __attribute__((indirect_branch("keep")))
+#define __noretpoline __attribute__((__indirect_branch__("keep")))
#endif
#define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)
-#define __optimize(level) __attribute__((__optimize__(level)))
-
#define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
-#ifndef __CHECKER__
-#define __compiletime_warning(message) __attribute__((warning(message)))
-#define __compiletime_error(message) __attribute__((error(message)))
+#define __compiletime_warning(message) __attribute__((__warning__(message)))
+#define __compiletime_error(message) __attribute__((__error__(message)))
-#ifdef LATENT_ENTROPY_PLUGIN
+#if defined(LATENT_ENTROPY_PLUGIN) && !defined(__CHECKER__)
#define __latent_entropy __attribute__((latent_entropy))
#endif
-#endif /* __CHECKER__ */
/*
* calling noreturn functions, __builtin_unreachable() and __builtin_trap()
* Mark a position in code as unreachable. This can be used to
* suppress control flow warnings after asm blocks that transfer
* control elsewhere.
- *
- * Early snapshots of gcc 4.5 don't support this and we can't detect
- * this in the preprocessor, but we can live with this because they're
- * unreleased. Really, we need to have autoconf for the kernel.
*/
#define unreachable() \
do { \
__builtin_unreachable(); \
} while (0)
-/* Mark a function definition as prohibited from being cloned. */
-#define __noclone __attribute__((__noclone__, __optimize__("no-tracer")))
-
#if defined(RANDSTRUCT_PLUGIN) && !defined(__CHECKER__)
#define __randomize_layout __attribute__((randomize_layout))
#define __no_randomize_layout __attribute__((no_randomize_layout))
#define randomized_struct_fields_end } __randomize_layout;
#endif
-/*
- * When used with Link Time Optimization, gcc can optimize away C functions or
- * variables which are referenced only from assembly code. __visible tells the
- * optimizer that something else uses this function or variable, thus preventing
- * this.
- */
-#define __visible __attribute__((externally_visible))
-
-/* gcc version specific checks */
-
-#if GCC_VERSION >= 40900 && !defined(__CHECKER__)
-/*
- * __assume_aligned(n, k): Tell the optimizer that the returned
- * pointer can be assumed to be k modulo n. The second argument is
- * optional (default 0), so we use a variadic macro to make the
- * shorthand.
- *
- * Beware: Do not apply this to functions which may return
- * ERR_PTRs. Also, it is probably unwise to apply it to functions
- * returning extra information in the low bits (but in that case the
- * compiler should see some alignment anyway, when the return value is
- * massaged by 'flags = ptr & 3; ptr &= ~3;').
- */
-#define __assume_aligned(a, ...) __attribute__((__assume_aligned__(a, ## __VA_ARGS__)))
-#endif
-
/*
* GCC 'asm goto' miscompiles certain code sequences:
*
#define KASAN_ABI_VERSION 3
#endif
-#if GCC_VERSION >= 40902
-/*
- * Tell the compiler that address safety instrumentation (KASAN)
- * should not be applied to that function.
- * Conflicts with inlining: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
- */
-#define __no_sanitize_address __attribute__((no_sanitize_address))
-#ifdef CONFIG_KASAN
-#define __no_sanitize_address_or_inline \
- __no_sanitize_address __maybe_unused notrace
-#else
-#define __no_sanitize_address_or_inline inline
-#endif
-#endif
-
#if GCC_VERSION >= 50100
-/*
- * Mark structures as requiring designated initializers.
- * https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
- */
-#define __designated_init __attribute__((designated_init))
#define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
#endif
-#if !defined(__noclone)
-#define __noclone /* not needed */
-#endif
-
-#if !defined(__no_sanitize_address)
-#define __no_sanitize_address
-#define __no_sanitize_address_or_inline inline
-#endif
-
/*
* Turn individual warnings and errors on and off locally, depending
* on version.
*/
#define OPTIMIZER_HIDE_VAR(var) barrier()
-/* Intel ECC compiler doesn't support __builtin_types_compatible_p() */
-#define __must_be_array(a) 0
-
#endif
/* icc has this, but it's called _bswap16 */
#define __HAVE_BUILTIN_BSWAP16__
#define __builtin_bswap16 _bswap16
-
-/* The following are for compatibility with GCC, from compiler-gcc.h,
- * and may be redefined here because they should not be shared with other
- * compilers, like clang.
- */
-#define __visible __attribute__((externally_visible))
#define __branch_check__(x, expect, is_constant) ({ \
long ______r; \
static struct ftrace_likely_data \
- __attribute__((__aligned__(4))) \
- __attribute__((section("_ftrace_annotated_branch"))) \
+ __aligned(4) \
+ __section("_ftrace_annotated_branch") \
______f = { \
.data.func = __func__, \
.data.file = __FILE__, \
({ \
int ______r; \
static struct ftrace_branch_data \
- __attribute__((__aligned__(4))) \
- __attribute__((section("_ftrace_branch"))) \
+ __aligned(4) \
+ __section("_ftrace_branch") \
______f = { \
.func = __func__, \
.file = __FILE__, \
# define ASM_UNREACHABLE
#endif
#ifndef unreachable
-# define unreachable() do { annotate_reachable(); do { } while (1); } while (0)
+# define unreachable() do { \
+ annotate_unreachable(); \
+ __builtin_unreachable(); \
+} while (0)
#endif
/*
extern typeof(sym) sym; \
static const unsigned long __kentry_##sym \
__used \
- __attribute__((section("___kentry" "+" #sym ), used)) \
+ __section("___kentry" "+" #sym ) \
= (unsigned long)&sym;
#endif
* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
* '__maybe_unused' allows us to avoid defined-but-not-used warnings.
*/
-# define __no_kasan_or_inline __no_sanitize_address __maybe_unused
+# define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
#else
# define __no_kasan_or_inline __always_inline
#endif
* visible to the compiler.
*/
#define __ADDRESSABLE(sym) \
- static void * __attribute__((section(".discard.addressable"), used)) \
+ static void * __section(".discard.addressable") __used \
__PASTE(__addressable_##sym, __LINE__) = (void *)&sym;
/**
#endif /* __KERNEL__ */
#endif /* __ASSEMBLY__ */
-#ifndef __optimize
-# define __optimize(level)
-#endif
-
/* Compile time object size, -1 for unknown */
#ifndef __compiletime_object_size
# define __compiletime_object_size(obj) -1
compiletime_assert(__native_word(t), \
"Need native word sized stores/loads for atomicity.")
+/* &a[0] degrades to a pointer: a different type from an array */
+#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
+
#endif /* __LINUX_COMPILER_H */
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_COMPILER_ATTRIBUTES_H
+#define __LINUX_COMPILER_ATTRIBUTES_H
+
+/*
+ * The attributes in this file are unconditionally defined and they directly
+ * map to compiler attribute(s), unless one of the compilers does not support
+ * the attribute. In that case, __has_attribute is used to check for support
+ * and the reason is stated in its comment ("Optional: ...").
+ *
+ * Any other "attributes" (i.e. those that depend on a configuration option,
+ * on a compiler, on an architecture, on plugins, on other attributes...)
+ * should be defined elsewhere (e.g. compiler_types.h or compiler-*.h).
+ * The intention is to keep this file as simple as possible, as well as
+ * compiler- and version-agnostic (e.g. avoiding GCC_VERSION checks).
+ *
+ * This file is meant to be sorted (by actual attribute name,
+ * not by #define identifier). Use the __attribute__((__name__)) syntax
+ * (i.e. with underscores) to avoid future collisions with other macros.
+ * Provide links to the documentation of each supported compiler, if it exists.
+ */
+
+/*
+ * __has_attribute is supported on gcc >= 5, clang >= 2.9 and icc >= 17.
+ * In the meantime, to support 4.6 <= gcc < 5, we implement __has_attribute
+ * by hand.
+ *
+ * sparse does not support __has_attribute (yet) and defines __GNUC_MINOR__
+ * depending on the compiler used to build it; however, these attributes have
+ * no semantic effects for sparse, so it does not matter. Also note that,
+ * in order to avoid sparse's warnings, even the unsupported ones must be
+ * defined to 0.
+ */
+#ifndef __has_attribute
+# define __has_attribute(x) __GCC4_has_attribute_##x
+# define __GCC4_has_attribute___assume_aligned__ (__GNUC_MINOR__ >= 9)
+# define __GCC4_has_attribute___designated_init__ 0
+# define __GCC4_has_attribute___externally_visible__ 1
+# define __GCC4_has_attribute___noclone__ 1
+# define __GCC4_has_attribute___optimize__ 1
+# define __GCC4_has_attribute___nonstring__ 0
+# define __GCC4_has_attribute___no_sanitize_address__ (__GNUC_MINOR__ >= 8)
+#endif
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alias-function-attribute
+ */
+#define __alias(symbol) __attribute__((__alias__(#symbol)))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-aligned-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-aligned-type-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-aligned-variable-attribute
+ */
+#define __aligned(x) __attribute__((__aligned__(x)))
+#define __aligned_largest __attribute__((__aligned__))
+
+/*
+ * Note: users of __always_inline currently do not write "inline" themselves,
+ * which seems to be required by gcc to apply the attribute according
+ * to its docs (and also "warning: always_inline function might not be
+ * inlinable [-Wattributes]" is emitted).
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-always_005finline-function-attribute
+ * clang: mentioned
+ */
+#define __always_inline inline __attribute__((__always_inline__))
+
+/*
+ * The second argument is optional (default 0), so we use a variadic macro
+ * to make the shorthand.
+ *
+ * Beware: Do not apply this to functions which may return
+ * ERR_PTRs. Also, it is probably unwise to apply it to functions
+ * returning extra information in the low bits (but in that case the
+ * compiler should see some alignment anyway, when the return value is
+ * massaged by 'flags = ptr & 3; ptr &= ~3;').
+ *
+ * Optional: only supported since gcc >= 4.9
+ * Optional: not supported by icc
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-assume_005faligned-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#assume-aligned
+ */
+#if __has_attribute(__assume_aligned__)
+# define __assume_aligned(a, ...) __attribute__((__assume_aligned__(a, ## __VA_ARGS__)))
+#else
+# define __assume_aligned(a, ...)
+#endif
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-cold-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html#index-cold-label-attribute
+ */
+#define __cold __attribute__((__cold__))
+
+/*
+ * Note the long name.
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute
+ */
+#define __attribute_const__ __attribute__((__const__))
+
+/*
+ * Don't. Just don't. See commit 771c035372a0 ("deprecate the '__deprecated'
+ * attribute warnings entirely and for good") for more information.
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-deprecated-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-deprecated-type-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-deprecated-variable-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Enumerator-Attributes.html#index-deprecated-enumerator-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#deprecated
+ */
+#define __deprecated
+
+/*
+ * Optional: only supported since gcc >= 5.1
+ * Optional: not supported by clang
+ * Optional: not supported by icc
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-designated_005finit-type-attribute
+ */
+#if __has_attribute(__designated_init__)
+# define __designated_init __attribute__((__designated_init__))
+#else
+# define __designated_init
+#endif
+
+/*
+ * Optional: not supported by clang
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-externally_005fvisible-function-attribute
+ */
+#if __has_attribute(__externally_visible__)
+# define __visible __attribute__((__externally_visible__))
+#else
+# define __visible
+#endif
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-format-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#format
+ */
+#define __printf(a, b) __attribute__((__format__(printf, a, b)))
+#define __scanf(a, b) __attribute__((__format__(scanf, a, b)))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-gnu_005finline-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#gnu-inline
+ */
+#define __gnu_inline __attribute__((__gnu_inline__))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute
+ */
+#define __malloc __attribute__((__malloc__))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-mode-type-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-mode-variable-attribute
+ */
+#define __mode(x) __attribute__((__mode__(x)))
+
+/*
+ * Optional: not supported by clang
+ * Note: icc does not recognize gcc's no-tracer
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noclone-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-optimize-function-attribute
+ */
+#if __has_attribute(__noclone__)
+# if __has_attribute(__optimize__)
+# define __noclone __attribute__((__noclone__, __optimize__("no-tracer")))
+# else
+# define __noclone __attribute__((__noclone__))
+# endif
+#else
+# define __noclone
+#endif
+
+/*
+ * Note the missing underscores.
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noinline-function-attribute
+ * clang: mentioned
+ */
+#define noinline __attribute__((__noinline__))
+
+/*
+ * Optional: only supported since gcc >= 8
+ * Optional: not supported by clang
+ * Optional: not supported by icc
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-nonstring-variable-attribute
+ */
+#if __has_attribute(__nonstring__)
+# define __nonstring __attribute__((__nonstring__))
+#else
+# define __nonstring
+#endif
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noreturn-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#noreturn
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#id1
+ */
+#define __noreturn __attribute__((__noreturn__))
+
+/*
+ * Optional: only supported since gcc >= 4.8
+ * Optional: not supported by icc
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-no_005fsanitize_005faddress-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#no-sanitize-address-no-address-safety-analysis
+ */
+#if __has_attribute(__no_sanitize_address__)
+# define __no_sanitize_address __attribute__((__no_sanitize_address__))
+#else
+# define __no_sanitize_address
+#endif
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-packed-type-attribute
+ * clang: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-packed-variable-attribute
+ */
+#define __packed __attribute__((__packed__))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute
+ */
+#define __pure __attribute__((__pure__))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-section-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-section-variable-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#section-declspec-allocate
+ */
+#define __section(S) __attribute__((__section__(#S)))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-unused-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-unused-type-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-unused-variable-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html#index-unused-label-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#maybe-unused-unused
+ */
+#define __always_unused __attribute__((__unused__))
+#define __maybe_unused __attribute__((__unused__))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-used-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-used-variable-attribute
+ */
+#define __used __attribute__((__used__))
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-weak-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-weak-variable-attribute
+ */
+#define __weak __attribute__((__weak__))
+
+#endif /* __LINUX_COMPILER_ATTRIBUTES_H */
+/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __LINUX_COMPILER_TYPES_H
#define __LINUX_COMPILER_TYPES_H
#ifdef __KERNEL__
+/* Attributes */
+#include <linux/compiler_attributes.h>
+
/* Compiler specific macros. */
#ifdef __clang__
#include <linux/compiler-clang.h>
#include <asm/compiler.h>
#endif
-/*
- * Generic compiler-independent macros required for kernel
- * build go below this comment. Actual compiler/compiler version
- * specific implementations come from the above header files
- */
-
struct ftrace_branch_data {
const char *func;
const char *file;
unsigned long constant;
};
-/* Don't. Just don't. */
-#define __deprecated
-#define __deprecated_for_modules
-
#endif /* __KERNEL__ */
#endif /* __ASSEMBLY__ */
* compilers. We don't consider that to be an error, so set them to nothing.
* For example, some of them are for compiler specific plugins.
*/
-#ifndef __designated_init
-# define __designated_init
-#endif
-
#ifndef __latent_entropy
# define __latent_entropy
#endif
# define randomized_struct_fields_end
#endif
-#ifndef __visible
-#define __visible
-#endif
-
-/*
- * Assume alignment of return value.
- */
-#ifndef __assume_aligned
-#define __assume_aligned(a, ...)
+#ifndef asm_volatile_goto
+#define asm_volatile_goto(x...) asm goto(x)
#endif
/* Are two types/vars the same type (ignoring qualifiers)? */
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
-#ifndef __attribute_const__
-#define __attribute_const__ __attribute__((__const__))
-#endif
-
-#ifndef __noclone
-#define __noclone
-#endif
-
/* Helpers for emitting diagnostics in pragmas. */
#ifndef __diag
#define __diag(string)
#define __diag_error(compiler, version, option, comment) \
__diag_ ## compiler(version, error, option)
-/*
- * From the GCC manual:
- *
- * Many functions have no effects except the return value and their
- * return value depends only on the parameters and/or global
- * variables. Such a function can be subject to common subexpression
- * elimination and loop optimization just as an arithmetic operator
- * would be.
- * [...]
- */
-#define __pure __attribute__((pure))
-#define __aligned(x) __attribute__((aligned(x)))
-#define __printf(a, b) __attribute__((format(printf, a, b)))
-#define __scanf(a, b) __attribute__((format(scanf, a, b)))
-#define __maybe_unused __attribute__((unused))
-#define __always_unused __attribute__((unused))
-#define __mode(x) __attribute__((mode(x)))
-#define __malloc __attribute__((__malloc__))
-#define __used __attribute__((__used__))
-#define __noreturn __attribute__((noreturn))
-#define __packed __attribute__((packed))
-#define __weak __attribute__((weak))
-#define __alias(symbol) __attribute__((alias(#symbol)))
-#define __cold __attribute__((cold))
-#define __section(S) __attribute__((__section__(#S)))
-
-
#ifdef CONFIG_ENABLE_MUST_CHECK
-#define __must_check __attribute__((warn_unused_result))
+#define __must_check __attribute__((__warn_unused_result__))
#else
#define __must_check
#endif
-#if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
+#if defined(CC_USING_HOTPATCH)
#define notrace __attribute__((hotpatch(0, 0)))
#else
-#define notrace __attribute__((no_instrument_function))
+#define notrace __attribute__((__no_instrument_function__))
#endif
/*
* stack and frame pointer being set up and there is no chance to
* restore the lr register to the value before mcount was called.
*/
-#define __naked __attribute__((naked)) notrace
+#define __naked __attribute__((__naked__)) notrace
#define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
-/*
- * Feature detection for gnu_inline (gnu89 extern inline semantics). Either
- * __GNUC_STDC_INLINE__ is defined (not using gnu89 extern inline semantics,
- * and we opt in to the gnu89 semantics), or __GNUC_STDC_INLINE__ is not
- * defined so the gnu89 semantics are the default.
- */
-#ifdef __GNUC_STDC_INLINE__
-# define __gnu_inline __attribute__((gnu_inline))
-#else
-# define __gnu_inline
-#endif
-
/*
* Force always-inline if the user requests it so via the .config.
* GCC does not warn about unused static inline functions for
* semantics rather than c99. This prevents multiple symbol definition errors
* of extern inline functions at link time.
* A lot of inline functions can cause havoc with function tracing.
+ * Do not use __always_inline here, since currently it expands to inline again
+ * (which would break users of __always_inline).
*/
#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
!defined(CONFIG_OPTIMIZE_INLINING)
-#define inline \
- inline __attribute__((always_inline, unused)) notrace __gnu_inline
+#define inline inline __attribute__((__always_inline__)) __gnu_inline \
+ __maybe_unused notrace
#else
-#define inline inline __attribute__((unused)) notrace __gnu_inline
+#define inline inline __gnu_inline \
+ __maybe_unused notrace
#endif
#define __inline__ inline
-#define __inline inline
-#define noinline __attribute__((noinline))
-
-#ifndef __always_inline
-#define __always_inline inline __attribute__((always_inline))
-#endif
+#define __inline inline
/*
* Rather then using noinline to prevent stack consumption, use
CPUHP_AP_MIPS_GIC_TIMER_STARTING,
CPUHP_AP_ARC_TIMER_STARTING,
CPUHP_AP_RISCV_TIMER_STARTING,
+ CPUHP_AP_CSKY_TIMER_STARTING,
CPUHP_AP_KVM_STARTING,
CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
CPUHP_AP_KVM_ARM_VGIC_STARTING,
#include <linux/radix-tree.h>
#include <asm/pgtable.h>
+typedef unsigned long dax_entry_t;
+
struct iomap_ops;
struct dax_device;
struct dax_operations {
struct block_device *bdev, struct writeback_control *wbc);
struct page *dax_layout_busy_page(struct address_space *mapping);
-bool dax_lock_mapping_entry(struct page *page);
-void dax_unlock_mapping_entry(struct page *page);
+dax_entry_t dax_lock_page(struct page *page);
+void dax_unlock_page(struct page *page, dax_entry_t cookie);
#else
static inline bool bdev_dax_supported(struct block_device *bdev,
int blocksize)
return -EOPNOTSUPP;
}
-static inline bool dax_lock_mapping_entry(struct page *page)
+static inline dax_entry_t dax_lock_page(struct page *page)
{
if (IS_DAX(page->mapping->host))
- return true;
- return false;
+ return ~0UL;
+ return 0;
}
-static inline void dax_unlock_mapping_entry(struct page *page)
+static inline void dax_unlock_page(struct page *page, dax_entry_t cookie)
{
}
#endif
#include <linux/dma-mapping.h>
#include <linux/mem_encrypt.h>
-#define DIRECT_MAPPING_ERROR 0
+#define DIRECT_MAPPING_ERROR (~(dma_addr_t)0)
#ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
#include <asm/dma-direct.h>
extern void efi_reboot(enum reboot_mode reboot_mode, const char *__unused);
extern bool efi_is_table_address(unsigned long phys_addr);
+
+extern int efi_apply_persistent_mem_reservations(void);
#else
static inline bool efi_enabled(int feature)
{
{
return false;
}
+
+static inline int efi_apply_persistent_mem_reservations(void)
+{
+ return 0;
+}
#endif
extern int efi_status_to_err(efi_status_t status);
offsetof(TYPE, MEMBER) ... offsetofend(TYPE, MEMBER) - 1
#define bpf_ctx_range_till(TYPE, MEMBER1, MEMBER2) \
offsetof(TYPE, MEMBER1) ... offsetofend(TYPE, MEMBER2) - 1
+#if BITS_PER_LONG == 64
+# define bpf_ctx_range_ptr(TYPE, MEMBER) \
+ offsetof(TYPE, MEMBER) ... offsetofend(TYPE, MEMBER) - 1
+#else
+# define bpf_ctx_range_ptr(TYPE, MEMBER) \
+ offsetof(TYPE, MEMBER) ... offsetof(TYPE, MEMBER) + 8 - 1
+#endif /* BITS_PER_LONG == 64 */
#define bpf_target_off(TYPE, MEMBER, SIZE, PTR_SIZE) \
({ \
void bpf_jit_free(struct bpf_prog *fp);
+int bpf_jit_get_func_addr(const struct bpf_prog *prog,
+ const struct bpf_insn *insn, bool extra_pass,
+ u64 *func_addr, bool *func_addr_fixed);
+
struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *fp);
void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other);
#define NOMMU_VMFLAGS \
(NOMMU_MAP_READ | NOMMU_MAP_WRITE | NOMMU_MAP_EXEC)
+/*
+ * These flags control the behavior of the remap_file_range function pointer.
+ * If it is called with len == 0 that means "remap to end of source file".
+ * See Documentation/filesystems/vfs.txt for more details about this call.
+ *
+ * REMAP_FILE_DEDUP: only remap if contents identical (i.e. deduplicate)
+ * REMAP_FILE_CAN_SHORTEN: caller can handle a shortened request
+ */
+#define REMAP_FILE_DEDUP (1 << 0)
+#define REMAP_FILE_CAN_SHORTEN (1 << 1)
+
+/*
+ * These flags signal that the caller is ok with altering various aspects of
+ * the behavior of the remap operation. The changes must be made by the
+ * implementation; the vfs remap helper functions can take advantage of them.
+ * Flags in this category exist to preserve the quirky behavior of the hoisted
+ * btrfs clone/dedupe ioctls.
+ */
+#define REMAP_FILE_ADVISORY (REMAP_FILE_CAN_SHORTEN)
struct iov_iter;
#endif
ssize_t (*copy_file_range)(struct file *, loff_t, struct file *,
loff_t, size_t, unsigned int);
- int (*clone_file_range)(struct file *, loff_t, struct file *, loff_t,
- u64);
- int (*dedupe_file_range)(struct file *, loff_t, struct file *, loff_t,
- u64);
+ loff_t (*remap_file_range)(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags);
int (*fadvise)(struct file *, loff_t, loff_t, int);
} __randomize_layout;
unsigned long, loff_t *, rwf_t);
extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
loff_t, size_t, unsigned int);
-extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
- struct inode *inode_out, loff_t pos_out,
- u64 *len, bool is_dedupe);
-extern int do_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len);
-extern int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len);
-extern int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
- struct inode *dest, loff_t destoff,
- loff_t len, bool *is_same);
+extern int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t *count,
+ unsigned int remap_flags);
+extern loff_t do_clone_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags);
+extern loff_t vfs_clone_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags);
extern int vfs_dedupe_file_range(struct file *file,
struct file_dedupe_range *same);
-extern int vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
- struct file *dst_file, loff_t dst_pos,
- u64 len);
+extern loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
+ struct file *dst_file, loff_t dst_pos,
+ loff_t len, unsigned int remap_flags);
struct super_operations {
extern int generic_file_mmap(struct file *, struct vm_area_struct *);
extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *);
extern ssize_t generic_write_checks(struct kiocb *, struct iov_iter *);
+extern int generic_remap_checks(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t *count, unsigned int remap_flags);
extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *);
extern ssize_t __generic_file_write_iter(struct kiocb *, struct iov_iter *);
extern ssize_t generic_file_write_iter(struct kiocb *, struct iov_iter *);
static inline void fscache_retrieval_complete(struct fscache_retrieval *op,
int n_pages)
{
- atomic_sub(n_pages, &op->n_pages);
- if (atomic_read(&op->n_pages) <= 0)
+ if (atomic_sub_return_relaxed(n_pages, &op->n_pages) <= 0)
fscache_op_complete(&op->op, false);
}
extern void return_to_handler(void);
extern int
-ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth,
- unsigned long frame_pointer, unsigned long *retp);
+function_graph_enter(unsigned long ret, unsigned long func,
+ unsigned long frame_pointer, unsigned long *retp);
unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx,
unsigned long ret, unsigned long *retp);
extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
struct vm_area_struct *vma, unsigned long addr,
int node, bool hugepage);
-#define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
+#define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
alloc_pages_vma(gfp_mask, order, vma, addr, numa_node_id(), true)
#else
#define alloc_pages(gfp_mask, order) \
alloc_pages_node(numa_node_id(), gfp_mask, order)
#define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\
alloc_pages(gfp_mask, order)
-#define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
+#define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
alloc_pages(gfp_mask, order)
#endif
#define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
* @attr_usage_id: Attribute usage id as per spec
* @report_id: Report id to look for
* @flag: Synchronous or asynchronous read
+* @is_signed: If true then fields < 32 bits will be sign-extended
*
* Issues a synchronous or asynchronous read request for an input attribute.
* Returns data upto 32 bits.
int sensor_hub_input_attr_get_raw_value(struct hid_sensor_hub_device *hsdev,
u32 usage_id,
u32 attr_usage_id, u32 report_id,
- enum sensor_hub_read_flags flag
+ enum sensor_hub_read_flags flag,
+ bool is_signed
);
/**
#define HID_GD_VBRZ 0x00010045
#define HID_GD_VNO 0x00010046
#define HID_GD_FEATURE 0x00010047
+#define HID_GD_RESOLUTION_MULTIPLIER 0x00010048
#define HID_GD_SYSTEM_CONTROL 0x00010080
#define HID_GD_UP 0x00010090
#define HID_GD_DOWN 0x00010091
#define HID_DC_BATTERYSTRENGTH 0x00060020
#define HID_CP_CONSUMER_CONTROL 0x000c0001
+#define HID_CP_AC_PAN 0x000c0238
#define HID_DG_DIGITIZER 0x000d0001
#define HID_DG_PEN 0x000d0002
#define HID_DG_LIGHTPEN 0x000d0003
#define HID_DG_TOUCHSCREEN 0x000d0004
#define HID_DG_TOUCHPAD 0x000d0005
+#define HID_DG_WHITEBOARD 0x000d0006
#define HID_DG_STYLUS 0x000d0020
#define HID_DG_PUCK 0x000d0021
#define HID_DG_FINGER 0x000d0022
*/
struct hid_collection {
+ struct hid_collection *parent;
unsigned type;
unsigned usage;
unsigned level;
unsigned hid; /* hid usage code */
unsigned collection_index; /* index into collection array */
unsigned usage_index; /* index into usage array */
+ __s8 resolution_multiplier;/* Effective Resolution Multiplier
+ (HUT v1.12, 4.3.1), default: 1 */
/* hidinput data */
+ __s8 wheel_factor; /* 120/resolution_multiplier */
__u16 code; /* input driver code */
__u8 type; /* input driver type */
__s8 hat_min; /* hat switch fun */
__s8 hat_max; /* ditto */
__s8 hat_dir; /* ditto */
+ __s16 wheel_accumulated; /* hi-res wheel */
};
struct hid_input;
unsigned int *collection_stack;
unsigned int collection_stack_ptr;
unsigned int collection_stack_size;
+ struct hid_collection *active_collection;
struct hid_device *device;
unsigned int scan_flags;
};
/* Applications from HID Usage Tables 4/8/99 Version 1.1 */
/* We ignore a few input applications that are not widely used */
-#define IS_INPUT_APPLICATION(a) (((a >= 0x00010000) && (a <= 0x00010008)) || (a == 0x00010080) || (a == 0x000c0001) || ((a >= 0x000d0002) && (a <= 0x000d0006)))
+#define IS_INPUT_APPLICATION(a) \
+ (((a >= HID_UP_GENDESK) && (a <= HID_GD_MULTIAXIS)) \
+ || ((a >= HID_DG_PEN) && (a <= HID_DG_WHITEBOARD)) \
+ || (a == HID_GD_SYSTEM_CONTROL) || (a == HID_CP_CONSUMER_CONTROL) \
+ || (a == HID_GD_WIRELESS_RADIO_CTLS))
/* HID core API */
unsigned int type, unsigned int id,
unsigned int field_index,
unsigned int report_counts);
+
+void hid_setup_resolution_multiplier(struct hid_device *hid);
int hid_open_report(struct hid_device *device);
int hid_check_keys_pressed(struct hid_device *hid);
int hid_connect(struct hid_device *hid, unsigned int connect_mask);
bool probe_done;
+ /*
+ * We must offload the handling of the primary/sub channels
+ * from the single-threaded vmbus_connection.work_queue to
+ * two different workqueue, otherwise we can block
+ * vmbus_connection.work_queue and hang: see vmbus_process_offer().
+ */
+ struct work_struct add_channel_work;
};
static inline bool is_hvsock_channel(const struct vmbus_channel *c)
#define PIT_LATCH ((PIT_TICK_RATE + HZ/2) / HZ)
extern raw_spinlock_t i8253_lock;
+extern bool i8253_clear_counter_on_shutdown;
extern struct clock_event_device i8253_clockevent;
extern void clockevent_i8253_init(bool oneshot);
unsigned long mr_v1_seen;
unsigned long mr_v2_seen;
unsigned long mr_maxdelay;
- unsigned char mr_qrv;
+ unsigned long mr_qi; /* Query Interval */
+ unsigned long mr_qri; /* Query Response Interval */
+ unsigned char mr_qrv; /* Query Robustness Variable */
unsigned char mr_gq_running;
unsigned char mr_ifc_count;
struct timer_list mr_gq_timer; /* general query timer */
#ifdef CONFIG_KEYS
+struct kernel_pkey_query;
+struct kernel_pkey_params;
+
/*
* key under-construction record
* - passed to the request_key actor if supplied
*/
struct key_restriction *(*lookup_restriction)(const char *params);
+ /* Asymmetric key accessor functions. */
+ int (*asym_query)(const struct kernel_pkey_params *params,
+ struct kernel_pkey_query *info);
+ int (*asym_eds_op)(struct kernel_pkey_params *params,
+ const void *in, void *out);
+ int (*asym_verify_signature)(struct kernel_pkey_params *params,
+ const void *in, const void *in2);
+
/* internal fields */
struct list_head link; /* link in types list */
struct lock_class_key lock_class; /* key->sem lock class */
--- /dev/null
+/* keyctl kernel bits
+ *
+ * Copyright (C) 2016 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#ifndef __LINUX_KEYCTL_H
+#define __LINUX_KEYCTL_H
+
+#include <uapi/linux/keyctl.h>
+
+struct kernel_pkey_query {
+ __u32 supported_ops; /* Which ops are supported */
+ __u32 key_size; /* Size of the key in bits */
+ __u16 max_data_size; /* Maximum size of raw data to sign in bytes */
+ __u16 max_sig_size; /* Maximum size of signature in bytes */
+ __u16 max_enc_size; /* Maximum size of encrypted blob in bytes */
+ __u16 max_dec_size; /* Maximum size of decrypted blob in bytes */
+};
+
+enum kernel_pkey_operation {
+ kernel_pkey_encrypt,
+ kernel_pkey_decrypt,
+ kernel_pkey_sign,
+ kernel_pkey_verify,
+};
+
+struct kernel_pkey_params {
+ struct key *key;
+ const char *encoding; /* Encoding (eg. "oaep" or "raw" for none) */
+ const char *hash_algo; /* Digest algorithm used (eg. "sha1") or NULL if N/A */
+ char *info; /* Modified info string to be released later */
+ __u32 in_len; /* Input data size */
+ union {
+ __u32 out_len; /* Output buffer size (enc/dec/sign) */
+ __u32 in2_len; /* 2nd input data size (verify) */
+ };
+ enum kernel_pkey_operation op : 8;
+};
+
+#endif /* __LINUX_KEYCTL_H */
u8 wq_signature[0x1];
u8 cont_srq[0x1];
- u8 dbr_umem_valid[0x1];
+ u8 reserved_at_22[0x1];
u8 rlky[0x1];
u8 basic_cyclic_rcv_wqe[0x1];
u8 log_rq_stride[0x3];
u8 xrcd[0x18];
u8 page_offset[0x6];
- u8 reserved_at_46[0x2];
+ u8 reserved_at_46[0x1];
+ u8 dbr_umem_valid[0x1];
u8 cqn[0x18];
u8 reserved_at_60[0x20];
struct mlx5_ifc_xrc_srqc_bits xrc_srq_context_entry;
- u8 reserved_at_280[0x40];
+ u8 reserved_at_280[0x60];
+
u8 xrc_srq_umem_valid[0x1];
- u8 reserved_at_2c1[0x5bf];
+ u8 reserved_at_2e1[0x1f];
+
+ u8 reserved_at_300[0x580];
u8 pas[0][0x40];
};
static inline void mm_inc_nr_puds(struct mm_struct *mm)
{
+ if (mm_pud_folded(mm))
+ return;
atomic_long_add(PTRS_PER_PUD * sizeof(pud_t), &mm->pgtables_bytes);
}
static inline void mm_dec_nr_puds(struct mm_struct *mm)
{
+ if (mm_pud_folded(mm))
+ return;
atomic_long_sub(PTRS_PER_PUD * sizeof(pud_t), &mm->pgtables_bytes);
}
#endif
static inline void mm_inc_nr_pmds(struct mm_struct *mm)
{
+ if (mm_pmd_folded(mm))
+ return;
atomic_long_add(PTRS_PER_PMD * sizeof(pmd_t), &mm->pgtables_bytes);
}
static inline void mm_dec_nr_pmds(struct mm_struct *mm)
{
+ if (mm_pmd_folded(mm))
+ return;
atomic_long_sub(PTRS_PER_PMD * sizeof(pmd_t), &mm->pgtables_bytes);
}
#endif
*/
static inline unsigned int nanddev_neraseblocks(const struct nand_device *nand)
{
- return (u64)nand->memorg.luns_per_target *
- nand->memorg.eraseblocks_per_lun *
- nand->memorg.pages_per_eraseblock;
+ return nand->memorg.ntargets * nand->memorg.luns_per_target *
+ nand->memorg.eraseblocks_per_lun;
}
/**
}
/**
- * nanddev_pos_next_eraseblock() - Move a position to the next page
+ * nanddev_pos_next_page() - Move a position to the next page
* @nand: NAND device
* @pos: the position to update
*
}
/* fall through */
case NET_DIM_START_MEASURE:
+ net_dim_sample(end_sample.event_ctr, end_sample.pkt_ctr, end_sample.byte_ctr,
+ &dim->start_sample);
dim->state = NET_DIM_MEASURE_IN_PROGRESS;
break;
case NET_DIM_APPLY_NEW_PROFILE:
#endif
}
+/* Variant of netdev_tx_sent_queue() for drivers that are aware
+ * that they should not test BQL status themselves.
+ * We do want to change __QUEUE_STATE_STACK_XOFF only for the last
+ * skb of a batch.
+ * Returns true if the doorbell must be used to kick the NIC.
+ */
+static inline bool __netdev_tx_sent_queue(struct netdev_queue *dev_queue,
+ unsigned int bytes,
+ bool xmit_more)
+{
+ if (xmit_more) {
+#ifdef CONFIG_BQL
+ dql_queued(&dev_queue->dql, bytes);
+#endif
+ return netif_tx_queue_stopped(dev_queue);
+ }
+ netdev_tx_sent_queue(dev_queue, bytes);
+ return true;
+}
+
/**
* netdev_sent_queue - report the number of bytes queued to hardware
* @dev: network device
extern ip_set_id_t ip_set_get_byname(struct net *net,
const char *name, struct ip_set **set);
extern void ip_set_put_byindex(struct net *net, ip_set_id_t index);
-extern const char *ip_set_name_byindex(struct net *net, ip_set_id_t index);
+extern void ip_set_name_byindex(struct net *net, ip_set_id_t index, char *name);
extern ip_set_id_t ip_set_nfnl_get_byindex(struct net *net, ip_set_id_t index);
extern void ip_set_nfnl_put(struct net *net, ip_set_id_t index);
rcu_assign_pointer(comment->c, c);
}
-/* Used only when dumping a set, protected by rcu_read_lock_bh() */
+/* Used only when dumping a set, protected by rcu_read_lock() */
static inline int
ip_set_put_comment(struct sk_buff *skb, const struct ip_set_comment *comment)
{
- struct ip_set_comment_rcu *c = rcu_dereference_bh(comment->c);
+ struct ip_set_comment_rcu *c = rcu_dereference(comment->c);
if (!c)
return 0;
struct nf_conntrack_tuple tuple;
};
+enum grep_conntrack {
+ GRE_CT_UNREPLIED,
+ GRE_CT_REPLIED,
+ GRE_CT_MAX
+};
+
+struct netns_proto_gre {
+ struct nf_proto_net nf;
+ rwlock_t keymap_lock;
+ struct list_head keymap_list;
+ unsigned int gre_timeouts[GRE_CT_MAX];
+};
+
/* add new tuple->key_reply pair to keymap */
int nf_ct_gre_keymap_add(struct nf_conn *ct, enum ip_conntrack_dir dir,
struct nf_conntrack_tuple *t);
void watchdog_nmi_stop(void);
void watchdog_nmi_start(void);
int watchdog_nmi_probe(void);
+int watchdog_nmi_enable(unsigned int cpu);
+void watchdog_nmi_disable(unsigned int cpu);
/**
* touch_nmi_watchdog - restart NMI watchdog timeout.
#ifdef CONFIG_TREE_SRCU
#define _SRCU_NOTIFIER_HEAD(name, mod) \
- static DEFINE_PER_CPU(struct srcu_data, \
- name##_head_srcu_data); \
+ static DEFINE_PER_CPU(struct srcu_data, name##_head_srcu_data); \
mod struct srcu_notifier_head name = \
SRCU_NOTIFIER_INIT(name, name##_head_srcu_data)
#define __DAVINCI_GPIO_PLATFORM_H
struct davinci_gpio_platform_data {
+ bool no_auto_base;
+ u32 base;
u32 ngpio;
u32 gpio_unbanked;
};
#ifndef _LINUX_PSI_H
#define _LINUX_PSI_H
+#include <linux/jump_label.h>
#include <linux/psi_types.h>
#include <linux/sched.h>
#ifdef CONFIG_PSI
-extern bool psi_disabled;
+extern struct static_key_false psi_disabled;
void psi_init(void);
*
* @buf_lock: spinlock to serialize access to @buf
* @buf: preallocated crash dump buffer
- * @bufsize: size of @buf available for crash dump writes
+ * @bufsize: size of @buf available for crash dump bytes (must match
+ * smallest number of bytes available for writing to a
+ * backend entry, since compressed bytes don't take kindly
+ * to being truncated)
*
* @read_mutex: serializes @open, @read, @close, and @erase callbacks
* @flags: bitfield of frontends the backend can accept writes for
#define PTRACE_MODE_NOAUDIT 0x04
#define PTRACE_MODE_FSCREDS 0x08
#define PTRACE_MODE_REALCREDS 0x10
-#define PTRACE_MODE_SCHED 0x20
-#define PTRACE_MODE_IBPB 0x40
/* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
#define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
#define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
#define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
#define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
-#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB)
/**
* ptrace_may_access - check whether the caller is permitted to access
*/
extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
-/**
- * ptrace_may_access - check whether the caller is permitted to access
- * a target task.
- * @task: target task
- * @mode: selects type of access and caller credentials
- *
- * Returns true on success, false on denial.
- *
- * Similar to ptrace_may_access(). Only to be called from context switch
- * code. Does not call into audit and the regular LSM hooks due to locking
- * constraints.
- */
-extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode);
-
static inline int ptrace_reparented(struct task_struct *child)
{
return !same_thread_group(child->real_parent, child->parent);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
/* Index of current stored address in ret_stack: */
int curr_ret_stack;
+ int curr_ret_depth;
/* Stack of return addresses for return function tracing: */
struct ftrace_ret_stack *ret_stack;
void *security;
#endif
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+ unsigned long lowest_stack;
+ unsigned long prev_lowest_stack;
+#endif
+
/*
* New fields for task_struct should be added above here, so that
* they are included in the randomized portion of task_struct.
#define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */
#define PFA_SPEC_SSB_DISABLE 3 /* Speculative Store Bypass disabled */
#define PFA_SPEC_SSB_FORCE_DISABLE 4 /* Speculative Store Bypass force disabled*/
+#define PFA_SPEC_IB_DISABLE 5 /* Indirect branch speculation restricted */
+#define PFA_SPEC_IB_FORCE_DISABLE 6 /* Indirect branch speculation permanently restricted */
#define TASK_PFA_TEST(name, func) \
static inline bool task_##func(struct task_struct *p) \
TASK_PFA_TEST(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable)
TASK_PFA_SET(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable)
+TASK_PFA_TEST(SPEC_IB_DISABLE, spec_ib_disable)
+TASK_PFA_SET(SPEC_IB_DISABLE, spec_ib_disable)
+TASK_PFA_CLEAR(SPEC_IB_DISABLE, spec_ib_disable)
+
+TASK_PFA_TEST(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable)
+TASK_PFA_SET(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable)
+
static inline void
current_restore_flags(unsigned long orig_flags, unsigned long flags)
{
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_SCHED_SMT_H
+#define _LINUX_SCHED_SMT_H
+
+#include <linux/static_key.h>
+
+#ifdef CONFIG_SCHED_SMT
+extern struct static_key_false sched_smt_present;
+
+static __always_inline bool sched_smt_active(void)
+{
+ return static_branch_likely(&sched_smt_present);
+}
+#else
+static inline bool sched_smt_active(void) { return false; }
+#endif
+
+void arch_smt_update(void);
+
+#endif
*
* See the SFF-8472 specification and related documents for the definition
* of these structure members. This can be obtained from
- * ftp://ftp.seagate.com/sff
+ * https://www.snia.org/technology-communities/sff/specifications
*/
struct sfp_eeprom_id {
struct sfp_eeprom_base base;
}
}
+static inline void skb_zcopy_set_nouarg(struct sk_buff *skb, void *val)
+{
+ skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t) val | 0x1UL);
+ skb_shinfo(skb)->tx_flags |= SKBTX_ZEROCOPY_FRAG;
+}
+
+static inline bool skb_zcopy_is_nouarg(struct sk_buff *skb)
+{
+ return (uintptr_t) skb_shinfo(skb)->destructor_arg & 0x1UL;
+}
+
+static inline void *skb_zcopy_get_nouarg(struct sk_buff *skb)
+{
+ return (void *)((uintptr_t) skb_shinfo(skb)->destructor_arg & ~0x1UL);
+}
+
/* Release a reference on a zerocopy structure */
static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy)
{
if (uarg->callback == sock_zerocopy_callback) {
uarg->zerocopy = uarg->zerocopy && zerocopy;
sock_zerocopy_put(uarg);
- } else {
+ } else if (!skb_zcopy_is_nouarg(skb)) {
uarg->callback(uarg, zerocopy);
}
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_STACKLEAK_H
+#define _LINUX_STACKLEAK_H
+
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+
+/*
+ * Check that the poison value points to the unused hole in the
+ * virtual memory map for your platform.
+ */
+#define STACKLEAK_POISON -0xBEEF
+#define STACKLEAK_SEARCH_DEPTH 128
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+#include <asm/stacktrace.h>
+
+static inline void stackleak_task_init(struct task_struct *t)
+{
+ t->lowest_stack = (unsigned long)end_of_stack(t) + sizeof(unsigned long);
+# ifdef CONFIG_STACKLEAK_METRICS
+ t->prev_lowest_stack = t->lowest_stack;
+# endif
+}
+
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+int stack_erasing_sysctl(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos);
+#endif
+
+#else /* !CONFIG_GCC_PLUGIN_STACKLEAK */
+static inline void stackleak_task_init(struct task_struct *t) { }
+#endif
+
+#endif
u8 Ksess[GSS_KRB5_MAX_KEYLEN]; /* session key */
u8 cksum[GSS_KRB5_MAX_KEYLEN];
s32 endtime;
- u32 seq_send;
- u64 seq_send64;
+ atomic_t seq_send;
+ atomic64_t seq_send64;
struct xdr_netobj mech_used;
u8 initiator_sign[GSS_KRB5_MAX_KEYLEN];
u8 acceptor_sign[GSS_KRB5_MAX_KEYLEN];
u8 acceptor_integ[GSS_KRB5_MAX_KEYLEN];
};
-extern u32 gss_seq_send_fetch_and_inc(struct krb5_ctx *ctx);
-extern u64 gss_seq_send64_fetch_and_inc(struct krb5_ctx *ctx);
-
/* The length of the Kerberos GSS token header */
#define GSS_KRB5_TOK_HDR_LEN (16)
buf->head[0].iov_base = start;
buf->head[0].iov_len = len;
buf->tail[0].iov_len = 0;
- buf->bvec = NULL;
buf->pages = NULL;
buf->page_len = 0;
buf->flags = 0;
u32 rcv_tstamp; /* timestamp of last received ACK (for keepalives) */
u32 lsndtime; /* timestamp of last sent data packet (for restart window) */
u32 last_oow_ack_time; /* timestamp of last out-of-window ACK */
+ u32 compressed_ack_rcv_nxt;
u32 tsoffset; /* timestamp offset */
* tracehook_report_syscall_entry - task is about to attempt a system call
* @regs: user register state of current task
*
- * This will be called if %TIF_SYSCALL_TRACE has been set, when the
- * current task has just entered the kernel for a system call.
+ * This will be called if %TIF_SYSCALL_TRACE or %TIF_SYSCALL_EMU have been set,
+ * when the current task has just entered the kernel for a system call.
* Full user register state is available here. Changing the values
* in @regs can affect the system call number and arguments to be tried.
* It is safe to block here, preventing the system call from beginning.
struct tracepoint_func *it_func_ptr; \
void *it_func; \
void *__data; \
- int __maybe_unused idx = 0; \
+ int __maybe_unused __idx = 0; \
\
if (!(cond)) \
return; \
* doesn't work from the idle path. \
*/ \
if (rcuidle) { \
- idx = srcu_read_lock_notrace(&tracepoint_srcu); \
+ __idx = srcu_read_lock_notrace(&tracepoint_srcu);\
rcu_irq_enter_irqson(); \
} \
\
\
if (rcuidle) { \
rcu_irq_exit_irqson(); \
- srcu_read_unlock_notrace(&tracepoint_srcu, idx);\
+ srcu_read_unlock_notrace(&tracepoint_srcu, __idx);\
} \
\
preempt_enable_notrace(); \
extern void tty_release_struct(struct tty_struct *tty, int idx);
extern int tty_release(struct inode *inode, struct file *filp);
extern void tty_init_termios(struct tty_struct *tty);
+extern void tty_save_termios(struct tty_struct *tty);
extern int tty_standard_install(struct tty_driver *driver,
struct tty_struct *tty);
size_t iov_len;
};
-enum {
+enum iter_type {
ITER_IOVEC = 0,
ITER_KVEC = 2,
ITER_BVEC = 4,
ITER_PIPE = 8,
+ ITER_DISCARD = 16,
};
struct iov_iter {
- int type;
+ unsigned int type;
size_t iov_offset;
size_t count;
union {
};
};
+static inline enum iter_type iov_iter_type(const struct iov_iter *i)
+{
+ return i->type & ~(READ | WRITE);
+}
+
+static inline bool iter_is_iovec(const struct iov_iter *i)
+{
+ return iov_iter_type(i) == ITER_IOVEC;
+}
+
+static inline bool iov_iter_is_kvec(const struct iov_iter *i)
+{
+ return iov_iter_type(i) == ITER_KVEC;
+}
+
+static inline bool iov_iter_is_bvec(const struct iov_iter *i)
+{
+ return iov_iter_type(i) == ITER_BVEC;
+}
+
+static inline bool iov_iter_is_pipe(const struct iov_iter *i)
+{
+ return iov_iter_type(i) == ITER_PIPE;
+}
+
+static inline bool iov_iter_is_discard(const struct iov_iter *i)
+{
+ return iov_iter_type(i) == ITER_DISCARD;
+}
+
+static inline unsigned char iov_iter_rw(const struct iov_iter *i)
+{
+ return i->type & (READ | WRITE);
+}
+
/*
* Total number of bytes covered by an iovec.
*
}
#define iov_for_each(iov, iter, start) \
- if (!((start).type & (ITER_BVEC | ITER_PIPE))) \
+ if (iov_iter_type(start) == ITER_IOVEC || \
+ iov_iter_type(start) == ITER_KVEC) \
for (iter = (start); \
(iter).count && \
((iov = iov_iter_iovec(&(iter))), 1); \
size_t iov_iter_zero(size_t bytes, struct iov_iter *);
unsigned long iov_iter_alignment(const struct iov_iter *i);
unsigned long iov_iter_gap_alignment(const struct iov_iter *i);
-void iov_iter_init(struct iov_iter *i, int direction, const struct iovec *iov,
+void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
unsigned long nr_segs, size_t count);
-void iov_iter_kvec(struct iov_iter *i, int direction, const struct kvec *kvec,
+void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec *kvec,
unsigned long nr_segs, size_t count);
-void iov_iter_bvec(struct iov_iter *i, int direction, const struct bio_vec *bvec,
+void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
unsigned long nr_segs, size_t count);
-void iov_iter_pipe(struct iov_iter *i, int direction, struct pipe_inode_info *pipe,
+void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
size_t count);
+void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
size_t maxsize, unsigned maxpages, size_t *start);
ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages,
return i->count;
}
-static inline bool iter_is_iovec(const struct iov_iter *i)
-{
- return !(i->type & (ITER_BVEC | ITER_KVEC | ITER_PIPE));
-}
-
-/*
- * Get one of READ or WRITE out of iter->type without any other flags OR'd in
- * with it.
- *
- * The ?: is just for type safety.
- */
-#define iov_iter_rw(i) ((0 ? (struct iov_iter *)0 : (i))->type & (READ | WRITE))
-
/*
* Cap the iov_iter by given limit; note that the second argument is
* *not* the new size - it's upper limit for such. Passing it a value
};
int __usb_get_extra_descriptor(char *buffer, unsigned size,
- unsigned char type, void **ptr);
+ unsigned char type, void **ptr, size_t min);
#define usb_get_extra_descriptor(ifpoint, type, ptr) \
__usb_get_extra_descriptor((ifpoint)->extra, \
(ifpoint)->extralen, \
- type, (void **)ptr)
+ type, (void **)ptr, sizeof(**(ptr)))
/* ----------------------------------------------------------------------- */
/* Device needs a pause after every control message. */
#define USB_QUIRK_DELAY_CTRL_MSG BIT(13)
+/* Hub needs extra delay after resetting its port. */
+#define USB_QUIRK_HUB_SLOW_RESET BIT(14)
+
#endif /* __LINUX_USB_QUIRKS_H */
*
* @bio is a part of the writeback in progress controlled by @wbc. Perform
* writeback specific initialization. This is used to apply the cgroup
- * writeback context. Must be called after the bio has been associated with
- * a device.
+ * writeback context.
*/
static inline void wbc_init_bio(struct writeback_control *wbc, struct bio *bio)
{
* regular writeback instead of writing things out itself.
*/
if (wbc->wb)
- bio_associate_blkg_from_css(bio, wbc->wb->blkcg_css);
+ bio_associate_blkcg(bio, wbc->wb->blkcg_css);
}
#else /* CONFIG_CGROUP_WRITEBACK */
void xa_init_flags(struct xarray *, gfp_t flags);
void *xa_load(struct xarray *, unsigned long index);
void *xa_store(struct xarray *, unsigned long index, void *entry, gfp_t);
-void *xa_cmpxchg(struct xarray *, unsigned long index,
- void *old, void *entry, gfp_t);
-int xa_reserve(struct xarray *, unsigned long index, gfp_t);
+void *xa_erase(struct xarray *, unsigned long index);
void *xa_store_range(struct xarray *, unsigned long first, unsigned long last,
void *entry, gfp_t);
bool xa_get_mark(struct xarray *, unsigned long index, xa_mark_t);
return xa->xa_flags & XA_FLAGS_MARK(mark);
}
-/**
- * xa_erase() - Erase this entry from the XArray.
- * @xa: XArray.
- * @index: Index of entry.
- *
- * This function is the equivalent of calling xa_store() with %NULL as
- * the third argument. The XArray does not need to allocate memory, so
- * the user does not need to provide GFP flags.
- *
- * Context: Process context. Takes and releases the xa_lock.
- * Return: The entry which used to be at this index.
- */
-static inline void *xa_erase(struct xarray *xa, unsigned long index)
-{
- return xa_store(xa, index, NULL, 0);
-}
-
-/**
- * xa_insert() - Store this entry in the XArray unless another entry is
- * already present.
- * @xa: XArray.
- * @index: Index into array.
- * @entry: New entry.
- * @gfp: Memory allocation flags.
- *
- * If you would rather see the existing entry in the array, use xa_cmpxchg().
- * This function is for users who don't care what the entry is, only that
- * one is present.
- *
- * Context: Process context. Takes and releases the xa_lock.
- * May sleep if the @gfp flags permit.
- * Return: 0 if the store succeeded. -EEXIST if another entry was present.
- * -ENOMEM if memory could not be allocated.
- */
-static inline int xa_insert(struct xarray *xa, unsigned long index,
- void *entry, gfp_t gfp)
-{
- void *curr = xa_cmpxchg(xa, index, NULL, entry, gfp);
- if (!curr)
- return 0;
- if (xa_is_err(curr))
- return xa_err(curr);
- return -EEXIST;
-}
-
-/**
- * xa_release() - Release a reserved entry.
- * @xa: XArray.
- * @index: Index of entry.
- *
- * After calling xa_reserve(), you can call this function to release the
- * reservation. If the entry at @index has been stored to, this function
- * will do nothing.
- */
-static inline void xa_release(struct xarray *xa, unsigned long index)
-{
- xa_cmpxchg(xa, index, NULL, NULL, 0);
-}
-
/**
* xa_for_each() - Iterate over a portion of an XArray.
* @xa: XArray.
void *__xa_cmpxchg(struct xarray *, unsigned long index, void *old,
void *entry, gfp_t);
int __xa_alloc(struct xarray *, u32 *id, u32 max, void *entry, gfp_t);
+int __xa_reserve(struct xarray *, unsigned long index, gfp_t);
void __xa_set_mark(struct xarray *, unsigned long index, xa_mark_t);
void __xa_clear_mark(struct xarray *, unsigned long index, xa_mark_t);
return -EEXIST;
}
+/**
+ * xa_store_bh() - Store this entry in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @entry: New entry.
+ * @gfp: Memory allocation flags.
+ *
+ * This function is like calling xa_store() except it disables softirqs
+ * while holding the array lock.
+ *
+ * Context: Any context. Takes and releases the xa_lock while
+ * disabling softirqs.
+ * Return: The entry which used to be at this index.
+ */
+static inline void *xa_store_bh(struct xarray *xa, unsigned long index,
+ void *entry, gfp_t gfp)
+{
+ void *curr;
+
+ xa_lock_bh(xa);
+ curr = __xa_store(xa, index, entry, gfp);
+ xa_unlock_bh(xa);
+
+ return curr;
+}
+
+/**
+ * xa_store_irq() - Erase this entry from the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @entry: New entry.
+ * @gfp: Memory allocation flags.
+ *
+ * This function is like calling xa_store() except it disables interrupts
+ * while holding the array lock.
+ *
+ * Context: Process context. Takes and releases the xa_lock while
+ * disabling interrupts.
+ * Return: The entry which used to be at this index.
+ */
+static inline void *xa_store_irq(struct xarray *xa, unsigned long index,
+ void *entry, gfp_t gfp)
+{
+ void *curr;
+
+ xa_lock_irq(xa);
+ curr = __xa_store(xa, index, entry, gfp);
+ xa_unlock_irq(xa);
+
+ return curr;
+}
+
/**
* xa_erase_bh() - Erase this entry from the XArray.
* @xa: XArray.
* the third argument. The XArray does not need to allocate memory, so
* the user does not need to provide GFP flags.
*
- * Context: Process context. Takes and releases the xa_lock while
+ * Context: Any context. Takes and releases the xa_lock while
* disabling softirqs.
* Return: The entry which used to be at this index.
*/
return entry;
}
+/**
+ * xa_cmpxchg() - Conditionally replace an entry in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @old: Old value to test against.
+ * @entry: New value to place in array.
+ * @gfp: Memory allocation flags.
+ *
+ * If the entry at @index is the same as @old, replace it with @entry.
+ * If the return value is equal to @old, then the exchange was successful.
+ *
+ * Context: Any context. Takes and releases the xa_lock. May sleep
+ * if the @gfp flags permit.
+ * Return: The old value at this index or xa_err() if an error happened.
+ */
+static inline void *xa_cmpxchg(struct xarray *xa, unsigned long index,
+ void *old, void *entry, gfp_t gfp)
+{
+ void *curr;
+
+ xa_lock(xa);
+ curr = __xa_cmpxchg(xa, index, old, entry, gfp);
+ xa_unlock(xa);
+
+ return curr;
+}
+
+/**
+ * xa_insert() - Store this entry in the XArray unless another entry is
+ * already present.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @entry: New entry.
+ * @gfp: Memory allocation flags.
+ *
+ * If you would rather see the existing entry in the array, use xa_cmpxchg().
+ * This function is for users who don't care what the entry is, only that
+ * one is present.
+ *
+ * Context: Process context. Takes and releases the xa_lock.
+ * May sleep if the @gfp flags permit.
+ * Return: 0 if the store succeeded. -EEXIST if another entry was present.
+ * -ENOMEM if memory could not be allocated.
+ */
+static inline int xa_insert(struct xarray *xa, unsigned long index,
+ void *entry, gfp_t gfp)
+{
+ void *curr = xa_cmpxchg(xa, index, NULL, entry, gfp);
+ if (!curr)
+ return 0;
+ if (xa_is_err(curr))
+ return xa_err(curr);
+ return -EEXIST;
+}
+
/**
* xa_alloc() - Find somewhere to store this entry in the XArray.
* @xa: XArray.
* Updates the @id pointer with the index, then stores the entry at that
* index. A concurrent lookup will not see an uninitialised @id.
*
- * Context: Process context. Takes and releases the xa_lock while
+ * Context: Any context. Takes and releases the xa_lock while
* disabling softirqs. May sleep if the @gfp flags permit.
* Return: 0 on success, -ENOMEM if memory allocation fails or -ENOSPC if
* there is no more space in the XArray.
return err;
}
+/**
+ * xa_reserve() - Reserve this index in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @gfp: Memory allocation flags.
+ *
+ * Ensures there is somewhere to store an entry at @index in the array.
+ * If there is already something stored at @index, this function does
+ * nothing. If there was nothing there, the entry is marked as reserved.
+ * Loading from a reserved entry returns a %NULL pointer.
+ *
+ * If you do not use the entry that you have reserved, call xa_release()
+ * or xa_erase() to free any unnecessary memory.
+ *
+ * Context: Any context. Takes and releases the xa_lock.
+ * May sleep if the @gfp flags permit.
+ * Return: 0 if the reservation succeeded or -ENOMEM if it failed.
+ */
+static inline
+int xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp)
+{
+ int ret;
+
+ xa_lock(xa);
+ ret = __xa_reserve(xa, index, gfp);
+ xa_unlock(xa);
+
+ return ret;
+}
+
+/**
+ * xa_reserve_bh() - Reserve this index in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @gfp: Memory allocation flags.
+ *
+ * A softirq-disabling version of xa_reserve().
+ *
+ * Context: Any context. Takes and releases the xa_lock while
+ * disabling softirqs.
+ * Return: 0 if the reservation succeeded or -ENOMEM if it failed.
+ */
+static inline
+int xa_reserve_bh(struct xarray *xa, unsigned long index, gfp_t gfp)
+{
+ int ret;
+
+ xa_lock_bh(xa);
+ ret = __xa_reserve(xa, index, gfp);
+ xa_unlock_bh(xa);
+
+ return ret;
+}
+
+/**
+ * xa_reserve_irq() - Reserve this index in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @gfp: Memory allocation flags.
+ *
+ * An interrupt-disabling version of xa_reserve().
+ *
+ * Context: Process context. Takes and releases the xa_lock while
+ * disabling interrupts.
+ * Return: 0 if the reservation succeeded or -ENOMEM if it failed.
+ */
+static inline
+int xa_reserve_irq(struct xarray *xa, unsigned long index, gfp_t gfp)
+{
+ int ret;
+
+ xa_lock_irq(xa);
+ ret = __xa_reserve(xa, index, gfp);
+ xa_unlock_irq(xa);
+
+ return ret;
+}
+
+/**
+ * xa_release() - Release a reserved entry.
+ * @xa: XArray.
+ * @index: Index of entry.
+ *
+ * After calling xa_reserve(), you can call this function to release the
+ * reservation. If the entry at @index has been stored to, this function
+ * will do nothing.
+ */
+static inline void xa_release(struct xarray *xa, unsigned long index)
+{
+ xa_cmpxchg(xa, index, NULL, NULL, 0);
+}
+
/* Everything below here is the Advanced API. Proceed with caution. */
/*
unsigned int access_count;
struct list_head objects;
unsigned int num_incomplete_objects;
- struct wait_queue_head poll_wait;
+ wait_queue_head_t poll_wait;
spinlock_t lock;
};
/* v4l2 request helper */
-void vb2_m2m_request_queue(struct media_request *req);
+void v4l2_m2m_request_queue(struct media_request *req);
/* v4l2 ioctl helpers */
const struct in6_addr *addr);
bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
const struct in6_addr *addr);
+int ipv6_anycast_init(void);
+void ipv6_anycast_cleanup(void);
/* Device notifier */
int register_inet6addr_notifier(struct notifier_block *nb);
struct sockaddr_rxrpc *, struct key *);
int rxrpc_kernel_check_call(struct socket *, struct rxrpc_call *,
enum rxrpc_call_completion *, u32 *);
-u32 rxrpc_kernel_check_life(struct socket *, struct rxrpc_call *);
+u32 rxrpc_kernel_check_life(const struct socket *, const struct rxrpc_call *);
+void rxrpc_kernel_probe_life(struct socket *, struct rxrpc_call *);
u32 rxrpc_kernel_get_epoch(struct socket *, struct rxrpc_call *);
bool rxrpc_kernel_get_reply_time(struct socket *, struct rxrpc_call *,
ktime_t *);
void unix_gc(void);
void wait_for_unix_gc(void);
struct sock *unix_get_socket(struct file *filp);
-struct sock *unix_peer_get(struct sock *);
+struct sock *unix_peer_get(struct sock *sk);
#define UNIX_HASH_SIZE 256
#define UNIX_HASH_BITS 8
u32 consumed;
} __randomize_layout;
-#define UNIXCB(skb) (*(struct unix_skb_parms *)&((skb)->cb))
+#define UNIXCB(skb) (*(struct unix_skb_parms *)&((skb)->cb))
#define unix_state_lock(s) spin_lock(&unix_sk(s)->lock)
#define unix_state_unlock(s) spin_unlock(&unix_sk(s)->lock)
struct in6_addr aca_addr;
struct fib6_info *aca_rt;
struct ifacaddr6 *aca_next;
+ struct hlist_node aca_addr_lst;
int aca_users;
refcount_t aca_refcnt;
unsigned long aca_cstamp;
unsigned long aca_tstamp;
+ struct rcu_head rcu;
};
#define IFA_HOST IPV6_ADDR_LOOPBACK
static inline int neigh_hh_output(const struct hh_cache *hh, struct sk_buff *skb)
{
+ unsigned int hh_alen = 0;
unsigned int seq;
unsigned int hh_len;
seq = read_seqbegin(&hh->hh_lock);
hh_len = hh->hh_len;
if (likely(hh_len <= HH_DATA_MOD)) {
- /* this is inlined by gcc */
- memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
+ hh_alen = HH_DATA_MOD;
+
+ /* skb_push() would proceed silently if we have room for
+ * the unaligned size but not for the aligned size:
+ * check headroom explicitly.
+ */
+ if (likely(skb_headroom(skb) >= HH_DATA_MOD)) {
+ /* this is inlined by gcc */
+ memcpy(skb->data - HH_DATA_MOD, hh->hh_data,
+ HH_DATA_MOD);
+ }
} else {
- unsigned int hh_alen = HH_DATA_ALIGN(hh_len);
+ hh_alen = HH_DATA_ALIGN(hh_len);
- memcpy(skb->data - hh_alen, hh->hh_data, hh_alen);
+ if (likely(skb_headroom(skb) >= hh_alen)) {
+ memcpy(skb->data - hh_alen, hh->hh_data,
+ hh_alen);
+ }
}
} while (read_seqretry(&hh->hh_lock, seq));
- skb_push(skb, hh_len);
+ if (WARN_ON_ONCE(skb_headroom(skb) < hh_alen)) {
+ kfree_skb(skb);
+ return NET_XMIT_DROP;
+ }
+
+ __skb_push(skb, hh_len);
return dev_queue_xmit(skb);
}
const struct nf_nat_range2 *range,
const struct net_device *out);
-void nf_nat_masquerade_ipv4_register_notifier(void);
+int nf_nat_masquerade_ipv4_register_notifier(void);
void nf_nat_masquerade_ipv4_unregister_notifier(void);
#endif /*_NF_NAT_MASQUERADE_IPV4_H_ */
unsigned int
nf_nat_masquerade_ipv6(struct sk_buff *skb, const struct nf_nat_range2 *range,
const struct net_device *out);
-void nf_nat_masquerade_ipv6_register_notifier(void);
+int nf_nat_masquerade_ipv6_register_notifier(void);
void nf_nat_masquerade_ipv6_unregister_notifier(void);
#endif /* _NF_NAT_MASQUERADE_IPV6_H_ */
const char *fmt, ...) { }
#endif /* CONFIG_SYSCTL */
+static inline struct nf_generic_net *nf_generic_pernet(struct net *net)
+{
+ return &net->ct.nf_ct_proto.generic;
+}
+
+static inline struct nf_tcp_net *nf_tcp_pernet(struct net *net)
+{
+ return &net->ct.nf_ct_proto.tcp;
+}
+
+static inline struct nf_udp_net *nf_udp_pernet(struct net *net)
+{
+ return &net->ct.nf_ct_proto.udp;
+}
+
+static inline struct nf_icmp_net *nf_icmp_pernet(struct net *net)
+{
+ return &net->ct.nf_ct_proto.icmp;
+}
+
+static inline struct nf_icmp_net *nf_icmpv6_pernet(struct net *net)
+{
+ return &net->ct.nf_ct_proto.icmpv6;
+}
+
+#ifdef CONFIG_NF_CT_PROTO_DCCP
+static inline struct nf_dccp_net *nf_dccp_pernet(struct net *net)
+{
+ return &net->ct.nf_ct_proto.dccp;
+}
+#endif
+
+#ifdef CONFIG_NF_CT_PROTO_SCTP
+static inline struct nf_sctp_net *nf_sctp_pernet(struct net *net)
+{
+ return &net->ct.nf_ct_proto.sctp;
+}
+#endif
+
#endif /*_NF_CONNTRACK_PROTOCOL_H*/
SCTP_DEFAULT_MINSEGMENT));
}
+static inline bool sctp_transport_pmtu_check(struct sctp_transport *t)
+{
+ __u32 pmtu = sctp_dst_mtu(t->dst);
+
+ if (t->pathmtu == pmtu)
+ return true;
+
+ t->pathmtu = pmtu;
+
+ return false;
+}
+
+static inline __u32 sctp_min_frag_point(struct sctp_sock *sp, __u16 datasize)
+{
+ return sctp_mtu_payload(sp, SCTP_DEFAULT_MINSEGMENT, datasize);
+}
+
#endif /* __net_sctp_h__ */
__u64 abandoned_unsent[SCTP_PR_INDEX(MAX) + 1];
__u64 abandoned_sent[SCTP_PR_INDEX(MAX) + 1];
+
+ struct rcu_head rcu;
};
static inline int snd_interval_single(const struct snd_interval *i)
{
return (i->min == i->max ||
- (i->min + 1 == i->max && i->openmax));
+ (i->min + 1 == i->max && (i->openmin || i->openmax)));
}
static inline int snd_interval_value(const struct snd_interval *i)
{
+ if (i->openmin && !i->openmax)
+ return i->max;
return i->min;
}
((i) < rtd->num_codecs) && ((dai) = rtd->codec_dais[i]); \
(i)++)
#define for_each_rtd_codec_dai_rollback(rtd, i, dai) \
- for (; ((i--) >= 0) && ((dai) = rtd->codec_dais[i]);)
+ for (; ((--i) >= 0) && ((dai) = rtd->codec_dais[i]);)
/* mixer control */
afs_FS_StoreData64 = 65538, /* AFS Store file data */
afs_FS_GiveUpAllCallBacks = 65539, /* AFS Give up all our callbacks on a server */
afs_FS_GetCapabilities = 65540, /* AFS Get FS server capabilities */
+
+ yfs_FS_FetchData = 130, /* YFS Fetch file data */
+ yfs_FS_FetchACL = 64131, /* YFS Fetch file ACL */
+ yfs_FS_FetchStatus = 64132, /* YFS Fetch file status */
+ yfs_FS_StoreACL = 64134, /* YFS Store file ACL */
+ yfs_FS_StoreStatus = 64135, /* YFS Store file status */
+ yfs_FS_RemoveFile = 64136, /* YFS Remove a file */
+ yfs_FS_CreateFile = 64137, /* YFS Create a file */
+ yfs_FS_Rename = 64138, /* YFS Rename or move a file or directory */
+ yfs_FS_Symlink = 64139, /* YFS Create a symbolic link */
+ yfs_FS_Link = 64140, /* YFS Create a hard link */
+ yfs_FS_MakeDir = 64141, /* YFS Create a directory */
+ yfs_FS_RemoveDir = 64142, /* YFS Remove a directory */
+ yfs_FS_GetVolumeStatus = 64149, /* YFS Get volume status information */
+ yfs_FS_SetVolumeStatus = 64150, /* YFS Set volume status information */
+ yfs_FS_SetLock = 64156, /* YFS Request a file lock */
+ yfs_FS_ExtendLock = 64157, /* YFS Extend a file lock */
+ yfs_FS_ReleaseLock = 64158, /* YFS Release a file lock */
+ yfs_FS_Lookup = 64161, /* YFS lookup file in directory */
+ yfs_FS_FlushCPS = 64165,
+ yfs_FS_FetchOpaqueACL = 64168,
+ yfs_FS_WhoAmI = 64170,
+ yfs_FS_RemoveACL = 64171,
+ yfs_FS_RemoveFile2 = 64173,
+ yfs_FS_StoreOpaqueACL2 = 64174,
+ yfs_FS_InlineBulkStatus = 64536, /* YFS Fetch multiple file statuses with errors */
+ yfs_FS_FetchData64 = 64537, /* YFS Fetch file data */
+ yfs_FS_StoreData64 = 64538, /* YFS Store file data */
+ yfs_FS_UpdateSymlink = 64540,
};
enum afs_vl_operation {
afs_edit_dir_for_unlink,
};
+enum afs_eproto_cause {
+ afs_eproto_bad_status,
+ afs_eproto_cb_count,
+ afs_eproto_cb_fid_count,
+ afs_eproto_file_type,
+ afs_eproto_ibulkst_cb_count,
+ afs_eproto_ibulkst_count,
+ afs_eproto_motd_len,
+ afs_eproto_offline_msg_len,
+ afs_eproto_volname_len,
+ afs_eproto_yvl_fsendpt4_len,
+ afs_eproto_yvl_fsendpt6_len,
+ afs_eproto_yvl_fsendpt_num,
+ afs_eproto_yvl_fsendpt_type,
+ afs_eproto_yvl_vlendpt4_len,
+ afs_eproto_yvl_vlendpt6_len,
+ afs_eproto_yvl_vlendpt_type,
+};
+
+enum afs_io_error {
+ afs_io_error_cm_reply,
+ afs_io_error_extract,
+ afs_io_error_fs_probe_fail,
+ afs_io_error_vl_lookup_fail,
+ afs_io_error_vl_probe_fail,
+};
+
+enum afs_file_error {
+ afs_file_error_dir_bad_magic,
+ afs_file_error_dir_big,
+ afs_file_error_dir_missing_page,
+ afs_file_error_dir_over_end,
+ afs_file_error_dir_small,
+ afs_file_error_dir_unmarked_ext,
+ afs_file_error_mntpt,
+ afs_file_error_writeback_fail,
+};
+
#endif /* end __AFS_DECLARE_TRACE_ENUMS_ONCE_ONLY */
/*
EM(afs_FS_FetchData64, "FS.FetchData64") \
EM(afs_FS_StoreData64, "FS.StoreData64") \
EM(afs_FS_GiveUpAllCallBacks, "FS.GiveUpAllCallBacks") \
- E_(afs_FS_GetCapabilities, "FS.GetCapabilities")
+ EM(afs_FS_GetCapabilities, "FS.GetCapabilities") \
+ EM(yfs_FS_FetchACL, "YFS.FetchACL") \
+ EM(yfs_FS_FetchStatus, "YFS.FetchStatus") \
+ EM(yfs_FS_StoreACL, "YFS.StoreACL") \
+ EM(yfs_FS_StoreStatus, "YFS.StoreStatus") \
+ EM(yfs_FS_RemoveFile, "YFS.RemoveFile") \
+ EM(yfs_FS_CreateFile, "YFS.CreateFile") \
+ EM(yfs_FS_Rename, "YFS.Rename") \
+ EM(yfs_FS_Symlink, "YFS.Symlink") \
+ EM(yfs_FS_Link, "YFS.Link") \
+ EM(yfs_FS_MakeDir, "YFS.MakeDir") \
+ EM(yfs_FS_RemoveDir, "YFS.RemoveDir") \
+ EM(yfs_FS_GetVolumeStatus, "YFS.GetVolumeStatus") \
+ EM(yfs_FS_SetVolumeStatus, "YFS.SetVolumeStatus") \
+ EM(yfs_FS_SetLock, "YFS.SetLock") \
+ EM(yfs_FS_ExtendLock, "YFS.ExtendLock") \
+ EM(yfs_FS_ReleaseLock, "YFS.ReleaseLock") \
+ EM(yfs_FS_Lookup, "YFS.Lookup") \
+ EM(yfs_FS_FlushCPS, "YFS.FlushCPS") \
+ EM(yfs_FS_FetchOpaqueACL, "YFS.FetchOpaqueACL") \
+ EM(yfs_FS_WhoAmI, "YFS.WhoAmI") \
+ EM(yfs_FS_RemoveACL, "YFS.RemoveACL") \
+ EM(yfs_FS_RemoveFile2, "YFS.RemoveFile2") \
+ EM(yfs_FS_StoreOpaqueACL2, "YFS.StoreOpaqueACL2") \
+ EM(yfs_FS_InlineBulkStatus, "YFS.InlineBulkStatus") \
+ EM(yfs_FS_FetchData64, "YFS.FetchData64") \
+ EM(yfs_FS_StoreData64, "YFS.StoreData64") \
+ E_(yfs_FS_UpdateSymlink, "YFS.UpdateSymlink")
#define afs_vl_operations \
EM(afs_VL_GetEntryByNameU, "VL.GetEntryByNameU") \
EM(afs_edit_dir_for_symlink, "Symlnk") \
E_(afs_edit_dir_for_unlink, "Unlink")
+#define afs_eproto_causes \
+ EM(afs_eproto_bad_status, "BadStatus") \
+ EM(afs_eproto_cb_count, "CbCount") \
+ EM(afs_eproto_cb_fid_count, "CbFidCount") \
+ EM(afs_eproto_file_type, "FileTYpe") \
+ EM(afs_eproto_ibulkst_cb_count, "IBS.CbCount") \
+ EM(afs_eproto_ibulkst_count, "IBS.FidCount") \
+ EM(afs_eproto_motd_len, "MotdLen") \
+ EM(afs_eproto_offline_msg_len, "OfflineMsgLen") \
+ EM(afs_eproto_volname_len, "VolNameLen") \
+ EM(afs_eproto_yvl_fsendpt4_len, "YVL.FsEnd4Len") \
+ EM(afs_eproto_yvl_fsendpt6_len, "YVL.FsEnd6Len") \
+ EM(afs_eproto_yvl_fsendpt_num, "YVL.FsEndCount") \
+ EM(afs_eproto_yvl_fsendpt_type, "YVL.FsEndType") \
+ EM(afs_eproto_yvl_vlendpt4_len, "YVL.VlEnd4Len") \
+ EM(afs_eproto_yvl_vlendpt6_len, "YVL.VlEnd6Len") \
+ E_(afs_eproto_yvl_vlendpt_type, "YVL.VlEndType")
+
+#define afs_io_errors \
+ EM(afs_io_error_cm_reply, "CM_REPLY") \
+ EM(afs_io_error_extract, "EXTRACT") \
+ EM(afs_io_error_fs_probe_fail, "FS_PROBE_FAIL") \
+ EM(afs_io_error_vl_lookup_fail, "VL_LOOKUP_FAIL") \
+ E_(afs_io_error_vl_probe_fail, "VL_PROBE_FAIL")
+
+#define afs_file_errors \
+ EM(afs_file_error_dir_bad_magic, "DIR_BAD_MAGIC") \
+ EM(afs_file_error_dir_big, "DIR_BIG") \
+ EM(afs_file_error_dir_missing_page, "DIR_MISSING_PAGE") \
+ EM(afs_file_error_dir_over_end, "DIR_ENT_OVER_END") \
+ EM(afs_file_error_dir_small, "DIR_SMALL") \
+ EM(afs_file_error_dir_unmarked_ext, "DIR_UNMARKED_EXT") \
+ EM(afs_file_error_mntpt, "MNTPT_READ_FAILED") \
+ E_(afs_file_error_writeback_fail, "WRITEBACK_FAILED")
/*
* Export enum symbols via userspace.
afs_vl_operations;
afs_edit_dir_ops;
afs_edit_dir_reasons;
+afs_eproto_causes;
+afs_io_errors;
+afs_file_errors;
/*
* Now redefine the EM() and E_() macros to map the enums to the strings that
#define EM(a, b) { a, b },
#define E_(a, b) { a, b }
-TRACE_EVENT(afs_recv_data,
- TP_PROTO(struct afs_call *call, unsigned count, unsigned offset,
+TRACE_EVENT(afs_receive_data,
+ TP_PROTO(struct afs_call *call, struct iov_iter *iter,
bool want_more, int ret),
- TP_ARGS(call, count, offset, want_more, ret),
+ TP_ARGS(call, iter, want_more, ret),
TP_STRUCT__entry(
+ __field(loff_t, remain )
__field(unsigned int, call )
__field(enum afs_call_state, state )
- __field(unsigned int, count )
- __field(unsigned int, offset )
__field(unsigned short, unmarshall )
__field(bool, want_more )
__field(int, ret )
__entry->call = call->debug_id;
__entry->state = call->state;
__entry->unmarshall = call->unmarshall;
- __entry->count = count;
- __entry->offset = offset;
+ __entry->remain = iov_iter_count(iter);
__entry->want_more = want_more;
__entry->ret = ret;
),
- TP_printk("c=%08x s=%u u=%u %u/%u wm=%u ret=%d",
+ TP_printk("c=%08x r=%llu u=%u w=%u s=%u ret=%d",
__entry->call,
- __entry->state, __entry->unmarshall,
- __entry->offset, __entry->count,
- __entry->want_more, __entry->ret)
+ __entry->remain,
+ __entry->unmarshall,
+ __entry->want_more,
+ __entry->state,
+ __entry->ret)
);
TRACE_EVENT(afs_notify_call,
}
),
- TP_printk("c=%08x %06x:%06x:%06x %s",
+ TP_printk("c=%08x %06llx:%06llx:%06x %s",
__entry->call,
__entry->fid.vid,
__entry->fid.vnode,
);
TRACE_EVENT(afs_protocol_error,
- TP_PROTO(struct afs_call *call, int error, const void *where),
+ TP_PROTO(struct afs_call *call, int error, enum afs_eproto_cause cause),
+
+ TP_ARGS(call, error, cause),
+
+ TP_STRUCT__entry(
+ __field(unsigned int, call )
+ __field(int, error )
+ __field(enum afs_eproto_cause, cause )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call ? call->debug_id : 0;
+ __entry->error = error;
+ __entry->cause = cause;
+ ),
+
+ TP_printk("c=%08x r=%d %s",
+ __entry->call, __entry->error,
+ __print_symbolic(__entry->cause, afs_eproto_causes))
+ );
+
+TRACE_EVENT(afs_io_error,
+ TP_PROTO(unsigned int call, int error, enum afs_io_error where),
TP_ARGS(call, error, where),
TP_STRUCT__entry(
__field(unsigned int, call )
__field(int, error )
- __field(const void *, where )
+ __field(enum afs_io_error, where )
),
TP_fast_assign(
- __entry->call = call ? call->debug_id : 0;
+ __entry->call = call;
+ __entry->error = error;
+ __entry->where = where;
+ ),
+
+ TP_printk("c=%08x r=%d %s",
+ __entry->call, __entry->error,
+ __print_symbolic(__entry->where, afs_io_errors))
+ );
+
+TRACE_EVENT(afs_file_error,
+ TP_PROTO(struct afs_vnode *vnode, int error, enum afs_file_error where),
+
+ TP_ARGS(vnode, error, where),
+
+ TP_STRUCT__entry(
+ __field_struct(struct afs_fid, fid )
+ __field(int, error )
+ __field(enum afs_file_error, where )
+ ),
+
+ TP_fast_assign(
+ __entry->fid = vnode->fid;
__entry->error = error;
__entry->where = where;
),
- TP_printk("c=%08x r=%d sp=%pSR",
- __entry->call, __entry->error, __entry->where)
+ TP_printk("%llx:%llx:%x r=%d %s",
+ __entry->fid.vid, __entry->fid.vnode, __entry->fid.unique,
+ __entry->error,
+ __print_symbolic(__entry->where, afs_file_errors))
);
TRACE_EVENT(afs_cm_no_server,
TP_fast_assign(
__entry->dev = disk_devt(dev_to_disk(kobj_to_dev(q->kobj.parent)));
- strlcpy(__entry->domain, domain, DOMAIN_LEN);
- strlcpy(__entry->type, type, DOMAIN_LEN);
+ strlcpy(__entry->domain, domain, sizeof(__entry->domain));
+ strlcpy(__entry->type, type, sizeof(__entry->type));
__entry->percentile = percentile;
__entry->numerator = numerator;
__entry->denominator = denominator;
TP_fast_assign(
__entry->dev = disk_devt(dev_to_disk(kobj_to_dev(q->kobj.parent)));
- strlcpy(__entry->domain, domain, DOMAIN_LEN);
+ strlcpy(__entry->domain, domain, sizeof(__entry->domain));
__entry->depth = depth;
),
TP_fast_assign(
__entry->dev = disk_devt(dev_to_disk(kobj_to_dev(q->kobj.parent)));
- strlcpy(__entry->domain, domain, DOMAIN_LEN);
+ strlcpy(__entry->domain, domain, sizeof(__entry->domain));
),
TP_printk("%d,%d %s", MAJOR(__entry->dev), MINOR(__entry->dev),
enum rxrpc_propose_ack_trace {
rxrpc_propose_ack_client_tx_end,
rxrpc_propose_ack_input_data,
+ rxrpc_propose_ack_ping_for_check_life,
rxrpc_propose_ack_ping_for_keepalive,
rxrpc_propose_ack_ping_for_lost_ack,
rxrpc_propose_ack_ping_for_lost_reply,
#define rxrpc_propose_ack_traces \
EM(rxrpc_propose_ack_client_tx_end, "ClTxEnd") \
EM(rxrpc_propose_ack_input_data, "DataIn ") \
+ EM(rxrpc_propose_ack_ping_for_check_life, "ChkLife") \
EM(rxrpc_propose_ack_ping_for_keepalive, "KeepAlv") \
EM(rxrpc_propose_ack_ping_for_lost_ack, "LostAck") \
EM(rxrpc_propose_ack_ping_for_lost_reply, "LostRpl") \
#ifdef CREATE_TRACE_POINTS
static inline long __trace_sched_switch_state(bool preempt, struct task_struct *p)
{
+ unsigned int state;
+
#ifdef CONFIG_SCHED_DEBUG
BUG_ON(p != current);
#endif /* CONFIG_SCHED_DEBUG */
if (preempt)
return TASK_REPORT_MAX;
- return 1 << task_state_index(p);
+ /*
+ * task_state_index() uses fls() and returns a value from 0-8 range.
+ * Decrement it by 1 (except TASK_RUNNING state i.e 0) before using
+ * it for left shift operation to get the correct task->state
+ * mapping.
+ */
+ state = task_state_index(p);
+
+ return state ? (1 << (state - 1)) : state;
}
#endif /* CREATE_TRACE_POINTS */
#define __NR_ftruncate __NR3264_ftruncate
#define __NR_lseek __NR3264_lseek
#define __NR_sendfile __NR3264_sendfile
+#if defined(__ARCH_WANT_NEW_STAT) || defined(__ARCH_WANT_STAT64)
#define __NR_newfstatat __NR3264_fstatat
#define __NR_fstat __NR3264_fstat
+#endif
#define __NR_mmap __NR3264_mmap
#define __NR_fadvise64 __NR3264_fadvise64
#ifdef __NR3264_stat
#define __NR_ftruncate64 __NR3264_ftruncate
#define __NR_llseek __NR3264_lseek
#define __NR_sendfile64 __NR3264_sendfile
+#if defined(__ARCH_WANT_NEW_STAT) || defined(__ARCH_WANT_STAT64)
#define __NR_fstatat64 __NR3264_fstatat
#define __NR_fstat64 __NR3264_fstat
+#endif
#define __NR_mmap2 __NR3264_mmap
#define __NR_fadvise64_64 __NR3264_fadvise64
#ifdef __NR3264_stat
* Return
* 0 on success, or a negative error in case of failure.
*
- * struct bpf_sock *bpf_sk_lookup_tcp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u32 netns, u64 flags)
+ * struct bpf_sock *bpf_sk_lookup_tcp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u64 netns, u64 flags)
* Description
* Look for TCP socket matching *tuple*, optionally in a child
* network namespace *netns*. The return value must be checked,
* **sizeof**\ (*tuple*\ **->ipv6**)
* Look for an IPv6 socket.
*
- * If the *netns* is zero, then the socket lookup table in the
- * netns associated with the *ctx* will be used. For the TC hooks,
- * this in the netns of the device in the skb. For socket hooks,
- * this in the netns of the socket. If *netns* is non-zero, then
- * it specifies the ID of the netns relative to the netns
- * associated with the *ctx*.
+ * If the *netns* is a negative signed 32-bit integer, then the
+ * socket lookup table in the netns associated with the *ctx* will
+ * will be used. For the TC hooks, this is the netns of the device
+ * in the skb. For socket hooks, this is the netns of the socket.
+ * If *netns* is any other signed 32-bit value greater than or
+ * equal to zero then it specifies the ID of the netns relative to
+ * the netns associated with the *ctx*. *netns* values beyond the
+ * range of 32-bit integers are reserved for future use.
*
* All values for *flags* are reserved for future usage, and must
* be left at zero.
* **CONFIG_NET** configuration option.
* Return
* Pointer to *struct bpf_sock*, or NULL in case of failure.
+ * For sockets with reuseport option, the *struct bpf_sock*
+ * result is from reuse->socks[] using the hash of the tuple.
*
- * struct bpf_sock *bpf_sk_lookup_udp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u32 netns, u64 flags)
+ * struct bpf_sock *bpf_sk_lookup_udp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u64 netns, u64 flags)
* Description
* Look for UDP socket matching *tuple*, optionally in a child
* network namespace *netns*. The return value must be checked,
* **sizeof**\ (*tuple*\ **->ipv6**)
* Look for an IPv6 socket.
*
- * If the *netns* is zero, then the socket lookup table in the
- * netns associated with the *ctx* will be used. For the TC hooks,
- * this in the netns of the device in the skb. For socket hooks,
- * this in the netns of the socket. If *netns* is non-zero, then
- * it specifies the ID of the netns relative to the netns
- * associated with the *ctx*.
+ * If the *netns* is a negative signed 32-bit integer, then the
+ * socket lookup table in the netns associated with the *ctx* will
+ * will be used. For the TC hooks, this is the netns of the device
+ * in the skb. For socket hooks, this is the netns of the socket.
+ * If *netns* is any other signed 32-bit value greater than or
+ * equal to zero then it specifies the ID of the netns relative to
+ * the netns associated with the *ctx*. *netns* values beyond the
+ * range of 32-bit integers are reserved for future use.
*
* All values for *flags* are reserved for future usage, and must
* be left at zero.
* **CONFIG_NET** configuration option.
* Return
* Pointer to *struct bpf_sock*, or NULL in case of failure.
+ * For sockets with reuseport option, the *struct bpf_sock*
+ * result is from reuse->socks[] using the hash of the tuple.
*
* int bpf_sk_release(struct bpf_sock *sk)
* Description
/* BPF_FUNC_perf_event_output for sk_buff input context. */
#define BPF_F_CTXLEN_MASK (0xfffffULL << 32)
+/* Current network namespace */
+#define BPF_F_CURRENT_NETNS (-1L)
+
/* Mode for BPF_FUNC_skb_adjust_room helper. */
enum bpf_adj_room_mode {
BPF_ADJ_ROOM_NET,
BPF_LWT_ENCAP_SEG6_INLINE
};
+#define __bpf_md_ptr(type, name) \
+union { \
+ type name; \
+ __u64 :64; \
+} __attribute__((aligned(8)))
+
/* user accessible mirror of in-kernel sk_buff.
* new fields can only be added to the end of this structure
*/
/* ... here. */
__u32 data_meta;
- struct bpf_flow_keys *flow_keys;
+ __bpf_md_ptr(struct bpf_flow_keys *, flow_keys);
};
struct bpf_tunnel_key {
* be added to the end of this structure
*/
struct sk_msg_md {
- void *data;
- void *data_end;
+ __bpf_md_ptr(void *, data);
+ __bpf_md_ptr(void *, data_end);
__u32 family;
__u32 remote_ip4; /* Stored in network byte order */
* Start of directly accessible data. It begins from
* the tcp/udp header.
*/
- void *data;
- void *data_end; /* End of directly accessible data */
+ __bpf_md_ptr(void *, data);
+ /* End of directly accessible data */
+ __bpf_md_ptr(void *, data_end);
/*
* Total length of packet (starting from the tcp/udp header).
* Note that the directly accessible bytes (data_end - data)
* the situation described above.
*/
#define REL_RESERVED 0x0a
+#define REL_WHEEL_HI_RES 0x0b
+#define REL_HWHEEL_HI_RES 0x0c
#define REL_MAX 0x0f
#define REL_CNT (REL_MAX+1)
#define KEYCTL_INVALIDATE 21 /* invalidate a key */
#define KEYCTL_GET_PERSISTENT 22 /* get a user's persistent keyring */
#define KEYCTL_DH_COMPUTE 23 /* Compute Diffie-Hellman values */
+#define KEYCTL_PKEY_QUERY 24 /* Query public key parameters */
+#define KEYCTL_PKEY_ENCRYPT 25 /* Encrypt a blob using a public key */
+#define KEYCTL_PKEY_DECRYPT 26 /* Decrypt a blob using a public key */
+#define KEYCTL_PKEY_SIGN 27 /* Create a public key signature */
+#define KEYCTL_PKEY_VERIFY 28 /* Verify a public key signature */
#define KEYCTL_RESTRICT_KEYRING 29 /* Restrict keys allowed to link to a keyring */
/* keyctl structures */
__u32 __spare[8];
};
+#define KEYCTL_SUPPORTS_ENCRYPT 0x01
+#define KEYCTL_SUPPORTS_DECRYPT 0x02
+#define KEYCTL_SUPPORTS_SIGN 0x04
+#define KEYCTL_SUPPORTS_VERIFY 0x08
+
+struct keyctl_pkey_query {
+ __u32 supported_ops; /* Which ops are supported */
+ __u32 key_size; /* Size of the key in bits */
+ __u16 max_data_size; /* Maximum size of raw data to sign in bytes */
+ __u16 max_sig_size; /* Maximum size of signature in bytes */
+ __u16 max_enc_size; /* Maximum size of encrypted blob in bytes */
+ __u16 max_dec_size; /* Maximum size of decrypted blob in bytes */
+ __u32 __spare[10];
+};
+
+struct keyctl_pkey_params {
+ __s32 key_id; /* Serial no. of public key to use */
+ __u32 in_len; /* Input data size */
+ union {
+ __u32 out_len; /* Output buffer size (encrypt/decrypt/sign) */
+ __u32 in2_len; /* 2nd input data size (verify) */
+ };
+ __u32 __spare[7];
+};
+
#endif /* _LINUX_KEYCTL_H */
};
struct kfd_ioctl_get_queue_wave_state_args {
- uint64_t ctl_stack_address; /* to KFD */
- uint32_t ctl_stack_used_size; /* from KFD */
- uint32_t save_area_used_size; /* from KFD */
- uint32_t queue_id; /* to KFD */
- uint32_t pad;
+ __u64 ctl_stack_address; /* to KFD */
+ __u32 ctl_stack_used_size; /* from KFD */
+ __u32 save_area_used_size; /* from KFD */
+ __u32 queue_id; /* to KFD */
+ __u32 pad;
};
/* For kfd_ioctl_set_memory_policy_args.default_policy and alternate_policy */
/* hw exception data */
struct kfd_hsa_hw_exception_data {
- uint32_t reset_type;
- uint32_t reset_cause;
- uint32_t memory_lost;
- uint32_t gpu_id;
+ __u32 reset_type;
+ __u32 reset_cause;
+ __u32 memory_lost;
+ __u32 gpu_id;
};
/* Event data */
NFTA_NG_MODULUS,
NFTA_NG_TYPE,
NFTA_NG_OFFSET,
- NFTA_NG_SET_NAME,
- NFTA_NG_SET_ID,
+ NFTA_NG_SET_NAME, /* deprecated */
+ NFTA_NG_SET_ID, /* deprecated */
__NFTA_NG_MAX
};
#define NFTA_NG_MAX (__NFTA_NG_MAX - 1)
#include <linux/if_vlan.h>
#include <linux/if_pppox.h>
+#ifndef __KERNEL__
+#include <limits.h> /* for INT_MIN, INT_MAX */
+#endif
+
/* Bridge Hooks */
/* After promisc drops, checksum checks. */
#define NF_BR_PRE_ROUTING 0
*
* PERF_RECORD_MISC_MMAP_DATA - PERF_RECORD_MMAP* events
* PERF_RECORD_MISC_COMM_EXEC - PERF_RECORD_COMM event
+ * PERF_RECORD_MISC_FORK_EXEC - PERF_RECORD_FORK event (perf internal)
* PERF_RECORD_MISC_SWITCH_OUT - PERF_RECORD_SWITCH* events
*/
#define PERF_RECORD_MISC_MMAP_DATA (1 << 13)
#define PERF_RECORD_MISC_COMM_EXEC (1 << 13)
+#define PERF_RECORD_MISC_FORK_EXEC (1 << 13)
#define PERF_RECORD_MISC_SWITCH_OUT (1 << 13)
/*
* These PERF_RECORD_MISC_* flags below are safely reused
#define PR_SET_SPECULATION_CTRL 53
/* Speculation control variants */
# define PR_SPEC_STORE_BYPASS 0
+# define PR_SPEC_INDIRECT_BRANCH 1
/* Return and control values for PR_SET/GET_SPECULATION_CTRL */
# define PR_SPEC_NOT_AFFECTED 0
# define PR_SPEC_PRCTL (1UL << 0)
#define SCTP_ASSOC_CHANGE_DENIED 0x0004
#define SCTP_ASSOC_CHANGE_FAILED 0x0008
+#define SCTP_STREAM_CHANGE_DENIED SCTP_ASSOC_CHANGE_DENIED
+#define SCTP_STREAM_CHANGE_FAILED SCTP_ASSOC_CHANGE_FAILED
struct sctp_stream_change_event {
__u16 strchange_type;
__u16 strchange_flags;
/* SCTP Stream schedulers */
enum sctp_sched_type {
SCTP_SS_FCFS,
+ SCTP_SS_DEFAULT = SCTP_SS_FCFS,
SCTP_SS_PRIO,
SCTP_SS_RR,
SCTP_SS_MAX = SCTP_SS_RR
#ifndef __LINUX_V4L2_CONTROLS_H
#define __LINUX_V4L2_CONTROLS_H
+#include <linux/types.h>
+
/* Control classes */
#define V4L2_CTRL_CLASS_USER 0x00980000 /* Old-style 'user' controls */
#define V4L2_CTRL_CLASS_MPEG 0x00990000 /* MPEG-compression controls */
__u8 profile_and_level_indication;
__u8 progressive_sequence;
__u8 chroma_format;
+ __u8 pad;
};
struct v4l2_mpeg2_picture {
__u8 alternate_scan;
__u8 repeat_first_field;
__u8 progressive_frame;
+ __u8 pad;
};
struct v4l2_ctrl_mpeg2_slice_params {
__u8 backward_ref_index;
__u8 forward_ref_index;
+ __u8 pad;
};
struct v4l2_ctrl_mpeg2_quantization {
#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
#define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */
#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
+#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
/* Size of a PFN in the balloon interface. */
#define VIRTIO_BALLOON_PFN_SHIFT 12
+#define VIRTIO_BALLOON_CMD_ID_STOP 0
+#define VIRTIO_BALLOON_CMD_ID_DONE 1
struct virtio_balloon_config {
/* Number of pages host wants Guest to give up. */
__u32 num_pages;
/* Number of pages we've actually got in balloon. */
__u32 actual;
+ /* Free page report command id, readonly by guest */
+ __u32 free_page_report_cmd_id;
+ /* Stores PAGE_POISON if page poisoning is in use */
+ __u32 poison_val;
};
#define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */
{
}
#endif
-
-#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
-struct resource;
-void arch_xen_balloon_init(struct resource *hostmem_resource);
-#endif
extern unsigned long *xen_contiguous_bitmap;
-#ifdef CONFIG_XEN_PV
+#if defined(CONFIG_XEN_PV) || defined(CONFIG_ARM) || defined(CONFIG_ARM64)
int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle);
void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order);
-
-int xen_remap_pfn(struct vm_area_struct *vma, unsigned long addr,
- xen_pfn_t *pfn, int nr, int *err_ptr, pgprot_t prot,
- unsigned int domid, bool no_translate, struct page **pages);
#else
static inline int xen_create_contiguous_region(phys_addr_t pstart,
unsigned int order,
static inline void xen_destroy_contiguous_region(phys_addr_t pstart,
unsigned int order) { }
+#endif
+#if defined(CONFIG_XEN_PV)
+int xen_remap_pfn(struct vm_area_struct *vma, unsigned long addr,
+ xen_pfn_t *pfn, int nr, int *err_ptr, pgprot_t prot,
+ unsigned int domid, bool no_translate, struct page **pages);
+#else
static inline int xen_remap_pfn(struct vm_area_struct *vma, unsigned long addr,
xen_pfn_t *pfn, int nr, int *err_ptr,
pgprot_t prot, unsigned int domid,
Say N if unsure.
+config PSI_DEFAULT_DISABLED
+ bool "Require boot parameter to enable pressure stall information tracking"
+ default n
+ depends on PSI
+ help
+ If set, pressure stall information tracking will be disabled
+ per default but can be enabled through passing psi_enable=1
+ on the kernel commandline during boot.
+
endmenu # "CPU/Task time and stats accounting"
config CPU_ISOLATION
return 1;
}
-static int __init maybe_link(void)
-{
- if (nlink >= 2) {
- char *old = find_link(major, minor, ino, mode, collected);
- if (old)
- return (ksys_link(old, collected) < 0) ? -1 : 1;
- }
- return 0;
-}
-
static void __init clean_path(char *path, umode_t fmode)
{
struct kstat st;
}
}
+static int __init maybe_link(void)
+{
+ if (nlink >= 2) {
+ char *old = find_link(major, minor, ino, mode, collected);
+ if (old) {
+ clean_path(collected, 0);
+ return (ksys_link(old, collected) < 0) ? -1 : 1;
+ }
+ }
+ return 0;
+}
+
static __initdata int wfd;
static int __init do_name(void)
obj-$(CONFIG_ZONE_DEVICE) += memremap.o
obj-$(CONFIG_RSEQ) += rseq.o
+obj-$(CONFIG_GCC_PLUGIN_STACKLEAK) += stackleak.o
+KASAN_SANITIZE_stackleak.o := n
+KCOV_INSTRUMENT_stackleak.o := n
+
$(obj)/configs.o: $(obj)/config_data.h
targets += config_data.gz
#include <uapi/linux/types.h>
#include <linux/seq_file.h>
#include <linux/compiler.h>
+#include <linux/ctype.h>
#include <linux/errno.h>
#include <linux/slab.h>
#include <linux/anon_inodes.h>
offset < btf->hdr.str_len;
}
+/* Only C-style identifier is permitted. This can be relaxed if
+ * necessary.
+ */
+static bool btf_name_valid_identifier(const struct btf *btf, u32 offset)
+{
+ /* offset must be valid */
+ const char *src = &btf->strings[offset];
+ const char *src_limit;
+
+ if (!isalpha(*src) && *src != '_')
+ return false;
+
+ /* set a limit on identifier length */
+ src_limit = src + KSYM_NAME_LEN;
+ src++;
+ while (*src && src < src_limit) {
+ if (!isalnum(*src) && *src != '_')
+ return false;
+ src++;
+ }
+
+ return !*src;
+}
+
static const char *btf_name_by_offset(const struct btf *btf, u32 offset)
{
if (!offset)
return -EINVAL;
}
+ /* typedef type must have a valid name, and other ref types,
+ * volatile, const, restrict, should have a null name.
+ */
+ if (BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF) {
+ if (!t->name_off ||
+ !btf_name_valid_identifier(env->btf, t->name_off)) {
+ btf_verifier_log_type(env, t, "Invalid name");
+ return -EINVAL;
+ }
+ } else {
+ if (t->name_off) {
+ btf_verifier_log_type(env, t, "Invalid name");
+ return -EINVAL;
+ }
+ }
+
btf_verifier_log_type(env, t, NULL);
return 0;
return -EINVAL;
}
+ /* fwd type must have a valid name */
+ if (!t->name_off ||
+ !btf_name_valid_identifier(env->btf, t->name_off)) {
+ btf_verifier_log_type(env, t, "Invalid name");
+ return -EINVAL;
+ }
+
btf_verifier_log_type(env, t, NULL);
return 0;
return -EINVAL;
}
+ /* array type should not have a name */
+ if (t->name_off) {
+ btf_verifier_log_type(env, t, "Invalid name");
+ return -EINVAL;
+ }
+
if (btf_type_vlen(t)) {
btf_verifier_log_type(env, t, "vlen != 0");
return -EINVAL;
return -EINVAL;
}
+ /* struct type either no name or a valid one */
+ if (t->name_off &&
+ !btf_name_valid_identifier(env->btf, t->name_off)) {
+ btf_verifier_log_type(env, t, "Invalid name");
+ return -EINVAL;
+ }
+
btf_verifier_log_type(env, t, NULL);
last_offset = 0;
return -EINVAL;
}
+ /* struct member either no name or a valid one */
+ if (member->name_off &&
+ !btf_name_valid_identifier(btf, member->name_off)) {
+ btf_verifier_log_member(env, t, member, "Invalid name");
+ return -EINVAL;
+ }
/* A member cannot be in type void */
if (!member->type || !BTF_TYPE_ID_VALID(member->type)) {
btf_verifier_log_member(env, t, member,
return -EINVAL;
}
+ /* enum type either no name or a valid one */
+ if (t->name_off &&
+ !btf_name_valid_identifier(env->btf, t->name_off)) {
+ btf_verifier_log_type(env, t, "Invalid name");
+ return -EINVAL;
+ }
+
btf_verifier_log_type(env, t, NULL);
for (i = 0; i < nr_enums; i++) {
return -EINVAL;
}
+ /* enum member must have a valid name */
+ if (!enums[i].name_off ||
+ !btf_name_valid_identifier(btf, enums[i].name_off)) {
+ btf_verifier_log_type(env, t, "Invalid name");
+ return -EINVAL;
+ }
+
+
btf_verifier_log(env, "\t%s val=%d\n",
btf_name_by_offset(btf, enums[i].name_off),
enums[i].val);
int bpf_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
char *sym)
{
- unsigned long symbol_start, symbol_end;
struct bpf_prog_aux *aux;
unsigned int it = 0;
int ret = -ERANGE;
if (it++ != symnum)
continue;
- bpf_get_prog_addr_region(aux->prog, &symbol_start, &symbol_end);
bpf_get_prog_name(aux->prog, sym);
- *value = symbol_start;
+ *value = (unsigned long)aux->prog->bpf_func;
*type = BPF_SYM_ELF_TYPE;
ret = 0;
bpf_prog_unlock_free(fp);
}
+int bpf_jit_get_func_addr(const struct bpf_prog *prog,
+ const struct bpf_insn *insn, bool extra_pass,
+ u64 *func_addr, bool *func_addr_fixed)
+{
+ s16 off = insn->off;
+ s32 imm = insn->imm;
+ u8 *addr;
+
+ *func_addr_fixed = insn->src_reg != BPF_PSEUDO_CALL;
+ if (!*func_addr_fixed) {
+ /* Place-holder address till the last pass has collected
+ * all addresses for JITed subprograms in which case we
+ * can pick them up from prog->aux.
+ */
+ if (!extra_pass)
+ addr = NULL;
+ else if (prog->aux->func &&
+ off >= 0 && off < prog->aux->func_cnt)
+ addr = (u8 *)prog->aux->func[off]->bpf_func;
+ else
+ return -EINVAL;
+ } else {
+ /* Address of a BPF helper call. Since part of the core
+ * kernel, it's always at a fixed location. __bpf_call_base
+ * and the helper with imm relative to it are both in core
+ * kernel.
+ */
+ addr = (u8 *)__bpf_call_base + imm;
+ }
+
+ *func_addr = (unsigned long)addr;
+ return 0;
+}
+
static int bpf_jit_blind_insn(const struct bpf_insn *from,
const struct bpf_insn *aux,
struct bpf_insn *to_buff)
return -ENOENT;
new = kmalloc_node(sizeof(struct bpf_storage_buffer) +
- map->value_size, __GFP_ZERO | GFP_USER,
+ map->value_size,
+ __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN,
map->numa_node);
if (!new)
return -ENOMEM;
#include <linux/bpf.h>
#include <linux/list.h>
#include <linux/slab.h>
+#include <linux/capability.h>
#include "percpu_freelist.h"
#define QUEUE_STACK_CREATE_FLAG_MASK \
/* Called from syscall */
static int queue_stack_map_alloc_check(union bpf_attr *attr)
{
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
/* check sanity of attributes */
if (attr->max_entries == 0 || attr->key_size != 0 ||
+ attr->value_size == 0 ||
attr->map_flags & ~QUEUE_STACK_CREATE_FLAG_MASK)
return -EINVAL;
{
int ret, numa_node = bpf_map_attr_numa_node(attr);
struct bpf_queue_stack *qs;
- u32 size, value_size;
- u64 queue_size, cost;
-
- size = attr->max_entries + 1;
- value_size = attr->value_size;
-
- queue_size = sizeof(*qs) + (u64) value_size * size;
+ u64 size, queue_size, cost;
- cost = queue_size;
+ size = (u64) attr->max_entries + 1;
+ cost = queue_size = sizeof(*qs) + size * attr->value_size;
if (cost >= U32_MAX - PAGE_SIZE)
return ERR_PTR(-E2BIG);
info.jited_prog_len = 0;
info.xlated_prog_len = 0;
info.nr_jited_ksyms = 0;
+ info.nr_jited_func_lens = 0;
goto done;
}
}
ulen = info.nr_jited_ksyms;
- info.nr_jited_ksyms = prog->aux->func_cnt;
+ info.nr_jited_ksyms = prog->aux->func_cnt ? : 1;
if (info.nr_jited_ksyms && ulen) {
if (bpf_dump_raw_ok()) {
+ unsigned long ksym_addr;
u64 __user *user_ksyms;
- ulong ksym_addr;
u32 i;
/* copy the address of the kernel symbol
*/
ulen = min_t(u32, info.nr_jited_ksyms, ulen);
user_ksyms = u64_to_user_ptr(info.jited_ksyms);
- for (i = 0; i < ulen; i++) {
- ksym_addr = (ulong) prog->aux->func[i]->bpf_func;
- ksym_addr &= PAGE_MASK;
- if (put_user((u64) ksym_addr, &user_ksyms[i]))
+ if (prog->aux->func_cnt) {
+ for (i = 0; i < ulen; i++) {
+ ksym_addr = (unsigned long)
+ prog->aux->func[i]->bpf_func;
+ if (put_user((u64) ksym_addr,
+ &user_ksyms[i]))
+ return -EFAULT;
+ }
+ } else {
+ ksym_addr = (unsigned long) prog->bpf_func;
+ if (put_user((u64) ksym_addr, &user_ksyms[0]))
return -EFAULT;
}
} else {
}
ulen = info.nr_jited_func_lens;
- info.nr_jited_func_lens = prog->aux->func_cnt;
+ info.nr_jited_func_lens = prog->aux->func_cnt ? : 1;
if (info.nr_jited_func_lens && ulen) {
if (bpf_dump_raw_ok()) {
u32 __user *user_lens;
/* copy the JITed image lengths for each function */
ulen = min_t(u32, info.nr_jited_func_lens, ulen);
user_lens = u64_to_user_ptr(info.jited_func_lens);
- for (i = 0; i < ulen; i++) {
- func_len = prog->aux->func[i]->jited_len;
- if (put_user(func_len, &user_lens[i]))
+ if (prog->aux->func_cnt) {
+ for (i = 0; i < ulen; i++) {
+ func_len =
+ prog->aux->func[i]->jited_len;
+ if (put_user(func_len, &user_lens[i]))
+ return -EFAULT;
+ }
+ } else {
+ func_len = prog->jited_len;
+ if (put_user(func_len, &user_lens[0]))
return -EFAULT;
}
} else {
#define BPF_COMPLEXITY_LIMIT_INSNS 131072
#define BPF_COMPLEXITY_LIMIT_STACK 1024
+#define BPF_COMPLEXITY_LIMIT_STATES 64
#define BPF_MAP_PTR_UNPRIV 1UL
#define BPF_MAP_PTR_POISON ((void *)((0xeB9FUL << 1) + \
regs[BPF_REG_0].type = NOT_INIT;
} else if (fn->ret_type == RET_PTR_TO_MAP_VALUE_OR_NULL ||
fn->ret_type == RET_PTR_TO_MAP_VALUE) {
- if (fn->ret_type == RET_PTR_TO_MAP_VALUE)
- regs[BPF_REG_0].type = PTR_TO_MAP_VALUE;
- else
- regs[BPF_REG_0].type = PTR_TO_MAP_VALUE_OR_NULL;
/* There is no offset yet applied, variable or fixed */
mark_reg_known_zero(env, regs, BPF_REG_0);
/* remember map_ptr, so that check_map_access()
return -EINVAL;
}
regs[BPF_REG_0].map_ptr = meta.map_ptr;
- regs[BPF_REG_0].id = ++env->id_gen;
+ if (fn->ret_type == RET_PTR_TO_MAP_VALUE) {
+ regs[BPF_REG_0].type = PTR_TO_MAP_VALUE;
+ } else {
+ regs[BPF_REG_0].type = PTR_TO_MAP_VALUE_OR_NULL;
+ regs[BPF_REG_0].id = ++env->id_gen;
+ }
} else if (fn->ret_type == RET_PTR_TO_SOCKET_OR_NULL) {
int id = acquire_reference_state(env, insn_idx);
if (id < 0)
dst_reg->umax_value = umax_ptr;
dst_reg->var_off = ptr_reg->var_off;
dst_reg->off = ptr_reg->off + smin_val;
- dst_reg->range = ptr_reg->range;
+ dst_reg->raw = ptr_reg->raw;
break;
}
/* A new variable offset is created. Note that off_reg->off
}
dst_reg->var_off = tnum_add(ptr_reg->var_off, off_reg->var_off);
dst_reg->off = ptr_reg->off;
+ dst_reg->raw = ptr_reg->raw;
if (reg_is_pkt_pointer(ptr_reg)) {
dst_reg->id = ++env->id_gen;
/* something was added to pkt_ptr, set range to zero */
- dst_reg->range = 0;
+ dst_reg->raw = 0;
}
break;
case BPF_SUB:
dst_reg->var_off = ptr_reg->var_off;
dst_reg->id = ptr_reg->id;
dst_reg->off = ptr_reg->off - smin_val;
- dst_reg->range = ptr_reg->range;
+ dst_reg->raw = ptr_reg->raw;
break;
}
/* A new variable offset is created. If the subtrahend is known
}
dst_reg->var_off = tnum_sub(ptr_reg->var_off, off_reg->var_off);
dst_reg->off = ptr_reg->off;
+ dst_reg->raw = ptr_reg->raw;
if (reg_is_pkt_pointer(ptr_reg)) {
dst_reg->id = ++env->id_gen;
/* something was added to pkt_ptr, set range to zero */
if (smin_val < 0)
- dst_reg->range = 0;
+ dst_reg->raw = 0;
}
break;
case BPF_AND:
}
}
+/* compute branch direction of the expression "if (reg opcode val) goto target;"
+ * and return:
+ * 1 - branch will be taken and "goto target" will be executed
+ * 0 - branch will not be taken and fall-through to next insn
+ * -1 - unknown. Example: "if (reg < 5)" is unknown when register value range [0,10]
+ */
+static int is_branch_taken(struct bpf_reg_state *reg, u64 val, u8 opcode)
+{
+ if (__is_pointer_value(false, reg))
+ return -1;
+
+ switch (opcode) {
+ case BPF_JEQ:
+ if (tnum_is_const(reg->var_off))
+ return !!tnum_equals_const(reg->var_off, val);
+ break;
+ case BPF_JNE:
+ if (tnum_is_const(reg->var_off))
+ return !tnum_equals_const(reg->var_off, val);
+ break;
+ case BPF_JGT:
+ if (reg->umin_value > val)
+ return 1;
+ else if (reg->umax_value <= val)
+ return 0;
+ break;
+ case BPF_JSGT:
+ if (reg->smin_value > (s64)val)
+ return 1;
+ else if (reg->smax_value < (s64)val)
+ return 0;
+ break;
+ case BPF_JLT:
+ if (reg->umax_value < val)
+ return 1;
+ else if (reg->umin_value >= val)
+ return 0;
+ break;
+ case BPF_JSLT:
+ if (reg->smax_value < (s64)val)
+ return 1;
+ else if (reg->smin_value >= (s64)val)
+ return 0;
+ break;
+ case BPF_JGE:
+ if (reg->umin_value >= val)
+ return 1;
+ else if (reg->umax_value < val)
+ return 0;
+ break;
+ case BPF_JSGE:
+ if (reg->smin_value >= (s64)val)
+ return 1;
+ else if (reg->smax_value < (s64)val)
+ return 0;
+ break;
+ case BPF_JLE:
+ if (reg->umax_value <= val)
+ return 1;
+ else if (reg->umin_value > val)
+ return 0;
+ break;
+ case BPF_JSLE:
+ if (reg->smax_value <= (s64)val)
+ return 1;
+ else if (reg->smin_value > (s64)val)
+ return 0;
+ break;
+ }
+
+ return -1;
+}
+
/* Adjusts the register min/max values in the case that the dst_reg is the
* variable register that we are working on, and src_reg is a constant or we're
* simply doing a BPF_K check.
dst_reg = ®s[insn->dst_reg];
- /* detect if R == 0 where R was initialized to zero earlier */
- if (BPF_SRC(insn->code) == BPF_K &&
- (opcode == BPF_JEQ || opcode == BPF_JNE) &&
- dst_reg->type == SCALAR_VALUE &&
- tnum_is_const(dst_reg->var_off)) {
- if ((opcode == BPF_JEQ && dst_reg->var_off.value == insn->imm) ||
- (opcode == BPF_JNE && dst_reg->var_off.value != insn->imm)) {
- /* if (imm == imm) goto pc+off;
- * only follow the goto, ignore fall-through
- */
+ if (BPF_SRC(insn->code) == BPF_K) {
+ int pred = is_branch_taken(dst_reg, insn->imm, opcode);
+
+ if (pred == 1) {
+ /* only follow the goto, ignore fall-through */
*insn_idx += insn->off;
return 0;
- } else {
- /* if (imm != imm) goto pc+off;
- * only follow fall-through branch, since
+ } else if (pred == 0) {
+ /* only follow fall-through branch, since
* that's where the program will go
*/
return 0;
struct bpf_verifier_state_list *new_sl;
struct bpf_verifier_state_list *sl;
struct bpf_verifier_state *cur = env->cur_state, *new;
- int i, j, err;
+ int i, j, err, states_cnt = 0;
sl = env->explored_states[insn_idx];
if (!sl)
return 1;
}
sl = sl->next;
+ states_cnt++;
}
+ if (!env->allow_ptr_leaks && states_cnt > BPF_COMPLEXITY_LIMIT_STATES)
+ return 0;
+
/* there were no equivalent states, remember current one.
* technically the current state is not proven to be safe yet,
* but it will either reach outer most bpf_exit (which means it's safe)
goto process_bpf_exit;
}
+ if (signal_pending(current))
+ return -EAGAIN;
+
if (need_resched())
cond_resched();
return;
/* NOTE: fake 'exit' subprog should be updated as well. */
for (i = 0; i <= env->subprog_cnt; i++) {
- if (env->subprog_info[i].start < off)
+ if (env->subprog_info[i].start <= off)
continue;
env->subprog_info[i].start += len - 1;
}
}
/**
- * cgroup_e_css_by_mask - obtain a cgroup's effective css for the specified ss
+ * cgroup_e_css - obtain a cgroup's effective css for the specified subsystem
* @cgrp: the cgroup of interest
* @ss: the subsystem of interest (%NULL returns @cgrp->self)
*
* enabled. If @ss is associated with the hierarchy @cgrp is on, this
* function is guaranteed to return non-NULL css.
*/
-static struct cgroup_subsys_state *cgroup_e_css_by_mask(struct cgroup *cgrp,
- struct cgroup_subsys *ss)
+static struct cgroup_subsys_state *cgroup_e_css(struct cgroup *cgrp,
+ struct cgroup_subsys *ss)
{
lockdep_assert_held(&cgroup_mutex);
return cgroup_css(cgrp, ss);
}
-/**
- * cgroup_e_css - obtain a cgroup's effective css for the specified subsystem
- * @cgrp: the cgroup of interest
- * @ss: the subsystem of interest
- *
- * Find and get the effective css of @cgrp for @ss. The effective css is
- * defined as the matching css of the nearest ancestor including self which
- * has @ss enabled. If @ss is not mounted on the hierarchy @cgrp is on,
- * the root css is returned, so this function always returns a valid css.
- *
- * The returned css is not guaranteed to be online, and therefore it is the
- * callers responsiblity to tryget a reference for it.
- */
-struct cgroup_subsys_state *cgroup_e_css(struct cgroup *cgrp,
- struct cgroup_subsys *ss)
-{
- struct cgroup_subsys_state *css;
-
- do {
- css = cgroup_css(cgrp, ss);
-
- if (css)
- return css;
- cgrp = cgroup_parent(cgrp);
- } while (cgrp);
-
- return init_css_set.subsys[ss->id];
-}
-
/**
* cgroup_get_e_css - get a cgroup's effective css for the specified subsystem
* @cgrp: the cgroup of interest
*
* Should be called under cgroup_[tree_]mutex.
*/
-#define for_each_e_css(css, ssid, cgrp) \
- for ((ssid) = 0; (ssid) < CGROUP_SUBSYS_COUNT; (ssid)++) \
- if (!((css) = cgroup_e_css_by_mask(cgrp, \
- cgroup_subsys[(ssid)]))) \
- ; \
+#define for_each_e_css(css, ssid, cgrp) \
+ for ((ssid) = 0; (ssid) < CGROUP_SUBSYS_COUNT; (ssid)++) \
+ if (!((css) = cgroup_e_css(cgrp, cgroup_subsys[(ssid)]))) \
+ ; \
else
/**
* @ss is in this hierarchy, so we want the
* effective css from @cgrp.
*/
- template[i] = cgroup_e_css_by_mask(cgrp, ss);
+ template[i] = cgroup_e_css(cgrp, ss);
} else {
/*
* @ss is not in this hierarchy, so we don't want
return ret;
/*
- * At this point, cgroup_e_css_by_mask() results reflect the new csses
+ * At this point, cgroup_e_css() results reflect the new csses
* making the following cgroup_update_dfl_csses() properly update
* css associations of all tasks in the subtree.
*/
CONFIG_KVM_GUEST=y
CONFIG_S390_GUEST=y
CONFIG_VIRTIO=y
+CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BLK=y
CONFIG_VIRTIO_CONSOLE=y
#include <linux/sched/signal.h>
#include <linux/sched/hotplug.h>
#include <linux/sched/task.h>
+#include <linux/sched/smt.h>
#include <linux/unistd.h>
#include <linux/cpu.h>
#include <linux/oom.h>
#endif /* CONFIG_HOTPLUG_CPU */
+/*
+ * Architectures that need SMT-specific errata handling during SMT hotplug
+ * should override this.
+ */
+void __weak arch_smt_update(void) { }
+
#ifdef CONFIG_HOTPLUG_SMT
enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED;
EXPORT_SYMBOL_GPL(cpu_smt_control);
* concurrent CPU hotplug via cpu_add_remove_lock.
*/
lockup_detector_cleanup();
+ arch_smt_update();
return ret;
}
ret = cpuhp_up_callbacks(cpu, st, target);
out:
cpus_write_unlock();
+ arch_smt_update();
return ret;
}
kobject_uevent(&dev->kobj, KOBJ_ONLINE);
}
-/*
- * Architectures that need SMT-specific errata handling during SMT hotplug
- * should override this.
- */
-void __weak arch_smt_update(void) { };
-
static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
{
int cpu, ret = 0;
kdb_printf("no process for cpu %ld\n", cpu);
return 0;
}
- sprintf(buf, "btt 0x%p\n", KDB_TSK(cpu));
+ sprintf(buf, "btt 0x%px\n", KDB_TSK(cpu));
kdb_parse(buf);
return 0;
}
kdb_printf("btc: cpu status: ");
kdb_parse("cpu\n");
for_each_online_cpu(cpu) {
- sprintf(buf, "btt 0x%p\n", KDB_TSK(cpu));
+ sprintf(buf, "btt 0x%px\n", KDB_TSK(cpu));
kdb_parse(buf);
touch_nmi_watchdog();
}
int count;
int i;
int diag, dtab_count;
- int key;
+ int key, buf_size, ret;
diag = kdbgetintenv("DTABCOUNT", &dtab_count);
else
p_tmp = tmpbuffer;
len = strlen(p_tmp);
- count = kallsyms_symbol_complete(p_tmp,
- sizeof(tmpbuffer) -
- (p_tmp - tmpbuffer));
+ buf_size = sizeof(tmpbuffer) - (p_tmp - tmpbuffer);
+ count = kallsyms_symbol_complete(p_tmp, buf_size);
if (tab == 2 && count > 0) {
kdb_printf("\n%d symbols are found.", count);
if (count > dtab_count) {
}
kdb_printf("\n");
for (i = 0; i < count; i++) {
- if (WARN_ON(!kallsyms_symbol_next(p_tmp, i)))
+ ret = kallsyms_symbol_next(p_tmp, i, buf_size);
+ if (WARN_ON(!ret))
break;
- kdb_printf("%s ", p_tmp);
+ if (ret != -E2BIG)
+ kdb_printf("%s ", p_tmp);
+ else
+ kdb_printf("%s... ", p_tmp);
*(p_tmp + len) = '\0';
}
if (i >= dtab_count)
case KT_LATIN:
if (isprint(keychar))
break; /* printable characters */
- /* drop through */
+ /* fall through */
case KT_SPEC:
if (keychar == K_ENTER)
break;
- /* drop through */
+ /* fall through */
default:
return -1; /* ignore unprintables */
}
if (reason == KDB_REASON_DEBUG) {
/* special case below */
} else {
- kdb_printf("\nEntering kdb (current=0x%p, pid %d) ",
+ kdb_printf("\nEntering kdb (current=0x%px, pid %d) ",
kdb_current, kdb_current ? kdb_current->pid : 0);
#if defined(CONFIG_SMP)
kdb_printf("on processor %d ", raw_smp_processor_id());
*/
switch (db_result) {
case KDB_DB_BPT:
- kdb_printf("\nEntering kdb (0x%p, pid %d) ",
+ kdb_printf("\nEntering kdb (0x%px, pid %d) ",
kdb_current, kdb_current->pid);
#if defined(CONFIG_SMP)
kdb_printf("on processor %d ", raw_smp_processor_id());
char cbuf[32];
char *c = cbuf;
int i;
+ int j;
unsigned long word;
memset(cbuf, '\0', sizeof(cbuf));
wc.word = word;
#define printable_char(c) \
({unsigned char __c = c; isascii(__c) && isprint(__c) ? __c : '.'; })
- switch (bytesperword) {
- case 8:
+ for (j = 0; j < bytesperword; j++)
*c++ = printable_char(*cp++);
- *c++ = printable_char(*cp++);
- *c++ = printable_char(*cp++);
- *c++ = printable_char(*cp++);
- addr += 4;
- case 4:
- *c++ = printable_char(*cp++);
- *c++ = printable_char(*cp++);
- addr += 2;
- case 2:
- *c++ = printable_char(*cp++);
- addr++;
- case 1:
- *c++ = printable_char(*cp++);
- addr++;
- break;
- }
+ addr += bytesperword;
#undef printable_char
}
}
if (mod->state == MODULE_STATE_UNFORMED)
continue;
- kdb_printf("%-20s%8u 0x%p ", mod->name,
+ kdb_printf("%-20s%8u 0x%px ", mod->name,
mod->core_layout.size, (void *)mod);
#ifdef CONFIG_MODULE_UNLOAD
kdb_printf("%4d ", module_refcount(mod));
kdb_printf(" (Loading)");
else
kdb_printf(" (Live)");
- kdb_printf(" 0x%p", mod->core_layout.base);
+ kdb_printf(" 0x%px", mod->core_layout.base);
#ifdef CONFIG_MODULE_UNLOAD
{
return;
cpu = kdb_process_cpu(p);
- kdb_printf("0x%p %8d %8d %d %4d %c 0x%p %c%s\n",
+ kdb_printf("0x%px %8d %8d %d %4d %c 0x%px %c%s\n",
(void *)p, p->pid, p->parent->pid,
kdb_task_has_cpu(p), kdb_process_cpu(p),
kdb_task_state_char(p),
} else {
if (KDB_TSK(cpu) != p)
kdb_printf(" Error: does not match running "
- "process table (0x%p)\n", KDB_TSK(cpu));
+ "process table (0x%px)\n", KDB_TSK(cpu));
}
}
}
for_each_kdbcmd(kp, i) {
if (kp->cmd_name && (strcmp(kp->cmd_name, cmd) == 0)) {
kdb_printf("Duplicate kdb command registered: "
- "%s, func %p help %s\n", cmd, func, help);
+ "%s, func %px help %s\n", cmd, func, help);
return 1;
}
}
unsigned long sym_start;
unsigned long sym_end;
} kdb_symtab_t;
-extern int kallsyms_symbol_next(char *prefix_name, int flag);
+extern int kallsyms_symbol_next(char *prefix_name, int flag, int buf_size);
extern int kallsyms_symbol_complete(char *prefix_name, int max_len);
/* Exported Symbols for kernel loadable modules to use. */
int kdbgetsymval(const char *symname, kdb_symtab_t *symtab)
{
if (KDB_DEBUG(AR))
- kdb_printf("kdbgetsymval: symname=%s, symtab=%p\n", symname,
+ kdb_printf("kdbgetsymval: symname=%s, symtab=%px\n", symname,
symtab);
memset(symtab, 0, sizeof(*symtab));
symtab->sym_start = kallsyms_lookup_name(symname);
char *knt1 = NULL;
if (KDB_DEBUG(AR))
- kdb_printf("kdbnearsym: addr=0x%lx, symtab=%p\n", addr, symtab);
+ kdb_printf("kdbnearsym: addr=0x%lx, symtab=%px\n", addr, symtab);
memset(symtab, 0, sizeof(*symtab));
if (addr < 4096)
symtab->mod_name = "kernel";
if (KDB_DEBUG(AR))
kdb_printf("kdbnearsym: returns %d symtab->sym_start=0x%lx, "
- "symtab->mod_name=%p, symtab->sym_name=%p (%s)\n", ret,
+ "symtab->mod_name=%px, symtab->sym_name=%px (%s)\n", ret,
symtab->sym_start, symtab->mod_name, symtab->sym_name,
symtab->sym_name);
* Parameters:
* prefix_name prefix of a symbol name to lookup
* flag 0 means search from the head, 1 means continue search.
+ * buf_size maximum length that can be written to prefix_name
+ * buffer
* Returns:
* 1 if a symbol matches the given prefix.
* 0 if no string found
*/
-int kallsyms_symbol_next(char *prefix_name, int flag)
+int kallsyms_symbol_next(char *prefix_name, int flag, int buf_size)
{
int prefix_len = strlen(prefix_name);
static loff_t pos;
pos = 0;
while ((name = kdb_walk_kallsyms(&pos))) {
- if (strncmp(name, prefix_name, prefix_len) == 0) {
- strncpy(prefix_name, name, strlen(name)+1);
- return 1;
- }
+ if (!strncmp(name, prefix_name, prefix_len))
+ return strscpy(prefix_name, name, buf_size);
}
return 0;
}
*word = w8;
break;
}
- /* drop through */
+ /* fall through */
default:
diag = KDB_BADWIDTH;
kdb_printf("kdb_getphysword: bad width %ld\n", (long) size);
*word = w8;
break;
}
- /* drop through */
+ /* fall through */
default:
diag = KDB_BADWIDTH;
kdb_printf("kdb_getword: bad width %ld\n", (long) size);
diag = kdb_putarea(addr, w8);
break;
}
- /* drop through */
+ /* fall through */
default:
diag = KDB_BADWIDTH;
kdb_printf("kdb_putword: bad width %ld\n", (long) size);
__func__, dah_first);
if (dah_first) {
h_used = (struct debug_alloc_header *)debug_alloc_pool;
- kdb_printf("%s: h_used %p size %d\n", __func__, h_used,
+ kdb_printf("%s: h_used %px size %d\n", __func__, h_used,
h_used->size);
}
do {
h_used = (struct debug_alloc_header *)
((char *)h_free + dah_overhead + h_free->size);
- kdb_printf("%s: h_used %p size %d caller %p\n",
+ kdb_printf("%s: h_used %px size %d caller %px\n",
__func__, h_used, h_used->size, h_used->caller);
h_free = (struct debug_alloc_header *)
(debug_alloc_pool + h_free->next);
((char *)h_free + dah_overhead + h_free->size);
if ((char *)h_used - debug_alloc_pool !=
sizeof(debug_alloc_pool_aligned))
- kdb_printf("%s: h_used %p size %d caller %p\n",
+ kdb_printf("%s: h_used %px size %d caller %px\n",
__func__, h_used, h_used->size, h_used->caller);
out:
spin_unlock(&dap_lock);
}
if (!dev_is_dma_coherent(dev) &&
- (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+ (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0 &&
+ dev_addr != DIRECT_MAPPING_ERROR)
arch_sync_dma_for_device(dev, phys, size, dir);
return dev_addr;
/*
* Do not update time when cgroup is not active
*/
- if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
+ if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
__update_cgrp_time(event->cgrp);
}
* gets called, we don't get a chance to remove uprobe from
* delayed_uprobe_list from remove_breakpoint(). Do it here.
*/
+ mutex_lock(&delayed_uprobe_lock);
delayed_uprobe_remove(uprobe, NULL);
+ mutex_unlock(&delayed_uprobe_lock);
kfree(uprobe);
}
}
BUG_ON((uprobe->offset & ~PAGE_MASK) +
UPROBE_SWBP_INSN_SIZE > PAGE_SIZE);
- smp_wmb(); /* pairs with rmb() in find_active_uprobe() */
+ smp_wmb(); /* pairs with the smp_rmb() in handle_swbp() */
set_bit(UPROBE_COPY_INSN, &uprobe->flags);
out:
* After we hit the bp, _unregister + _register can install the
* new and not-yet-analyzed uprobe at the same address, restart.
*/
- smp_rmb(); /* pairs with wmb() in install_breakpoint() */
if (unlikely(!test_bit(UPROBE_COPY_INSN, &uprobe->flags)))
goto out;
+ /*
+ * Pairs with the smp_wmb() in prepare_uprobe().
+ *
+ * Guarantees that if we see the UPROBE_COPY_INSN bit set, then
+ * we must also see the stores to &uprobe->arch performed by the
+ * prepare_uprobe() call.
+ */
+ smp_rmb();
+
/* Tracing handlers use ->utask to communicate with fetch methods */
if (!get_utask())
goto out;
#include <linux/kcov.h>
#include <linux/livepatch.h>
#include <linux/thread_info.h>
+#include <linux/stackleak.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
if (retval)
goto bad_fork_cleanup_io;
+ stackleak_task_init(p);
+
if (pid != &init_struct_pid) {
pid = alloc_pid(p->nsproxy->pid_ns_for_children);
if (IS_ERR(pid)) {
#include <linux/cpu.h>
#include <linux/irq.h>
-#define IRQ_MATRIX_SIZE (BITS_TO_LONGS(IRQ_MATRIX_BITS) * sizeof(unsigned long))
+#define IRQ_MATRIX_SIZE (BITS_TO_LONGS(IRQ_MATRIX_BITS))
struct cpumap {
unsigned int available;
struct task_struct *t;
};
-static bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t)
+static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t)
{
unsigned int mode;
return mode == needed_mode;
}
-static unsigned long canonicalize_ip(unsigned long ip)
+static notrace unsigned long canonicalize_ip(unsigned long ip)
{
#ifdef CONFIG_RANDOMIZE_BASE
ip -= kaslr_offset();
#include <linux/elf.h>
#include <linux/elfcore.h>
#include <linux/kernel.h>
-#include <linux/kexec.h>
-#include <linux/slab.h>
#include <linux/syscalls.h>
#include <linux/vmalloc.h>
#include "kexec_internal.h"
static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
{
- if (mode & PTRACE_MODE_SCHED)
- return false;
-
if (mode & PTRACE_MODE_NOAUDIT)
return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
else
!ptrace_has_cap(mm->user_ns, mode)))
return -EPERM;
- if (mode & PTRACE_MODE_SCHED)
- return 0;
return security_ptrace_access_check(task, mode);
}
-bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode)
-{
- return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED);
-}
-
bool ptrace_may_access(struct task_struct *task, unsigned int mode)
{
int err;
EXPORT_SYMBOL(release_resource);
/**
- * Finds the lowest iomem resource that covers part of [start..end]. The
- * caller must specify start, end, flags, and desc (which may be
+ * Finds the lowest iomem resource that covers part of [@start..@end]. The
+ * caller must specify @start, @end, @flags, and @desc (which may be
* IORES_DESC_NONE).
*
- * If a resource is found, returns 0 and *res is overwritten with the part
- * of the resource that's within [start..end]; if none is found, returns
- * -1.
+ * If a resource is found, returns 0 and @*res is overwritten with the part
+ * of the resource that's within [@start..@end]; if none is found, returns
+ * -1 or -EINVAL for other invalid parameters.
*
* This function walks the whole tree and not just first level children
* unless @first_lvl is true.
+ *
+ * @start: start address of the resource searched for
+ * @end: end address of same resource
+ * @flags: flags which the resource must have
+ * @desc: descriptor the resource must have
+ * @first_lvl: walk only the first level children, if set
+ * @res: return ptr, if resource found
*/
static int find_next_iomem_res(resource_size_t start, resource_size_t end,
unsigned long flags, unsigned long desc,
* @flags: I/O resource flags
* @start: start addr
* @end: end addr
+ * @arg: function argument for the callback @func
+ * @func: callback function that is called for each qualifying resource area
*
* NOTE: For a new descriptor search, define a new IORES_DESC in
* <linux/ioport.h> and set it in 'desc' of a target resource entry.
#ifdef CONFIG_SCHED_SMT
/*
- * The sched_smt_present static key needs to be evaluated on every
- * hotplug event because at boot time SMT might be disabled when
- * the number of booted CPUs is limited.
- *
- * If then later a sibling gets hotplugged, then the key would stay
- * off and SMT scheduling would never be functional.
+ * When going up, increment the number of cores with SMT present.
*/
- if (cpumask_weight(cpu_smt_mask(cpu)) > 1)
- static_branch_enable_cpuslocked(&sched_smt_present);
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ static_branch_inc_cpuslocked(&sched_smt_present);
#endif
set_cpu_active(cpu, true);
*/
synchronize_rcu_mult(call_rcu, call_rcu_sched);
+#ifdef CONFIG_SCHED_SMT
+ /*
+ * When going down, decrement the number of cores with SMT present.
+ */
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ static_branch_dec_cpuslocked(&sched_smt_present);
+#endif
+
if (!sched_smp_initialized)
return 0;
/*
* There's no userspace yet to cause hotplug operations; hence all the
* CPU masks are stable and all blatant races in the below code cannot
- * happen.
+ * happen. The hotplug lock is nevertheless taken to satisfy lockdep,
+ * but there won't be any contention on it.
*/
+ cpus_read_lock();
mutex_lock(&sched_domains_mutex);
sched_init_domains(cpu_active_mask);
mutex_unlock(&sched_domains_mutex);
+ cpus_read_unlock();
/* Move init over to a non-isolated CPU */
if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_DOMAIN)) < 0)
local = 1;
/*
- * Retry task to preferred node migration periodically, in case it
- * case it previously failed, or the scheduler moved us.
+ * Retry to migrate task to preferred node periodically, in case it
+ * previously failed, or the scheduler moved us.
*/
if (time_after(jiffies, p->numa_migrate_retry)) {
task_numa_placement(p);
return target;
}
-static unsigned long cpu_util_wake(int cpu, struct task_struct *p);
+static unsigned long cpu_util_without(int cpu, struct task_struct *p);
-static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
+static unsigned long capacity_spare_without(int cpu, struct task_struct *p)
{
- return max_t(long, capacity_of(cpu) - cpu_util_wake(cpu, p), 0);
+ return max_t(long, capacity_of(cpu) - cpu_util_without(cpu, p), 0);
}
/*
avg_load += cfs_rq_load_avg(&cpu_rq(i)->cfs);
- spare_cap = capacity_spare_wake(i, p);
+ spare_cap = capacity_spare_without(i, p);
if (spare_cap > max_spare_cap)
max_spare_cap = spare_cap;
return prev_cpu;
/*
- * We need task's util for capacity_spare_wake, sync it up to prev_cpu's
- * last_update_time.
+ * We need task's util for capacity_spare_without, sync it up to
+ * prev_cpu's last_update_time.
*/
if (!(sd_flag & SD_BALANCE_FORK))
sync_entity_load_avg(&p->se);
}
/*
- * cpu_util_wake: Compute CPU utilization with any contributions from
- * the waking task p removed.
+ * cpu_util_without: compute cpu utilization without any contributions from *p
+ * @cpu: the CPU which utilization is requested
+ * @p: the task which utilization should be discounted
+ *
+ * The utilization of a CPU is defined by the utilization of tasks currently
+ * enqueued on that CPU as well as tasks which are currently sleeping after an
+ * execution on that CPU.
+ *
+ * This method returns the utilization of the specified CPU by discounting the
+ * utilization of the specified task, whenever the task is currently
+ * contributing to the CPU utilization.
*/
-static unsigned long cpu_util_wake(int cpu, struct task_struct *p)
+static unsigned long cpu_util_without(int cpu, struct task_struct *p)
{
struct cfs_rq *cfs_rq;
unsigned int util;
cfs_rq = &cpu_rq(cpu)->cfs;
util = READ_ONCE(cfs_rq->avg.util_avg);
- /* Discount task's blocked util from CPU's util */
+ /* Discount task's util from CPU's util */
util -= min_t(unsigned int, util, task_util(p));
/*
* a) if *p is the only task sleeping on this CPU, then:
* cpu_util (== task_util) > util_est (== 0)
* and thus we return:
- * cpu_util_wake = (cpu_util - task_util) = 0
+ * cpu_util_without = (cpu_util - task_util) = 0
*
* b) if other tasks are SLEEPING on this CPU, which is now exiting
* IDLE, then:
* cpu_util >= task_util
* cpu_util > util_est (== 0)
* and thus we discount *p's blocked utilization to return:
- * cpu_util_wake = (cpu_util - task_util) >= 0
+ * cpu_util_without = (cpu_util - task_util) >= 0
*
* c) if other tasks are RUNNABLE on that CPU and
* util_est > cpu_util
* covered by the following code when estimated utilization is
* enabled.
*/
- if (sched_feat(UTIL_EST))
- util = max(util, READ_ONCE(cfs_rq->avg.util_est.enqueued));
+ if (sched_feat(UTIL_EST)) {
+ unsigned int estimated =
+ READ_ONCE(cfs_rq->avg.util_est.enqueued);
+
+ /*
+ * Despite the following checks we still have a small window
+ * for a possible race, when an execl's select_task_rq_fair()
+ * races with LB's detach_task():
+ *
+ * detach_task()
+ * p->on_rq = TASK_ON_RQ_MIGRATING;
+ * ---------------------------------- A
+ * deactivate_task() \
+ * dequeue_task() + RaceTime
+ * util_est_dequeue() /
+ * ---------------------------------- B
+ *
+ * The additional check on "current == p" it's required to
+ * properly fix the execl regression and it helps in further
+ * reducing the chances for the above race.
+ */
+ if (unlikely(task_on_rq_queued(p) || current == p)) {
+ estimated -= min_t(unsigned int, estimated,
+ (_task_util_est(p) | UTIL_AVG_UNCHANGED));
+ }
+ util = max(util, estimated);
+ }
/*
* Utilization (estimated) can exceed the CPU capacity, thus let's
static int psi_bug __read_mostly;
-bool psi_disabled __read_mostly;
-core_param(psi_disabled, psi_disabled, bool, 0644);
+DEFINE_STATIC_KEY_FALSE(psi_disabled);
+
+#ifdef CONFIG_PSI_DEFAULT_DISABLED
+bool psi_enable;
+#else
+bool psi_enable = true;
+#endif
+static int __init setup_psi(char *str)
+{
+ return kstrtobool(str, &psi_enable) == 0;
+}
+__setup("psi=", setup_psi);
/* Running averages - we need to be higher-res than loadavg */
#define PSI_FREQ (2*HZ+1) /* 2 sec intervals */
void __init psi_init(void)
{
- if (psi_disabled)
+ if (!psi_enable) {
+ static_branch_enable(&psi_disabled);
return;
+ }
psi_period = jiffies_to_nsecs(PSI_FREQ);
group_init(&psi_system);
struct rq_flags rf;
struct rq *rq;
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return;
*flags = current->flags & PF_MEMSTALL;
struct rq_flags rf;
struct rq *rq;
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return;
if (*flags)
#ifdef CONFIG_CGROUPS
int psi_cgroup_alloc(struct cgroup *cgroup)
{
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return 0;
cgroup->psi.pcpu = alloc_percpu(struct psi_group_cpu);
void psi_cgroup_free(struct cgroup *cgroup)
{
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return;
cancel_delayed_work_sync(&cgroup->psi.clock_work);
*/
void cgroup_move_task(struct task_struct *task, struct css_set *to)
{
- bool move_psi = !psi_disabled;
unsigned int task_flags = 0;
struct rq_flags rf;
struct rq *rq;
- if (move_psi) {
- rq = task_rq_lock(task, &rf);
+ if (static_branch_likely(&psi_disabled)) {
+ /*
+ * Lame to do this here, but the scheduler cannot be locked
+ * from the outside, so we move cgroups from inside sched/.
+ */
+ rcu_assign_pointer(task->cgroups, to);
+ return;
+ }
- if (task_on_rq_queued(task))
- task_flags = TSK_RUNNING;
- else if (task->in_iowait)
- task_flags = TSK_IOWAIT;
+ rq = task_rq_lock(task, &rf);
- if (task->flags & PF_MEMSTALL)
- task_flags |= TSK_MEMSTALL;
+ if (task_on_rq_queued(task))
+ task_flags = TSK_RUNNING;
+ else if (task->in_iowait)
+ task_flags = TSK_IOWAIT;
- if (task_flags)
- psi_task_change(task, task_flags, 0);
- }
+ if (task->flags & PF_MEMSTALL)
+ task_flags |= TSK_MEMSTALL;
- /*
- * Lame to do this here, but the scheduler cannot be locked
- * from the outside, so we move cgroups from inside sched/.
- */
+ if (task_flags)
+ psi_task_change(task, task_flags, 0);
+
+ /* See comment above */
rcu_assign_pointer(task->cgroups, to);
- if (move_psi) {
- if (task_flags)
- psi_task_change(task, 0, task_flags);
+ if (task_flags)
+ psi_task_change(task, 0, task_flags);
- task_rq_unlock(rq, task, &rf);
- }
+ task_rq_unlock(rq, task, &rf);
}
#endif /* CONFIG_CGROUPS */
{
int full;
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return -EOPNOTSUPP;
update_stats(group);
/*
* We may dequeue prev's rt_rq in put_prev_task().
- * So, we update time before rt_nr_running check.
+ * So, we update time before rt_queued check.
*/
if (prev->sched_class == &rt_sched_class)
update_curr_rt(rq);
#include <linux/sched/prio.h>
#include <linux/sched/rt.h>
#include <linux/sched/signal.h>
+#include <linux/sched/smt.h>
#include <linux/sched/stat.h>
#include <linux/sched/sysctl.h>
#include <linux/sched/task.h>
#ifdef CONFIG_SCHED_SMT
-
-extern struct static_key_false sched_smt_present;
-
extern void __update_idle_core(struct rq *rq);
static inline void update_idle_core(struct rq *rq)
{
int clear = 0, set = TSK_RUNNING;
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return;
if (!wakeup || p->sched_psi_wake_requeue) {
{
int clear = TSK_RUNNING, set = 0;
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return;
if (!sleep) {
static inline void psi_ttwu_dequeue(struct task_struct *p)
{
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return;
/*
* Is the task being migrated during a wakeup? Make sure to
static inline void psi_task_tick(struct rq *rq)
{
- if (psi_disabled)
+ if (static_branch_likely(&psi_disabled))
return;
if (unlikely(rq->curr->flags & PF_MEMSTALL))
int level = 0;
int i, j, k;
- sched_domains_numa_distance = kzalloc(sizeof(int) * nr_node_ids, GFP_KERNEL);
+ sched_domains_numa_distance = kzalloc(sizeof(int) * (nr_node_ids + 1), GFP_KERNEL);
if (!sched_domains_numa_distance)
return;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This code fills the used part of the kernel stack with a poison value
+ * before returning to userspace. It's part of the STACKLEAK feature
+ * ported from grsecurity/PaX.
+ *
+ * Author: Alexander Popov <alex.popov@linux.com>
+ *
+ * STACKLEAK reduces the information which kernel stack leak bugs can
+ * reveal and blocks some uninitialized stack variable attacks.
+ */
+
+#include <linux/stackleak.h>
+#include <linux/kprobes.h>
+
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+#include <linux/jump_label.h>
+#include <linux/sysctl.h>
+
+static DEFINE_STATIC_KEY_FALSE(stack_erasing_bypass);
+
+int stack_erasing_sysctl(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret = 0;
+ int state = !static_branch_unlikely(&stack_erasing_bypass);
+ int prev_state = state;
+
+ table->data = &state;
+ table->maxlen = sizeof(int);
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ state = !!state;
+ if (ret || !write || state == prev_state)
+ return ret;
+
+ if (state)
+ static_branch_disable(&stack_erasing_bypass);
+ else
+ static_branch_enable(&stack_erasing_bypass);
+
+ pr_warn("stackleak: kernel stack erasing is %s\n",
+ state ? "enabled" : "disabled");
+ return ret;
+}
+
+#define skip_erasing() static_branch_unlikely(&stack_erasing_bypass)
+#else
+#define skip_erasing() false
+#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
+
+asmlinkage void notrace stackleak_erase(void)
+{
+ /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
+ unsigned long kstack_ptr = current->lowest_stack;
+ unsigned long boundary = (unsigned long)end_of_stack(current);
+ unsigned int poison_count = 0;
+ const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
+
+ if (skip_erasing())
+ return;
+
+ /* Check that 'lowest_stack' value is sane */
+ if (unlikely(kstack_ptr - boundary >= THREAD_SIZE))
+ kstack_ptr = boundary;
+
+ /* Search for the poison value in the kernel stack */
+ while (kstack_ptr > boundary && poison_count <= depth) {
+ if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
+ poison_count++;
+ else
+ poison_count = 0;
+
+ kstack_ptr -= sizeof(unsigned long);
+ }
+
+ /*
+ * One 'long int' at the bottom of the thread stack is reserved and
+ * should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK=y).
+ */
+ if (kstack_ptr == boundary)
+ kstack_ptr += sizeof(unsigned long);
+
+#ifdef CONFIG_STACKLEAK_METRICS
+ current->prev_lowest_stack = kstack_ptr;
+#endif
+
+ /*
+ * Now write the poison value to the kernel stack. Start from
+ * 'kstack_ptr' and move up till the new 'boundary'. We assume that
+ * the stack pointer doesn't change when we write poison.
+ */
+ if (on_thread_stack())
+ boundary = current_stack_pointer;
+ else
+ boundary = current_top_of_stack();
+
+ while (kstack_ptr < boundary) {
+ *(unsigned long *)kstack_ptr = STACKLEAK_POISON;
+ kstack_ptr += sizeof(unsigned long);
+ }
+
+ /* Reset the 'lowest_stack' value for the next syscall */
+ current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
+}
+NOKPROBE_SYMBOL(stackleak_erase);
+
+void __used notrace stackleak_track_stack(void)
+{
+ /*
+ * N.B. stackleak_erase() fills the kernel stack with the poison value,
+ * which has the register width. That code assumes that the value
+ * of 'lowest_stack' is aligned on the register width boundary.
+ *
+ * That is true for x86 and x86_64 because of the kernel stack
+ * alignment on these platforms (for details, see 'cc_stack_align' in
+ * arch/x86/Makefile). Take care of that when you port STACKLEAK to
+ * new platforms.
+ */
+ unsigned long sp = (unsigned long)&sp;
+
+ /*
+ * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
+ * STACKLEAK_SEARCH_DEPTH makes the poison search in
+ * stackleak_erase() unreliable. Let's prevent that.
+ */
+ BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE > STACKLEAK_SEARCH_DEPTH);
+
+ if (sp < current->lowest_stack &&
+ sp >= (unsigned long)task_stack_page(current) +
+ sizeof(unsigned long)) {
+ current->lowest_stack = sp;
+ }
+}
+EXPORT_SYMBOL(stackleak_track_stack);
#include <linux/kexec.h>
#include <linux/bpf.h>
#include <linux/mount.h>
-#include <linux/pipe_fs_i.h>
#include <linux/uaccess.h>
#include <asm/processor.h>
#ifdef CONFIG_CHR_DEV_SG
#include <scsi/sg.h>
#endif
-
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+#include <linux/stackleak.h>
+#endif
#ifdef CONFIG_LOCKUP_DETECTOR
#include <linux/nmi.h>
#endif
.extra1 = &zero,
.extra2 = &one,
},
+#endif
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+ {
+ .procname = "stack_erasing",
+ .data = NULL,
+ .maxlen = sizeof(int),
+ .mode = 0600,
+ .proc_handler = stack_erasing_sysctl,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
#endif
{ }
};
struct task_cputime cputime;
unsigned long soft;
- if (dl_task(tsk))
- check_dl_overrun(tsk);
-
/*
* If cputimer is not running, then there are no active
* process wide timers (POSIX 1.b, itimers, RLIMIT_CPU).
ts->tv_sec = kts.tv_sec;
/* Zero out the padding for 32 bit systems or in compat mode */
- if (IS_ENABLED(CONFIG_64BIT_TIME) && (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()))
+ if (IS_ENABLED(CONFIG_64BIT_TIME) && in_compat_syscall())
kts.tv_nsec &= 0xFFFFFFFFUL;
ts->tv_nsec = kts.tv_nsec;
if (!bt || !(blk_tracer_flags.val & TRACE_BLK_OPT_CGROUP))
return NULL;
- if (!bio->bi_blkg)
+ if (!bio->bi_css)
return NULL;
- return cgroup_get_kernfs_id(bio_blkcg(bio)->css.cgroup);
+ return cgroup_get_kernfs_id(bio->bi_css->cgroup);
}
#else
static union kernfs_node_id *
i++;
} else if (fmt[i] == 'p' || fmt[i] == 's') {
mod[fmt_cnt]++;
- i++;
- if (!isspace(fmt[i]) && !ispunct(fmt[i]) && fmt[i] != 0)
+ /* disallow any further format extensions */
+ if (fmt[i + 1] != 0 &&
+ !isspace(fmt[i + 1]) &&
+ !ispunct(fmt[i + 1]))
return -EINVAL;
fmt_cnt++;
- if (fmt[i - 1] == 's') {
+ if (fmt[i] == 's') {
if (str_seen)
/* allow only one '%s' per fmt string */
return -EINVAL;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static int profile_graph_entry(struct ftrace_graph_ent *trace)
{
- int index = trace->depth;
+ int index = current->curr_ret_stack;
function_profile_call(trace->func, 0, NULL, NULL);
if (!fgraph_graph_time) {
int index;
- index = trace->depth;
+ index = current->curr_ret_stack;
/* Append this call time to the parent time to subtract */
if (index)
atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->curr_ret_stack = -1;
+ t->curr_ret_depth = -1;
/* Make sure the tasks see the -1 first: */
smp_wmb();
t->ret_stack = ret_stack_list[start++];
void ftrace_graph_init_idle_task(struct task_struct *t, int cpu)
{
t->curr_ret_stack = -1;
+ t->curr_ret_depth = -1;
/*
* The idle task has no parent, it either has its own
* stack or no stack at all.
/* Make sure we do not use the parent ret_stack */
t->ret_stack = NULL;
t->curr_ret_stack = -1;
+ t->curr_ret_depth = -1;
if (ftrace_graph_active) {
struct ftrace_ret_stack *ret_stack;
* can only be modified by current, we can reuse trace_recursion.
*/
TRACE_IRQ_BIT,
+
+ /* Set if the function is in the set_graph_function file */
+ TRACE_GRAPH_BIT,
+
+ /*
+ * In the very unlikely case that an interrupt came in
+ * at a start of graph tracing, and we want to trace
+ * the function in that interrupt, the depth can be greater
+ * than zero, because of the preempted start of a previous
+ * trace. In an even more unlikely case, depth could be 2
+ * if a softirq interrupted the start of graph tracing,
+ * followed by an interrupt preempting a start of graph
+ * tracing in the softirq, and depth can even be 3
+ * if an NMI came in at the start of an interrupt function
+ * that preempted a softirq start of a function that
+ * preempted normal context!!!! Luckily, it can't be
+ * greater than 3, so the next two bits are a mask
+ * of what the depth is when we set TRACE_GRAPH_BIT
+ */
+
+ TRACE_GRAPH_DEPTH_START_BIT,
+ TRACE_GRAPH_DEPTH_END_BIT,
};
#define trace_recursion_set(bit) do { (current)->trace_recursion |= (1<<(bit)); } while (0)
#define trace_recursion_clear(bit) do { (current)->trace_recursion &= ~(1<<(bit)); } while (0)
#define trace_recursion_test(bit) ((current)->trace_recursion & (1<<(bit)))
+#define trace_recursion_depth() \
+ (((current)->trace_recursion >> TRACE_GRAPH_DEPTH_START_BIT) & 3)
+#define trace_recursion_set_depth(depth) \
+ do { \
+ current->trace_recursion &= \
+ ~(3 << TRACE_GRAPH_DEPTH_START_BIT); \
+ current->trace_recursion |= \
+ ((depth) & 3) << TRACE_GRAPH_DEPTH_START_BIT; \
+ } while (0)
+
#define TRACE_CONTEXT_BITS 4
#define TRACE_FTRACE_START TRACE_FTRACE_BIT
extern struct ftrace_hash *ftrace_graph_hash;
extern struct ftrace_hash *ftrace_graph_notrace_hash;
-static inline int ftrace_graph_addr(unsigned long addr)
+static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
{
+ unsigned long addr = trace->func;
int ret = 0;
preempt_disable_notrace();
}
if (ftrace_lookup_ip(ftrace_graph_hash, addr)) {
+
+ /*
+ * This needs to be cleared on the return functions
+ * when the depth is zero.
+ */
+ trace_recursion_set(TRACE_GRAPH_BIT);
+ trace_recursion_set_depth(trace->depth);
+
/*
* If no irqs are to be traced, but a set_graph_function
* is set, and called by an interrupt handler, we still
return ret;
}
+static inline void ftrace_graph_addr_finish(struct ftrace_graph_ret *trace)
+{
+ if (trace_recursion_test(TRACE_GRAPH_BIT) &&
+ trace->depth == trace_recursion_depth())
+ trace_recursion_clear(TRACE_GRAPH_BIT);
+}
+
static inline int ftrace_graph_notrace_addr(unsigned long addr)
{
int ret = 0;
return ret;
}
#else
-static inline int ftrace_graph_addr(unsigned long addr)
+static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
{
return 1;
}
{
return 0;
}
+static inline void ftrace_graph_addr_finish(struct ftrace_graph_ret *trace)
+{ }
#endif /* CONFIG_DYNAMIC_FTRACE */
extern unsigned int fgraph_max_depth;
static inline bool ftrace_graph_ignore_func(struct ftrace_graph_ent *trace)
{
/* trace it when it is-nested-in or is a function enabled. */
- return !(trace->depth || ftrace_graph_addr(trace->func)) ||
+ return !(trace_recursion_test(TRACE_GRAPH_BIT) ||
+ ftrace_graph_addr(trace)) ||
(trace->depth < 0) ||
(fgraph_max_depth && trace->depth >= fgraph_max_depth);
}
struct trace_seq *s, u32 flags);
/* Add a function return address to the trace stack on thread info.*/
-int
-ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth,
+static int
+ftrace_push_return_trace(unsigned long ret, unsigned long func,
unsigned long frame_pointer, unsigned long *retp)
{
unsigned long long calltime;
#ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
current->ret_stack[index].retp = retp;
#endif
- *depth = current->curr_ret_stack;
+ return 0;
+}
+
+int function_graph_enter(unsigned long ret, unsigned long func,
+ unsigned long frame_pointer, unsigned long *retp)
+{
+ struct ftrace_graph_ent trace;
+
+ trace.func = func;
+ trace.depth = ++current->curr_ret_depth;
+
+ if (ftrace_push_return_trace(ret, func,
+ frame_pointer, retp))
+ goto out;
+
+ /* Only trace if the calling function expects to */
+ if (!ftrace_graph_entry(&trace))
+ goto out_ret;
return 0;
+ out_ret:
+ current->curr_ret_stack--;
+ out:
+ current->curr_ret_depth--;
+ return -EBUSY;
}
/* Retrieve a function return address to the trace stack on thread info.*/
trace->func = current->ret_stack[index].func;
trace->calltime = current->ret_stack[index].calltime;
trace->overrun = atomic_read(¤t->trace_overrun);
- trace->depth = index;
+ trace->depth = current->curr_ret_depth--;
+ /*
+ * We still want to trace interrupts coming in if
+ * max_depth is set to 1. Make sure the decrement is
+ * seen before ftrace_graph_return.
+ */
+ barrier();
}
/*
ftrace_pop_return_trace(&trace, &ret, frame_pointer);
trace.rettime = trace_clock_local();
+ ftrace_graph_return(&trace);
+ /*
+ * The ftrace_graph_return() may still access the current
+ * ret_stack structure, we need to make sure the update of
+ * curr_ret_stack is after that.
+ */
barrier();
current->curr_ret_stack--;
/*
return ret;
}
- /*
- * The trace should run after decrementing the ret counter
- * in case an interrupt were to come in. We don't want to
- * lose the interrupt if max_depth is set.
- */
- ftrace_graph_return(&trace);
-
if (unlikely(!ret)) {
ftrace_graph_stop();
WARN_ON(1);
int cpu;
int pc;
+ ftrace_graph_addr_finish(trace);
+
local_irq_save(flags);
cpu = raw_smp_processor_id();
data = per_cpu_ptr(tr->trace_buffer.data, cpu);
static void trace_graph_thresh_return(struct ftrace_graph_ret *trace)
{
+ ftrace_graph_addr_finish(trace);
+
if (tracing_thresh &&
(trace->rettime - trace->calltime < tracing_thresh))
return;
unsigned long flags;
int pc;
+ ftrace_graph_addr_finish(trace);
+
if (!func_prolog_dec(tr, &data, &flags))
return;
if (code[1].op != FETCH_OP_IMM)
return -EINVAL;
- tmp = strpbrk("+-", code->data);
+ tmp = strpbrk(code->data, "+-");
if (tmp)
c = *tmp;
ret = traceprobe_split_symbol_offset(code->data,
unsigned long flags;
int pc;
+ ftrace_graph_addr_finish(trace);
+
if (!func_prolog_preempt_disable(tr, &data, &pc))
return;
if (!new_idmap_permitted(file, ns, cap_setid, &new_map))
goto out;
- ret = sort_idmaps(&new_map);
- if (ret < 0)
- goto out;
-
ret = -EPERM;
/* Map the lower ids from the parent user namespace to the
* kernel global id space.
e->lower_first = lower_first;
}
+ /*
+ * If we want to use binary search for lookup, this clones the extent
+ * array and sorts both copies.
+ */
+ ret = sort_idmaps(&new_map);
+ if (ret < 0)
+ goto out;
+
/* Install the map */
if (new_map.nr_extents <= UID_GID_MAP_MAX_BASE_EXTENTS) {
memcpy(map->extent, new_map.extent,
if (!new)
return;
- kmemleak_ignore(new);
raw_spin_lock_irqsave(&pool_lock, flags);
hlist_add_head(&new->node, &obj_pool);
debug_objects_allocated++;
obj = kmem_cache_zalloc(obj_cache, GFP_KERNEL);
if (!obj)
goto free;
- kmemleak_ignore(obj);
hlist_add_head(&obj->node, &objects);
}
obj_cache = kmem_cache_create("debug_objects_cache",
sizeof (struct debug_obj), 0,
- SLAB_DEBUG_OBJECTS, NULL);
+ SLAB_DEBUG_OBJECTS | SLAB_NOLEAKTRACE,
+ NULL);
if (!obj_cache || debug_objects_replace_static_objects()) {
debug_objects_enabled = 0;
const struct kvec *kvec; \
struct kvec v; \
iterate_kvec(i, n, v, kvec, skip, (K)) \
+ } else if (unlikely(i->type & ITER_DISCARD)) { \
} else { \
const struct iovec *iov; \
struct iovec v; \
} \
i->nr_segs -= kvec - i->kvec; \
i->kvec = kvec; \
+ } else if (unlikely(i->type & ITER_DISCARD)) { \
+ skip += n; \
} else { \
const struct iovec *iov; \
struct iovec v; \
}
EXPORT_SYMBOL(iov_iter_fault_in_readable);
-void iov_iter_init(struct iov_iter *i, int direction,
+void iov_iter_init(struct iov_iter *i, unsigned int direction,
const struct iovec *iov, unsigned long nr_segs,
size_t count)
{
+ WARN_ON(direction & ~(READ | WRITE));
+ direction &= READ | WRITE;
+
/* It will get better. Eventually... */
if (uaccess_kernel()) {
- direction |= ITER_KVEC;
- i->type = direction;
+ i->type = ITER_KVEC | direction;
i->kvec = (struct kvec *)iov;
} else {
- i->type = direction;
+ i->type = ITER_IOVEC | direction;
i->iov = iov;
}
i->nr_segs = nr_segs;
return bytes;
}
+static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
+ __wsum *csum, struct iov_iter *i)
+{
+ struct pipe_inode_info *pipe = i->pipe;
+ size_t n, r;
+ size_t off = 0;
+ __wsum sum = *csum, next;
+ int idx;
+
+ if (!sanity(i))
+ return 0;
+
+ bytes = n = push_pipe(i, bytes, &idx, &r);
+ if (unlikely(!n))
+ return 0;
+ for ( ; n; idx = next_idx(idx, pipe), r = 0) {
+ size_t chunk = min_t(size_t, n, PAGE_SIZE - r);
+ char *p = kmap_atomic(pipe->bufs[idx].page);
+ next = csum_partial_copy_nocheck(addr, p + r, chunk, 0);
+ sum = csum_block_add(sum, next, off);
+ kunmap_atomic(p);
+ i->idx = idx;
+ i->iov_offset = r + chunk;
+ n -= chunk;
+ off += chunk;
+ addr += chunk;
+ }
+ i->count -= bytes;
+ *csum = sum;
+ return bytes;
+}
+
size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
{
const char *from = addr;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return copy_pipe_to_iter(addr, bytes, i);
if (iter_is_iovec(i))
might_fault();
const char *from = addr;
unsigned long rem, curr_addr, s_addr = (unsigned long) addr;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return copy_pipe_to_iter_mcsafe(addr, bytes, i);
if (iter_is_iovec(i))
might_fault();
size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return 0;
}
bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return false;
}
size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return 0;
}
size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return 0;
}
bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return false;
}
size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
kunmap_atomic(kaddr);
return wanted;
- } else if (likely(!(i->type & ITER_PIPE)))
+ } else if (unlikely(iov_iter_is_discard(i)))
+ return bytes;
+ else if (likely(!iov_iter_is_pipe(i)))
return copy_page_to_iter_iovec(page, offset, bytes, i);
else
return copy_page_to_iter_pipe(page, offset, bytes, i);
{
if (unlikely(!page_copy_sane(page, offset, bytes)))
return 0;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
WARN_ON(1);
return 0;
}
size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
{
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return pipe_zero(bytes, i);
iterate_and_advance(i, bytes, v,
clear_user(v.iov_base, v.iov_len),
kunmap_atomic(kaddr);
return 0;
}
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
kunmap_atomic(kaddr);
WARN_ON(1);
return 0;
void iov_iter_advance(struct iov_iter *i, size_t size)
{
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
pipe_advance(i, size);
return;
}
+ if (unlikely(iov_iter_is_discard(i))) {
+ i->count -= size;
+ return;
+ }
iterate_and_advance(i, size, v, 0, 0, 0)
}
EXPORT_SYMBOL(iov_iter_advance);
if (WARN_ON(unroll > MAX_RW_COUNT))
return;
i->count += unroll;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
struct pipe_inode_info *pipe = i->pipe;
int idx = i->idx;
size_t off = i->iov_offset;
pipe_truncate(i);
return;
}
+ if (unlikely(iov_iter_is_discard(i)))
+ return;
if (unroll <= i->iov_offset) {
i->iov_offset -= unroll;
return;
}
unroll -= i->iov_offset;
- if (i->type & ITER_BVEC) {
+ if (iov_iter_is_bvec(i)) {
const struct bio_vec *bvec = i->bvec;
while (1) {
size_t n = (--bvec)->bv_len;
*/
size_t iov_iter_single_seg_count(const struct iov_iter *i)
{
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return i->count; // it is a silly place, anyway
if (i->nr_segs == 1)
return i->count;
- else if (i->type & ITER_BVEC)
+ if (unlikely(iov_iter_is_discard(i)))
+ return i->count;
+ else if (iov_iter_is_bvec(i))
return min(i->count, i->bvec->bv_len - i->iov_offset);
else
return min(i->count, i->iov->iov_len - i->iov_offset);
}
EXPORT_SYMBOL(iov_iter_single_seg_count);
-void iov_iter_kvec(struct iov_iter *i, int direction,
+void iov_iter_kvec(struct iov_iter *i, unsigned int direction,
const struct kvec *kvec, unsigned long nr_segs,
size_t count)
{
- BUG_ON(!(direction & ITER_KVEC));
- i->type = direction;
+ WARN_ON(direction & ~(READ | WRITE));
+ i->type = ITER_KVEC | (direction & (READ | WRITE));
i->kvec = kvec;
i->nr_segs = nr_segs;
i->iov_offset = 0;
}
EXPORT_SYMBOL(iov_iter_kvec);
-void iov_iter_bvec(struct iov_iter *i, int direction,
+void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
const struct bio_vec *bvec, unsigned long nr_segs,
size_t count)
{
- BUG_ON(!(direction & ITER_BVEC));
- i->type = direction;
+ WARN_ON(direction & ~(READ | WRITE));
+ i->type = ITER_BVEC | (direction & (READ | WRITE));
i->bvec = bvec;
i->nr_segs = nr_segs;
i->iov_offset = 0;
}
EXPORT_SYMBOL(iov_iter_bvec);
-void iov_iter_pipe(struct iov_iter *i, int direction,
+void iov_iter_pipe(struct iov_iter *i, unsigned int direction,
struct pipe_inode_info *pipe,
size_t count)
{
- BUG_ON(direction != ITER_PIPE);
+ BUG_ON(direction != READ);
WARN_ON(pipe->nrbufs == pipe->buffers);
- i->type = direction;
+ i->type = ITER_PIPE | READ;
i->pipe = pipe;
i->idx = (pipe->curbuf + pipe->nrbufs) & (pipe->buffers - 1);
i->iov_offset = 0;
}
EXPORT_SYMBOL(iov_iter_pipe);
+/**
+ * iov_iter_discard - Initialise an I/O iterator that discards data
+ * @i: The iterator to initialise.
+ * @direction: The direction of the transfer.
+ * @count: The size of the I/O buffer in bytes.
+ *
+ * Set up an I/O iterator that just discards everything that's written to it.
+ * It's only available as a READ iterator.
+ */
+void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
+{
+ BUG_ON(direction != READ);
+ i->type = ITER_DISCARD | READ;
+ i->count = count;
+ i->iov_offset = 0;
+}
+EXPORT_SYMBOL(iov_iter_discard);
+
unsigned long iov_iter_alignment(const struct iov_iter *i)
{
unsigned long res = 0;
size_t size = i->count;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
if (size && i->iov_offset && allocated(&i->pipe->bufs[i->idx]))
return size | i->iov_offset;
return size;
unsigned long res = 0;
size_t size = i->count;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
WARN_ON(1);
return ~0U;
}
if (maxsize > i->count)
maxsize = i->count;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return pipe_get_pages(i, pages, maxsize, maxpages, start);
+ if (unlikely(iov_iter_is_discard(i)))
+ return -EFAULT;
+
iterate_all_kinds(i, maxsize, v, ({
unsigned long addr = (unsigned long)v.iov_base;
size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
len = maxpages * PAGE_SIZE;
addr &= ~(PAGE_SIZE - 1);
n = DIV_ROUND_UP(len, PAGE_SIZE);
- res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, pages);
+ res = get_user_pages_fast(addr, n, iov_iter_rw(i) != WRITE, pages);
if (unlikely(res < 0))
return res;
return (res == n ? len : res * PAGE_SIZE) - *start;
if (maxsize > i->count)
maxsize = i->count;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return pipe_get_pages_alloc(i, pages, maxsize, start);
+ if (unlikely(iov_iter_is_discard(i)))
+ return -EFAULT;
+
iterate_all_kinds(i, maxsize, v, ({
unsigned long addr = (unsigned long)v.iov_base;
size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
p = get_pages_array(n);
if (!p)
return -ENOMEM;
- res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, p);
+ res = get_user_pages_fast(addr, n, iov_iter_rw(i) != WRITE, p);
if (unlikely(res < 0)) {
kvfree(p);
return res;
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
WARN_ON(1);
return 0;
}
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
WARN_ON(1);
return false;
}
const char *from = addr;
__wsum sum, next;
size_t off = 0;
+
+ if (unlikely(iov_iter_is_pipe(i)))
+ return csum_and_copy_to_pipe_iter(addr, bytes, csum, i);
+
sum = *csum;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_discard(i))) {
WARN_ON(1); /* for now */
return 0;
}
if (!size)
return 0;
+ if (unlikely(iov_iter_is_discard(i)))
+ return 0;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
struct pipe_inode_info *pipe = i->pipe;
size_t off;
int idx;
const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
{
*new = *old;
- if (unlikely(new->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(new))) {
WARN_ON(1);
return NULL;
}
- if (new->type & ITER_BVEC)
+ if (unlikely(iov_iter_is_discard(new)))
+ return NULL;
+ if (iov_iter_is_bvec(new))
return new->bvec = kmemdup(new->bvec,
new->nr_segs * sizeof(struct bio_vec),
flags);
CFLAGS += -I../../../arch/arm/include -mfpu=neon
HAS_NEON = yes
endif
-ifeq ($(ARCH),arm64)
+ifeq ($(ARCH),aarch64)
CFLAGS += -I../../../arch/arm64/include
HAS_NEON = yes
endif
gcc -c -x assembler - >&/dev/null && \
rm ./-.o && echo -DCONFIG_AS_AVX512=1)
else ifeq ($(HAS_NEON),yes)
- OBJS += neon.o neon1.o neon2.o neon4.o neon8.o
+ OBJS += neon.o neon1.o neon2.o neon4.o neon8.o recov_neon.o recov_neon_inner.o
CFLAGS += -DCONFIG_KERNEL_MODE_NEON=1
else
HAS_ALTIVEC := $(shell printf '\#include <altivec.h>\nvector int a;\n' |\
if (req->fw->size > PAGE_SIZE) {
pr_err("Testing interface must use PAGE_SIZE firmware for now\n");
rc = -EINVAL;
+ goto out;
}
memcpy(buf, req->fw->data, req->fw->size);
const char *q = *result++;
size_t amount = strlen(q);
- strncpy(p, q, amount);
+ memcpy(p, q, amount);
p += amount;
*p++ = ' ';
dev_info(test_dev->dev, "removing interface\n");
misc_deregister(&test_dev->misc_dev);
- kfree(&test_dev->misc_dev.name);
mutex_unlock(&test_dev->config_mutex);
mutex_unlock(&test_dev->trigger_mutex);
XA_BUG_ON(xa, xa_get_mark(xa, i, XA_MARK_2));
/* We should see two elements in the array */
+ rcu_read_lock();
xas_for_each(&xas, entry, ULONG_MAX)
seen++;
+ rcu_read_unlock();
XA_BUG_ON(xa, seen != 2);
/* One of which is marked */
xas_set(&xas, 0);
seen = 0;
+ rcu_read_lock();
xas_for_each_marked(&xas, entry, ULONG_MAX, XA_MARK_0)
seen++;
+ rcu_read_unlock();
XA_BUG_ON(xa, seen != 1);
}
XA_BUG_ON(xa, xa_get_mark(xa, next, XA_MARK_0));
xa_erase_index(xa, 12345678);
XA_BUG_ON(xa, !xa_empty(xa));
+ /* And so does xa_insert */
+ xa_reserve(xa, 12345678, GFP_KERNEL);
+ XA_BUG_ON(xa, xa_insert(xa, 12345678, xa_mk_value(12345678), 0) != 0);
+ xa_erase_index(xa, 12345678);
+ XA_BUG_ON(xa, !xa_empty(xa));
+
/* Can iterate through a reserved entry */
xa_store_index(xa, 5, GFP_KERNEL);
xa_reserve(xa, 6, GFP_KERNEL);
XA_BUG_ON(xa, xa_load(xa, max) != NULL);
XA_BUG_ON(xa, xa_load(xa, min - 1) != NULL);
+ xas_lock(&xas);
XA_BUG_ON(xa, xas_store(&xas, xa_mk_value(min)) != xa_mk_value(index));
+ xas_unlock(&xas);
XA_BUG_ON(xa, xa_load(xa, min) != xa_mk_value(min));
XA_BUG_ON(xa, xa_load(xa, max - 1) != xa_mk_value(min));
XA_BUG_ON(xa, xa_load(xa, max) != NULL);
XA_STATE(xas, xa, index);
xa_store_order(xa, index, order, xa_mk_value(0), GFP_KERNEL);
+ xas_lock(&xas);
XA_BUG_ON(xa, xas_store(&xas, xa_mk_value(1)) != xa_mk_value(0));
XA_BUG_ON(xa, xas.xa_index != index);
XA_BUG_ON(xa, xas_store(&xas, NULL) != xa_mk_value(1));
+ xas_unlock(&xas);
XA_BUG_ON(xa, !xa_empty(xa));
}
#endif
rcu_read_unlock();
/* We can erase multiple values with a single store */
- xa_store_order(xa, 0, 63, NULL, GFP_KERNEL);
+ xa_store_order(xa, 0, BITS_PER_LONG - 1, NULL, GFP_KERNEL);
XA_BUG_ON(xa, !xa_empty(xa));
/* Even when the first slot is empty but the others aren't */
}
}
-static noinline void check_find(struct xarray *xa)
+static noinline void check_find_1(struct xarray *xa)
{
unsigned long i, j, k;
XA_BUG_ON(xa, xa_get_mark(xa, i, XA_MARK_0));
}
XA_BUG_ON(xa, !xa_empty(xa));
+}
+
+static noinline void check_find_2(struct xarray *xa)
+{
+ void *entry;
+ unsigned long i, j, index = 0;
+
+ xa_for_each(xa, entry, index, ULONG_MAX, XA_PRESENT) {
+ XA_BUG_ON(xa, true);
+ }
+
+ for (i = 0; i < 1024; i++) {
+ xa_store_index(xa, index, GFP_KERNEL);
+ j = 0;
+ index = 0;
+ xa_for_each(xa, entry, index, ULONG_MAX, XA_PRESENT) {
+ XA_BUG_ON(xa, xa_mk_value(index) != entry);
+ XA_BUG_ON(xa, index != j++);
+ }
+ }
+
+ xa_destroy(xa);
+}
+
+static noinline void check_find(struct xarray *xa)
+{
+ check_find_1(xa);
+ check_find_2(xa);
check_multi_find(xa);
check_multi_find_2(xa);
}
__check_store_range(xa, 4095 + i, 4095 + j);
__check_store_range(xa, 4096 + i, 4096 + j);
__check_store_range(xa, 123456 + i, 123456 + j);
- __check_store_range(xa, UINT_MAX + i, UINT_MAX + j);
+ __check_store_range(xa, (1 << 24) + i, (1 << 24) + j);
}
}
}
XA_STATE(xas, xa, 1 << order);
xa_store_order(xa, 0, order, xa, GFP_KERNEL);
+ rcu_read_lock();
xas_load(&xas);
XA_BUG_ON(xa, xas.xa_node->count == 0);
XA_BUG_ON(xa, xas.xa_node->count > (1 << order));
XA_BUG_ON(xa, xas.xa_node->nr_values != 0);
+ rcu_read_unlock();
xa_store_order(xa, 1 << order, order, xa_mk_value(1 << order),
GFP_KERNEL);
EXPORT_SYMBOL(__ubsan_handle_shift_out_of_bounds);
-void __noreturn
-__ubsan_handle_builtin_unreachable(struct unreachable_data *data)
+void __ubsan_handle_builtin_unreachable(struct unreachable_data *data)
{
unsigned long flags;
* (see the xa_cmpxchg() implementation for an example).
*
* Return: If the slot already existed, returns the contents of this slot.
- * If the slot was newly created, returns NULL. If it failed to create the
- * slot, returns NULL and indicates the error in @xas.
+ * If the slot was newly created, returns %NULL. If it failed to create the
+ * slot, returns %NULL and indicates the error in @xas.
*/
static void *xas_create(struct xa_state *xas)
{
XA_STATE(xas, xa, index);
return xas_result(&xas, xas_store(&xas, NULL));
}
-EXPORT_SYMBOL_GPL(__xa_erase);
+EXPORT_SYMBOL(__xa_erase);
/**
- * xa_store() - Store this entry in the XArray.
+ * xa_erase() - Erase this entry from the XArray.
* @xa: XArray.
- * @index: Index into array.
- * @entry: New entry.
- * @gfp: Memory allocation flags.
+ * @index: Index of entry.
*
- * After this function returns, loads from this index will return @entry.
- * Storing into an existing multislot entry updates the entry of every index.
- * The marks associated with @index are unaffected unless @entry is %NULL.
+ * This function is the equivalent of calling xa_store() with %NULL as
+ * the third argument. The XArray does not need to allocate memory, so
+ * the user does not need to provide GFP flags.
*
- * Context: Process context. Takes and releases the xa_lock. May sleep
- * if the @gfp flags permit.
- * Return: The old entry at this index on success, xa_err(-EINVAL) if @entry
- * cannot be stored in an XArray, or xa_err(-ENOMEM) if memory allocation
- * failed.
+ * Context: Any context. Takes and releases the xa_lock.
+ * Return: The entry which used to be at this index.
*/
-void *xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp)
+void *xa_erase(struct xarray *xa, unsigned long index)
{
- XA_STATE(xas, xa, index);
- void *curr;
-
- if (WARN_ON_ONCE(xa_is_internal(entry)))
- return XA_ERROR(-EINVAL);
+ void *entry;
- do {
- xas_lock(&xas);
- curr = xas_store(&xas, entry);
- if (xa_track_free(xa) && entry)
- xas_clear_mark(&xas, XA_FREE_MARK);
- xas_unlock(&xas);
- } while (xas_nomem(&xas, gfp));
+ xa_lock(xa);
+ entry = __xa_erase(xa, index);
+ xa_unlock(xa);
- return xas_result(&xas, curr);
+ return entry;
}
-EXPORT_SYMBOL(xa_store);
+EXPORT_SYMBOL(xa_erase);
/**
* __xa_store() - Store this entry in the XArray.
if (WARN_ON_ONCE(xa_is_internal(entry)))
return XA_ERROR(-EINVAL);
+ if (xa_track_free(xa) && !entry)
+ entry = XA_ZERO_ENTRY;
do {
curr = xas_store(&xas, entry);
- if (xa_track_free(xa) && entry)
+ if (xa_track_free(xa))
xas_clear_mark(&xas, XA_FREE_MARK);
} while (__xas_nomem(&xas, gfp));
EXPORT_SYMBOL(__xa_store);
/**
- * xa_cmpxchg() - Conditionally replace an entry in the XArray.
+ * xa_store() - Store this entry in the XArray.
* @xa: XArray.
* @index: Index into array.
- * @old: Old value to test against.
- * @entry: New value to place in array.
+ * @entry: New entry.
* @gfp: Memory allocation flags.
*
- * If the entry at @index is the same as @old, replace it with @entry.
- * If the return value is equal to @old, then the exchange was successful.
+ * After this function returns, loads from this index will return @entry.
+ * Storing into an existing multislot entry updates the entry of every index.
+ * The marks associated with @index are unaffected unless @entry is %NULL.
*
- * Context: Process context. Takes and releases the xa_lock. May sleep
- * if the @gfp flags permit.
- * Return: The old value at this index or xa_err() if an error happened.
+ * Context: Any context. Takes and releases the xa_lock.
+ * May sleep if the @gfp flags permit.
+ * Return: The old entry at this index on success, xa_err(-EINVAL) if @entry
+ * cannot be stored in an XArray, or xa_err(-ENOMEM) if memory allocation
+ * failed.
*/
-void *xa_cmpxchg(struct xarray *xa, unsigned long index,
- void *old, void *entry, gfp_t gfp)
+void *xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp)
{
- XA_STATE(xas, xa, index);
void *curr;
- if (WARN_ON_ONCE(xa_is_internal(entry)))
- return XA_ERROR(-EINVAL);
-
- do {
- xas_lock(&xas);
- curr = xas_load(&xas);
- if (curr == XA_ZERO_ENTRY)
- curr = NULL;
- if (curr == old) {
- xas_store(&xas, entry);
- if (xa_track_free(xa) && entry)
- xas_clear_mark(&xas, XA_FREE_MARK);
- }
- xas_unlock(&xas);
- } while (xas_nomem(&xas, gfp));
+ xa_lock(xa);
+ curr = __xa_store(xa, index, entry, gfp);
+ xa_unlock(xa);
- return xas_result(&xas, curr);
+ return curr;
}
-EXPORT_SYMBOL(xa_cmpxchg);
+EXPORT_SYMBOL(xa_store);
/**
* __xa_cmpxchg() - Store this entry in the XArray.
if (WARN_ON_ONCE(xa_is_internal(entry)))
return XA_ERROR(-EINVAL);
+ if (xa_track_free(xa) && !entry)
+ entry = XA_ZERO_ENTRY;
do {
curr = xas_load(&xas);
curr = NULL;
if (curr == old) {
xas_store(&xas, entry);
- if (xa_track_free(xa) && entry)
+ if (xa_track_free(xa))
xas_clear_mark(&xas, XA_FREE_MARK);
}
} while (__xas_nomem(&xas, gfp));
EXPORT_SYMBOL(__xa_cmpxchg);
/**
- * xa_reserve() - Reserve this index in the XArray.
+ * __xa_reserve() - Reserve this index in the XArray.
* @xa: XArray.
* @index: Index into array.
* @gfp: Memory allocation flags.
* Ensures there is somewhere to store an entry at @index in the array.
* If there is already something stored at @index, this function does
* nothing. If there was nothing there, the entry is marked as reserved.
- * Loads from @index will continue to see a %NULL pointer until a
- * subsequent store to @index.
+ * Loading from a reserved entry returns a %NULL pointer.
*
* If you do not use the entry that you have reserved, call xa_release()
* or xa_erase() to free any unnecessary memory.
*
- * Context: Process context. Takes and releases the xa_lock, IRQ or BH safe
- * if specified in XArray flags. May sleep if the @gfp flags permit.
+ * Context: Any context. Expects the xa_lock to be held on entry. May
+ * release the lock, sleep and reacquire the lock if the @gfp flags permit.
* Return: 0 if the reservation succeeded or -ENOMEM if it failed.
*/
-int xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp)
+int __xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp)
{
XA_STATE(xas, xa, index);
- unsigned int lock_type = xa_lock_type(xa);
void *curr;
do {
- xas_lock_type(&xas, lock_type);
curr = xas_load(&xas);
- if (!curr)
+ if (!curr) {
xas_store(&xas, XA_ZERO_ENTRY);
- xas_unlock_type(&xas, lock_type);
- } while (xas_nomem(&xas, gfp));
+ if (xa_track_free(xa))
+ xas_clear_mark(&xas, XA_FREE_MARK);
+ }
+ } while (__xas_nomem(&xas, gfp));
return xas_error(&xas);
}
-EXPORT_SYMBOL(xa_reserve);
+EXPORT_SYMBOL(__xa_reserve);
#ifdef CONFIG_XARRAY_MULTI
static void xas_set_range(struct xa_state *xas, unsigned long first,
do {
xas_lock(&xas);
if (entry) {
- unsigned int order = (last == ~0UL) ? 64 :
- ilog2(last + 1);
+ unsigned int order = BITS_PER_LONG;
+ if (last + 1)
+ order = __ffs(last + 1);
xas_set_order(&xas, last, order);
xas_create(&xas);
if (xas_error(&xas))
* @index: Index of entry.
* @mark: Mark number.
*
- * Attempting to set a mark on a NULL entry does not succeed.
+ * Attempting to set a mark on a %NULL entry does not succeed.
*
* Context: Any context. Expects xa_lock to be held on entry.
*/
if (entry)
xas_set_mark(&xas, mark);
}
-EXPORT_SYMBOL_GPL(__xa_set_mark);
+EXPORT_SYMBOL(__xa_set_mark);
/**
* __xa_clear_mark() - Clear this mark on this entry while locked.
if (entry)
xas_clear_mark(&xas, mark);
}
-EXPORT_SYMBOL_GPL(__xa_clear_mark);
+EXPORT_SYMBOL(__xa_clear_mark);
/**
* xa_get_mark() - Inquire whether this mark is set on this entry.
* @index: Index of entry.
* @mark: Mark number.
*
- * Attempting to set a mark on a NULL entry does not succeed.
+ * Attempting to set a mark on a %NULL entry does not succeed.
*
* Context: Process context. Takes and releases the xa_lock.
*/
entry = xas_find_marked(&xas, max, filter);
else
entry = xas_find(&xas, max);
+ if (xas.xa_node == XAS_BOUNDS)
+ break;
if (xas.xa_shift) {
if (xas.xa_index & ((1UL << xas.xa_shift) - 1))
continue;
*
* The @filter may be an XArray mark value, in which case entries which are
* marked with that mark will be copied. It may also be %XA_PRESENT, in
- * which case all entries which are not NULL will be copied.
+ * which case all entries which are not %NULL will be copied.
*
* The entries returned may not represent a snapshot of the XArray at a
* moment in time. For example, if another thread stores to index 5, then
!mapping->a_ops->is_partially_uptodate)
goto page_not_up_to_date;
/* pipes can't handle partially uptodate pages */
- if (unlikely(iter->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(iter)))
goto page_not_up_to_date;
if (!trylock_page(page))
goto page_not_up_to_date;
}
EXPORT_SYMBOL(read_cache_page_gfp);
+/*
+ * Don't operate on ranges the page cache doesn't support, and don't exceed the
+ * LFS limits. If pos is under the limit it becomes a short access. If it
+ * exceeds the limit we return -EFBIG.
+ */
+static int generic_access_check_limits(struct file *file, loff_t pos,
+ loff_t *count)
+{
+ struct inode *inode = file->f_mapping->host;
+ loff_t max_size = inode->i_sb->s_maxbytes;
+
+ if (!(file->f_flags & O_LARGEFILE))
+ max_size = MAX_NON_LFS;
+
+ if (unlikely(pos >= max_size))
+ return -EFBIG;
+ *count = min(*count, max_size - pos);
+ return 0;
+}
+
+static int generic_write_check_limits(struct file *file, loff_t pos,
+ loff_t *count)
+{
+ loff_t limit = rlimit(RLIMIT_FSIZE);
+
+ if (limit != RLIM_INFINITY) {
+ if (pos >= limit) {
+ send_sig(SIGXFSZ, current, 0);
+ return -EFBIG;
+ }
+ *count = min(*count, limit - pos);
+ }
+
+ return generic_access_check_limits(file, pos, count);
+}
+
/*
* Performs necessary checks before doing a write
*
{
struct file *file = iocb->ki_filp;
struct inode *inode = file->f_mapping->host;
- unsigned long limit = rlimit(RLIMIT_FSIZE);
- loff_t pos;
+ loff_t count;
+ int ret;
if (!iov_iter_count(from))
return 0;
if (iocb->ki_flags & IOCB_APPEND)
iocb->ki_pos = i_size_read(inode);
- pos = iocb->ki_pos;
-
if ((iocb->ki_flags & IOCB_NOWAIT) && !(iocb->ki_flags & IOCB_DIRECT))
return -EINVAL;
- if (limit != RLIM_INFINITY) {
- if (iocb->ki_pos >= limit) {
- send_sig(SIGXFSZ, current, 0);
- return -EFBIG;
- }
- iov_iter_truncate(from, limit - (unsigned long)pos);
- }
+ count = iov_iter_count(from);
+ ret = generic_write_check_limits(file, iocb->ki_pos, &count);
+ if (ret)
+ return ret;
+
+ iov_iter_truncate(from, count);
+ return iov_iter_count(from);
+}
+EXPORT_SYMBOL(generic_write_checks);
+
+/*
+ * Performs necessary checks before doing a clone.
+ *
+ * Can adjust amount of bytes to clone.
+ * Returns appropriate error code that caller should return or
+ * zero in case the clone should be allowed.
+ */
+int generic_remap_checks(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t *req_count, unsigned int remap_flags)
+{
+ struct inode *inode_in = file_in->f_mapping->host;
+ struct inode *inode_out = file_out->f_mapping->host;
+ uint64_t count = *req_count;
+ uint64_t bcount;
+ loff_t size_in, size_out;
+ loff_t bs = inode_out->i_sb->s_blocksize;
+ int ret;
+
+ /* The start of both ranges must be aligned to an fs block. */
+ if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_out, bs))
+ return -EINVAL;
+
+ /* Ensure offsets don't wrap. */
+ if (pos_in + count < pos_in || pos_out + count < pos_out)
+ return -EINVAL;
+
+ size_in = i_size_read(inode_in);
+ size_out = i_size_read(inode_out);
+
+ /* Dedupe requires both ranges to be within EOF. */
+ if ((remap_flags & REMAP_FILE_DEDUP) &&
+ (pos_in >= size_in || pos_in + count > size_in ||
+ pos_out >= size_out || pos_out + count > size_out))
+ return -EINVAL;
+
+ /* Ensure the infile range is within the infile. */
+ if (pos_in >= size_in)
+ return -EINVAL;
+ count = min(count, size_in - (uint64_t)pos_in);
+
+ ret = generic_access_check_limits(file_in, pos_in, &count);
+ if (ret)
+ return ret;
+
+ ret = generic_write_check_limits(file_out, pos_out, &count);
+ if (ret)
+ return ret;
/*
- * LFS rule
+ * If the user wanted us to link to the infile's EOF, round up to the
+ * next block boundary for this check.
+ *
+ * Otherwise, make sure the count is also block-aligned, having
+ * already confirmed the starting offsets' block alignment.
*/
- if (unlikely(pos + iov_iter_count(from) > MAX_NON_LFS &&
- !(file->f_flags & O_LARGEFILE))) {
- if (pos >= MAX_NON_LFS)
- return -EFBIG;
- iov_iter_truncate(from, MAX_NON_LFS - (unsigned long)pos);
+ if (pos_in + count == size_in) {
+ bcount = ALIGN(size_in, bs) - pos_in;
+ } else {
+ if (!IS_ALIGNED(count, bs))
+ count = ALIGN_DOWN(count, bs);
+ bcount = count;
}
+ /* Don't allow overlapped cloning within the same file. */
+ if (inode_in == inode_out &&
+ pos_out + bcount > pos_in &&
+ pos_out < pos_in + bcount)
+ return -EINVAL;
+
/*
- * Are we about to exceed the fs block limit ?
- *
- * If we have written data it becomes a short write. If we have
- * exceeded without writing data we send a signal and return EFBIG.
- * Linus frestrict idea will clean these up nicely..
+ * We shortened the request but the caller can't deal with that, so
+ * bounce the request back to userspace.
*/
- if (unlikely(pos >= inode->i_sb->s_maxbytes))
- return -EFBIG;
+ if (*req_count != count && !(remap_flags & REMAP_FILE_CAN_SHORTEN))
+ return -EINVAL;
- iov_iter_truncate(from, inode->i_sb->s_maxbytes - pos);
- return iov_iter_count(from);
+ *req_count = count;
+ return 0;
}
-EXPORT_SYMBOL(generic_write_checks);
int pagecache_write_begin(struct file *file, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
* @vma: vm_area_struct mapping @address
* @address: virtual address to look up
* @flags: flags modifying lookup behaviour
- * @page_mask: on output, *page_mask is set according to the size of the page
+ * @ctx: contains dev_pagemap for %ZONE_DEVICE memory pinning and a
+ * pointer to output page_mask
*
* @flags can have FOLL_ flags set, defined in <linux/mm.h>
*
- * Returns the mapped (struct page *), %NULL if no mapping exists, or
+ * When getting pages from ZONE_DEVICE memory, the @ctx->pgmap caches
+ * the device's dev_pagemap metadata to avoid repeating expensive lookups.
+ *
+ * On output, the @ctx->page_mask is set according to the size of the page.
+ *
+ * Return: the mapped (struct page *), %NULL if no mapping exists, or
* an error pointer if there is a mapping to something not represented
* by a page descriptor (see also vm_normal_page()).
*/
if (!vma || start >= vma->vm_end) {
vma = find_extend_vma(mm, start);
if (!vma && in_gate_area(mm, start)) {
- int ret;
ret = get_gate_page(mm, start & PAGE_MASK,
gup_flags, &vma,
pages ? &pages[i] : NULL);
if (ret)
- return i ? : ret;
+ goto out;
ctx.page_mask = 0;
goto next_page;
}
{
const bool vma_madvised = !!(vma->vm_flags & VM_HUGEPAGE);
+ /* Always do synchronous compaction */
if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags))
return GFP_TRANSHUGE | (vma_madvised ? 0 : __GFP_NORETRY);
+
+ /* Kick kcompactd and fail quickly */
if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags))
return GFP_TRANSHUGE_LIGHT | __GFP_KSWAPD_RECLAIM;
+
+ /* Synchronous compaction if madvised, otherwise kick kcompactd */
if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags))
- return GFP_TRANSHUGE_LIGHT | (vma_madvised ? __GFP_DIRECT_RECLAIM :
- __GFP_KSWAPD_RECLAIM);
+ return GFP_TRANSHUGE_LIGHT |
+ (vma_madvised ? __GFP_DIRECT_RECLAIM :
+ __GFP_KSWAPD_RECLAIM);
+
+ /* Only do synchronous compaction if madvised */
if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags))
- return GFP_TRANSHUGE_LIGHT | (vma_madvised ? __GFP_DIRECT_RECLAIM :
- 0);
+ return GFP_TRANSHUGE_LIGHT |
+ (vma_madvised ? __GFP_DIRECT_RECLAIM : 0);
+
return GFP_TRANSHUGE_LIGHT;
}
}
}
-static void freeze_page(struct page *page)
+static void unmap_page(struct page *page)
{
enum ttu_flags ttu_flags = TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS |
TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD;
VM_BUG_ON_PAGE(!unmap_success, page);
}
-static void unfreeze_page(struct page *page)
+static void remap_page(struct page *page)
{
int i;
if (PageTransHuge(page)) {
(1L << PG_unevictable) |
(1L << PG_dirty)));
+ /* ->mapping in first tail page is compound_mapcount */
+ VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
+ page_tail);
+ page_tail->mapping = head->mapping;
+ page_tail->index = head->index + tail;
+
/* Page flags must be visible before we make the page non-compound. */
smp_wmb();
if (page_is_idle(head))
set_page_idle(page_tail);
- /* ->mapping in first tail page is compound_mapcount */
- VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
- page_tail);
- page_tail->mapping = head->mapping;
-
- page_tail->index = head->index + tail;
page_cpupid_xchg_last(page_tail, page_cpupid_last(head));
/*
}
static void __split_huge_page(struct page *page, struct list_head *list,
- unsigned long flags)
+ pgoff_t end, unsigned long flags)
{
struct page *head = compound_head(page);
struct zone *zone = page_zone(head);
struct lruvec *lruvec;
- pgoff_t end = -1;
int i;
lruvec = mem_cgroup_page_lruvec(head, zone->zone_pgdat);
/* complete memcg works before add pages to LRU */
mem_cgroup_split_huge_fixup(head);
- if (!PageAnon(page))
- end = DIV_ROUND_UP(i_size_read(head->mapping->host), PAGE_SIZE);
-
for (i = HPAGE_PMD_NR - 1; i >= 1; i--) {
__split_huge_page_tail(head, i, lruvec, list);
/* Some pages can be beyond i_size: drop them from page cache */
spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags);
- unfreeze_page(head);
+ remap_page(head);
for (i = 0; i < HPAGE_PMD_NR; i++) {
struct page *subpage = head + i;
int count, mapcount, extra_pins, ret;
bool mlocked;
unsigned long flags;
+ pgoff_t end;
VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
ret = -EBUSY;
goto out;
}
+ end = -1;
mapping = NULL;
anon_vma_lock_write(anon_vma);
} else {
anon_vma = NULL;
i_mmap_lock_read(mapping);
+
+ /*
+ *__split_huge_page() may need to trim off pages beyond EOF:
+ * but on 32-bit, i_size_read() takes an irq-unsafe seqlock,
+ * which cannot be nested inside the page tree lock. So note
+ * end now: i_size itself may be changed at any moment, but
+ * head page lock is good enough to serialize the trimming.
+ */
+ end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
}
/*
- * Racy check if we can split the page, before freeze_page() will
+ * Racy check if we can split the page, before unmap_page() will
* split PMDs
*/
if (!can_split_huge_page(head, &extra_pins)) {
}
mlocked = PageMlocked(page);
- freeze_page(head);
+ unmap_page(head);
VM_BUG_ON_PAGE(compound_mapcount(head), head);
/* Make sure the page is not on per-CPU pagevec as it takes pin */
if (mapping)
__dec_node_page_state(page, NR_SHMEM_THPS);
spin_unlock(&pgdata->split_queue_lock);
- __split_huge_page(page, list, flags);
+ __split_huge_page(page, list, end, flags);
if (PageSwapCache(head)) {
swp_entry_t entry = { .val = page_private(head) };
fail: if (mapping)
xa_unlock(&mapping->i_pages);
spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags);
- unfreeze_page(head);
+ remap_page(head);
ret = -EBUSY;
}
int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
struct vm_area_struct *vma)
{
- pte_t *src_pte, *dst_pte, entry;
+ pte_t *src_pte, *dst_pte, entry, dst_entry;
struct page *ptepage;
unsigned long addr;
int cow;
break;
}
- /* If the pagetables are shared don't copy or take references */
- if (dst_pte == src_pte)
+ /*
+ * If the pagetables are shared don't copy or take references.
+ * dst_pte == src_pte is the common case of src/dest sharing.
+ *
+ * However, src could have 'unshared' and dst shares with
+ * another vma. If dst_pte !none, this implies sharing.
+ * Check here before taking page table lock, and once again
+ * after taking the lock below.
+ */
+ dst_entry = huge_ptep_get(dst_pte);
+ if ((dst_pte == src_pte) || !huge_pte_none(dst_entry))
continue;
dst_ptl = huge_pte_lock(h, dst, dst_pte);
src_ptl = huge_pte_lockptr(h, src, src_pte);
spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
entry = huge_ptep_get(src_pte);
- if (huge_pte_none(entry)) { /* skip none entry */
+ dst_entry = huge_ptep_get(dst_pte);
+ if (huge_pte_none(entry) || !huge_pte_none(dst_entry)) {
+ /*
+ * Skip if src entry none. Also, skip in the
+ * unlikely case dst entry !none as this implies
+ * sharing with another vma.
+ */
;
} else if (unlikely(is_hugetlb_entry_migration(entry) ||
is_hugetlb_entry_hwpoisoned(entry))) {
/* fallback to copy_from_user outside mmap_sem */
if (unlikely(ret)) {
- ret = -EFAULT;
+ ret = -ENOENT;
*pagep = page;
/* don't free the page */
goto out;
* collapse_shmem - collapse small tmpfs/shmem pages into huge one.
*
* Basic scheme is simple, details are more complex:
- * - allocate and freeze a new huge page;
+ * - allocate and lock a new huge page;
* - scan page cache replacing old pages with the new one
* + swap in pages if necessary;
* + fill in gaps;
* - if replacing succeeds:
* + copy data over;
* + free old pages;
- * + unfreeze huge page;
+ * + unlock huge page;
* - if replacing failed;
* + put all pages back and unfreeze them;
* + restore gaps in the page cache;
- * + free huge page;
+ * + unlock and free huge page;
*/
static void collapse_shmem(struct mm_struct *mm,
struct address_space *mapping, pgoff_t start,
goto out;
}
- new_page->index = start;
- new_page->mapping = mapping;
- __SetPageSwapBacked(new_page);
- __SetPageLocked(new_page);
- BUG_ON(!page_ref_freeze(new_page, 1));
-
- /*
- * At this point the new_page is 'frozen' (page_count() is zero),
- * locked and not up-to-date. It's safe to insert it into the page
- * cache, because nobody would be able to map it or use it in other
- * way until we unfreeze it.
- */
-
/* This will be less messy when we use multi-index entries */
do {
xas_lock_irq(&xas);
if (!xas_error(&xas))
break;
xas_unlock_irq(&xas);
- if (!xas_nomem(&xas, GFP_KERNEL))
+ if (!xas_nomem(&xas, GFP_KERNEL)) {
+ mem_cgroup_cancel_charge(new_page, memcg, true);
+ result = SCAN_FAIL;
goto out;
+ }
} while (1);
+ __SetPageLocked(new_page);
+ __SetPageSwapBacked(new_page);
+ new_page->index = start;
+ new_page->mapping = mapping;
+
+ /*
+ * At this point the new_page is locked and not up-to-date.
+ * It's safe to insert it into the page cache, because nobody would
+ * be able to map it or use it in another way until we unlock it.
+ */
+
xas_set(&xas, start);
for (index = start; index < end; index++) {
struct page *page = xas_next(&xas);
VM_BUG_ON(index != xas.xa_index);
if (!page) {
+ /*
+ * Stop if extent has been truncated or hole-punched,
+ * and is now completely empty.
+ */
+ if (index == start) {
+ if (!xas_next_entry(&xas, end - 1)) {
+ result = SCAN_TRUNCATED;
+ goto xa_locked;
+ }
+ xas_set(&xas, index);
+ }
if (!shmem_charge(mapping->host, 1)) {
result = SCAN_FAIL;
- break;
+ goto xa_locked;
}
xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
nr_none++;
result = SCAN_FAIL;
goto xa_unlocked;
}
- xas_lock_irq(&xas);
- xas_set(&xas, index);
} else if (trylock_page(page)) {
get_page(page);
+ xas_unlock_irq(&xas);
} else {
result = SCAN_PAGE_LOCK;
- break;
+ goto xa_locked;
}
/*
*/
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(!PageUptodate(page), page);
- VM_BUG_ON_PAGE(PageTransCompound(page), page);
+
+ /*
+ * If file was truncated then extended, or hole-punched, before
+ * we locked the first page, then a THP might be there already.
+ */
+ if (PageTransCompound(page)) {
+ result = SCAN_PAGE_COMPOUND;
+ goto out_unlock;
+ }
if (page_mapping(page) != mapping) {
result = SCAN_TRUNCATED;
goto out_unlock;
}
- xas_unlock_irq(&xas);
if (isolate_lru_page(page)) {
result = SCAN_DEL_PAGE_LRU;
- goto out_isolate_failed;
+ goto out_unlock;
}
if (page_mapped(page))
*/
if (!page_ref_freeze(page, 3)) {
result = SCAN_PAGE_COUNT;
- goto out_lru;
+ xas_unlock_irq(&xas);
+ putback_lru_page(page);
+ goto out_unlock;
}
/*
/* Finally, replace with the new page. */
xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
continue;
-out_lru:
- xas_unlock_irq(&xas);
- putback_lru_page(page);
-out_isolate_failed:
- unlock_page(page);
- put_page(page);
- goto xa_unlocked;
out_unlock:
unlock_page(page);
put_page(page);
- break;
+ goto xa_unlocked;
}
- xas_unlock_irq(&xas);
+ __inc_node_page_state(new_page, NR_SHMEM_THPS);
+ if (nr_none) {
+ struct zone *zone = page_zone(new_page);
+
+ __mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
+ __mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
+ }
+
+xa_locked:
+ xas_unlock_irq(&xas);
xa_unlocked:
+
if (result == SCAN_SUCCEED) {
struct page *page, *tmp;
- struct zone *zone = page_zone(new_page);
/*
* Replacing old pages with new one has succeeded, now we
* need to copy the content and free the old pages.
*/
+ index = start;
list_for_each_entry_safe(page, tmp, &pagelist, lru) {
+ while (index < page->index) {
+ clear_highpage(new_page + (index % HPAGE_PMD_NR));
+ index++;
+ }
copy_highpage(new_page + (page->index % HPAGE_PMD_NR),
page);
list_del(&page->lru);
- unlock_page(page);
- page_ref_unfreeze(page, 1);
page->mapping = NULL;
+ page_ref_unfreeze(page, 1);
ClearPageActive(page);
ClearPageUnevictable(page);
+ unlock_page(page);
put_page(page);
+ index++;
}
-
- local_irq_disable();
- __inc_node_page_state(new_page, NR_SHMEM_THPS);
- if (nr_none) {
- __mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
- __mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
+ while (index < end) {
+ clear_highpage(new_page + (index % HPAGE_PMD_NR));
+ index++;
}
- local_irq_enable();
- /*
- * Remove pte page tables, so we can re-fault
- * the page as huge.
- */
- retract_page_tables(mapping, start);
-
- /* Everything is ready, let's unfreeze the new_page */
- set_page_dirty(new_page);
SetPageUptodate(new_page);
- page_ref_unfreeze(new_page, HPAGE_PMD_NR);
+ page_ref_add(new_page, HPAGE_PMD_NR - 1);
+ set_page_dirty(new_page);
mem_cgroup_commit_charge(new_page, memcg, false, true);
lru_cache_add_anon(new_page);
- unlock_page(new_page);
+ /*
+ * Remove pte page tables, so we can re-fault the page as huge.
+ */
+ retract_page_tables(mapping, start);
*hpage = NULL;
khugepaged_pages_collapsed++;
} else {
struct page *page;
+
/* Something went wrong: roll back page cache changes */
- shmem_uncharge(mapping->host, nr_none);
xas_lock_irq(&xas);
+ mapping->nrpages -= nr_none;
+ shmem_uncharge(mapping->host, nr_none);
+
xas_set(&xas, start);
xas_for_each(&xas, page, end - 1) {
page = list_first_entry_or_null(&pagelist,
xas_store(&xas, page);
xas_pause(&xas);
xas_unlock_irq(&xas);
- putback_lru_page(page);
unlock_page(page);
+ putback_lru_page(page);
xas_lock_irq(&xas);
}
VM_BUG_ON(nr_none);
xas_unlock_irq(&xas);
- /* Unfreeze new_page, caller would take care about freeing it */
- page_ref_unfreeze(new_page, 1);
mem_cgroup_cancel_charge(new_page, memcg, true);
- unlock_page(new_page);
new_page->mapping = NULL;
}
+
+ unlock_page(new_page);
out:
VM_BUG_ON(!list_empty(&pagelist));
/* TODO: tracepoints */
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
/*
- * Common iterator interface used to define for_each_mem_range().
+ * Common iterator interface used to define for_each_mem_pfn_range().
*/
void __init_memblock __next_mem_pfn_range(int *idx, int nid,
unsigned long *out_start_pfn,
struct mem_cgroup *memcg;
int ret = 0;
- if (memcg_kmem_bypass())
+ if (mem_cgroup_disabled() || memcg_kmem_bypass())
return 0;
memcg = get_mem_cgroup_from_current();
LIST_HEAD(tokill);
int rc = -EBUSY;
loff_t start;
+ dax_entry_t cookie;
/*
* Prevent the inode from being freed while we are interrogating
* also prevents changes to the mapping of this pfn until
* poison signaling is complete.
*/
- if (!dax_lock_mapping_entry(page))
+ cookie = dax_lock_page(page);
+ if (!cookie)
goto out;
if (hwpoison_filter(page)) {
kill_procs(&tokill, flags & MF_MUST_KILL, !unmap_success, pfn, flags);
rc = 0;
unlock:
- dax_unlock_mapping_entry(page);
+ dax_unlock_page(page, cookie);
out:
/* drop pgmap ref acquired in caller */
put_dev_pagemap(pgmap);
for (i = 0; i < sections_to_remove; i++) {
unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION;
+ cond_resched();
ret = __remove_section(zone, __pfn_to_section(pfn), map_offset,
altmap);
map_offset = 0;
* If the policy is interleave, or does not allow the current
* node in its nodemask, we allocate the standard way.
*/
- if (pol->mode == MPOL_PREFERRED &&
- !(pol->flags & MPOL_F_LOCAL))
+ if (pol->mode == MPOL_PREFERRED && !(pol->flags & MPOL_F_LOCAL))
hpage_node = pol->v.preferred_node;
nmask = policy_nodemask(gfp, pol);
unsigned int cpuset_mems_cookie;
int reserve_flags;
- /*
- * In the slowpath, we sanity check order to avoid ever trying to
- * reclaim >= MAX_ORDER areas which will never succeed. Callers may
- * be using allocators in order of preference for an area that is
- * too large.
- */
- if (order >= MAX_ORDER) {
- WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
- return NULL;
- }
-
/*
* We also sanity check to catch abuse of atomic reserves being used by
* callers that are not in atomic context.
gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
struct alloc_context ac = { };
+ /*
+ * There are several places where we assume that the order value is sane
+ * so bail out early if the request is out of bound.
+ */
+ if (unlikely(order >= MAX_ORDER)) {
+ WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
+ return NULL;
+ }
+
gfp_mask &= gfp_allowed_mask;
alloc_mask = gfp_mask;
if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask, &ac, &alloc_mask, &alloc_flags))
unsigned long size)
{
struct pglist_data *pgdat = zone->zone_pgdat;
+ int zone_idx = zone_idx(zone) + 1;
- pgdat->nr_zones = zone_idx(zone) + 1;
+ if (zone_idx > pgdat->nr_zones)
+ pgdat->nr_zones = zone_idx;
zone->zone_start_pfn = zone_start_pfn;
if (PageReserved(page))
goto unmovable;
+ /*
+ * If the zone is movable and we have ruled out all reserved
+ * pages then it should be reasonably safe to assume the rest
+ * is movable.
+ */
+ if (zone_idx(zone) == ZONE_MOVABLE)
+ continue;
+
/*
* Hugepages are not in LRU lists, but they're movable.
* We need not scan over tail pages bacause we don't
};
struct iov_iter from;
- iov_iter_bvec(&from, ITER_BVEC | WRITE, &bv, 1, PAGE_SIZE);
+ iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE);
init_sync_kiocb(&kiocb, swap_file);
kiocb.ki_pos = page_file_offset(page);
goto out;
}
bio->bi_opf = REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc);
- bio_associate_blkg_from_page(bio, page);
+ bio_associate_blkcg_from_page(bio, page);
count_swpout_vm_event(page);
set_page_writeback(page);
unlock_page(page);
}
early_param("page_poison", early_page_poison_param);
+/**
+ * page_poisoning_enabled - check if page poisoning is enabled
+ *
+ * Return true if page poisoning is enabled, or false if not.
+ */
bool page_poisoning_enabled(void)
{
/*
(!IS_ENABLED(CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC) &&
debug_pagealloc_enabled()));
}
+EXPORT_SYMBOL_GPL(page_poisoning_enabled);
static void poison_page(struct page *page)
{
BUG_ON(ai->nr_groups != 1);
upa = ai->alloc_size/ai->unit_size;
nr_g0_units = roundup(num_possible_cpus(), upa);
- if (unlikely(WARN_ON(ai->groups[0].nr_units != nr_g0_units))) {
+ if (WARN_ON(ai->groups[0].nr_units != nr_g0_units)) {
pcpu_free_alloc_info(ai);
return -EINVAL;
}
address + PAGE_SIZE);
} else {
/*
- * We should not need to notify here as we reach this
- * case only from freeze_page() itself only call from
- * split_huge_page_to_list() so everything below must
- * be true:
- * - page is not anonymous
- * - page is locked
- *
- * So as it is a locked file back page thus it can not
- * be remove from the page cache and replace by a new
- * page before mmu_notifier_invalidate_range_end so no
+ * This is a locked file-backed page, thus it cannot
+ * be removed from the page cache and replaced by a new
+ * page before mmu_notifier_invalidate_range_end, so no
* concurrent thread might update its page table to
* point at new page while a device still is using this
* page.
if (!shmem_inode_acct_block(inode, pages))
return false;
+ /* nrpages adjustment first, then shmem_recalc_inode() when balanced */
+ inode->i_mapping->nrpages += pages;
+
spin_lock_irqsave(&info->lock, flags);
info->alloced += pages;
inode->i_blocks += pages * BLOCKS_PER_PAGE;
shmem_recalc_inode(inode);
spin_unlock_irqrestore(&info->lock, flags);
- inode->i_mapping->nrpages += pages;
return true;
}
struct shmem_inode_info *info = SHMEM_I(inode);
unsigned long flags;
+ /* nrpages adjustment done by __delete_from_page_cache() or caller */
+
spin_lock_irqsave(&info->lock, flags);
info->alloced -= pages;
inode->i_blocks -= pages * BLOCKS_PER_PAGE;
{
struct page *oldpage, *newpage;
struct address_space *swap_mapping;
+ swp_entry_t entry;
pgoff_t swap_index;
int error;
oldpage = *pagep;
- swap_index = page_private(oldpage);
+ entry.val = page_private(oldpage);
+ swap_index = swp_offset(entry);
swap_mapping = page_mapping(oldpage);
/*
__SetPageLocked(newpage);
__SetPageSwapBacked(newpage);
SetPageUptodate(newpage);
- set_page_private(newpage, swap_index);
+ set_page_private(newpage, entry.val);
SetPageSwapCache(newpage);
/*
struct page *page;
pte_t _dst_pte, *dst_pte;
int ret;
+ pgoff_t offset, max_off;
ret = -ENOMEM;
if (!shmem_inode_acct_block(inode, 1))
*pagep = page;
shmem_inode_unacct_blocks(inode, 1);
/* don't free the page */
- return -EFAULT;
+ return -ENOENT;
}
} else { /* mfill_zeropage_atomic */
clear_highpage(page);
__SetPageSwapBacked(page);
__SetPageUptodate(page);
+ ret = -EFAULT;
+ offset = linear_page_index(dst_vma, dst_addr);
+ max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ if (unlikely(offset >= max_off))
+ goto out_release;
+
ret = mem_cgroup_try_charge_delay(page, dst_mm, gfp, &memcg, false);
if (ret)
goto out_release;
_dst_pte = mk_pte(page, dst_vma->vm_page_prot);
if (dst_vma->vm_flags & VM_WRITE)
_dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
+ else {
+ /*
+ * We don't set the pte dirty if the vma has no
+ * VM_WRITE permission, so mark the page dirty or it
+ * could be freed from under us. We could do it
+ * unconditionally before unlock_page(), but doing it
+ * only if VM_WRITE is not set is faster.
+ */
+ set_page_dirty(page);
+ }
- ret = -EEXIST;
dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+
+ ret = -EFAULT;
+ max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ if (unlikely(offset >= max_off))
+ goto out_release_uncharge_unlock;
+
+ ret = -EEXIST;
if (!pte_none(*dst_pte))
goto out_release_uncharge_unlock;
/* No need to invalidate - it was non-present before */
update_mmu_cache(dst_vma, dst_addr, dst_pte);
- unlock_page(page);
pte_unmap_unlock(dst_pte, ptl);
+ unlock_page(page);
ret = 0;
out:
return ret;
out_release_uncharge_unlock:
pte_unmap_unlock(dst_pte, ptl);
+ ClearPageDirty(page);
+ delete_from_page_cache(page);
out_release_uncharge:
mem_cgroup_cancel_charge(page, memcg, false);
out_release:
inode_lock(inode);
/* We're holding i_mutex so we can access i_size directly */
- if (offset < 0)
- offset = -EINVAL;
- else if (offset >= inode->i_size)
+ if (offset < 0 || offset >= inode->i_size)
offset = -ENXIO;
else {
start = offset >> PAGE_SHIFT;
unsigned int type;
int i;
- p = kzalloc(sizeof(*p), GFP_KERNEL);
+ p = kvzalloc(sizeof(*p), GFP_KERNEL);
if (!p)
return ERR_PTR(-ENOMEM);
}
if (type >= MAX_SWAPFILES) {
spin_unlock(&swap_lock);
- kfree(p);
+ kvfree(p);
return ERR_PTR(-EPERM);
}
if (type >= nr_swapfiles) {
smp_wmb();
nr_swapfiles++;
} else {
- kfree(p);
+ kvfree(p);
p = swap_info[type];
/*
* Do not memset this entry: a racing procfs swap_next()
*/
xa_lock_irq(&mapping->i_pages);
xa_unlock_irq(&mapping->i_pages);
-
- truncate_inode_pages(mapping, 0);
}
+
+ /*
+ * Cleancache needs notification even if there are no pages or shadow
+ * entries.
+ */
+ truncate_inode_pages(mapping, 0);
}
EXPORT_SYMBOL(truncate_inode_pages_final);
void *page_kaddr;
int ret;
struct page *page;
+ pgoff_t offset, max_off;
+ struct inode *inode;
if (!*pagep) {
ret = -ENOMEM;
/* fallback to copy_from_user outside mmap_sem */
if (unlikely(ret)) {
- ret = -EFAULT;
+ ret = -ENOENT;
*pagep = page;
/* don't free the page */
goto out;
if (dst_vma->vm_flags & VM_WRITE)
_dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
- ret = -EEXIST;
dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+ if (dst_vma->vm_file) {
+ /* the shmem MAP_PRIVATE case requires checking the i_size */
+ inode = dst_vma->vm_file->f_inode;
+ offset = linear_page_index(dst_vma, dst_addr);
+ max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ ret = -EFAULT;
+ if (unlikely(offset >= max_off))
+ goto out_release_uncharge_unlock;
+ }
+ ret = -EEXIST;
if (!pte_none(*dst_pte))
goto out_release_uncharge_unlock;
pte_t _dst_pte, *dst_pte;
spinlock_t *ptl;
int ret;
+ pgoff_t offset, max_off;
+ struct inode *inode;
_dst_pte = pte_mkspecial(pfn_pte(my_zero_pfn(dst_addr),
dst_vma->vm_page_prot));
- ret = -EEXIST;
dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+ if (dst_vma->vm_file) {
+ /* the shmem MAP_PRIVATE case requires checking the i_size */
+ inode = dst_vma->vm_file->f_inode;
+ offset = linear_page_index(dst_vma, dst_addr);
+ max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ ret = -EFAULT;
+ if (unlikely(offset >= max_off))
+ goto out_unlock;
+ }
+ ret = -EEXIST;
if (!pte_none(*dst_pte))
goto out_unlock;
set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
if (!dst_vma || !is_vm_hugetlb_page(dst_vma))
goto out_unlock;
/*
- * Only allow __mcopy_atomic_hugetlb on userfaultfd
- * registered ranges.
+ * Check the vma is registered in uffd, this is
+ * required to enforce the VM_MAYWRITE check done at
+ * uffd registration time.
*/
if (!dst_vma->vm_userfaultfd_ctx.ctx)
goto out_unlock;
cond_resched();
- if (unlikely(err == -EFAULT)) {
+ if (unlikely(err == -ENOENT)) {
up_read(&dst_mm->mmap_sem);
BUG_ON(!page);
{
ssize_t err;
- if (vma_is_anonymous(dst_vma)) {
+ /*
+ * The normal page fault path for a shmem will invoke the
+ * fault, fill the hole in the file and COW it right away. The
+ * result generates plain anonymous memory. So when we are
+ * asked to fill an hole in a MAP_PRIVATE shmem mapping, we'll
+ * generate anonymous memory directly without actually filling
+ * the hole. For the MAP_PRIVATE case the robustness check
+ * only happens in the pagetable (to verify it's still none)
+ * and not in the radix tree.
+ */
+ if (!(dst_vma->vm_flags & VM_SHARED)) {
if (!zeropage)
err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
dst_addr, src_addr, page);
if (!dst_vma)
goto out_unlock;
/*
- * Be strict and only allow __mcopy_atomic on userfaultfd
- * registered ranges to prevent userland errors going
- * unnoticed. As far as the VM consistency is concerned, it
- * would be perfectly safe to remove this check, but there's
- * no useful usage for __mcopy_atomic ouside of userfaultfd
- * registered ranges. This is after all why these are ioctls
- * belonging to the userfaultfd and not syscalls.
+ * Check the vma is registered in uffd, this is required to
+ * enforce the VM_MAYWRITE check done at uffd registration
+ * time.
*/
if (!dst_vma->vm_userfaultfd_ctx.ctx)
goto out_unlock;
* dst_vma.
*/
err = -ENOMEM;
- if (vma_is_anonymous(dst_vma) && unlikely(anon_vma_prepare(dst_vma)))
+ if (!(dst_vma->vm_flags & VM_SHARED) &&
+ unlikely(anon_vma_prepare(dst_vma)))
goto out_unlock;
while (src_addr < src_start + len) {
src_addr, &page, zeropage);
cond_resched();
- if (unlikely(err == -EFAULT)) {
+ if (unlikely(err == -ENOENT)) {
void *page_kaddr;
up_read(&dst_mm->mmap_sem);
/*
* The fast way of checking if there are any vmstat diffs.
- * This works because the diffs are byte sized items.
*/
- if (memchr_inv(p->vm_stat_diff, 0, NR_VM_ZONE_STAT_ITEMS))
+ if (memchr_inv(p->vm_stat_diff, 0, NR_VM_ZONE_STAT_ITEMS *
+ sizeof(p->vm_stat_diff[0])))
return true;
#ifdef CONFIG_NUMA
- if (memchr_inv(p->vm_numa_stat_diff, 0, NR_VM_NUMA_STAT_ITEMS))
+ if (memchr_inv(p->vm_numa_stat_diff, 0, NR_VM_NUMA_STAT_ITEMS *
+ sizeof(p->vm_numa_stat_diff[0])))
return true;
#endif
}
#define NCHUNKS ((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT)
#define BUDDY_MASK (0x3)
+#define BUDDY_SHIFT 2
/**
* struct z3fold_pool - stores metadata for each z3fold pool
MIDDLE_CHUNK_MAPPED,
NEEDS_COMPACTING,
PAGE_STALE,
- UNDER_RECLAIM
+ PAGE_CLAIMED, /* by either reclaim or free */
};
/*****************
clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
clear_bit(NEEDS_COMPACTING, &page->private);
clear_bit(PAGE_STALE, &page->private);
- clear_bit(UNDER_RECLAIM, &page->private);
+ clear_bit(PAGE_CLAIMED, &page->private);
spin_lock_init(&zhdr->page_lock);
kref_init(&zhdr->refcount);
unsigned long handle;
handle = (unsigned long)zhdr;
- if (bud != HEADLESS)
- handle += (bud + zhdr->first_num) & BUDDY_MASK;
+ if (bud != HEADLESS) {
+ handle |= (bud + zhdr->first_num) & BUDDY_MASK;
+ if (bud == LAST)
+ handle |= (zhdr->last_chunks << BUDDY_SHIFT);
+ }
return handle;
}
return (struct z3fold_header *)(handle & PAGE_MASK);
}
+/* only for LAST bud, returns zero otherwise */
+static unsigned short handle_to_chunks(unsigned long handle)
+{
+ return (handle & ~PAGE_MASK) >> BUDDY_SHIFT;
+}
+
/*
* (handle & BUDDY_MASK) < zhdr->first_num is possible in encode_handle
* but that doesn't matter. because the masking will result in the
page = virt_to_page(zhdr);
if (test_bit(PAGE_HEADLESS, &page->private)) {
- /* HEADLESS page stored */
- bud = HEADLESS;
- } else {
- z3fold_page_lock(zhdr);
- bud = handle_to_buddy(handle);
-
- switch (bud) {
- case FIRST:
- zhdr->first_chunks = 0;
- break;
- case MIDDLE:
- zhdr->middle_chunks = 0;
- zhdr->start_middle = 0;
- break;
- case LAST:
- zhdr->last_chunks = 0;
- break;
- default:
- pr_err("%s: unknown bud %d\n", __func__, bud);
- WARN_ON(1);
- z3fold_page_unlock(zhdr);
- return;
+ /* if a headless page is under reclaim, just leave.
+ * NB: we use test_and_set_bit for a reason: if the bit
+ * has not been set before, we release this page
+ * immediately so we don't care about its value any more.
+ */
+ if (!test_and_set_bit(PAGE_CLAIMED, &page->private)) {
+ spin_lock(&pool->lock);
+ list_del(&page->lru);
+ spin_unlock(&pool->lock);
+ free_z3fold_page(page);
+ atomic64_dec(&pool->pages_nr);
}
+ return;
}
- if (bud == HEADLESS) {
- spin_lock(&pool->lock);
- list_del(&page->lru);
- spin_unlock(&pool->lock);
- free_z3fold_page(page);
- atomic64_dec(&pool->pages_nr);
+ /* Non-headless case */
+ z3fold_page_lock(zhdr);
+ bud = handle_to_buddy(handle);
+
+ switch (bud) {
+ case FIRST:
+ zhdr->first_chunks = 0;
+ break;
+ case MIDDLE:
+ zhdr->middle_chunks = 0;
+ break;
+ case LAST:
+ zhdr->last_chunks = 0;
+ break;
+ default:
+ pr_err("%s: unknown bud %d\n", __func__, bud);
+ WARN_ON(1);
+ z3fold_page_unlock(zhdr);
return;
}
atomic64_dec(&pool->pages_nr);
return;
}
- if (test_bit(UNDER_RECLAIM, &page->private)) {
+ if (test_bit(PAGE_CLAIMED, &page->private)) {
z3fold_page_unlock(zhdr);
return;
}
}
list_for_each_prev(pos, &pool->lru) {
page = list_entry(pos, struct page, lru);
+
+ /* this bit could have been set by free, in which case
+ * we pass over to the next page in the pool.
+ */
+ if (test_and_set_bit(PAGE_CLAIMED, &page->private))
+ continue;
+
+ zhdr = page_address(page);
if (test_bit(PAGE_HEADLESS, &page->private))
- /* candidate found */
break;
- zhdr = page_address(page);
- if (!z3fold_page_trylock(zhdr))
+ if (!z3fold_page_trylock(zhdr)) {
+ zhdr = NULL;
continue; /* can't evict at this point */
+ }
kref_get(&zhdr->refcount);
list_del_init(&zhdr->buddy);
zhdr->cpu = -1;
- set_bit(UNDER_RECLAIM, &page->private);
break;
}
+ if (!zhdr)
+ break;
+
list_del_init(&page->lru);
spin_unlock(&pool->lock);
if (test_bit(PAGE_HEADLESS, &page->private)) {
if (ret == 0) {
free_z3fold_page(page);
+ atomic64_dec(&pool->pages_nr);
return 0;
}
spin_lock(&pool->lock);
spin_unlock(&pool->lock);
} else {
z3fold_page_lock(zhdr);
- clear_bit(UNDER_RECLAIM, &page->private);
+ clear_bit(PAGE_CLAIMED, &page->private);
if (kref_put(&zhdr->refcount,
release_z3fold_page_locked)) {
atomic64_dec(&pool->pages_nr);
set_bit(MIDDLE_CHUNK_MAPPED, &page->private);
break;
case LAST:
- addr += PAGE_SIZE - (zhdr->last_chunks << CHUNK_SHIFT);
+ addr += PAGE_SIZE - (handle_to_chunks(handle) << CHUNK_SHIFT);
break;
default:
pr_err("unknown buddy id %d\n", buddy);
struct kvec kv = {.iov_base = data, .iov_len = count};
struct iov_iter to;
- iov_iter_kvec(&to, READ | ITER_KVEC, &kv, 1, count);
+ iov_iter_kvec(&to, READ, &kv, 1, count);
p9_debug(P9_DEBUG_9P, ">>> TREADDIR fid %d offset %llu count %d\n",
fid->fid, (unsigned long long) offset, count);
if (!iov_iter_count(data))
return 0;
- if (!(data->type & ITER_KVEC)) {
+ if (!iov_iter_is_kvec(data)) {
int n;
/*
* We allow only p9_max_pages pinned. We wait for the
*/
int batadv_v_elp_iface_enable(struct batadv_hard_iface *hard_iface)
{
+ static const size_t tvlv_padding = sizeof(__be32);
struct batadv_elp_packet *elp_packet;
unsigned char *elp_buff;
u32 random_seqno;
size_t size;
int res = -ENOMEM;
- size = ETH_HLEN + NET_IP_ALIGN + BATADV_ELP_HLEN;
+ size = ETH_HLEN + NET_IP_ALIGN + BATADV_ELP_HLEN + tvlv_padding;
hard_iface->bat_v.elp_skb = dev_alloc_skb(size);
if (!hard_iface->bat_v.elp_skb)
goto out;
skb_reserve(hard_iface->bat_v.elp_skb, ETH_HLEN + NET_IP_ALIGN);
- elp_buff = skb_put_zero(hard_iface->bat_v.elp_skb, BATADV_ELP_HLEN);
+ elp_buff = skb_put_zero(hard_iface->bat_v.elp_skb,
+ BATADV_ELP_HLEN + tvlv_padding);
elp_packet = (struct batadv_elp_packet *)elp_buff;
elp_packet->packet_type = BATADV_ELP;
kfree(entry);
packet = (struct batadv_frag_packet *)skb_out->data;
- size = ntohs(packet->total_size);
+ size = ntohs(packet->total_size) + hdr_size;
/* Make room for the rest of the fragments. */
if (pskb_expand_head(skb_out, 0, size - skb_out->len, GFP_ATOMIC) < 0) {
iv.iov_len = skb->len;
memset(&msg, 0, sizeof(msg));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, skb->len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, &iv, 1, skb->len);
err = l2cap_chan_send(chan, &msg, skb->len);
if (err > 0) {
memset(&msg, 0, sizeof(msg));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, total_len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, &iv, 1, total_len);
l2cap_chan_send(chan, &msg, total_len);
memset(&msg, 0, sizeof(msg));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iv, 2, 1 + len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iv, 2, 1 + len);
l2cap_chan_send(chan, &msg, 1 + len);
return ret;
}
-static u32 bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, u32 *time)
+static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, u32 *ret,
+ u32 *time)
{
struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = { 0 };
enum bpf_cgroup_storage_type stype;
u64 time_start, time_spent = 0;
- u32 ret = 0, i;
+ u32 i;
for_each_cgroup_storage_type(stype) {
storage[stype] = bpf_cgroup_storage_alloc(prog, stype);
repeat = 1;
time_start = ktime_get_ns();
for (i = 0; i < repeat; i++) {
- ret = bpf_test_run_one(prog, ctx, storage);
+ *ret = bpf_test_run_one(prog, ctx, storage);
if (need_resched()) {
if (signal_pending(current))
break;
for_each_cgroup_storage_type(stype)
bpf_cgroup_storage_free(storage[stype]);
- return ret;
+ return 0;
}
static int bpf_test_finish(const union bpf_attr *kattr,
__skb_push(skb, hh_len);
if (is_direct_pkt_access)
bpf_compute_data_pointers(skb);
- retval = bpf_test_run(prog, skb, repeat, &duration);
+ ret = bpf_test_run(prog, skb, repeat, &retval, &duration);
+ if (ret) {
+ kfree_skb(skb);
+ kfree(sk);
+ return ret;
+ }
if (!is_l2) {
if (skb_headroom(skb) < hh_len) {
int nhead = HH_DATA_ALIGN(hh_len - skb_headroom(skb));
rxqueue = __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 0);
xdp.rxq = &rxqueue->xdp_rxq;
- retval = bpf_test_run(prog, &xdp, repeat, &duration);
+ ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration);
+ if (ret)
+ goto out;
if (xdp.data != data + XDP_PACKET_HEADROOM + NET_IP_ALIGN ||
xdp.data_end != xdp.data + size)
size = xdp.data_end - xdp.data;
ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration);
+out:
kfree(data);
return ret;
}
struct metadata_dst *tunnel_dst;
};
+/* private vlan flags */
+enum {
+ BR_VLFLAG_PER_PORT_STATS = BIT(0),
+};
+
/**
* struct net_bridge_vlan - per-vlan entry
*
* @vnode: rhashtable member
* @vid: VLAN id
* @flags: bridge vlan flags
+ * @priv_flags: private (in-kernel) bridge vlan flags
* @stats: per-cpu VLAN statistics
* @br: if MASTER flag set, this points to a bridge struct
* @port: if MASTER flag unset, this points to a port struct
struct rhash_head tnode;
u16 vid;
u16 flags;
+ u16 priv_flags;
struct br_vlan_stats __percpu *stats;
union {
struct net_bridge *br;
v = container_of(rcu, struct net_bridge_vlan, rcu);
WARN_ON(br_vlan_is_master(v));
/* if we had per-port stats configured then free them here */
- if (v->brvlan->stats != v->stats)
+ if (v->priv_flags & BR_VLFLAG_PER_PORT_STATS)
free_percpu(v->stats);
v->stats = NULL;
kfree(v);
err = -ENOMEM;
goto out_filt;
}
+ v->priv_flags |= BR_VLFLAG_PER_PORT_STATS;
} else {
v->stats = masterv->stats;
}
} else
ifindex = ro->ifindex;
- if (ro->fd_frames) {
+ dev = dev_get_by_index(sock_net(sk), ifindex);
+ if (!dev)
+ return -ENXIO;
+
+ err = -EINVAL;
+ if (ro->fd_frames && dev->mtu == CANFD_MTU) {
if (unlikely(size != CANFD_MTU && size != CAN_MTU))
- return -EINVAL;
+ goto put_dev;
} else {
if (unlikely(size != CAN_MTU))
- return -EINVAL;
+ goto put_dev;
}
- dev = dev_get_by_index(sock_net(sk), ifindex);
- if (!dev)
- return -ENXIO;
-
skb = sock_alloc_send_skb(sk, size + sizeof(struct can_skb_priv),
msg->msg_flags & MSG_DONTWAIT, &err);
if (!skb)
if (!buf)
msg.msg_flags |= MSG_TRUNC;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, len);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, len);
r = sock_recvmsg(sock, &msg, msg.msg_flags);
if (r == -EAGAIN)
r = 0;
int r;
BUG_ON(page_offset + length > PAGE_SIZE);
- iov_iter_bvec(&msg.msg_iter, READ | ITER_BVEC, &bvec, 1, length);
+ iov_iter_bvec(&msg.msg_iter, READ, &bvec, 1, length);
r = sock_recvmsg(sock, &msg, msg.msg_flags);
if (r == -EAGAIN)
r = 0;
struct bio_vec bvec;
int ret;
- /* sendpage cannot properly handle pages with page_count == 0,
- * we need to fallback to sendmsg if that's the case */
- if (page_count(page) >= 1)
+ /*
+ * sendpage cannot properly handle pages with page_count == 0,
+ * we need to fall back to sendmsg if that's the case.
+ *
+ * Same goes for slab pages: skb_can_coalesce() allows
+ * coalescing neighboring slab objects into a single frag which
+ * triggers one of hardened usercopy checks.
+ */
+ if (page_count(page) >= 1 && !PageSlab(page))
return __ceph_tcp_sendpage(sock, page, offset, size, more);
bvec.bv_page = page;
else
msg.msg_flags |= MSG_EOR; /* superfluous, but what the hell */
- iov_iter_bvec(&msg.msg_iter, WRITE | ITER_BVEC, &bvec, 1, size);
+ iov_iter_bvec(&msg.msg_iter, WRITE, &bvec, 1, size);
ret = sock_sendmsg(sock, &msg);
if (ret == -EAGAIN)
ret = 0;
return active;
}
+static void reset_xps_maps(struct net_device *dev,
+ struct xps_dev_maps *dev_maps,
+ bool is_rxqs_map)
+{
+ if (is_rxqs_map) {
+ static_key_slow_dec_cpuslocked(&xps_rxqs_needed);
+ RCU_INIT_POINTER(dev->xps_rxqs_map, NULL);
+ } else {
+ RCU_INIT_POINTER(dev->xps_cpus_map, NULL);
+ }
+ static_key_slow_dec_cpuslocked(&xps_needed);
+ kfree_rcu(dev_maps, rcu);
+}
+
static void clean_xps_maps(struct net_device *dev, const unsigned long *mask,
struct xps_dev_maps *dev_maps, unsigned int nr_ids,
u16 offset, u16 count, bool is_rxqs_map)
j < nr_ids;)
active |= remove_xps_queue_cpu(dev, dev_maps, j, offset,
count);
- if (!active) {
- if (is_rxqs_map) {
- RCU_INIT_POINTER(dev->xps_rxqs_map, NULL);
- } else {
- RCU_INIT_POINTER(dev->xps_cpus_map, NULL);
+ if (!active)
+ reset_xps_maps(dev, dev_maps, is_rxqs_map);
- for (i = offset + (count - 1); count--; i--)
- netdev_queue_numa_node_write(
- netdev_get_tx_queue(dev, i),
- NUMA_NO_NODE);
+ if (!is_rxqs_map) {
+ for (i = offset + (count - 1); count--; i--) {
+ netdev_queue_numa_node_write(
+ netdev_get_tx_queue(dev, i),
+ NUMA_NO_NODE);
}
- kfree_rcu(dev_maps, rcu);
}
}
false);
out_no_maps:
- if (static_key_enabled(&xps_rxqs_needed))
- static_key_slow_dec_cpuslocked(&xps_rxqs_needed);
-
- static_key_slow_dec_cpuslocked(&xps_needed);
mutex_unlock(&xps_map_mutex);
cpus_read_unlock();
}
if (!new_dev_maps)
goto out_no_new_maps;
- static_key_slow_inc_cpuslocked(&xps_needed);
- if (is_rxqs_map)
- static_key_slow_inc_cpuslocked(&xps_rxqs_needed);
+ if (!dev_maps) {
+ /* Increment static keys at most once per type */
+ static_key_slow_inc_cpuslocked(&xps_needed);
+ if (is_rxqs_map)
+ static_key_slow_inc_cpuslocked(&xps_rxqs_needed);
+ }
for (j = -1; j = netif_attrmask_next(j, possible_mask, nr_ids),
j < nr_ids;) {
}
/* free map if not active */
- if (!active) {
- if (is_rxqs_map)
- RCU_INIT_POINTER(dev->xps_rxqs_map, NULL);
- else
- RCU_INIT_POINTER(dev->xps_cpus_map, NULL);
- kfree_rcu(dev_maps, rcu);
- }
+ if (!active)
+ reset_xps_maps(dev, dev_maps, is_rxqs_map);
out_no_maps:
mutex_unlock(&xps_map_mutex);
}
skb = next;
- if (netif_xmit_stopped(txq) && skb) {
+ if (netif_tx_queue_stopped(txq) && skb) {
rc = NETDEV_TX_BUSY;
break;
}
struct net_device *orig_dev = skb->dev;
struct packet_type *pt_prev = NULL;
- list_del(&skb->list);
+ skb_list_del_init(skb);
__netif_receive_skb_core(skb, pfmemalloc, &pt_prev);
if (!pt_prev)
continue;
INIT_LIST_HEAD(&sublist);
list_for_each_entry_safe(skb, next, head, list) {
net_timestamp_check(netdev_tstamp_prequeue, skb);
- list_del(&skb->list);
+ skb_list_del_init(skb);
if (!skb_defer_rx_timestamp(skb))
list_add_tail(&skb->list, &sublist);
}
rcu_read_lock();
list_for_each_entry_safe(skb, next, head, list) {
xdp_prog = rcu_dereference(skb->dev->xdp_prog);
- list_del(&skb->list);
+ skb_list_del_init(skb);
if (do_xdp_generic(xdp_prog, skb) == XDP_PASS)
list_add_tail(&skb->list, &sublist);
}
if (cpu >= 0) {
/* Will be handled, remove from list */
- list_del(&skb->list);
+ skb_list_del_init(skb);
enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
}
}
skb->vlan_tci = 0;
skb->dev = napi->dev;
skb->skb_iif = 0;
+
+ /* eth_type_trans() assumes pkt_type is PACKET_HOST */
+ skb->pkt_type = PACKET_HOST;
+
skb->encapsulation = 0;
skb_shinfo(skb)->gso_type = 0;
skb->truesize = SKB_TRUESIZE(skb_end_offset(skb));
if (work_done)
timeout = n->dev->gro_flush_timeout;
+ /* When the NAPI instance uses a timeout and keeps postponing
+ * it, we need to bound somehow the time packets are kept in
+ * the GRO layer
+ */
+ napi_gro_flush(n, !!timeout);
if (timeout)
hrtimer_start(&n->timer, ns_to_ktime(timeout),
HRTIMER_MODE_REL_PINNED);
- else
- napi_gro_flush(n, false);
}
if (unlikely(!list_empty(&n->poll_list))) {
/* If n->poll_list is not empty, we need to mask irqs */
napi->skb = NULL;
napi->poll = poll;
if (weight > NAPI_POLL_WEIGHT)
- pr_err_once("netif_napi_add() called with weight %d on device %s\n",
- weight, dev->name);
+ netdev_err_once(dev, "%s() called with weight %d\n", __func__,
+ weight);
napi->weight = weight;
list_add(&napi->dev_list, &dev->napi_list);
napi->dev = dev;
} else {
struct in6_addr *src6 = (struct in6_addr *)&tuple->ipv6.saddr;
struct in6_addr *dst6 = (struct in6_addr *)&tuple->ipv6.daddr;
- u16 hnum = ntohs(tuple->ipv6.dport);
int sdif = inet6_sdif(skb);
if (proto == IPPROTO_TCP)
sk = __inet6_lookup(net, &tcp_hashinfo, skb, 0,
src6, tuple->ipv6.sport,
- dst6, hnum,
+ dst6, ntohs(tuple->ipv6.dport),
dif, sdif, &refcounted);
else if (likely(ipv6_bpf_stub))
sk = ipv6_bpf_stub->udp6_lib_lookup(net,
src6, tuple->ipv6.sport,
- dst6, hnum,
+ dst6, tuple->ipv6.dport,
dif, sdif,
&udp_table, skb);
#endif
struct net *net;
family = len == sizeof(tuple->ipv4) ? AF_INET : AF_INET6;
- if (unlikely(family == AF_UNSPEC || netns_id > U32_MAX || flags))
+ if (unlikely(family == AF_UNSPEC || flags ||
+ !((s32)netns_id < 0 || netns_id <= S32_MAX)))
goto out;
if (skb->dev)
caller_net = dev_net(skb->dev);
else
caller_net = sock_net(skb->sk);
- if (netns_id) {
+ if ((s32)netns_id < 0) {
+ net = caller_net;
+ sk = sk_lookup(net, tuple, skb, family, proto);
+ } else {
net = get_net_ns_by_id(caller_net, netns_id);
if (unlikely(!net))
goto out;
sk = sk_lookup(net, tuple, skb, family, proto);
put_net(net);
- } else {
- net = caller_net;
- sk = sk_lookup(net, tuple, skb, family, proto);
}
if (sk)
if (size != size_default)
return false;
break;
- case bpf_ctx_range(struct __sk_buff, flow_keys):
- if (size != sizeof(struct bpf_flow_keys *))
+ case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
+ if (size != sizeof(__u64))
return false;
break;
default:
case bpf_ctx_range(struct __sk_buff, data):
case bpf_ctx_range(struct __sk_buff, data_meta):
case bpf_ctx_range(struct __sk_buff, data_end):
- case bpf_ctx_range(struct __sk_buff, flow_keys):
+ case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
case bpf_ctx_range_till(struct __sk_buff, family, local_port):
return false;
}
switch (off) {
case bpf_ctx_range(struct __sk_buff, tc_classid):
case bpf_ctx_range(struct __sk_buff, data_meta):
- case bpf_ctx_range(struct __sk_buff, flow_keys):
+ case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
return false;
case bpf_ctx_range(struct __sk_buff, data):
case bpf_ctx_range(struct __sk_buff, data_end):
case bpf_ctx_range(struct __sk_buff, tc_classid):
case bpf_ctx_range_till(struct __sk_buff, family, local_port):
case bpf_ctx_range(struct __sk_buff, data_meta):
- case bpf_ctx_range(struct __sk_buff, flow_keys):
+ case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
return false;
}
case bpf_ctx_range(struct __sk_buff, data_end):
info->reg_type = PTR_TO_PACKET_END;
break;
- case bpf_ctx_range(struct __sk_buff, flow_keys):
+ case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
case bpf_ctx_range_till(struct __sk_buff, family, local_port):
return false;
}
switch (off) {
case bpf_ctx_range(struct __sk_buff, tc_classid):
case bpf_ctx_range(struct __sk_buff, data_meta):
- case bpf_ctx_range(struct __sk_buff, flow_keys):
+ case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
return false;
}
case bpf_ctx_range(struct __sk_buff, data_end):
info->reg_type = PTR_TO_PACKET_END;
break;
- case bpf_ctx_range(struct __sk_buff, flow_keys):
+ case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
info->reg_type = PTR_TO_FLOW_KEYS;
break;
case bpf_ctx_range(struct __sk_buff, tc_classid):
break;
}
- if (dissector_uses_key(flow_dissector,
- FLOW_DISSECTOR_KEY_PORTS)) {
+ if (dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_PORTS) &&
+ !(key_control->flags & FLOW_DIS_IS_FRAGMENT)) {
key_ports = skb_flow_dissector_target(flow_dissector,
FLOW_DISSECTOR_KEY_PORTS,
target_container);
read_lock_bh(&idev->lock);
list_for_each_entry(ifp, &idev->addr_list, if_list) {
- if (ipv6_addr_type(&ifp->addr) & IPV6_ADDR_LINKLOCAL)
+ if (!!(ipv6_addr_type(&ifp->addr) & IPV6_ADDR_LINKLOCAL) !=
+ !!(ipv6_addr_type(&np->remote_ip.in6) & IPV6_ADDR_LINKLOCAL))
continue;
np->local_ip.in6 = ifp->addr;
err = 0;
cb->seq = 0;
}
ret = dumpit(skb, cb);
- if (ret < 0)
+ if (ret)
break;
}
cb->family = idx;
return -EINVAL;
}
+ if (dev->type != ARPHRD_ETHER) {
+ NL_SET_ERR_MSG(extack, "FDB add only supported for Ethernet devices");
+ return -EINVAL;
+ }
+
addr = nla_data(tb[NDA_LLADDR]);
err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack);
return -EINVAL;
}
+ if (dev->type != ARPHRD_ETHER) {
+ NL_SET_ERR_MSG(extack, "FDB delete only supported for Ethernet devices");
+ return -EINVAL;
+ }
+
addr = nla_data(tb[NDA_LLADDR]);
err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack);
{
int err;
+ if (dev->type != ARPHRD_ETHER)
+ return -EINVAL;
+
netif_addr_lock_bh(dev);
err = nlmsg_populate_fdb(skb, cb, dev, idx, &dev->uc);
if (err)
nf_reset(skb);
nf_reset_trace(skb);
+#ifdef CONFIG_NET_SWITCHDEV
+ skb->offload_fwd_mark = 0;
+ skb->offload_mr_fwd_mark = 0;
+#endif
+
if (!xnet)
return;
*
* This is a helper to do that correctly considering GSO_BY_FRAGS.
*
+ * @skb: GSO skb
+ *
* @seg_len: The segmented length (from skb_gso_*_seglen). In the
* GSO_BY_FRAGS case this will be [header sizes + GSO_BY_FRAGS].
*
#ifdef CONFIG_INET
if (family == AF_INET &&
+ protocol != IPPROTO_RAW &&
!rcu_access_pointer(inet_protos[protocol]))
return -ENOENT;
#endif
cpu_dp->orig_ethtool_ops = NULL;
}
+static ssize_t tagging_show(struct device *d, struct device_attribute *attr,
+ char *buf)
+{
+ struct net_device *dev = to_net_dev(d);
+ struct dsa_port *cpu_dp = dev->dsa_ptr;
+
+ return sprintf(buf, "%s\n",
+ dsa_tag_protocol_to_str(cpu_dp->tag_ops));
+}
+static DEVICE_ATTR_RO(tagging);
+
+static struct attribute *dsa_slave_attrs[] = {
+ &dev_attr_tagging.attr,
+ NULL
+};
+
+static const struct attribute_group dsa_group = {
+ .name = "dsa",
+ .attrs = dsa_slave_attrs,
+};
+
int dsa_master_setup(struct net_device *dev, struct dsa_port *cpu_dp)
{
+ int ret;
+
/* If we use a tagging format that doesn't have an ethertype
* field, make sure that all packets from this point on get
* sent to the tag format's receive function.
dev->dsa_ptr = cpu_dp;
- return dsa_master_ethtool_setup(dev);
+ ret = dsa_master_ethtool_setup(dev);
+ if (ret)
+ return ret;
+
+ ret = sysfs_create_group(&dev->dev.kobj, &dsa_group);
+ if (ret)
+ dsa_master_ethtool_teardown(dev);
+
+ return ret;
}
void dsa_master_teardown(struct net_device *dev)
{
+ sysfs_remove_group(&dev->dev.kobj, &dsa_group);
dsa_master_ethtool_teardown(dev);
dev->dsa_ptr = NULL;
.name = "dsa",
};
-static ssize_t tagging_show(struct device *d, struct device_attribute *attr,
- char *buf)
-{
- struct net_device *dev = to_net_dev(d);
- struct dsa_port *dp = dsa_slave_to_port(dev);
-
- return sprintf(buf, "%s\n",
- dsa_tag_protocol_to_str(dp->cpu_dp->tag_ops));
-}
-static DEVICE_ATTR_RO(tagging);
-
-static struct attribute *dsa_slave_attrs[] = {
- &dev_attr_tagging.attr,
- NULL
-};
-
-static const struct attribute_group dsa_group = {
- .name = "dsa",
- .attrs = dsa_slave_attrs,
-};
-
static void dsa_slave_phylink_validate(struct net_device *dev,
unsigned long *supported,
struct phylink_link_state *state)
goto out_phy;
}
- ret = sysfs_create_group(&slave_dev->dev.kobj, &dsa_group);
- if (ret)
- goto out_unreg;
-
return 0;
-out_unreg:
- unregister_netdev(slave_dev);
out_phy:
rtnl_lock();
phylink_disconnect_phy(p->dp->pl);
rtnl_unlock();
dsa_slave_notify(slave_dev, DSA_PORT_UNREGISTER);
- sysfs_remove_group(&slave_dev->dev.kobj, &dsa_group);
unregister_netdev(slave_dev);
phylink_destroy(dp->pl);
free_percpu(p->stats64);
#ifdef CONFIG_IP_MULTICAST
/* Parameter names and values are taken from igmp-v2-06 draft */
-#define IGMP_V1_ROUTER_PRESENT_TIMEOUT (400*HZ)
-#define IGMP_V2_ROUTER_PRESENT_TIMEOUT (400*HZ)
#define IGMP_V2_UNSOLICITED_REPORT_INTERVAL (10*HZ)
#define IGMP_V3_UNSOLICITED_REPORT_INTERVAL (1*HZ)
+#define IGMP_QUERY_INTERVAL (125*HZ)
#define IGMP_QUERY_RESPONSE_INTERVAL (10*HZ)
-#define IGMP_QUERY_ROBUSTNESS_VARIABLE 2
-
#define IGMP_INITIAL_REPORT_DELAY (1)
max_delay = IGMP_QUERY_RESPONSE_INTERVAL;
in_dev->mr_v1_seen = jiffies +
- IGMP_V1_ROUTER_PRESENT_TIMEOUT;
+ (in_dev->mr_qrv * in_dev->mr_qi) +
+ in_dev->mr_qri;
group = 0;
} else {
/* v2 router present */
max_delay = ih->code*(HZ/IGMP_TIMER_SCALE);
in_dev->mr_v2_seen = jiffies +
- IGMP_V2_ROUTER_PRESENT_TIMEOUT;
+ (in_dev->mr_qrv * in_dev->mr_qi) +
+ in_dev->mr_qri;
}
/* cancel the interface change timer */
in_dev->mr_ifc_count = 0;
if (!max_delay)
max_delay = 1; /* can't mod w/ 0 */
in_dev->mr_maxdelay = max_delay;
- if (ih3->qrv)
- in_dev->mr_qrv = ih3->qrv;
+
+ /* RFC3376, 4.1.6. QRV and 4.1.7. QQIC, when the most recently
+ * received value was zero, use the default or statically
+ * configured value.
+ */
+ in_dev->mr_qrv = ih3->qrv ?: net->ipv4.sysctl_igmp_qrv;
+ in_dev->mr_qi = IGMPV3_QQIC(ih3->qqic)*HZ ?: IGMP_QUERY_INTERVAL;
+
+ /* RFC3376, 8.3. Query Response Interval:
+ * The number of seconds represented by the [Query Response
+ * Interval] must be less than the [Query Interval].
+ */
+ if (in_dev->mr_qri >= in_dev->mr_qi)
+ in_dev->mr_qri = (in_dev->mr_qi/HZ - 1)*HZ;
+
if (!group) { /* general query */
if (ih3->nsrcs)
return true; /* no sources allowed */
ip_mc_dec_group(in_dev, IGMP_ALL_HOSTS);
}
-void ip_mc_init_dev(struct in_device *in_dev)
-{
#ifdef CONFIG_IP_MULTICAST
+static void ip_mc_reset(struct in_device *in_dev)
+{
struct net *net = dev_net(in_dev->dev);
+
+ in_dev->mr_qi = IGMP_QUERY_INTERVAL;
+ in_dev->mr_qri = IGMP_QUERY_RESPONSE_INTERVAL;
+ in_dev->mr_qrv = net->ipv4.sysctl_igmp_qrv;
+}
+#else
+static void ip_mc_reset(struct in_device *in_dev)
+{
+}
#endif
+
+void ip_mc_init_dev(struct in_device *in_dev)
+{
ASSERT_RTNL();
#ifdef CONFIG_IP_MULTICAST
timer_setup(&in_dev->mr_gq_timer, igmp_gq_timer_expire, 0);
timer_setup(&in_dev->mr_ifc_timer, igmp_ifc_timer_expire, 0);
- in_dev->mr_qrv = net->ipv4.sysctl_igmp_qrv;
#endif
+ ip_mc_reset(in_dev);
spin_lock_init(&in_dev->mc_tomb_lock);
}
void ip_mc_up(struct in_device *in_dev)
{
struct ip_mc_list *pmc;
-#ifdef CONFIG_IP_MULTICAST
- struct net *net = dev_net(in_dev->dev);
-#endif
ASSERT_RTNL();
-#ifdef CONFIG_IP_MULTICAST
- in_dev->mr_qrv = net->ipv4.sysctl_igmp_qrv;
-#endif
+ ip_mc_reset(in_dev);
ip_mc_inc_group(in_dev, IGMP_ALL_HOSTS);
for_each_pmc_rtnl(in_dev, pmc) {
}
static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
- void *arg)
+ void *arg,
+ struct inet_frag_queue **prev)
{
struct inet_frags *f = nf->f;
struct inet_frag_queue *q;
- int err;
q = inet_frag_alloc(nf, f, arg);
- if (!q)
+ if (!q) {
+ *prev = ERR_PTR(-ENOMEM);
return NULL;
-
+ }
mod_timer(&q->timer, jiffies + nf->timeout);
- err = rhashtable_insert_fast(&nf->rhashtable, &q->node,
- f->rhash_params);
- if (err < 0) {
+ *prev = rhashtable_lookup_get_insert_key(&nf->rhashtable, &q->key,
+ &q->node, f->rhash_params);
+ if (*prev) {
q->flags |= INET_FRAG_COMPLETE;
inet_frag_kill(q);
inet_frag_destroy(q);
/* TODO : call from rcu_read_lock() and no longer use refcount_inc_not_zero() */
struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key)
{
- struct inet_frag_queue *fq;
+ struct inet_frag_queue *fq = NULL, *prev;
if (!nf->high_thresh || frag_mem_limit(nf) > nf->high_thresh)
return NULL;
rcu_read_lock();
- fq = rhashtable_lookup(&nf->rhashtable, key, nf->f->rhash_params);
- if (fq) {
+ prev = rhashtable_lookup(&nf->rhashtable, key, nf->f->rhash_params);
+ if (!prev)
+ fq = inet_frag_create(nf, key, &prev);
+ if (prev && !IS_ERR(prev)) {
+ fq = prev;
if (!refcount_inc_not_zero(&fq->refcnt))
fq = NULL;
- rcu_read_unlock();
- return fq;
}
rcu_read_unlock();
-
- return inet_frag_create(nf, key);
+ return fq;
}
EXPORT_SYMBOL(inet_frag_find);
struct rb_node *rbn;
int len;
int ihlen;
+ int delta;
int err;
u8 ecn;
if (len > 65535)
goto out_oversize;
+ delta = - head->truesize;
+
/* Head of list must not be cloned. */
if (skb_unclone(head, GFP_ATOMIC))
goto out_nomem;
+ delta += head->truesize;
+ if (delta)
+ add_frag_mem_limit(qp->q.net, delta);
+
/* If the first fragment is fragmented itself, we split
* it to two chunks: the first with data and paged part
* and the second, holding only fragments. */
if (ip_is_fragment(&iph)) {
skb = skb_share_check(skb, GFP_ATOMIC);
if (skb) {
- if (!pskb_may_pull(skb, netoff + iph.ihl * 4))
- return skb;
- if (pskb_trim_rcsum(skb, netoff + len))
- return skb;
+ if (!pskb_may_pull(skb, netoff + iph.ihl * 4)) {
+ kfree_skb(skb);
+ return NULL;
+ }
+ if (pskb_trim_rcsum(skb, netoff + len)) {
+ kfree_skb(skb);
+ return NULL;
+ }
memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
if (ip_defrag(net, skb, user))
return NULL;
list_for_each_entry_safe(skb, next, head, list) {
struct dst_entry *dst;
- list_del(&skb->list);
+ skb_list_del_init(skb);
/* if ingress device is enslaved to an L3 master device pass the
* skb to its handler for processing
*/
struct net_device *dev = skb->dev;
struct net *net = dev_net(dev);
- list_del(&skb->list);
+ skb_list_del_init(skb);
skb = ip_rcv_core(skb, net);
if (skb == NULL)
continue;
unsigned int fraglen;
unsigned int fraggap;
unsigned int alloclen;
- unsigned int pagedlen = 0;
+ unsigned int pagedlen;
struct sk_buff *skb_prev;
alloc_new_skb:
skb_prev = skb;
if (datalen > mtu - fragheaderlen)
datalen = maxfraglen - fragheaderlen;
fraglen = datalen + fragheaderlen;
+ pagedlen = 0;
if ((flags & MSG_MORE) &&
!(rt->dst.dev->features&NETIF_F_SG))
return -ENOPROTOOPT;
err = do_ip_setsockopt(sk, level, optname, optval, optlen);
-#ifdef CONFIG_BPFILTER
+#if IS_ENABLED(CONFIG_BPFILTER_UMH)
if (optname >= BPFILTER_IPT_SO_SET_REPLACE &&
optname < BPFILTER_IPT_SET_MAX)
err = bpfilter_ip_set_sockopt(sk, optname, optval, optlen);
int err;
err = do_ip_getsockopt(sk, level, optname, optval, optlen, 0);
-#ifdef CONFIG_BPFILTER
+#if IS_ENABLED(CONFIG_BPFILTER_UMH)
if (optname >= BPFILTER_IPT_SO_GET_INFO &&
optname < BPFILTER_IPT_GET_MAX)
err = bpfilter_ip_get_sockopt(sk, optname, optval, optlen);
err = do_ip_getsockopt(sk, level, optname, optval, optlen,
MSG_CMSG_COMPAT);
-#ifdef CONFIG_BPFILTER
+#if IS_ENABLED(CONFIG_BPFILTER_UMH)
if (optname >= BPFILTER_IPT_SO_GET_INFO &&
optname < BPFILTER_IPT_GET_MAX)
err = bpfilter_ip_get_sockopt(sk, optname, optval, optlen);
iph->version = 4;
iph->ihl = sizeof(struct iphdr) >> 2;
- iph->frag_off = df;
+ iph->frag_off = ip_mtu_locked(&rt->dst) ? 0 : df;
iph->protocol = proto;
iph->tos = tos;
iph->daddr = dst;
int ret;
ret = xt_register_target(&masquerade_tg_reg);
+ if (ret)
+ return ret;
- if (ret == 0)
- nf_nat_masquerade_ipv4_register_notifier();
+ ret = nf_nat_masquerade_ipv4_register_notifier();
+ if (ret)
+ xt_unregister_target(&masquerade_tg_reg);
return ret;
}
.notifier_call = masq_inet_event,
};
-static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
+static int masq_refcnt;
+static DEFINE_MUTEX(masq_mutex);
-void nf_nat_masquerade_ipv4_register_notifier(void)
+int nf_nat_masquerade_ipv4_register_notifier(void)
{
+ int ret = 0;
+
+ mutex_lock(&masq_mutex);
/* check if the notifier was already set */
- if (atomic_inc_return(&masquerade_notifier_refcount) > 1)
- return;
+ if (++masq_refcnt > 1)
+ goto out_unlock;
/* Register for device down reports */
- register_netdevice_notifier(&masq_dev_notifier);
+ ret = register_netdevice_notifier(&masq_dev_notifier);
+ if (ret)
+ goto err_dec;
/* Register IP address change reports */
- register_inetaddr_notifier(&masq_inet_notifier);
+ ret = register_inetaddr_notifier(&masq_inet_notifier);
+ if (ret)
+ goto err_unregister;
+
+ mutex_unlock(&masq_mutex);
+ return ret;
+
+err_unregister:
+ unregister_netdevice_notifier(&masq_dev_notifier);
+err_dec:
+ masq_refcnt--;
+out_unlock:
+ mutex_unlock(&masq_mutex);
+ return ret;
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_register_notifier);
void nf_nat_masquerade_ipv4_unregister_notifier(void)
{
+ mutex_lock(&masq_mutex);
/* check if the notifier still has clients */
- if (atomic_dec_return(&masquerade_notifier_refcount) > 0)
- return;
+ if (--masq_refcnt > 0)
+ goto out_unlock;
unregister_netdevice_notifier(&masq_dev_notifier);
unregister_inetaddr_notifier(&masq_inet_notifier);
+out_unlock:
+ mutex_unlock(&masq_mutex);
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_unregister_notifier);
if (ret < 0)
return ret;
- nf_nat_masquerade_ipv4_register_notifier();
+ ret = nf_nat_masquerade_ipv4_register_notifier();
+ if (ret)
+ nft_unregister_expr(&nft_masq_ipv4_type);
return ret;
}
ret = err;
goto out;
}
+ copied = -EAGAIN;
}
ret = copied;
out:
u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
u32 delta_us;
- if (!delta)
- delta = 1;
- delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
- tcp_rcv_rtt_update(tp, delta_us, 0);
+ if (likely(delta < INT_MAX / (USEC_PER_SEC / TCP_TS_HZ))) {
+ if (!delta)
+ delta = 1;
+ delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
+ tcp_rcv_rtt_update(tp, delta_us, 0);
+ }
}
}
if (seq_rtt_us < 0 && tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
flag & FLAG_ACKED) {
u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
- u32 delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
- seq_rtt_us = ca_rtt_us = delta_us;
+ if (likely(delta < INT_MAX / (USEC_PER_SEC / TCP_TS_HZ))) {
+ seq_rtt_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
+ ca_rtt_us = seq_rtt_us;
+ }
}
rs->rtt_us = ca_rtt_us; /* RTT of last (S)ACKed packet (or -1) */
if (seq_rtt_us < 0)
* If the sack array is full, forget about the last one.
*/
if (this_sack >= TCP_NUM_SACKS) {
- if (tp->compressed_ack)
+ if (tp->compressed_ack > TCP_FASTRETRANS_THRESH)
tcp_send_ack(sk);
this_sack--;
tp->rx_opt.num_sacks--;
if (TCP_SKB_CB(from)->has_rxtstamp) {
TCP_SKB_CB(to)->has_rxtstamp = true;
to->tstamp = from->tstamp;
+ skb_hwtstamps(to)->hwtstamp = skb_hwtstamps(from)->hwtstamp;
}
return true;
if (!tcp_is_sack(tp) ||
tp->compressed_ack >= sock_net(sk)->ipv4.sysctl_tcp_comp_sack_nr)
goto send_now;
- tp->compressed_ack++;
+
+ if (tp->compressed_ack_rcv_nxt != tp->rcv_nxt) {
+ tp->compressed_ack_rcv_nxt = tp->rcv_nxt;
+ if (tp->compressed_ack > TCP_FASTRETRANS_THRESH)
+ NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPACKCOMPRESSED,
+ tp->compressed_ack - TCP_FASTRETRANS_THRESH);
+ tp->compressed_ack = 0;
+ }
+
+ if (++tp->compressed_ack <= TCP_FASTRETRANS_THRESH)
+ goto send_now;
if (hrtimer_is_queued(&tp->compressed_ack_timer))
return;
{
struct tcp_sock *tp = tcp_sk(sk);
- if (unlikely(tp->compressed_ack)) {
+ if (unlikely(tp->compressed_ack > TCP_FASTRETRANS_THRESH)) {
NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPACKCOMPRESSED,
- tp->compressed_ack);
- tp->compressed_ack = 0;
+ tp->compressed_ack - TCP_FASTRETRANS_THRESH);
+ tp->compressed_ack = TCP_FASTRETRANS_THRESH;
if (hrtimer_try_to_cancel(&tp->compressed_ack_timer) == 1)
__sock_put(sk);
}
* This algorithm is from John Heffner.
*/
static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
- bool *is_cwnd_limited, u32 max_segs)
+ bool *is_cwnd_limited,
+ bool *is_rwnd_limited,
+ u32 max_segs)
{
const struct inet_connection_sock *icsk = inet_csk(sk);
u32 age, send_win, cong_win, limit, in_flight;
struct sk_buff *head;
int win_divisor;
- if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
- goto send_now;
-
if (icsk->icsk_ca_state >= TCP_CA_Recovery)
goto send_now;
if (age < (tp->srtt_us >> 4))
goto send_now;
- /* Ok, it looks like it is advisable to defer. */
+ /* Ok, it looks like it is advisable to defer.
+ * Three cases are tracked :
+ * 1) We are cwnd-limited
+ * 2) We are rwnd-limited
+ * 3) We are application limited.
+ */
+ if (cong_win < send_win) {
+ if (cong_win <= skb->len) {
+ *is_cwnd_limited = true;
+ return true;
+ }
+ } else {
+ if (send_win <= skb->len) {
+ *is_rwnd_limited = true;
+ return true;
+ }
+ }
- if (cong_win < send_win && cong_win <= skb->len)
- *is_cwnd_limited = true;
+ /* If this packet won't get more data, do not wait. */
+ if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
+ goto send_now;
return true;
} else {
if (!push_one &&
tcp_tso_should_defer(sk, skb, &is_cwnd_limited,
- max_segs))
+ &is_rwnd_limited, max_segs))
break;
}
goto rearm_timer;
}
skb = skb_rb_last(&sk->tcp_rtx_queue);
+ if (unlikely(!skb)) {
+ WARN_ONCE(tp->packets_out,
+ "invalid inflight: %u state %u cwnd %u mss %d\n",
+ tp->packets_out, sk->sk_state, tp->snd_cwnd, mss);
+ inet_csk(sk)->icsk_pending = 0;
+ return;
+ }
/* At most one outstanding TLP retransmission. */
if (tp->tlp_high_seq)
goto rearm_timer;
- /* Retransmit last segment. */
- if (WARN_ON(!skb))
- goto rearm_timer;
-
if (skb_still_in_host_queue(sk, skb))
goto rearm_timer;
TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS;
trace_tcp_retransmit_skb(sk, skb);
} else if (err != -EBUSY) {
- NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL);
+ NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL, segs);
}
return err;
}
{
struct inet_connection_sock *icsk = inet_csk(sk);
u32 elapsed, start_ts;
+ s32 remaining;
start_ts = tcp_retransmit_stamp(sk);
if (!icsk->icsk_user_timeout || !start_ts)
return icsk->icsk_rto;
elapsed = tcp_time_stamp(tcp_sk(sk)) - start_ts;
- if (elapsed >= icsk->icsk_user_timeout)
+ remaining = icsk->icsk_user_timeout - elapsed;
+ if (remaining <= 0)
return 1; /* user timeout has passed; fire ASAP */
- else
- return min_t(u32, icsk->icsk_rto, msecs_to_jiffies(icsk->icsk_user_timeout - elapsed));
+
+ return min_t(u32, icsk->icsk_rto, msecs_to_jiffies(remaining));
}
/**
(boundary - linear_backoff_thresh) * TCP_RTO_MAX;
timeout = jiffies_to_msecs(timeout);
}
- return (tcp_time_stamp(tcp_sk(sk)) - start_ts) >= timeout;
+ return (s32)(tcp_time_stamp(tcp_sk(sk)) - start_ts - timeout) >= 0;
}
/* A write timeout has occurred. Process the after effects. */
return;
}
- if (icsk->icsk_probes_out > max_probes) {
+ if (icsk->icsk_probes_out >= max_probes) {
abort: tcp_write_err(sk);
} else {
/* Only send another probe if we didn't close things up. */
goto out_reset_timer;
}
+ __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPTIMEOUTS);
if (tcp_write_timeout(sk))
goto out;
if (icsk->icsk_retransmits == 0) {
- int mib_idx;
+ int mib_idx = 0;
if (icsk->icsk_ca_state == TCP_CA_Recovery) {
if (tcp_is_sack(tp))
mib_idx = LINUX_MIB_TCPSACKFAILURES;
else
mib_idx = LINUX_MIB_TCPRENOFAILURES;
- } else {
- mib_idx = LINUX_MIB_TCPTIMEOUTS;
}
- __NET_INC_STATS(sock_net(sk), mib_idx);
+ if (mib_idx)
+ __NET_INC_STATS(sock_net(sk), mib_idx);
}
tcp_enter_loss(sk);
bh_lock_sock(sk);
if (!sock_owned_by_user(sk)) {
- if (tp->compressed_ack)
+ if (tp->compressed_ack > TCP_FASTRETRANS_THRESH)
tcp_send_ack(sk);
} else {
if (!test_and_set_bit(TCP_DELACK_TIMER_DEFERRED,
static void addrconf_dad_work(struct work_struct *w);
static void addrconf_dad_completed(struct inet6_ifaddr *ifp, bool bump_id,
bool send_na);
-static void addrconf_dad_run(struct inet6_dev *idev);
+static void addrconf_dad_run(struct inet6_dev *idev, bool restart);
static void addrconf_rs_timer(struct timer_list *t);
static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifa);
static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifa);
void *ptr)
{
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+ struct netdev_notifier_change_info *change_info;
struct netdev_notifier_changeupper_info *info;
struct inet6_dev *idev = __in6_dev_get(dev);
struct net *net = dev_net(dev);
break;
}
- if (idev) {
+ if (!IS_ERR_OR_NULL(idev)) {
if (idev->if_flags & IF_READY) {
/* device is already configured -
* but resend MLD reports, we might
* multicast snooping switches
*/
ipv6_mc_up(idev);
+ change_info = ptr;
+ if (change_info->flags_changed & IFF_NOARP)
+ addrconf_dad_run(idev, true);
rt6_sync_up(dev, RTNH_F_LINKDOWN);
break;
}
if (!IS_ERR_OR_NULL(idev)) {
if (run_pending)
- addrconf_dad_run(idev);
+ addrconf_dad_run(idev, false);
/* Device has an address by now */
rt6_sync_up(dev, RTNH_F_DEAD);
addrconf_verify_rtnl();
}
-static void addrconf_dad_run(struct inet6_dev *idev)
+static void addrconf_dad_run(struct inet6_dev *idev, bool restart)
{
struct inet6_ifaddr *ifp;
read_lock_bh(&idev->lock);
list_for_each_entry(ifp, &idev->addr_list, if_list) {
spin_lock(&ifp->lock);
- if (ifp->flags & IFA_F_TENTATIVE &&
- ifp->state == INET6_IFADDR_STATE_DAD)
+ if ((ifp->flags & IFA_F_TENTATIVE &&
+ ifp->state == INET6_IFADDR_STATE_DAD) || restart) {
+ if (restart)
+ ifp->state = INET6_IFADDR_STATE_PREDAD;
addrconf_dad_kick(ifp);
+ }
spin_unlock(&ifp->lock);
}
read_unlock_bh(&idev->lock);
err = ip6_flowlabel_init();
if (err)
goto ip6_flowlabel_fail;
+ err = ipv6_anycast_init();
+ if (err)
+ goto ipv6_anycast_fail;
err = addrconf_init();
if (err)
goto addrconf_fail;
ipv6_exthdrs_fail:
addrconf_cleanup();
addrconf_fail:
+ ipv6_anycast_cleanup();
+ipv6_anycast_fail:
ip6_flowlabel_cleanup();
ip6_flowlabel_fail:
ndisc_late_cleanup();
#include <net/checksum.h>
+#define IN6_ADDR_HSIZE_SHIFT 8
+#define IN6_ADDR_HSIZE BIT(IN6_ADDR_HSIZE_SHIFT)
+/* anycast address hash table
+ */
+static struct hlist_head inet6_acaddr_lst[IN6_ADDR_HSIZE];
+static DEFINE_SPINLOCK(acaddr_hash_lock);
+
static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr);
+static u32 inet6_acaddr_hash(struct net *net, const struct in6_addr *addr)
+{
+ u32 val = ipv6_addr_hash(addr) ^ net_hash_mix(net);
+
+ return hash_32(val, IN6_ADDR_HSIZE_SHIFT);
+}
+
/*
* socket join an anycast group
*/
rtnl_unlock();
}
+static void ipv6_add_acaddr_hash(struct net *net, struct ifacaddr6 *aca)
+{
+ unsigned int hash = inet6_acaddr_hash(net, &aca->aca_addr);
+
+ spin_lock(&acaddr_hash_lock);
+ hlist_add_head_rcu(&aca->aca_addr_lst, &inet6_acaddr_lst[hash]);
+ spin_unlock(&acaddr_hash_lock);
+}
+
+static void ipv6_del_acaddr_hash(struct ifacaddr6 *aca)
+{
+ spin_lock(&acaddr_hash_lock);
+ hlist_del_init_rcu(&aca->aca_addr_lst);
+ spin_unlock(&acaddr_hash_lock);
+}
+
static void aca_get(struct ifacaddr6 *aca)
{
refcount_inc(&aca->aca_refcnt);
}
+static void aca_free_rcu(struct rcu_head *h)
+{
+ struct ifacaddr6 *aca = container_of(h, struct ifacaddr6, rcu);
+
+ fib6_info_release(aca->aca_rt);
+ kfree(aca);
+}
+
static void aca_put(struct ifacaddr6 *ac)
{
if (refcount_dec_and_test(&ac->aca_refcnt)) {
- fib6_info_release(ac->aca_rt);
- kfree(ac);
+ call_rcu(&ac->rcu, aca_free_rcu);
}
}
aca->aca_addr = *addr;
fib6_info_hold(f6i);
aca->aca_rt = f6i;
+ INIT_HLIST_NODE(&aca->aca_addr_lst);
aca->aca_users = 1;
/* aca_tstamp should be updated upon changes */
aca->aca_cstamp = aca->aca_tstamp = jiffies;
aca_get(aca);
write_unlock_bh(&idev->lock);
+ ipv6_add_acaddr_hash(net, aca);
+
ip6_ins_rt(net, f6i);
addrconf_join_solict(idev->dev, &aca->aca_addr);
else
idev->ac_list = aca->aca_next;
write_unlock_bh(&idev->lock);
+ ipv6_del_acaddr_hash(aca);
addrconf_leave_solict(idev, &aca->aca_addr);
ip6_del_rt(dev_net(idev->dev), aca->aca_rt);
idev->ac_list = aca->aca_next;
write_unlock_bh(&idev->lock);
+ ipv6_del_acaddr_hash(aca);
+
addrconf_leave_solict(idev, &aca->aca_addr);
ip6_del_rt(dev_net(idev->dev), aca->aca_rt);
bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
const struct in6_addr *addr)
{
+ unsigned int hash = inet6_acaddr_hash(net, addr);
+ struct net_device *nh_dev;
+ struct ifacaddr6 *aca;
bool found = false;
rcu_read_lock();
if (dev)
found = ipv6_chk_acast_dev(dev, addr);
else
- for_each_netdev_rcu(net, dev)
- if (ipv6_chk_acast_dev(dev, addr)) {
+ hlist_for_each_entry_rcu(aca, &inet6_acaddr_lst[hash],
+ aca_addr_lst) {
+ nh_dev = fib6_info_nh_dev(aca->aca_rt);
+ if (!nh_dev || !net_eq(dev_net(nh_dev), net))
+ continue;
+ if (ipv6_addr_equal(&aca->aca_addr, addr)) {
found = true;
break;
}
+ }
rcu_read_unlock();
return found;
}
remove_proc_entry("anycast6", net->proc_net);
}
#endif
+
+/* Init / cleanup code
+ */
+int __init ipv6_anycast_init(void)
+{
+ int i;
+
+ for (i = 0; i < IN6_ADDR_HSIZE; i++)
+ INIT_HLIST_HEAD(&inet6_acaddr_lst[i]);
+ return 0;
+}
+
+void ipv6_anycast_cleanup(void)
+{
+ int i;
+
+ spin_lock(&acaddr_hash_lock);
+ for (i = 0; i < IN6_ADDR_HSIZE; i++)
+ WARN_ON(!hlist_empty(&inet6_acaddr_lst[i]));
+ spin_unlock(&acaddr_hash_lock);
+}
/* fib entries are never clones */
if (arg.filter.flags & RTM_F_CLONED)
- return skb->len;
+ goto out;
w = (void *)cb->args[2];
if (!w) {
tb = fib6_get_table(net, arg.filter.table_id);
if (!tb) {
if (arg.filter.dump_all_families)
- return skb->len;
+ goto out;
NL_SET_ERR_MSG_MOD(cb->extack, "FIB table does not exist");
return -ENOENT;
list_for_each_entry_safe(skb, next, head, list) {
struct dst_entry *dst;
- list_del(&skb->list);
+ skb_list_del_init(skb);
/* if ingress device is enslaved to an L3 master device pass the
* skb to its handler for processing
*/
struct net_device *dev = skb->dev;
struct net *net = dev_net(dev);
- list_del(&skb->list);
+ skb_list_del_init(skb);
skb = ip6_rcv_core(skb, dev, net);
if (skb == NULL)
continue;
const struct ipv6_pinfo *np = inet6_sk(sk);
struct in6_addr *first_hop = &fl6->daddr;
struct dst_entry *dst = skb_dst(skb);
+ unsigned int head_room;
struct ipv6hdr *hdr;
u8 proto = fl6->flowi6_proto;
int seg_len = skb->len;
int hlimit = -1;
u32 mtu;
- if (opt) {
- unsigned int head_room;
+ head_room = sizeof(struct ipv6hdr) + LL_RESERVED_SPACE(dst->dev);
+ if (opt)
+ head_room += opt->opt_nflen + opt->opt_flen;
- /* First: exthdrs may take lots of space (~8K for now)
- MAX_HEADER is not enough.
- */
- head_room = opt->opt_nflen + opt->opt_flen;
- seg_len += head_room;
- head_room += sizeof(struct ipv6hdr) + LL_RESERVED_SPACE(dst->dev);
-
- if (skb_headroom(skb) < head_room) {
- struct sk_buff *skb2 = skb_realloc_headroom(skb, head_room);
- if (!skb2) {
- IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
- IPSTATS_MIB_OUTDISCARDS);
- kfree_skb(skb);
- return -ENOBUFS;
- }
- if (skb->sk)
- skb_set_owner_w(skb2, skb->sk);
- consume_skb(skb);
- skb = skb2;
+ if (unlikely(skb_headroom(skb) < head_room)) {
+ struct sk_buff *skb2 = skb_realloc_headroom(skb, head_room);
+ if (!skb2) {
+ IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
+ IPSTATS_MIB_OUTDISCARDS);
+ kfree_skb(skb);
+ return -ENOBUFS;
}
+ if (skb->sk)
+ skb_set_owner_w(skb2, skb->sk);
+ consume_skb(skb);
+ skb = skb2;
+ }
+
+ if (opt) {
+ seg_len += opt->opt_nflen + opt->opt_flen;
+
if (opt->opt_flen)
ipv6_push_frag_opts(skb, opt, &proto);
+
if (opt->opt_nflen)
ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop,
&fl6->saddr);
unsigned int fraglen;
unsigned int fraggap;
unsigned int alloclen;
- unsigned int pagedlen = 0;
+ unsigned int pagedlen;
alloc_new_skb:
/* There's no room in the current skb */
if (skb)
if (datalen > (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - fragheaderlen)
datalen = maxfraglen - fragheaderlen - rt->dst.trailer_len;
fraglen = datalen + fragheaderlen;
+ pagedlen = 0;
if ((flags & MSG_MORE) &&
!(rt->dst.dev->features&NETIF_F_SG))
unsigned int hh_len;
struct dst_entry *dst;
struct flowi6 fl6 = {
- .flowi6_oif = sk ? sk->sk_bound_dev_if : 0,
+ .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if :
+ rt6_need_strict(&iph->daddr) ? skb_dst(skb)->dev->ifindex : 0,
.flowi6_mark = skb->mark,
.flowi6_uid = sock_net_uid(net, sk),
.daddr = iph->daddr,
int err;
err = xt_register_target(&masquerade_tg6_reg);
- if (err == 0)
- nf_nat_masquerade_ipv6_register_notifier();
+ if (err)
+ return err;
+
+ err = nf_nat_masquerade_ipv6_register_notifier();
+ if (err)
+ xt_unregister_target(&masquerade_tg6_reg);
return err;
}
nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev, struct net_device *dev)
{
struct sk_buff *fp, *head = fq->q.fragments;
- int payload_len;
+ int payload_len, delta;
u8 ecn;
inet_frag_kill(&fq->q);
return false;
}
+ delta = - head->truesize;
+
/* Head of list must not be cloned. */
if (skb_unclone(head, GFP_ATOMIC))
return false;
+ delta += head->truesize;
+ if (delta)
+ add_frag_mem_limit(fq->q.net, delta);
+
/* If the first fragment is fragmented itself, we split
* it to two chunks: the first with data and paged part
* and the second, holding only fragments. */
*/
ret = -EINPROGRESS;
if (fq->q.flags == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
- fq->q.meat == fq->q.len &&
- nf_ct_frag6_reasm(fq, skb, dev))
- ret = 0;
- else
+ fq->q.meat == fq->q.len) {
+ unsigned long orefdst = skb->_skb_refdst;
+
+ skb->_skb_refdst = 0UL;
+ if (nf_ct_frag6_reasm(fq, skb, dev))
+ ret = 0;
+ skb->_skb_refdst = orefdst;
+ } else {
skb_dst_drop(skb);
+ }
out_unlock:
spin_unlock_bh(&fq->q.lock);
* of ipv6 addresses being deleted), we also need to add an upper
* limit to the number of queued work items.
*/
-static int masq_inet_event(struct notifier_block *this,
- unsigned long event, void *ptr)
+static int masq_inet6_event(struct notifier_block *this,
+ unsigned long event, void *ptr)
{
struct inet6_ifaddr *ifa = ptr;
const struct net_device *dev;
return NOTIFY_DONE;
}
-static struct notifier_block masq_inet_notifier = {
- .notifier_call = masq_inet_event,
+static struct notifier_block masq_inet6_notifier = {
+ .notifier_call = masq_inet6_event,
};
-static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
+static int masq_refcnt;
+static DEFINE_MUTEX(masq_mutex);
-void nf_nat_masquerade_ipv6_register_notifier(void)
+int nf_nat_masquerade_ipv6_register_notifier(void)
{
+ int ret = 0;
+
+ mutex_lock(&masq_mutex);
/* check if the notifier is already set */
- if (atomic_inc_return(&masquerade_notifier_refcount) > 1)
- return;
+ if (++masq_refcnt > 1)
+ goto out_unlock;
+
+ ret = register_netdevice_notifier(&masq_dev_notifier);
+ if (ret)
+ goto err_dec;
+
+ ret = register_inet6addr_notifier(&masq_inet6_notifier);
+ if (ret)
+ goto err_unregister;
- register_netdevice_notifier(&masq_dev_notifier);
- register_inet6addr_notifier(&masq_inet_notifier);
+ mutex_unlock(&masq_mutex);
+ return ret;
+
+err_unregister:
+ unregister_netdevice_notifier(&masq_dev_notifier);
+err_dec:
+ masq_refcnt--;
+out_unlock:
+ mutex_unlock(&masq_mutex);
+ return ret;
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_register_notifier);
void nf_nat_masquerade_ipv6_unregister_notifier(void)
{
+ mutex_lock(&masq_mutex);
/* check if the notifier still has clients */
- if (atomic_dec_return(&masquerade_notifier_refcount) > 0)
- return;
+ if (--masq_refcnt > 0)
+ goto out_unlock;
- unregister_inet6addr_notifier(&masq_inet_notifier);
+ unregister_inet6addr_notifier(&masq_inet6_notifier);
unregister_netdevice_notifier(&masq_dev_notifier);
+out_unlock:
+ mutex_unlock(&masq_mutex);
}
EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_unregister_notifier);
if (ret < 0)
return ret;
- nf_nat_masquerade_ipv6_register_notifier();
+ ret = nf_nat_masquerade_ipv6_register_notifier();
+ if (ret)
+ nft_unregister_expr(&nft_masq_ipv6_type);
return ret;
}
{
struct net *net = container_of(fq->q.net, struct net, ipv6.frags);
struct sk_buff *fp, *head = fq->q.fragments;
- int payload_len;
+ int payload_len, delta;
unsigned int nhoff;
int sum_truesize;
u8 ecn;
if (payload_len > IPV6_MAXPLEN)
goto out_oversize;
+ delta = - head->truesize;
+
/* Head of list must not be cloned. */
if (skb_unclone(head, GFP_ATOMIC))
goto out_oom;
+ delta += head->truesize;
+ if (delta)
+ add_frag_mem_limit(fq->q.net, delta);
+
/* If the first fragment is fragmented itself, we split
* it to two chunks: the first with data and paged part
* and the second, holding only fragments. */
if (rt) {
rcu_read_lock();
if (rt->rt6i_flags & RTF_CACHE) {
- if (dst_hold_safe(&rt->dst))
- rt6_remove_exception_rt(rt);
+ rt6_remove_exception_rt(rt);
} else {
struct fib6_info *from;
struct fib6_node *fn;
void ip6_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, __be32 mtu)
{
+ int oif = sk->sk_bound_dev_if;
struct dst_entry *dst;
- ip6_update_pmtu(skb, sock_net(sk), mtu,
- sk->sk_bound_dev_if, sk->sk_mark, sk->sk_uid);
+ if (!oif && skb->dev)
+ oif = l3mdev_master_ifindex(skb->dev);
+
+ ip6_update_pmtu(skb, sock_net(sk), mtu, oif, sk->sk_mark, sk->sk_uid);
dst = __sk_dst_get(sk);
if (!dst || !dst->obsolete ||
if (cfg->fc_flags & RTF_GATEWAY &&
!ipv6_addr_equal(&cfg->fc_gateway, &rt->rt6i_gateway))
goto out;
- if (dst_hold_safe(&rt->dst))
- rc = rt6_remove_exception_rt(rt);
+
+ rc = rt6_remove_exception_rt(rt);
out:
return rc;
}
struct ipv6hdr *hdr = ipv6_hdr(skb);
struct flowi6 fl6;
+ memset(&fl6, 0, sizeof(fl6));
fl6.daddr = hdr->daddr;
fl6.saddr = hdr->saddr;
fl6.flowlabel = ip6_flowinfo(hdr);
goto err_sock;
}
- sk = sock->sk;
-
- sock_hold(sk);
- tunnel->sock = sk;
tunnel->l2tp_net = net;
-
pn = l2tp_pernet(net);
spin_lock_bh(&pn->l2tp_tunnel_list_lock);
list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
spin_unlock_bh(&pn->l2tp_tunnel_list_lock);
+ sk = sock->sk;
+ sock_hold(sk);
+ tunnel->sock = sk;
+
if (tunnel->encap == L2TP_ENCAPTYPE_UDP) {
struct udp_tunnel_sock_cfg udp_cfg = {
.sk_user_data = tunnel,
len = beacon->head_len + beacon->tail_len + beacon->beacon_ies_len +
beacon->proberesp_ies_len + beacon->assocresp_ies_len +
- beacon->probe_resp_len;
+ beacon->probe_resp_len + beacon->lci_len + beacon->civicloc_len;
new_beacon = kzalloc(sizeof(*new_beacon) + len, GFP_KERNEL);
if (!new_beacon)
memcpy(pos, beacon->probe_resp, beacon->probe_resp_len);
pos += beacon->probe_resp_len;
}
- if (beacon->ftm_responder)
- new_beacon->ftm_responder = beacon->ftm_responder;
+
+ /* might copy -1, meaning no changes requested */
+ new_beacon->ftm_responder = beacon->ftm_responder;
if (beacon->lci) {
new_beacon->lci_len = beacon->lci_len;
new_beacon->lci = pos;
if (local->open_count == 0)
ieee80211_clear_tx_pending(local);
+ sdata->vif.bss_conf.beacon_int = 0;
+
/*
* If the interface goes down while suspended, presumably because
* the device was unplugged and that happens before our resume,
{
struct ieee80211_if_managed *ifmgd = &sdata->u.mgd;
struct sta_info *sta;
+ bool result = true;
sdata_info(sdata, "authenticated\n");
ifmgd->auth_data->done = true;
sta = sta_info_get(sdata, bssid);
if (!sta) {
WARN_ONCE(1, "%s: STA %pM not found", sdata->name, bssid);
- return false;
+ result = false;
+ goto out;
}
if (sta_info_move_state(sta, IEEE80211_STA_AUTH)) {
sdata_info(sdata, "failed moving %pM to auth\n", bssid);
- return false;
+ result = false;
+ goto out;
}
- mutex_unlock(&sdata->local->sta_mtx);
- return true;
+out:
+ mutex_unlock(&sdata->local->sta_mtx);
+ return result;
}
static void ieee80211_rx_mgmt_auth(struct ieee80211_sub_if_data *sdata,
return RX_CONTINUE;
if (ieee80211_is_ctl(hdr->frame_control) ||
+ ieee80211_is_nullfunc(hdr->frame_control) ||
ieee80211_is_qos_nullfunc(hdr->frame_control) ||
is_multicast_ether_addr(hdr->addr1))
return RX_CONTINUE;
cfg80211_sta_opmode_change_notify(sdata->dev,
rx->sta->addr,
&sta_opmode,
- GFP_KERNEL);
+ GFP_ATOMIC);
goto handled;
}
case WLAN_HT_ACTION_NOTIFY_CHANWIDTH: {
cfg80211_sta_opmode_change_notify(sdata->dev,
rx->sta->addr,
&sta_opmode,
- GFP_KERNEL);
+ GFP_ATOMIC);
goto handled;
}
default:
/* Track when last TDLS packet was ACKed */
if (test_sta_flag(sta, WLAN_STA_TDLS_PEER_AUTH))
sta->status_stats.last_tdls_pkt_time = jiffies;
+ } else if (test_sta_flag(sta, WLAN_STA_PS_STA)) {
+ return;
} else {
ieee80211_lost_packet(sta, info);
}
if (ieee80211_hw_check(&tx->local->hw, QUEUE_CONTROL))
info->hw_queue = tx->sdata->vif.cab_queue;
- /* no stations in PS mode */
- if (!atomic_read(&ps->num_sta_ps))
+ /* no stations in PS mode and no buffered packets */
+ if (!atomic_read(&ps->num_sta_ps) && skb_queue_empty(&ps->bc_buf))
return TX_CONTINUE;
info->flags |= IEEE80211_TX_CTL_SEND_AFTER_DTIM;
MODULE_DESCRIPTION("core IP set support");
MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_IPSET);
-/* When the nfnl mutex is held: */
+/* When the nfnl mutex or ip_set_ref_lock is held: */
#define ip_set_dereference(p) \
- rcu_dereference_protected(p, lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
+ rcu_dereference_protected(p, \
+ lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET) || \
+ lockdep_is_held(&ip_set_ref_lock))
#define ip_set(inst, id) \
ip_set_dereference((inst)->ip_set_list)[id]
+#define ip_set_ref_netlink(inst,id) \
+ rcu_dereference_raw((inst)->ip_set_list)[id]
/* The set types are implemented in modules and registered set types
* can be found in ip_set_type_list. Adding/deleting types is
EXPORT_SYMBOL_GPL(ip_set_put_byindex);
/* Get the name of a set behind a set index.
- * We assume the set is referenced, so it does exist and
- * can't be destroyed. The set cannot be renamed due to
- * the referencing either.
- *
+ * Set itself is protected by RCU, but its name isn't: to protect against
+ * renaming, grab ip_set_ref_lock as reader (see ip_set_rename()) and copy the
+ * name.
*/
-const char *
-ip_set_name_byindex(struct net *net, ip_set_id_t index)
+void
+ip_set_name_byindex(struct net *net, ip_set_id_t index, char *name)
{
- const struct ip_set *set = ip_set_rcu_get(net, index);
+ struct ip_set *set = ip_set_rcu_get(net, index);
BUG_ON(!set);
- BUG_ON(set->ref == 0);
- /* Referenced, so it's safe */
- return set->name;
+ read_lock_bh(&ip_set_ref_lock);
+ strncpy(name, set->name, IPSET_MAXNAMELEN);
+ read_unlock_bh(&ip_set_ref_lock);
}
EXPORT_SYMBOL_GPL(ip_set_name_byindex);
/* Wraparound */
goto cleanup;
- list = kcalloc(i, sizeof(struct ip_set *), GFP_KERNEL);
+ list = kvcalloc(i, sizeof(struct ip_set *), GFP_KERNEL);
if (!list)
goto cleanup;
/* nfnl mutex is held, both lists are valid */
/* Use new list */
index = inst->ip_set_max;
inst->ip_set_max = i;
- kfree(tmp);
+ kvfree(tmp);
ret = 0;
} else if (ret) {
goto cleanup;
if (!set)
return -ENOENT;
- read_lock_bh(&ip_set_ref_lock);
+ write_lock_bh(&ip_set_ref_lock);
if (set->ref != 0) {
ret = -IPSET_ERR_REFERENCED;
goto out;
strncpy(set->name, name2, IPSET_MAXNAMELEN);
out:
- read_unlock_bh(&ip_set_ref_lock);
+ write_unlock_bh(&ip_set_ref_lock);
return ret;
}
struct ip_set_net *inst =
(struct ip_set_net *)cb->args[IPSET_CB_NET];
ip_set_id_t index = (ip_set_id_t)cb->args[IPSET_CB_INDEX];
- struct ip_set *set = ip_set(inst, index);
+ struct ip_set *set = ip_set_ref_netlink(inst, index);
if (set->variant->uref)
set->variant->uref(set, cb, false);
release_refcount:
/* If there was an error or set is done, release set */
if (ret || !cb->args[IPSET_CB_ARG0]) {
- set = ip_set(inst, index);
+ set = ip_set_ref_netlink(inst, index);
if (set->variant->uref)
set->variant->uref(set, cb, false);
pr_debug("release set %s\n", set->name);
if (inst->ip_set_max >= IPSET_INVALID_ID)
inst->ip_set_max = IPSET_INVALID_ID - 1;
- list = kcalloc(inst->ip_set_max, sizeof(struct ip_set *), GFP_KERNEL);
+ list = kvcalloc(inst->ip_set_max, sizeof(struct ip_set *), GFP_KERNEL);
if (!list)
return -ENOMEM;
inst->is_deleted = false;
}
}
nfnl_unlock(NFNL_SUBSYS_IPSET);
- kfree(rcu_dereference_protected(inst->ip_set_list, 1));
+ kvfree(rcu_dereference_protected(inst->ip_set_list, 1));
}
static struct pernet_operations ip_set_net_ops = {
if (tb[IPSET_ATTR_CIDR]) {
e.cidr[0] = nla_get_u8(tb[IPSET_ATTR_CIDR]);
- if (!e.cidr[0] || e.cidr[0] > HOST_MASK)
+ if (e.cidr[0] > HOST_MASK)
return -IPSET_ERR_INVALID_CIDR;
}
if (tb[IPSET_ATTR_CIDR2]) {
e.cidr[1] = nla_get_u8(tb[IPSET_ATTR_CIDR2]);
- if (!e.cidr[1] || e.cidr[1] > HOST_MASK)
+ if (e.cidr[1] > HOST_MASK)
return -IPSET_ERR_INVALID_CIDR;
}
if (tb[IPSET_ATTR_CIDR]) {
e.cidr[0] = nla_get_u8(tb[IPSET_ATTR_CIDR]);
- if (!e.cidr[0] || e.cidr[0] > HOST_MASK)
+ if (e.cidr[0] > HOST_MASK)
return -IPSET_ERR_INVALID_CIDR;
}
if (tb[IPSET_ATTR_CIDR2]) {
e.cidr[1] = nla_get_u8(tb[IPSET_ATTR_CIDR2]);
- if (!e.cidr[1] || e.cidr[1] > HOST_MASK)
+ if (e.cidr[1] > HOST_MASK)
return -IPSET_ERR_INVALID_CIDR;
}
{
struct set_elem *e = container_of(rcu, struct set_elem, rcu);
struct ip_set *set = e->set;
- struct list_set *map = set->data;
- ip_set_put_byindex(map->net, e->id);
ip_set_ext_destroy(set, e);
kfree(e);
}
static inline void
list_set_del(struct ip_set *set, struct set_elem *e)
{
+ struct list_set *map = set->data;
+
set->elements--;
list_del_rcu(&e->list);
+ ip_set_put_byindex(map->net, e->id);
call_rcu(&e->rcu, __list_set_del_rcu);
}
static inline void
-list_set_replace(struct set_elem *e, struct set_elem *old)
+list_set_replace(struct ip_set *set, struct set_elem *e, struct set_elem *old)
{
+ struct list_set *map = set->data;
+
list_replace_rcu(&old->list, &e->list);
+ ip_set_put_byindex(map->net, old->id);
call_rcu(&old->rcu, __list_set_del_rcu);
}
INIT_LIST_HEAD(&e->list);
list_set_init_extensions(set, ext, e);
if (n)
- list_set_replace(e, n);
+ list_set_replace(set, e, n);
else if (next)
list_add_tail_rcu(&e->list, &next->list);
else if (prev)
const struct list_set *map = set->data;
struct nlattr *atd, *nested;
u32 i = 0, first = cb->args[IPSET_CB_ARG0];
+ char name[IPSET_MAXNAMELEN];
struct set_elem *e;
int ret = 0;
nested = ipset_nest_start(skb, IPSET_ATTR_DATA);
if (!nested)
goto nla_put_failure;
- if (nla_put_string(skb, IPSET_ATTR_NAME,
- ip_set_name_byindex(map->net, e->id)))
+ ip_set_name_byindex(map->net, e->id, name);
+ if (nla_put_string(skb, IPSET_ATTR_NAME, name))
goto nla_put_failure;
if (ip_set_put_extensions(skb, set, e, true))
goto nla_put_failure;
static struct notifier_block ip_vs_dst_notifier = {
.notifier_call = ip_vs_dst_event,
+#ifdef CONFIG_IP_VS_IPV6
+ .priority = ADDRCONF_NOTIFY_PRIORITY + 5,
+#endif
};
int __net_init ip_vs_control_net_init(struct netns_ipvs *ipvs)
EnterFunction(7);
/* Receive a packet */
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, buflen);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, buflen);
len = sock_recvmsg(sock, &msg, MSG_DONTWAIT);
if (len < 0)
return len;
struct nf_conntrack_zone zone;
int cpu;
u32 jiffies32;
+ bool dead;
struct rcu_head rcu_head;
};
conn->zone = *zone;
conn->cpu = raw_smp_processor_id();
conn->jiffies32 = (u32)jiffies;
- spin_lock(&list->list_lock);
+ conn->dead = false;
+ spin_lock_bh(&list->list_lock);
if (list->dead == true) {
kmem_cache_free(conncount_conn_cachep, conn);
- spin_unlock(&list->list_lock);
+ spin_unlock_bh(&list->list_lock);
return NF_CONNCOUNT_SKIP;
}
list_add_tail(&conn->node, &list->head);
list->count++;
- spin_unlock(&list->list_lock);
+ spin_unlock_bh(&list->list_lock);
return NF_CONNCOUNT_ADDED;
}
EXPORT_SYMBOL_GPL(nf_conncount_add);
{
bool free_entry = false;
- spin_lock(&list->list_lock);
+ spin_lock_bh(&list->list_lock);
- if (list->count == 0) {
- spin_unlock(&list->list_lock);
- return free_entry;
+ if (conn->dead) {
+ spin_unlock_bh(&list->list_lock);
+ return free_entry;
}
list->count--;
+ conn->dead = true;
list_del_rcu(&conn->node);
- if (list->count == 0)
+ if (list->count == 0) {
+ list->dead = true;
free_entry = true;
+ }
- spin_unlock(&list->list_lock);
+ spin_unlock_bh(&list->list_lock);
call_rcu(&conn->rcu_head, __conn_free);
return free_entry;
}
{
spin_lock_init(&list->list_lock);
INIT_LIST_HEAD(&list->head);
- list->count = 1;
+ list->count = 0;
list->dead = false;
}
EXPORT_SYMBOL_GPL(nf_conncount_list_init);
struct nf_conn *found_ct;
unsigned int collected = 0;
bool free_entry = false;
+ bool ret = false;
list_for_each_entry_safe(conn, conn_n, &list->head, node) {
found = find_or_evict(net, list, conn, &free_entry);
if (collected > CONNCOUNT_GC_MAX_NODES)
return false;
}
- return false;
+
+ spin_lock_bh(&list->list_lock);
+ if (!list->count) {
+ list->dead = true;
+ ret = true;
+ }
+ spin_unlock_bh(&list->list_lock);
+
+ return ret;
}
EXPORT_SYMBOL_GPL(nf_conncount_gc_list);
while (gc_count) {
rbconn = gc_nodes[--gc_count];
spin_lock(&rbconn->list.list_lock);
- if (rbconn->list.count == 0 && rbconn->list.dead == false) {
- rbconn->list.dead = true;
- rb_erase(&rbconn->node, root);
- call_rcu(&rbconn->rcu_head, __tree_nodes_free);
- }
+ rb_erase(&rbconn->node, root);
+ call_rcu(&rbconn->rcu_head, __tree_nodes_free);
spin_unlock(&rbconn->list.list_lock);
}
}
nf_conncount_list_init(&rbconn->list);
list_add(&conn->node, &rbconn->list.head);
count = 1;
+ rbconn->list.count = count;
rb_link_node(&rbconn->node, parent, rbnode);
rb_insert_color(&rbconn->node, root);
return drops;
}
-static noinline int early_drop(struct net *net, unsigned int _hash)
+static noinline int early_drop(struct net *net, unsigned int hash)
{
- unsigned int i;
+ unsigned int i, bucket;
for (i = 0; i < NF_CT_EVICTION_RANGE; i++) {
struct hlist_nulls_head *ct_hash;
- unsigned int hash, hsize, drops;
+ unsigned int hsize, drops;
rcu_read_lock();
nf_conntrack_get_ht(&ct_hash, &hsize);
- hash = reciprocal_scale(_hash++, hsize);
+ if (!i)
+ bucket = reciprocal_scale(hash, hsize);
+ else
+ bucket = (bucket + 1) % hsize;
- drops = early_drop_list(net, &ct_hash[hash]);
+ drops = early_drop_list(net, &ct_hash[bucket]);
rcu_read_unlock();
if (drops) {
},
};
-static inline struct nf_dccp_net *dccp_pernet(struct net *net)
-{
- return &net->ct.nf_ct_proto.dccp;
-}
-
static noinline bool
dccp_new(struct nf_conn *ct, const struct sk_buff *skb,
const struct dccp_hdr *dh)
state = dccp_state_table[CT_DCCP_ROLE_CLIENT][dh->dccph_type][CT_DCCP_NONE];
switch (state) {
default:
- dn = dccp_pernet(net);
+ dn = nf_dccp_pernet(net);
if (dn->dccp_loose == 0) {
msg = "not picking up existing connection ";
goto out_invalid;
timeouts = nf_ct_timeout_lookup(ct);
if (!timeouts)
- timeouts = dccp_pernet(nf_ct_net(ct))->dccp_timeout;
+ timeouts = nf_dccp_pernet(nf_ct_net(ct))->dccp_timeout;
nf_ct_refresh_acct(ct, ctinfo, skb, timeouts[new_state]);
return NF_ACCEPT;
static int dccp_timeout_nlattr_to_obj(struct nlattr *tb[],
struct net *net, void *data)
{
- struct nf_dccp_net *dn = dccp_pernet(net);
+ struct nf_dccp_net *dn = nf_dccp_pernet(net);
unsigned int *timeouts = data;
int i;
static int dccp_init_net(struct net *net)
{
- struct nf_dccp_net *dn = dccp_pernet(net);
+ struct nf_dccp_net *dn = nf_dccp_pernet(net);
struct nf_proto_net *pn = &dn->pn;
if (!pn->users) {
}
}
-static inline struct nf_generic_net *generic_pernet(struct net *net)
-{
- return &net->ct.nf_ct_proto.generic;
-}
-
static bool generic_pkt_to_tuple(const struct sk_buff *skb,
unsigned int dataoff,
struct net *net, struct nf_conntrack_tuple *tuple)
}
if (!timeout)
- timeout = &generic_pernet(nf_ct_net(ct))->timeout;
+ timeout = &nf_generic_pernet(nf_ct_net(ct))->timeout;
nf_ct_refresh_acct(ct, ctinfo, skb, *timeout);
return NF_ACCEPT;
static int generic_timeout_nlattr_to_obj(struct nlattr *tb[],
struct net *net, void *data)
{
- struct nf_generic_net *gn = generic_pernet(net);
+ struct nf_generic_net *gn = nf_generic_pernet(net);
unsigned int *timeout = data;
if (!timeout)
static int generic_init_net(struct net *net)
{
- struct nf_generic_net *gn = generic_pernet(net);
+ struct nf_generic_net *gn = nf_generic_pernet(net);
struct nf_proto_net *pn = &gn->pn;
gn->timeout = nf_ct_generic_timeout;
#include <linux/netfilter/nf_conntrack_proto_gre.h>
#include <linux/netfilter/nf_conntrack_pptp.h>
-enum grep_conntrack {
- GRE_CT_UNREPLIED,
- GRE_CT_REPLIED,
- GRE_CT_MAX
-};
-
static const unsigned int gre_timeouts[GRE_CT_MAX] = {
[GRE_CT_UNREPLIED] = 30*HZ,
[GRE_CT_REPLIED] = 180*HZ,
};
static unsigned int proto_gre_net_id __read_mostly;
-struct netns_proto_gre {
- struct nf_proto_net nf;
- rwlock_t keymap_lock;
- struct list_head keymap_list;
- unsigned int gre_timeouts[GRE_CT_MAX];
-};
static inline struct netns_proto_gre *gre_pernet(struct net *net)
{
{
int ret;
+ BUILD_BUG_ON(offsetof(struct netns_proto_gre, nf) != 0);
+
ret = register_pernet_subsys(&proto_gre_net_ops);
if (ret < 0)
goto out_pernet;
static const unsigned int nf_ct_icmp_timeout = 30*HZ;
-static inline struct nf_icmp_net *icmp_pernet(struct net *net)
-{
- return &net->ct.nf_ct_proto.icmp;
-}
-
static bool icmp_pkt_to_tuple(const struct sk_buff *skb, unsigned int dataoff,
struct net *net, struct nf_conntrack_tuple *tuple)
{
}
if (!timeout)
- timeout = &icmp_pernet(nf_ct_net(ct))->timeout;
+ timeout = &nf_icmp_pernet(nf_ct_net(ct))->timeout;
nf_ct_refresh_acct(ct, ctinfo, skb, *timeout);
return NF_ACCEPT;
struct net *net, void *data)
{
unsigned int *timeout = data;
- struct nf_icmp_net *in = icmp_pernet(net);
+ struct nf_icmp_net *in = nf_icmp_pernet(net);
if (tb[CTA_TIMEOUT_ICMP_TIMEOUT]) {
if (!timeout)
static int icmp_init_net(struct net *net)
{
- struct nf_icmp_net *in = icmp_pernet(net);
+ struct nf_icmp_net *in = nf_icmp_pernet(net);
struct nf_proto_net *pn = &in->pn;
in->timeout = nf_ct_icmp_timeout;
static const unsigned int nf_ct_icmpv6_timeout = 30*HZ;
-static inline struct nf_icmp_net *icmpv6_pernet(struct net *net)
-{
- return &net->ct.nf_ct_proto.icmpv6;
-}
-
static bool icmpv6_pkt_to_tuple(const struct sk_buff *skb,
unsigned int dataoff,
struct net *net,
static unsigned int *icmpv6_get_timeouts(struct net *net)
{
- return &icmpv6_pernet(net)->timeout;
+ return &nf_icmpv6_pernet(net)->timeout;
}
/* Returns verdict for packet, or -1 for invalid. */
struct net *net, void *data)
{
unsigned int *timeout = data;
- struct nf_icmp_net *in = icmpv6_pernet(net);
+ struct nf_icmp_net *in = nf_icmpv6_pernet(net);
if (!timeout)
timeout = icmpv6_get_timeouts(net);
static int icmpv6_init_net(struct net *net)
{
- struct nf_icmp_net *in = icmpv6_pernet(net);
+ struct nf_icmp_net *in = nf_icmpv6_pernet(net);
struct nf_proto_net *pn = &in->pn;
in->timeout = nf_ct_icmpv6_timeout;
}
};
-static inline struct nf_sctp_net *sctp_pernet(struct net *net)
-{
- return &net->ct.nf_ct_proto.sctp;
-}
-
#ifdef CONFIG_NF_CONNTRACK_PROCFS
/* Print out the private part of the conntrack. */
static void sctp_print_conntrack(struct seq_file *s, struct nf_conn *ct)
timeouts = nf_ct_timeout_lookup(ct);
if (!timeouts)
- timeouts = sctp_pernet(nf_ct_net(ct))->timeouts;
+ timeouts = nf_sctp_pernet(nf_ct_net(ct))->timeouts;
nf_ct_refresh_acct(ct, ctinfo, skb, timeouts[new_state]);
struct net *net, void *data)
{
unsigned int *timeouts = data;
- struct nf_sctp_net *sn = sctp_pernet(net);
+ struct nf_sctp_net *sn = nf_sctp_pernet(net);
int i;
/* set default SCTP timeouts. */
static int sctp_init_net(struct net *net)
{
- struct nf_sctp_net *sn = sctp_pernet(net);
+ struct nf_sctp_net *sn = nf_sctp_pernet(net);
struct nf_proto_net *pn = &sn->pn;
if (!pn->users) {
}
};
-static inline struct nf_tcp_net *tcp_pernet(struct net *net)
-{
- return &net->ct.nf_ct_proto.tcp;
-}
-
#ifdef CONFIG_NF_CONNTRACK_PROCFS
/* Print out the private part of the conntrack. */
static void tcp_print_conntrack(struct seq_file *s, struct nf_conn *ct)
const struct tcphdr *tcph)
{
struct net *net = nf_ct_net(ct);
- struct nf_tcp_net *tn = tcp_pernet(net);
+ struct nf_tcp_net *tn = nf_tcp_pernet(net);
struct ip_ct_tcp_state *sender = &state->seen[dir];
struct ip_ct_tcp_state *receiver = &state->seen[!dir];
const struct nf_conntrack_tuple *tuple = &ct->tuplehash[dir].tuple;
{
enum tcp_conntrack new_state;
struct net *net = nf_ct_net(ct);
- const struct nf_tcp_net *tn = tcp_pernet(net);
+ const struct nf_tcp_net *tn = nf_tcp_pernet(net);
const struct ip_ct_tcp_state *sender = &ct->proto.tcp.seen[0];
const struct ip_ct_tcp_state *receiver = &ct->proto.tcp.seen[1];
const struct nf_hook_state *state)
{
struct net *net = nf_ct_net(ct);
- struct nf_tcp_net *tn = tcp_pernet(net);
+ struct nf_tcp_net *tn = nf_tcp_pernet(net);
struct nf_conntrack_tuple *tuple;
enum tcp_conntrack new_state, old_state;
unsigned int index, *timeouts;
static int tcp_timeout_nlattr_to_obj(struct nlattr *tb[],
struct net *net, void *data)
{
- struct nf_tcp_net *tn = tcp_pernet(net);
+ struct nf_tcp_net *tn = nf_tcp_pernet(net);
unsigned int *timeouts = data;
int i;
static int tcp_init_net(struct net *net)
{
- struct nf_tcp_net *tn = tcp_pernet(net);
+ struct nf_tcp_net *tn = nf_tcp_pernet(net);
struct nf_proto_net *pn = &tn->pn;
if (!pn->users) {
[UDP_CT_REPLIED] = 180*HZ,
};
-static inline struct nf_udp_net *udp_pernet(struct net *net)
-{
- return &net->ct.nf_ct_proto.udp;
-}
-
static unsigned int *udp_get_timeouts(struct net *net)
{
- return udp_pernet(net)->timeouts;
+ return nf_udp_pernet(net)->timeouts;
}
static void udp_error_log(const struct sk_buff *skb,
struct net *net, void *data)
{
unsigned int *timeouts = data;
- struct nf_udp_net *un = udp_pernet(net);
+ struct nf_udp_net *un = nf_udp_pernet(net);
if (!timeouts)
timeouts = un->timeouts;
static int udp_init_net(struct net *net)
{
- struct nf_udp_net *un = udp_pernet(net);
+ struct nf_udp_net *un = nf_udp_pernet(net);
struct nf_proto_net *pn = &un->pn;
if (!pn->users) {
static void nf_tables_rule_destroy(const struct nft_ctx *ctx,
struct nft_rule *rule)
{
- struct nft_expr *expr;
+ struct nft_expr *expr, *next;
/*
* Careful: some expressions might not be initialized in case this
*/
expr = nft_expr_first(rule);
while (expr != nft_expr_last(rule) && expr->ops) {
+ next = nft_expr_next(expr);
nf_tables_expr_destroy(ctx, expr);
- expr = nft_expr_next(expr);
+ expr = next;
}
kfree(rule);
}
if (chain->use == UINT_MAX)
return -EOVERFLOW;
- }
-
- if (nla[NFTA_RULE_POSITION]) {
- if (!(nlh->nlmsg_flags & NLM_F_CREATE))
- return -EOPNOTSUPP;
- pos_handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_POSITION]));
- old_rule = __nft_rule_lookup(chain, pos_handle);
- if (IS_ERR(old_rule)) {
- NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION]);
- return PTR_ERR(old_rule);
+ if (nla[NFTA_RULE_POSITION]) {
+ pos_handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_POSITION]));
+ old_rule = __nft_rule_lookup(chain, pos_handle);
+ if (IS_ERR(old_rule)) {
+ NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION]);
+ return PTR_ERR(old_rule);
+ }
}
}
}
if (nlh->nlmsg_flags & NLM_F_REPLACE) {
- if (!nft_is_active_next(net, old_rule)) {
- err = -ENOENT;
- goto err2;
- }
- trans = nft_trans_rule_add(&ctx, NFT_MSG_DELRULE,
- old_rule);
+ trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule);
if (trans == NULL) {
err = -ENOMEM;
goto err2;
}
- nft_deactivate_next(net, old_rule);
- chain->use--;
-
- if (nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule) == NULL) {
- err = -ENOMEM;
+ err = nft_delrule(&ctx, old_rule);
+ if (err < 0) {
+ nft_trans_destroy(trans);
goto err2;
}
call_rcu(&old->h, __nf_tables_commit_chain_free_rules_old);
}
-static void nf_tables_commit_chain_active(struct net *net, struct nft_chain *chain)
+static void nf_tables_commit_chain(struct net *net, struct nft_chain *chain)
{
struct nft_rule **g0, **g1;
bool next_genbit;
/* step 2. Make rules_gen_X visible to packet path */
list_for_each_entry(table, &net->nft.tables, list) {
- list_for_each_entry(chain, &table->chains, list) {
- if (!nft_is_active_next(net, chain))
- continue;
- nf_tables_commit_chain_active(net, chain);
- }
+ list_for_each_entry(chain, &table->chains, list)
+ nf_tables_commit_chain(net, chain);
}
/*
static int
cttimeout_default_fill_info(struct net *net, struct sk_buff *skb, u32 portid,
u32 seq, u32 type, int event, u16 l3num,
- const struct nf_conntrack_l4proto *l4proto)
+ const struct nf_conntrack_l4proto *l4proto,
+ const unsigned int *timeouts)
{
struct nlmsghdr *nlh;
struct nfgenmsg *nfmsg;
if (!nest_parms)
goto nla_put_failure;
- ret = l4proto->ctnl_timeout.obj_to_nlattr(skb, NULL);
+ ret = l4proto->ctnl_timeout.obj_to_nlattr(skb, timeouts);
if (ret < 0)
goto nla_put_failure;
struct netlink_ext_ack *extack)
{
const struct nf_conntrack_l4proto *l4proto;
+ unsigned int *timeouts = NULL;
struct sk_buff *skb2;
int ret, err;
__u16 l3num;
l4num = nla_get_u8(cda[CTA_TIMEOUT_L4PROTO]);
l4proto = nf_ct_l4proto_find_get(l4num);
- /* This protocol is not supported, skip. */
- if (l4proto->l4proto != l4num) {
- err = -EOPNOTSUPP;
+ err = -EOPNOTSUPP;
+ if (l4proto->l4proto != l4num)
goto err;
+
+ switch (l4proto->l4proto) {
+ case IPPROTO_ICMP:
+ timeouts = &nf_icmp_pernet(net)->timeout;
+ break;
+ case IPPROTO_TCP:
+ timeouts = nf_tcp_pernet(net)->timeouts;
+ break;
+ case IPPROTO_UDP: /* fallthrough */
+ case IPPROTO_UDPLITE:
+ timeouts = nf_udp_pernet(net)->timeouts;
+ break;
+ case IPPROTO_DCCP:
+#ifdef CONFIG_NF_CT_PROTO_DCCP
+ timeouts = nf_dccp_pernet(net)->dccp_timeout;
+#endif
+ break;
+ case IPPROTO_ICMPV6:
+ timeouts = &nf_icmpv6_pernet(net)->timeout;
+ break;
+ case IPPROTO_SCTP:
+#ifdef CONFIG_NF_CT_PROTO_SCTP
+ timeouts = nf_sctp_pernet(net)->timeouts;
+#endif
+ break;
+ case IPPROTO_GRE:
+#ifdef CONFIG_NF_CT_PROTO_GRE
+ if (l4proto->net_id) {
+ struct netns_proto_gre *net_gre;
+
+ net_gre = net_generic(net, *l4proto->net_id);
+ timeouts = net_gre->gre_timeouts;
+ }
+#endif
+ break;
+ case 255:
+ timeouts = &nf_generic_pernet(net)->timeout;
+ break;
+ default:
+ WARN_ONCE(1, "Missing timeouts for proto %d", l4proto->l4proto);
+ break;
}
+ if (!timeouts)
+ goto err;
+
skb2 = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (skb2 == NULL) {
err = -ENOMEM;
nlh->nlmsg_seq,
NFNL_MSG_TYPE(nlh->nlmsg_type),
IPCTNL_MSG_TIMEOUT_DEFAULT_SET,
- l3num,
- l4proto);
+ l3num, l4proto, timeouts);
if (ret <= 0) {
kfree_skb(skb2);
err = -ENOMEM;
return false;
}
-static int nft_compat_chain_validate_dependency(const char *tablename,
- const struct nft_chain *chain)
+static int nft_compat_chain_validate_dependency(const struct nft_ctx *ctx,
+ const char *tablename)
{
+ enum nft_chain_types type = NFT_CHAIN_T_DEFAULT;
+ const struct nft_chain *chain = ctx->chain;
const struct nft_base_chain *basechain;
if (!tablename ||
return 0;
basechain = nft_base_chain(chain);
- if (strcmp(tablename, "nat") == 0 &&
- basechain->type->type != NFT_CHAIN_T_NAT)
- return -EINVAL;
+ if (strcmp(tablename, "nat") == 0) {
+ if (ctx->family != NFPROTO_BRIDGE)
+ type = NFT_CHAIN_T_NAT;
+ if (basechain->type->type != type)
+ return -EINVAL;
+ }
return 0;
}
if (target->hooks && !(hook_mask & target->hooks))
return -EINVAL;
- ret = nft_compat_chain_validate_dependency(target->table,
- ctx->chain);
+ ret = nft_compat_chain_validate_dependency(ctx, target->table);
if (ret < 0)
return ret;
}
void *info)
{
struct xt_match *match = expr->ops->data;
+ struct module *me = match->me;
struct xt_mtdtor_param par;
par.net = ctx->net;
par.match->destroy(&par);
if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops)))
- module_put(match->me);
+ module_put(me);
}
static void
if (match->hooks && !(hook_mask & match->hooks))
return -EINVAL;
- ret = nft_compat_chain_validate_dependency(match->table,
- ctx->chain);
+ ret = nft_compat_chain_validate_dependency(ctx, match->table);
if (ret < 0)
return ret;
}
{
int err;
- register_netdevice_notifier(&flow_offload_netdev_notifier);
+ err = register_netdevice_notifier(&flow_offload_netdev_notifier);
+ if (err)
+ goto err;
err = nft_register_expr(&nft_flow_offload_type);
if (err < 0)
register_expr:
unregister_netdevice_notifier(&flow_offload_netdev_notifier);
+err:
return err;
}
u32 modulus;
atomic_t counter;
u32 offset;
- struct nft_set *map;
};
static u32 nft_ng_inc_gen(struct nft_ng_inc *priv)
regs->data[priv->dreg] = nft_ng_inc_gen(priv);
}
-static void nft_ng_inc_map_eval(const struct nft_expr *expr,
- struct nft_regs *regs,
- const struct nft_pktinfo *pkt)
-{
- struct nft_ng_inc *priv = nft_expr_priv(expr);
- const struct nft_set *map = priv->map;
- const struct nft_set_ext *ext;
- u32 result;
- bool found;
-
- result = nft_ng_inc_gen(priv);
- found = map->ops->lookup(nft_net(pkt), map, &result, &ext);
-
- if (!found)
- return;
-
- nft_data_copy(®s->data[priv->dreg],
- nft_set_ext_data(ext), map->dlen);
-}
-
static const struct nla_policy nft_ng_policy[NFTA_NG_MAX + 1] = {
[NFTA_NG_DREG] = { .type = NLA_U32 },
[NFTA_NG_MODULUS] = { .type = NLA_U32 },
[NFTA_NG_TYPE] = { .type = NLA_U32 },
[NFTA_NG_OFFSET] = { .type = NLA_U32 },
- [NFTA_NG_SET_NAME] = { .type = NLA_STRING,
- .len = NFT_SET_MAXNAMELEN - 1 },
- [NFTA_NG_SET_ID] = { .type = NLA_U32 },
};
static int nft_ng_inc_init(const struct nft_ctx *ctx,
NFT_DATA_VALUE, sizeof(u32));
}
-static int nft_ng_inc_map_init(const struct nft_ctx *ctx,
- const struct nft_expr *expr,
- const struct nlattr * const tb[])
-{
- struct nft_ng_inc *priv = nft_expr_priv(expr);
- u8 genmask = nft_genmask_next(ctx->net);
-
- nft_ng_inc_init(ctx, expr, tb);
-
- priv->map = nft_set_lookup_global(ctx->net, ctx->table,
- tb[NFTA_NG_SET_NAME],
- tb[NFTA_NG_SET_ID], genmask);
-
- return PTR_ERR_OR_ZERO(priv->map);
-}
-
static int nft_ng_dump(struct sk_buff *skb, enum nft_registers dreg,
u32 modulus, enum nft_ng_types type, u32 offset)
{
priv->offset);
}
-static int nft_ng_inc_map_dump(struct sk_buff *skb,
- const struct nft_expr *expr)
-{
- const struct nft_ng_inc *priv = nft_expr_priv(expr);
-
- if (nft_ng_dump(skb, priv->dreg, priv->modulus,
- NFT_NG_INCREMENTAL, priv->offset) ||
- nla_put_string(skb, NFTA_NG_SET_NAME, priv->map->name))
- goto nla_put_failure;
-
- return 0;
-
-nla_put_failure:
- return -1;
-}
-
struct nft_ng_random {
enum nft_registers dreg:8;
u32 modulus;
u32 offset;
- struct nft_set *map;
};
static u32 nft_ng_random_gen(struct nft_ng_random *priv)
regs->data[priv->dreg] = nft_ng_random_gen(priv);
}
-static void nft_ng_random_map_eval(const struct nft_expr *expr,
- struct nft_regs *regs,
- const struct nft_pktinfo *pkt)
-{
- struct nft_ng_random *priv = nft_expr_priv(expr);
- const struct nft_set *map = priv->map;
- const struct nft_set_ext *ext;
- u32 result;
- bool found;
-
- result = nft_ng_random_gen(priv);
- found = map->ops->lookup(nft_net(pkt), map, &result, &ext);
- if (!found)
- return;
-
- nft_data_copy(®s->data[priv->dreg],
- nft_set_ext_data(ext), map->dlen);
-}
-
static int nft_ng_random_init(const struct nft_ctx *ctx,
const struct nft_expr *expr,
const struct nlattr * const tb[])
NFT_DATA_VALUE, sizeof(u32));
}
-static int nft_ng_random_map_init(const struct nft_ctx *ctx,
- const struct nft_expr *expr,
- const struct nlattr * const tb[])
-{
- struct nft_ng_random *priv = nft_expr_priv(expr);
- u8 genmask = nft_genmask_next(ctx->net);
-
- nft_ng_random_init(ctx, expr, tb);
- priv->map = nft_set_lookup_global(ctx->net, ctx->table,
- tb[NFTA_NG_SET_NAME],
- tb[NFTA_NG_SET_ID], genmask);
-
- return PTR_ERR_OR_ZERO(priv->map);
-}
-
static int nft_ng_random_dump(struct sk_buff *skb, const struct nft_expr *expr)
{
const struct nft_ng_random *priv = nft_expr_priv(expr);
priv->offset);
}
-static int nft_ng_random_map_dump(struct sk_buff *skb,
- const struct nft_expr *expr)
-{
- const struct nft_ng_random *priv = nft_expr_priv(expr);
-
- if (nft_ng_dump(skb, priv->dreg, priv->modulus,
- NFT_NG_RANDOM, priv->offset) ||
- nla_put_string(skb, NFTA_NG_SET_NAME, priv->map->name))
- goto nla_put_failure;
-
- return 0;
-
-nla_put_failure:
- return -1;
-}
-
static struct nft_expr_type nft_ng_type;
static const struct nft_expr_ops nft_ng_inc_ops = {
.type = &nft_ng_type,
.dump = nft_ng_inc_dump,
};
-static const struct nft_expr_ops nft_ng_inc_map_ops = {
- .type = &nft_ng_type,
- .size = NFT_EXPR_SIZE(sizeof(struct nft_ng_inc)),
- .eval = nft_ng_inc_map_eval,
- .init = nft_ng_inc_map_init,
- .dump = nft_ng_inc_map_dump,
-};
-
static const struct nft_expr_ops nft_ng_random_ops = {
.type = &nft_ng_type,
.size = NFT_EXPR_SIZE(sizeof(struct nft_ng_random)),
.dump = nft_ng_random_dump,
};
-static const struct nft_expr_ops nft_ng_random_map_ops = {
- .type = &nft_ng_type,
- .size = NFT_EXPR_SIZE(sizeof(struct nft_ng_random)),
- .eval = nft_ng_random_map_eval,
- .init = nft_ng_random_map_init,
- .dump = nft_ng_random_map_dump,
-};
-
static const struct nft_expr_ops *
nft_ng_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[])
{
switch (type) {
case NFT_NG_INCREMENTAL:
- if (tb[NFTA_NG_SET_NAME])
- return &nft_ng_inc_map_ops;
return &nft_ng_inc_ops;
case NFT_NG_RANDOM:
- if (tb[NFTA_NG_SET_NAME])
- return &nft_ng_random_map_ops;
return &nft_ng_random_ops;
}
int err;
u8 ttl;
- if (nla_get_u8(tb[NFTA_OSF_TTL])) {
+ if (tb[NFTA_OSF_TTL]) {
ttl = nla_get_u8(tb[NFTA_OSF_TTL]);
if (ttl > 2)
return -EINVAL;
schedule_work(&timer->work);
}
+static int idletimer_check_sysfs_name(const char *name, unsigned int size)
+{
+ int ret;
+
+ ret = xt_check_proc_name(name, size);
+ if (ret < 0)
+ return ret;
+
+ if (!strcmp(name, "power") ||
+ !strcmp(name, "subsystem") ||
+ !strcmp(name, "uevent"))
+ return -EINVAL;
+
+ return 0;
+}
+
static int idletimer_tg_create(struct idletimer_tg_info *info)
{
int ret;
goto out;
}
+ ret = idletimer_check_sysfs_name(info->label, sizeof(info->label));
+ if (ret < 0)
+ goto out_free_timer;
+
sysfs_attr_init(&info->timer->attr.attr);
info->timer->attr.attr.name = kstrdup(info->label, GFP_KERNEL);
if (!info->timer->attr.attr.name) {
return 0;
}
-static void __net_exit xt_rateest_net_exit(struct net *net)
-{
- struct xt_rateest_net *xn = net_generic(net, xt_rateest_id);
- int i;
-
- for (i = 0; i < ARRAY_SIZE(xn->hash); i++)
- WARN_ON_ONCE(!hlist_empty(&xn->hash[i]));
-}
-
static struct pernet_operations xt_rateest_net_ops = {
.init = xt_rateest_net_init,
- .exit = xt_rateest_net_exit,
.id = &xt_rateest_id,
.size = sizeof(struct xt_rateest_net),
};
/* copy match config into hashtable config */
ret = cfg_copy(&hinfo->cfg, (void *)cfg, 3);
-
- if (ret)
+ if (ret) {
+ vfree(hinfo);
return ret;
+ }
hinfo->cfg.size = size;
if (hinfo->cfg.max == 0)
int ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 1);
-
if (ret)
return ret;
int ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 2);
-
if (ret)
return ret;
return ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 1);
-
if (ret)
return ret;
return ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 2);
-
if (ret)
return ret;
&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
if (err) {
net_warn_ratelimited("openvswitch: zone: %u "
- "execeeds conntrack limit\n",
+ "exceeds conntrack limit\n",
info->zone.id);
return err;
}
&info->labels.mask);
if (err)
return err;
- } else if (labels_nonzero(&info->labels.mask)) {
+ } else if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
+ labels_nonzero(&info->labels.mask)) {
err = ovs_ct_set_labels(ct, key, &info->labels.value,
&info->labels.mask);
if (err)
* is already present */
if (mac_proto != MAC_PROTO_NONE)
return -EINVAL;
- mac_proto = MAC_PROTO_NONE;
+ mac_proto = MAC_PROTO_ETHERNET;
break;
case OVS_ACTION_ATTR_POP_ETH:
return -EINVAL;
if (vlan_tci & htons(VLAN_TAG_PRESENT))
return -EINVAL;
- mac_proto = MAC_PROTO_ETHERNET;
+ mac_proto = MAC_PROTO_NONE;
break;
case OVS_ACTION_ATTR_PUSH_NSH:
void *ph;
__u32 ts;
- ph = skb_shinfo(skb)->destructor_arg;
+ ph = skb_zcopy_get_nouarg(skb);
packet_dec_pending(&po->tx_ring);
ts = __packet_set_timestamp(po, ph, skb);
skb->mark = po->sk.sk_mark;
skb->tstamp = sockc->transmit_time;
sock_tx_timestamp(&po->sk, sockc->tsflags, &skb_shinfo(skb)->tx_flags);
- skb_shinfo(skb)->destructor_arg = ph.raw;
+ skb_zcopy_set_nouarg(skb, ph.raw);
skb_reserve(skb, hlen);
skb_reset_network_header(skb);
* getting ACKs from the server. Returns a number representing the life state
* which can be compared to that returned by a previous call.
*
- * If this is a client call, ping ACKs will be sent to the server to find out
- * whether it's still responsive and whether the call is still alive on the
- * server.
+ * If the life state stalls, rxrpc_kernel_probe_life() should be called and
+ * then 2RTT waited.
*/
-u32 rxrpc_kernel_check_life(struct socket *sock, struct rxrpc_call *call)
+u32 rxrpc_kernel_check_life(const struct socket *sock,
+ const struct rxrpc_call *call)
{
return call->acks_latest;
}
EXPORT_SYMBOL(rxrpc_kernel_check_life);
+/**
+ * rxrpc_kernel_probe_life - Poke the peer to see if it's still alive
+ * @sock: The socket the call is on
+ * @call: The call to check
+ *
+ * In conjunction with rxrpc_kernel_check_life(), allow a kernel service to
+ * find out whether a call is still alive by pinging it. This should cause the
+ * life state to be bumped in about 2*RTT.
+ *
+ * The must be called in TASK_RUNNING state on pain of might_sleep() objecting.
+ */
+void rxrpc_kernel_probe_life(struct socket *sock, struct rxrpc_call *call)
+{
+ rxrpc_propose_ACK(call, RXRPC_ACK_PING, 0, 0, true, false,
+ rxrpc_propose_ack_ping_for_check_life);
+ rxrpc_send_ack_packet(call, true, NULL);
+}
+EXPORT_SYMBOL(rxrpc_kernel_probe_life);
+
/**
* rxrpc_kernel_get_epoch - Retrieve the epoch value from a call.
* @sock: The socket the call is on
* not hard-ACK'd packet follows this.
*/
rxrpc_seq_t tx_top; /* Highest Tx slot allocated. */
+ u16 tx_backoff; /* Delay to insert due to Tx failure */
/* TCP-style slow-start congestion control [RFC5681]. Since the SMSS
* is fixed, we keep these numbers in terms of segments (ie. DATA
else
ack_at = expiry;
+ ack_at += READ_ONCE(call->tx_backoff);
ack_at += now;
if (time_before(ack_at, call->ack_at)) {
WRITE_ONCE(call->ack_at, ack_at);
container_of(work, struct rxrpc_call, processor);
rxrpc_serial_t *send_ack;
unsigned long now, next, t;
+ unsigned int iterations = 0;
rxrpc_see_call(call);
call->debug_id, rxrpc_call_states[call->state], call->events);
recheck_state:
+ /* Limit the number of times we do this before returning to the manager */
+ iterations++;
+ if (iterations > 5)
+ goto requeue;
+
if (test_and_clear_bit(RXRPC_CALL_EV_ABORT, &call->events)) {
rxrpc_send_abort_packet(call);
goto recheck_state;
rxrpc_reduce_call_timer(call, next, now, rxrpc_timer_restart);
/* other events may have been raised since we started checking */
- if (call->events && call->state < RXRPC_CALL_COMPLETE) {
- __rxrpc_queue_call(call);
- goto out;
- }
+ if (call->events && call->state < RXRPC_CALL_COMPLETE)
+ goto requeue;
out_put:
rxrpc_put_call(call, rxrpc_call_put);
out:
_leave("");
+ return;
+
+requeue:
+ __rxrpc_queue_call(call);
+ goto out;
}
static const char rxrpc_keepalive_string[] = "";
+/*
+ * Increase Tx backoff on transmission failure and clear it on success.
+ */
+static void rxrpc_tx_backoff(struct rxrpc_call *call, int ret)
+{
+ if (ret < 0) {
+ u16 tx_backoff = READ_ONCE(call->tx_backoff);
+
+ if (tx_backoff < HZ)
+ WRITE_ONCE(call->tx_backoff, tx_backoff + 1);
+ } else {
+ WRITE_ONCE(call->tx_backoff, 0);
+ }
+}
+
/*
* Arrange for a keepalive ping a certain time after we last transmitted. This
* lets the far side know we're still interested in this call and helps keep
else
trace_rxrpc_tx_packet(call->debug_id, &pkt->whdr,
rxrpc_tx_point_call_ack);
+ rxrpc_tx_backoff(call, ret);
if (call->state < RXRPC_CALL_COMPLETE) {
if (ret < 0) {
rxrpc_propose_ACK(call, pkt->ack.reason,
ntohs(pkt->ack.maxSkew),
ntohl(pkt->ack.serial),
- true, true,
+ false, true,
rxrpc_propose_ack_retry_tx);
} else {
spin_lock_bh(&call->lock);
else
trace_rxrpc_tx_packet(call->debug_id, &pkt.whdr,
rxrpc_tx_point_call_abort);
-
+ rxrpc_tx_backoff(call, ret);
rxrpc_put_connection(conn);
return ret;
else
trace_rxrpc_tx_packet(call->debug_id, &whdr,
rxrpc_tx_point_call_data_nofrag);
+ rxrpc_tx_backoff(call, ret);
if (ret == -EMSGSIZE)
goto send_fragmentable;
rxrpc_reduce_call_timer(call, expect_rx_by, nowj,
rxrpc_timer_set_for_normal);
}
- }
- rxrpc_set_keepalive(call);
+ rxrpc_set_keepalive(call);
+ } else {
+ /* Cancel the call if the initial transmission fails,
+ * particularly if that's due to network routing issues that
+ * aren't going away anytime soon. The layer above can arrange
+ * the retransmission.
+ */
+ if (!test_and_set_bit(RXRPC_CALL_BEGAN_RX_TIMER, &call->flags))
+ rxrpc_set_call_completion(call, RXRPC_CALL_LOCAL_ERROR,
+ RX_USER_ABORT, ret);
+ }
_leave(" = %d [%u]", ret, call->peer->maxdata);
return ret;
else
trace_rxrpc_tx_packet(call->debug_id, &whdr,
rxrpc_tx_point_call_data_frag);
+ rxrpc_tx_backoff(call, ret);
up_write(&conn->params.local->defrag_sem);
goto done;
if (is_redirect) {
skb2->tc_redirected = 1;
skb2->tc_from_ingress = skb2->tc_at_ingress;
-
+ if (skb2->tc_from_ingress)
+ skb2->tstamp = 0;
/* let's the caller reinsert the packet, if possible */
if (use_reinsert) {
res->ingress = want_ingress;
goto out_release;
}
} else {
- return err;
+ ret = err;
+ goto out_free;
}
p = to_pedit(*a);
u32 tcfp_ewma_rate;
s64 tcfp_burst;
u32 tcfp_mtu;
- s64 tcfp_toks;
- s64 tcfp_ptoks;
s64 tcfp_mtu_ptoks;
- s64 tcfp_t_c;
struct psched_ratecfg rate;
bool rate_present;
struct psched_ratecfg peak;
struct tcf_police {
struct tc_action common;
struct tcf_police_params __rcu *params;
+
+ spinlock_t tcfp_lock ____cacheline_aligned_in_smp;
+ s64 tcfp_toks;
+ s64 tcfp_ptoks;
+ s64 tcfp_t_c;
};
#define to_police(pc) ((struct tcf_police *)pc)
int ovr, int bind, bool rtnl_held,
struct netlink_ext_ack *extack)
{
- int ret = 0, err;
+ int ret = 0, tcfp_result = TC_ACT_OK, err, size;
struct nlattr *tb[TCA_POLICE_MAX + 1];
struct tc_police *parm;
struct tcf_police *police;
struct tc_action_net *tn = net_generic(net, police_net_id);
struct tcf_police_params *new;
bool exists = false;
- int size;
if (nla == NULL)
return -EINVAL;
return ret;
}
ret = ACT_P_CREATED;
+ spin_lock_init(&(to_police(*a)->tcfp_lock));
} else if (!ovr) {
tcf_idr_release(*a, bind);
return -EEXIST;
goto failure;
}
+ if (tb[TCA_POLICE_RESULT]) {
+ tcfp_result = nla_get_u32(tb[TCA_POLICE_RESULT]);
+ if (TC_ACT_EXT_CMP(tcfp_result, TC_ACT_GOTO_CHAIN)) {
+ NL_SET_ERR_MSG(extack,
+ "goto chain not allowed on fallback");
+ err = -EINVAL;
+ goto failure;
+ }
+ }
+
new = kzalloc(sizeof(*new), GFP_KERNEL);
if (unlikely(!new)) {
err = -ENOMEM;
}
/* No failure allowed after this point */
+ new->tcfp_result = tcfp_result;
new->tcfp_mtu = parm->mtu;
if (!new->tcfp_mtu) {
new->tcfp_mtu = ~0;
}
new->tcfp_burst = PSCHED_TICKS2NS(parm->burst);
- new->tcfp_toks = new->tcfp_burst;
- if (new->peak_present) {
+ if (new->peak_present)
new->tcfp_mtu_ptoks = (s64)psched_l2t_ns(&new->peak,
new->tcfp_mtu);
- new->tcfp_ptoks = new->tcfp_mtu_ptoks;
- }
if (tb[TCA_POLICE_AVRATE])
new->tcfp_ewma_rate = nla_get_u32(tb[TCA_POLICE_AVRATE]);
- if (tb[TCA_POLICE_RESULT]) {
- new->tcfp_result = nla_get_u32(tb[TCA_POLICE_RESULT]);
- if (TC_ACT_EXT_CMP(new->tcfp_result, TC_ACT_GOTO_CHAIN)) {
- NL_SET_ERR_MSG(extack,
- "goto chain not allowed on fallback");
- err = -EINVAL;
- goto failure;
- }
- }
-
spin_lock_bh(&police->tcf_lock);
- new->tcfp_t_c = ktime_get_ns();
+ spin_lock_bh(&police->tcfp_lock);
+ police->tcfp_t_c = ktime_get_ns();
+ police->tcfp_toks = new->tcfp_burst;
+ if (new->peak_present)
+ police->tcfp_ptoks = new->tcfp_mtu_ptoks;
+ spin_unlock_bh(&police->tcfp_lock);
police->tcf_action = parm->action;
rcu_swap_protected(police->params,
new,
}
now = ktime_get_ns();
- toks = min_t(s64, now - p->tcfp_t_c, p->tcfp_burst);
+ spin_lock_bh(&police->tcfp_lock);
+ toks = min_t(s64, now - police->tcfp_t_c, p->tcfp_burst);
if (p->peak_present) {
- ptoks = toks + p->tcfp_ptoks;
+ ptoks = toks + police->tcfp_ptoks;
if (ptoks > p->tcfp_mtu_ptoks)
ptoks = p->tcfp_mtu_ptoks;
ptoks -= (s64)psched_l2t_ns(&p->peak,
qdisc_pkt_len(skb));
}
- toks += p->tcfp_toks;
+ toks += police->tcfp_toks;
if (toks > p->tcfp_burst)
toks = p->tcfp_burst;
toks -= (s64)psched_l2t_ns(&p->rate, qdisc_pkt_len(skb));
if ((toks|ptoks) >= 0) {
- p->tcfp_t_c = now;
- p->tcfp_toks = toks;
- p->tcfp_ptoks = ptoks;
+ police->tcfp_t_c = now;
+ police->tcfp_toks = toks;
+ police->tcfp_ptoks = ptoks;
+ spin_unlock_bh(&police->tcfp_lock);
ret = p->tcfp_result;
goto inc_drops;
}
+ spin_unlock_bh(&police->tcfp_lock);
}
inc_overlimits:
struct netlink_ext_ack *extack)
{
const struct nlattr *nla_enc_key, *nla_opt_key, *nla_opt_msk = NULL;
- int option_len, key_depth, msk_depth = 0;
+ int err, option_len, key_depth, msk_depth = 0;
+
+ err = nla_validate_nested(tb[TCA_FLOWER_KEY_ENC_OPTS],
+ TCA_FLOWER_KEY_ENC_OPTS_MAX,
+ enc_opts_policy, extack);
+ if (err)
+ return err;
nla_enc_key = nla_data(tb[TCA_FLOWER_KEY_ENC_OPTS]);
if (tb[TCA_FLOWER_KEY_ENC_OPTS_MASK]) {
+ err = nla_validate_nested(tb[TCA_FLOWER_KEY_ENC_OPTS_MASK],
+ TCA_FLOWER_KEY_ENC_OPTS_MAX,
+ enc_opts_policy, extack);
+ if (err)
+ return err;
+
nla_opt_msk = nla_data(tb[TCA_FLOWER_KEY_ENC_OPTS_MASK]);
msk_depth = nla_len(tb[TCA_FLOWER_KEY_ENC_OPTS_MASK]);
}
if (err)
goto errout_idr;
- if (!tc_skip_sw(fnew->flags)) {
- if (!fold && fl_lookup(fnew->mask, &fnew->mkey)) {
- err = -EEXIST;
- goto errout_mask;
- }
-
- err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node,
- fnew->mask->filter_ht_params);
- if (err)
- goto errout_mask;
+ if (!fold && fl_lookup(fnew->mask, &fnew->mkey)) {
+ err = -EEXIST;
+ goto errout_mask;
}
+ err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node,
+ fnew->mask->filter_ht_params);
+ if (err)
+ goto errout_mask;
+
if (!tc_skip_hw(fnew->flags)) {
err = fl_hw_replace_filter(tp, fnew, extack);
if (err)
struct cls_fl_head *head = rtnl_dereference(tp->root);
struct cls_fl_filter *f = arg;
- if (!tc_skip_sw(f->flags))
- rhashtable_remove_fast(&f->mask->ht, &f->ht_node,
- f->mask->filter_ht_params);
+ rhashtable_remove_fast(&f->mask->ht, &f->ht_node,
+ f->mask->filter_ht_params);
__fl_delete(tp, f, extack);
*last = list_empty(&head->masks);
return 0;
goto begin;
}
prefetch(&skb->end);
- f->credit -= qdisc_pkt_len(skb);
+ plen = qdisc_pkt_len(skb);
+ f->credit -= plen;
- if (ktime_to_ns(skb->tstamp) || !q->rate_enable)
+ if (!q->rate_enable)
goto out;
rate = q->flow_max_rate;
- if (skb->sk)
- rate = min(skb->sk->sk_pacing_rate, rate);
-
- if (rate <= q->low_rate_threshold) {
- f->credit = 0;
- plen = qdisc_pkt_len(skb);
- } else {
- plen = max(qdisc_pkt_len(skb), q->quantum);
- if (f->credit > 0)
- goto out;
+
+ /* If EDT time was provided for this skb, we need to
+ * update f->time_next_packet only if this qdisc enforces
+ * a flow max rate.
+ */
+ if (!skb->tstamp) {
+ if (skb->sk)
+ rate = min(skb->sk->sk_pacing_rate, rate);
+
+ if (rate <= q->low_rate_threshold) {
+ f->credit = 0;
+ } else {
+ plen = max(plen, q->quantum);
+ if (f->credit > 0)
+ goto out;
+ }
}
if (rate != ~0UL) {
u64 len = (u64)plen * NSEC_PER_SEC;
int count = 1;
int rc = NET_XMIT_SUCCESS;
+ /* Do not fool qdisc_drop_all() */
+ skb->prev = NULL;
+
/* Random duplication */
if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor))
++count;
*/
skb->dev = qdisc_dev(sch);
-#ifdef CONFIG_NET_CLS_ACT
- /*
- * If it's at ingress let's pretend the delay is
- * from the network (tstamp will be updated).
- */
- if (skb->tc_redirected && skb->tc_from_ingress)
- skb->tstamp = 0;
-#endif
-
if (q->slot.slot_next) {
q->slot.packets_left--;
q->slot.bytes_left -= qdisc_pkt_len(skb);
asoc->flowlabel = sp->flowlabel;
asoc->dscp = sp->dscp;
- /* Initialize default path MTU. */
- asoc->pathmtu = sp->pathmtu;
-
/* Set association default SACK delay */
asoc->sackdelay = msecs_to_jiffies(sp->sackdelay);
asoc->sackfreq = sp->sackfreq;
0, gfp))
goto fail_init;
+ /* Initialize default path MTU. */
+ asoc->pathmtu = sp->pathmtu;
+ sctp_assoc_update_frag_point(asoc);
+
/* Assume that peer would support both address types unless we are
* told otherwise.
*/
WARN_ON(atomic_read(&asoc->rmem_alloc));
- kfree(asoc);
+ kfree_rcu(asoc, rcu);
SCTP_DBG_OBJCNT_DEC(assoc);
}
void sctp_assoc_rm_peer(struct sctp_association *asoc,
struct sctp_transport *peer)
{
- struct list_head *pos;
- struct sctp_transport *transport;
+ struct sctp_transport *transport;
+ struct list_head *pos;
+ struct sctp_chunk *ch;
pr_debug("%s: association:%p addr:%pISpc\n",
__func__, asoc, &peer->ipaddr.sa);
*/
if (!list_empty(&peer->transmitted)) {
struct sctp_transport *active = asoc->peer.active_path;
- struct sctp_chunk *ch;
/* Reset the transport of each chunk on this list */
list_for_each_entry(ch, &peer->transmitted,
sctp_transport_hold(active);
}
+ list_for_each_entry(ch, &asoc->outqueue.out_chunk_list, list)
+ if (ch->transport == peer)
+ ch->transport = NULL;
+
asoc->peer.transport_count--;
sctp_transport_free(peer);
* the packet
*/
max_data = asoc->frag_point;
+ if (unlikely(!max_data)) {
+ max_data = sctp_min_frag_point(sctp_sk(asoc->base.sk),
+ sctp_datachk_len(&asoc->stream));
+ pr_warn_ratelimited("%s: asoc:%p frag_point is zero, forcing max_data to default minimum (%Zu)",
+ __func__, asoc, max_data);
+ }
/* If the the peer requested that we authenticate DATA chunks
* we need to account for bundling of the AUTH chunks along with
sctp_transport_route(tp, NULL, sp);
if (asoc->param_flags & SPP_PMTUD_ENABLE)
sctp_assoc_sync_pmtu(asoc);
+ } else if (!sctp_transport_pmtu_check(tp)) {
+ if (asoc->param_flags & SPP_PMTUD_ENABLE)
+ sctp_assoc_sync_pmtu(asoc);
}
if (asoc->pmtu_pending) {
return retval;
}
-static void sctp_packet_release_owner(struct sk_buff *skb)
-{
- sk_free(skb->sk);
-}
-
-static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
-{
- skb_orphan(skb);
- skb->sk = sk;
- skb->destructor = sctp_packet_release_owner;
-
- /*
- * The data chunks have already been accounted for in sctp_sendmsg(),
- * therefore only reserve a single byte to keep socket around until
- * the packet has been transmitted.
- */
- refcount_inc(&sk->sk_wmem_alloc);
-}
-
static void sctp_packet_gso_append(struct sk_buff *head, struct sk_buff *skb)
{
if (SCTP_OUTPUT_CB(head)->last == head)
head->truesize += skb->truesize;
head->data_len += skb->len;
head->len += skb->len;
+ refcount_add(skb->truesize, &head->sk->sk_wmem_alloc);
__skb_header_release(skb);
}
if (!head)
goto out;
skb_reserve(head, packet->overhead + MAX_HEADER);
- sctp_packet_set_owner_w(head, sk);
+ skb_set_owner_w(head, sk);
/* set sctp header */
sh = skb_push(head, sizeof(struct sctphdr));
INIT_LIST_HEAD(&q->retransmit);
INIT_LIST_HEAD(&q->sacked);
INIT_LIST_HEAD(&q->abandoned);
- sctp_sched_set_sched(asoc, SCTP_SS_FCFS);
+ sctp_sched_set_sched(asoc, SCTP_SS_DEFAULT);
}
/* Free the outqueue structure and any related pending chunks.
asoc->c.sinit_max_instreams, gfp))
goto clean_up;
+ /* Update frag_point when stream_interleave may get changed. */
+ sctp_assoc_update_frag_point(asoc);
+
if (!asoc->temp && sctp_assoc_set_id(asoc, gfp))
goto clean_up;
__u16 datasize = asoc ? sctp_datachk_len(&asoc->stream) :
sizeof(struct sctp_data_chunk);
- min_len = sctp_mtu_payload(sp, SCTP_DEFAULT_MINSEGMENT,
- datasize);
+ min_len = sctp_min_frag_point(sp, datasize);
max_len = SCTP_MAX_CHUNK_LEN - datasize;
if (val < min_len || val > max_len)
unsigned int optlen)
{
struct sctp_assoc_value params;
- struct sctp_association *asoc;
- int retval = -EINVAL;
if (optlen != sizeof(params))
- goto out;
-
- if (copy_from_user(¶ms, optval, optlen)) {
- retval = -EFAULT;
- goto out;
- }
-
- asoc = sctp_id2assoc(sk, params.assoc_id);
- if (asoc) {
- asoc->prsctp_enable = !!params.assoc_value;
- } else if (!params.assoc_id) {
- struct sctp_sock *sp = sctp_sk(sk);
+ return -EINVAL;
- sp->ep->prsctp_enable = !!params.assoc_value;
- } else {
- goto out;
- }
+ if (copy_from_user(¶ms, optval, optlen))
+ return -EFAULT;
- retval = 0;
+ sctp_sk(sk)->ep->prsctp_enable = !!params.assoc_value;
-out:
- return retval;
+ return 0;
}
static int sctp_setsockopt_default_prinfo(struct sock *sk,
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+ ((policy & SCTP_PR_SCTP_ALL) && (policy & SCTP_PR_SCTP_MASK)))
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
if (!asoc)
goto out;
- if (policy & SCTP_PR_SCTP_ALL) {
+ if (policy == SCTP_PR_SCTP_ALL) {
params.sprstat_abandoned_unsent = 0;
params.sprstat_abandoned_sent = 0;
for (policy = 0; policy <= SCTP_PR_INDEX(MAX); policy++) {
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+ ((policy & SCTP_PR_SCTP_ALL) && (policy & SCTP_PR_SCTP_MASK)))
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
goto out;
}
- stream->incnt = incnt;
stream->outcnt = outcnt;
asoc->strreset_outstanding = !!out + !!in;
smc = smc_sk(sk);
/* cleanup for a dangling non-blocking connect */
+ if (smc->connect_info && sk->sk_state == SMC_INIT)
+ tcp_abort(smc->clcsock->sk, ECONNABORTED);
flush_work(&smc->connect_work);
kfree(smc->connect_info);
smc->connect_info = NULL;
mutex_lock(&smc_create_lgr_pending);
local_contact = smc_conn_create(smc, false, aclc->hdr.flag, ibdev,
- ibport, &aclc->lcl, NULL, 0);
+ ibport, ntoh24(aclc->qpn), &aclc->lcl,
+ NULL, 0);
if (local_contact < 0) {
if (local_contact == -ENOMEM)
reason_code = SMC_CLC_DECL_MEM;/* insufficient memory*/
int rc = 0;
mutex_lock(&smc_create_lgr_pending);
- local_contact = smc_conn_create(smc, true, aclc->hdr.flag, NULL, 0,
+ local_contact = smc_conn_create(smc, true, aclc->hdr.flag, NULL, 0, 0,
NULL, ismdev, aclc->gid);
if (local_contact < 0)
return smc_connect_abort(smc, SMC_CLC_DECL_MEM, 0);
int *local_contact)
{
/* allocate connection / link group */
- *local_contact = smc_conn_create(new_smc, false, 0, ibdev, ibport,
+ *local_contact = smc_conn_create(new_smc, false, 0, ibdev, ibport, 0,
&pclc->lcl, NULL, 0);
if (*local_contact < 0) {
if (*local_contact == -ENOMEM)
struct smc_clc_msg_smcd *pclc_smcd;
pclc_smcd = smc_get_clc_msg_smcd(pclc);
- *local_contact = smc_conn_create(new_smc, true, 0, NULL, 0, NULL,
+ *local_contact = smc_conn_create(new_smc, true, 0, NULL, 0, 0, NULL,
ismdev, pclc_smcd->gid);
if (*local_contact < 0) {
if (*local_contact == -ENOMEM)
sizeof(struct smc_cdc_msg) > SMC_WR_BUF_SIZE,
"must increase SMC_WR_BUF_SIZE to at least sizeof(struct smc_cdc_msg)");
BUILD_BUG_ON_MSG(
- sizeof(struct smc_cdc_msg) != SMC_WR_TX_SIZE,
+ offsetofend(struct smc_cdc_msg, reserved) > SMC_WR_TX_SIZE,
"must adapt SMC_WR_TX_SIZE to sizeof(struct smc_cdc_msg); if not all smc_wr upper layer protocols use the same message size any more, must start to set link->wr_tx_sges[i].length on each individual smc_wr_tx_send()");
BUILD_BUG_ON_MSG(
sizeof(struct smc_cdc_tx_pend) > SMC_WR_TX_PEND_PRIV_SIZE,
int smcd_cdc_msg_send(struct smc_connection *conn)
{
struct smc_sock *smc = container_of(conn, struct smc_sock, conn);
+ union smc_host_cursor curs;
struct smcd_cdc_msg cdc;
int rc, diff;
memset(&cdc, 0, sizeof(cdc));
cdc.common.type = SMC_CDC_MSG_TYPE;
- cdc.prod_wrap = conn->local_tx_ctrl.prod.wrap;
- cdc.prod_count = conn->local_tx_ctrl.prod.count;
-
- cdc.cons_wrap = conn->local_tx_ctrl.cons.wrap;
- cdc.cons_count = conn->local_tx_ctrl.cons.count;
- cdc.prod_flags = conn->local_tx_ctrl.prod_flags;
- cdc.conn_state_flags = conn->local_tx_ctrl.conn_state_flags;
+ curs.acurs.counter = atomic64_read(&conn->local_tx_ctrl.prod.acurs);
+ cdc.prod.wrap = curs.wrap;
+ cdc.prod.count = curs.count;
+ curs.acurs.counter = atomic64_read(&conn->local_tx_ctrl.cons.acurs);
+ cdc.cons.wrap = curs.wrap;
+ cdc.cons.count = curs.count;
+ cdc.cons.prod_flags = conn->local_tx_ctrl.prod_flags;
+ cdc.cons.conn_state_flags = conn->local_tx_ctrl.conn_state_flags;
rc = smcd_tx_ism_write(conn, &cdc, sizeof(cdc), 0, 1);
if (rc)
return rc;
- smc_curs_copy(&conn->rx_curs_confirmed, &conn->local_tx_ctrl.cons,
- conn);
+ smc_curs_copy(&conn->rx_curs_confirmed, &curs, conn);
/* Calculate transmitted data and increment free send buffer space */
diff = smc_curs_diff(conn->sndbuf_desc->len, &conn->tx_curs_fin,
&conn->tx_curs_sent);
static void smcd_cdc_rx_tsklet(unsigned long data)
{
struct smc_connection *conn = (struct smc_connection *)data;
+ struct smcd_cdc_msg *data_cdc;
struct smcd_cdc_msg cdc;
struct smc_sock *smc;
if (!conn)
return;
- memcpy(&cdc, conn->rmb_desc->cpu_addr, sizeof(cdc));
+ data_cdc = (struct smcd_cdc_msg *)conn->rmb_desc->cpu_addr;
+ smcd_curs_copy(&cdc.prod, &data_cdc->prod, conn);
+ smcd_curs_copy(&cdc.cons, &data_cdc->cons, conn);
smc = container_of(conn, struct smc_sock, conn);
smc_cdc_msg_recv(smc, (struct smc_cdc_msg *)&cdc);
}
struct smc_cdc_producer_flags prod_flags;
struct smc_cdc_conn_state_flags conn_state_flags;
u8 reserved[18];
-} __packed; /* format defined in RFC7609 */
+};
+
+/* SMC-D cursor format */
+union smcd_cdc_cursor {
+ struct {
+ u16 wrap;
+ u32 count;
+ struct smc_cdc_producer_flags prod_flags;
+ struct smc_cdc_conn_state_flags conn_state_flags;
+ } __packed;
+#ifdef KERNEL_HAS_ATOMIC64
+ atomic64_t acurs; /* for atomic processing */
+#else
+ u64 acurs; /* for atomic processing */
+#endif
+} __aligned(8);
/* CDC message for SMC-D */
struct smcd_cdc_msg {
struct smc_wr_rx_hdr common; /* Type = 0xFE */
u8 res1[7];
- u16 prod_wrap;
- u32 prod_count;
- u8 res2[2];
- u16 cons_wrap;
- u32 cons_count;
- struct smc_cdc_producer_flags prod_flags;
- struct smc_cdc_conn_state_flags conn_state_flags;
+ union smcd_cdc_cursor prod;
+ union smcd_cdc_cursor cons;
u8 res3[8];
-} __packed;
+} __aligned(8);
static inline bool smc_cdc_rxed_any_close(struct smc_connection *conn)
{
#endif
}
+static inline void smcd_curs_copy(union smcd_cdc_cursor *tgt,
+ union smcd_cdc_cursor *src,
+ struct smc_connection *conn)
+{
+#ifndef KERNEL_HAS_ATOMIC64
+ unsigned long flags;
+
+ spin_lock_irqsave(&conn->acurs_lock, flags);
+ tgt->acurs = src->acurs;
+ spin_unlock_irqrestore(&conn->acurs_lock, flags);
+#else
+ atomic64_set(&tgt->acurs, atomic64_read(&src->acurs));
+#endif
+}
+
/* calculate cursor difference between old and new, where old <= new */
static inline int smc_curs_diff(unsigned int size,
union smc_host_cursor *old,
static inline void smcd_cdc_msg_to_host(struct smc_host_cdc_msg *local,
struct smcd_cdc_msg *peer)
{
- local->prod.wrap = peer->prod_wrap;
- local->prod.count = peer->prod_count;
- local->cons.wrap = peer->cons_wrap;
- local->cons.count = peer->cons_count;
- local->prod_flags = peer->prod_flags;
- local->conn_state_flags = peer->conn_state_flags;
+ union smc_host_cursor temp;
+
+ temp.wrap = peer->prod.wrap;
+ temp.count = peer->prod.count;
+ atomic64_set(&local->prod.acurs, atomic64_read(&temp.acurs));
+
+ temp.wrap = peer->cons.wrap;
+ temp.count = peer->cons.count;
+ atomic64_set(&local->cons.acurs, atomic64_read(&temp.acurs));
+ local->prod_flags = peer->cons.prod_flags;
+ local->conn_state_flags = peer->cons.conn_state_flags;
}
static inline void smc_cdc_msg_to_host(struct smc_host_cdc_msg *local,
*/
krflags = MSG_PEEK | MSG_WAITALL;
smc->clcsock->sk->sk_rcvtimeo = CLC_WAIT_TIME;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &vec, 1,
+ iov_iter_kvec(&msg.msg_iter, READ, &vec, 1,
sizeof(struct smc_clc_msg_hdr));
len = sock_recvmsg(smc->clcsock, &msg, krflags);
if (signal_pending(current)) {
/* receive the complete CLC message */
memset(&msg, 0, sizeof(struct msghdr));
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &vec, 1, datlen);
+ iov_iter_kvec(&msg.msg_iter, READ, &vec, 1, datlen);
krflags = MSG_WAITALL;
len = sock_recvmsg(smc->clcsock, &msg, krflags);
if (len < datlen || !smc_clc_msg_hdr_valid(clcm)) {
if (!lgr->is_smcd && lnk->state != SMC_LNK_INACTIVE)
smc_llc_link_inactive(lnk);
+ if (lgr->is_smcd)
+ smc_ism_signal_shutdown(lgr);
smc_lgr_free(lgr);
}
}
}
/* Called when SMC-D device is terminated or peer is lost */
-void smc_smcd_terminate(struct smcd_dev *dev, u64 peer_gid)
+void smc_smcd_terminate(struct smcd_dev *dev, u64 peer_gid, unsigned short vlan)
{
struct smc_link_group *lgr, *l;
LIST_HEAD(lgr_free_list);
list_for_each_entry_safe(lgr, l, &smc_lgr_list.list, list) {
if (lgr->is_smcd && lgr->smcd == dev &&
(!peer_gid || lgr->peer_gid == peer_gid) &&
- !list_empty(&lgr->list)) {
+ (vlan == VLAN_VID_MASK || lgr->vlan_id == vlan)) {
__smc_lgr_terminate(lgr);
list_move(&lgr->list, &lgr_free_list);
}
list_for_each_entry_safe(lgr, l, &lgr_free_list, list) {
list_del_init(&lgr->list);
cancel_delayed_work_sync(&lgr->free_work);
+ if (!peer_gid && vlan == VLAN_VID_MASK) /* dev terminated? */
+ smc_ism_signal_shutdown(lgr);
smc_lgr_free(lgr);
}
}
static bool smcr_lgr_match(struct smc_link_group *lgr,
struct smc_clc_msg_local *lcl,
- enum smc_lgr_role role)
+ enum smc_lgr_role role, u32 clcqpn)
{
return !memcmp(lgr->peer_systemid, lcl->id_for_peer,
SMC_SYSTEMID_LEN) &&
SMC_GID_SIZE) &&
!memcmp(lgr->lnk[SMC_SINGLE_LINK].peer_mac, lcl->mac,
sizeof(lcl->mac)) &&
- lgr->role == role;
+ lgr->role == role &&
+ (lgr->role == SMC_SERV ||
+ lgr->lnk[SMC_SINGLE_LINK].peer_qpn == clcqpn);
}
static bool smcd_lgr_match(struct smc_link_group *lgr,
/* create a new SMC connection (and a new link group if necessary) */
int smc_conn_create(struct smc_sock *smc, bool is_smcd, int srv_first_contact,
- struct smc_ib_device *smcibdev, u8 ibport,
+ struct smc_ib_device *smcibdev, u8 ibport, u32 clcqpn,
struct smc_clc_msg_local *lcl, struct smcd_dev *smcd,
u64 peer_gid)
{
list_for_each_entry(lgr, &smc_lgr_list.list, list) {
write_lock_bh(&lgr->conns_lock);
if ((is_smcd ? smcd_lgr_match(lgr, smcd, peer_gid) :
- smcr_lgr_match(lgr, lcl, role)) &&
+ smcr_lgr_match(lgr, lcl, role, clcqpn)) &&
!lgr->sync_err &&
lgr->vlan_id == vlan_id &&
(role == SMC_CLNT ||
smc_llc_link_inactive(lnk);
}
cancel_delayed_work_sync(&lgr->free_work);
+ if (lgr->is_smcd)
+ smc_ism_signal_shutdown(lgr);
smc_lgr_free(lgr); /* free link group */
}
}
void smc_lgr_forget(struct smc_link_group *lgr);
void smc_lgr_terminate(struct smc_link_group *lgr);
void smc_port_terminate(struct smc_ib_device *smcibdev, u8 ibport);
-void smc_smcd_terminate(struct smcd_dev *dev, u64 peer_gid);
+void smc_smcd_terminate(struct smcd_dev *dev, u64 peer_gid,
+ unsigned short vlan);
int smc_buf_create(struct smc_sock *smc, bool is_smcd);
int smc_uncompress_bufsize(u8 compressed);
int smc_rmb_rtoken_handling(struct smc_connection *conn,
void smc_conn_free(struct smc_connection *conn);
int smc_conn_create(struct smc_sock *smc, bool is_smcd, int srv_first_contact,
- struct smc_ib_device *smcibdev, u8 ibport,
+ struct smc_ib_device *smcibdev, u8 ibport, u32 clcqpn,
struct smc_clc_msg_local *lcl, struct smcd_dev *smcd,
u64 peer_gid);
void smcd_conn_free(struct smc_connection *conn);
#define ISM_EVENT_REQUEST 0x0001
#define ISM_EVENT_RESPONSE 0x0002
#define ISM_EVENT_REQUEST_IR 0x00000001
+#define ISM_EVENT_CODE_SHUTDOWN 0x80
#define ISM_EVENT_CODE_TESTLINK 0x83
+union smcd_sw_event_info {
+ u64 info;
+ struct {
+ u8 uid[SMC_LGR_ID_SIZE];
+ unsigned short vlan_id;
+ u16 code;
+ };
+};
+
static void smcd_handle_sw_event(struct smc_ism_event_work *wrk)
{
- union {
- u64 info;
- struct {
- u32 uid;
- unsigned short vlanid;
- u16 code;
- };
- } ev_info;
+ union smcd_sw_event_info ev_info;
+ ev_info.info = wrk->event.info;
switch (wrk->event.code) {
+ case ISM_EVENT_CODE_SHUTDOWN: /* Peer shut down DMBs */
+ smc_smcd_terminate(wrk->smcd, wrk->event.tok, ev_info.vlan_id);
+ break;
case ISM_EVENT_CODE_TESTLINK: /* Activity timer */
- ev_info.info = wrk->event.info;
if (ev_info.code == ISM_EVENT_REQUEST) {
ev_info.code = ISM_EVENT_RESPONSE;
wrk->smcd->ops->signal_event(wrk->smcd,
}
}
+int smc_ism_signal_shutdown(struct smc_link_group *lgr)
+{
+ int rc;
+ union smcd_sw_event_info ev_info;
+
+ memcpy(ev_info.uid, lgr->id, SMC_LGR_ID_SIZE);
+ ev_info.vlan_id = lgr->vlan_id;
+ ev_info.code = ISM_EVENT_REQUEST;
+ rc = lgr->smcd->ops->signal_event(lgr->smcd, lgr->peer_gid,
+ ISM_EVENT_REQUEST_IR,
+ ISM_EVENT_CODE_SHUTDOWN,
+ ev_info.info);
+ return rc;
+}
+
/* worker for SMC-D events */
static void smc_ism_event_work(struct work_struct *work)
{
switch (wrk->event.type) {
case ISM_EVENT_GID: /* GID event, token is peer GID */
- smc_smcd_terminate(wrk->smcd, wrk->event.tok);
+ smc_smcd_terminate(wrk->smcd, wrk->event.tok, VLAN_VID_MASK);
break;
case ISM_EVENT_DMB:
break;
spin_unlock(&smcd_dev_list.lock);
flush_workqueue(smcd->event_wq);
destroy_workqueue(smcd->event_wq);
- smc_smcd_terminate(smcd, 0);
+ smc_smcd_terminate(smcd, 0, VLAN_VID_MASK);
device_del(&smcd->dev);
}
int smc_ism_unregister_dmb(struct smcd_dev *dev, struct smc_buf_desc *dmb_desc);
int smc_ism_write(struct smcd_dev *dev, const struct smc_ism_position *pos,
void *data, size_t len);
+int smc_ism_signal_shutdown(struct smc_link_group *lgr);
#endif
pend = container_of(wr_pend_priv, struct smc_wr_tx_pend, priv);
if (pend->idx < link->wr_tx_cnt) {
+ u32 idx = pend->idx;
+
/* clear the full struct smc_wr_tx_pend including .priv */
memset(&link->wr_tx_pends[pend->idx], 0,
sizeof(link->wr_tx_pends[pend->idx]));
memset(&link->wr_tx_bufs[pend->idx], 0,
sizeof(link->wr_tx_bufs[pend->idx]));
- test_and_clear_bit(pend->idx, link->wr_tx_mask);
+ test_and_clear_bit(idx, link->wr_tx_mask);
return 1;
}
int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
struct kvec *vec, size_t num, size_t size)
{
- iov_iter_kvec(&msg->msg_iter, WRITE | ITER_KVEC, vec, num, size);
+ iov_iter_kvec(&msg->msg_iter, WRITE, vec, num, size);
return sock_sendmsg(sock, msg);
}
EXPORT_SYMBOL(kernel_sendmsg);
if (!sock->ops->sendmsg_locked)
return sock_no_sendmsg_locked(sk, msg, size);
- iov_iter_kvec(&msg->msg_iter, WRITE | ITER_KVEC, vec, num, size);
+ iov_iter_kvec(&msg->msg_iter, WRITE, vec, num, size);
return sock->ops->sendmsg_locked(sk, msg, msg_data_left(msg));
}
mm_segment_t oldfs = get_fs();
int result;
- iov_iter_kvec(&msg->msg_iter, READ | ITER_KVEC, vec, num, size);
+ iov_iter_kvec(&msg->msg_iter, READ, vec, num, size);
set_fs(KERNEL_DS);
result = sock_recvmsg(sock, msg, flags);
set_fs(oldfs);
struct socket *sock = file->private_data;
if (unlikely(!sock->ops->splice_read))
- return -EINVAL;
+ return generic_file_splice_read(file, ppos, pipe, len, flags);
return sock->ops->splice_read(sock, ppos, pipe, len, flags);
}
{
struct auth_cred *acred = &container_of(cred, struct generic_cred,
gc_base)->acred;
- bool ret;
-
- get_rpccred(cred);
- ret = test_bit(RPC_CRED_KEY_EXPIRE_SOON, &acred->ac_flags);
- put_rpccred(cred);
-
- return ret;
+ return test_bit(RPC_CRED_KEY_EXPIRE_SOON, &acred->ac_flags);
}
static const struct rpc_credops generic_credops = {
return &gss_auth->rpc_auth;
}
+static struct gss_cred *
+gss_dup_cred(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
+{
+ struct gss_cred *new;
+
+ /* Make a copy of the cred so that we can reference count it */
+ new = kzalloc(sizeof(*gss_cred), GFP_NOIO);
+ if (new) {
+ struct auth_cred acred = {
+ .uid = gss_cred->gc_base.cr_uid,
+ };
+ struct gss_cl_ctx *ctx =
+ rcu_dereference_protected(gss_cred->gc_ctx, 1);
+
+ rpcauth_init_cred(&new->gc_base, &acred,
+ &gss_auth->rpc_auth,
+ &gss_nullops);
+ new->gc_base.cr_flags = 1UL << RPCAUTH_CRED_UPTODATE;
+ new->gc_service = gss_cred->gc_service;
+ new->gc_principal = gss_cred->gc_principal;
+ kref_get(&gss_auth->kref);
+ rcu_assign_pointer(new->gc_ctx, ctx);
+ gss_get_ctx(ctx);
+ }
+ return new;
+}
+
/*
- * gss_destroying_context will cause the RPCSEC_GSS to send a NULL RPC call
+ * gss_send_destroy_context will cause the RPCSEC_GSS to send a NULL RPC call
* to the server with the GSS control procedure field set to
* RPC_GSS_PROC_DESTROY. This should normally cause the server to release
* all RPCSEC_GSS state associated with that context.
*/
-static int
-gss_destroying_context(struct rpc_cred *cred)
+static void
+gss_send_destroy_context(struct rpc_cred *cred)
{
struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base);
struct gss_auth *gss_auth = container_of(cred->cr_auth, struct gss_auth, rpc_auth);
struct gss_cl_ctx *ctx = rcu_dereference_protected(gss_cred->gc_ctx, 1);
+ struct gss_cred *new;
struct rpc_task *task;
- if (test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) == 0)
- return 0;
+ new = gss_dup_cred(gss_auth, gss_cred);
+ if (new) {
+ ctx->gc_proc = RPC_GSS_PROC_DESTROY;
- ctx->gc_proc = RPC_GSS_PROC_DESTROY;
- cred->cr_ops = &gss_nullops;
+ task = rpc_call_null(gss_auth->client, &new->gc_base,
+ RPC_TASK_ASYNC|RPC_TASK_SOFT);
+ if (!IS_ERR(task))
+ rpc_put_task(task);
- /* Take a reference to ensure the cred will be destroyed either
- * by the RPC call or by the put_rpccred() below */
- get_rpccred(cred);
-
- task = rpc_call_null(gss_auth->client, cred, RPC_TASK_ASYNC|RPC_TASK_SOFT);
- if (!IS_ERR(task))
- rpc_put_task(task);
-
- put_rpccred(cred);
- return 1;
+ put_rpccred(&new->gc_base);
+ }
}
/* gss_destroy_cred (and gss_free_ctx) are used to clean up after failure
gss_destroy_cred(struct rpc_cred *cred)
{
- if (gss_destroying_context(cred))
- return;
+ if (test_and_clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) != 0)
+ gss_send_destroy_context(cred);
gss_destroy_nullcred(cred);
}
for (i=0; i < rqstp->rq_enc_pages_num; i++)
__free_page(rqstp->rq_enc_pages[i]);
kfree(rqstp->rq_enc_pages);
+ rqstp->rq_release_snd_buf = NULL;
}
static int
struct xdr_buf *snd_buf = &rqstp->rq_snd_buf;
int first, last, i;
+ if (rqstp->rq_release_snd_buf)
+ rqstp->rq_release_snd_buf(rqstp);
+
if (snd_buf->page_len == 0) {
rqstp->rq_enc_pages_num = 0;
return 0;
static int
gss_import_v1_context(const void *p, const void *end, struct krb5_ctx *ctx)
{
+ u32 seq_send;
int tmp;
p = simple_get_bytes(p, end, &ctx->initiate, sizeof(ctx->initiate));
p = simple_get_bytes(p, end, &ctx->endtime, sizeof(ctx->endtime));
if (IS_ERR(p))
goto out_err;
- p = simple_get_bytes(p, end, &ctx->seq_send, sizeof(ctx->seq_send));
+ p = simple_get_bytes(p, end, &seq_send, sizeof(seq_send));
if (IS_ERR(p))
goto out_err;
+ atomic_set(&ctx->seq_send, seq_send);
p = simple_get_netobj(p, end, &ctx->mech_used);
if (IS_ERR(p))
goto out_err;
gss_import_v2_context(const void *p, const void *end, struct krb5_ctx *ctx,
gfp_t gfp_mask)
{
+ u64 seq_send64;
int keylen;
p = simple_get_bytes(p, end, &ctx->flags, sizeof(ctx->flags));
p = simple_get_bytes(p, end, &ctx->endtime, sizeof(ctx->endtime));
if (IS_ERR(p))
goto out_err;
- p = simple_get_bytes(p, end, &ctx->seq_send64, sizeof(ctx->seq_send64));
+ p = simple_get_bytes(p, end, &seq_send64, sizeof(seq_send64));
if (IS_ERR(p))
goto out_err;
+ atomic64_set(&ctx->seq_send64, seq_send64);
/* set seq_send for use by "older" enctypes */
- ctx->seq_send = ctx->seq_send64;
- if (ctx->seq_send64 != ctx->seq_send) {
- dprintk("%s: seq_send64 %lx, seq_send %x overflow?\n", __func__,
- (unsigned long)ctx->seq_send64, ctx->seq_send);
+ atomic_set(&ctx->seq_send, seq_send64);
+ if (seq_send64 != atomic_read(&ctx->seq_send)) {
+ dprintk("%s: seq_send64 %llx, seq_send %x overflow?\n", __func__,
+ seq_send64, atomic_read(&ctx->seq_send));
p = ERR_PTR(-EINVAL);
goto out_err;
}
return krb5_hdr;
}
-u32
-gss_seq_send_fetch_and_inc(struct krb5_ctx *ctx)
-{
- u32 old, seq_send = READ_ONCE(ctx->seq_send);
-
- do {
- old = seq_send;
- seq_send = cmpxchg(&ctx->seq_send, old, old + 1);
- } while (old != seq_send);
- return seq_send;
-}
-
-u64
-gss_seq_send64_fetch_and_inc(struct krb5_ctx *ctx)
-{
- u64 old, seq_send = READ_ONCE(ctx->seq_send);
-
- do {
- old = seq_send;
- seq_send = cmpxchg64(&ctx->seq_send64, old, old + 1);
- } while (old != seq_send);
- return seq_send;
-}
-
static u32
gss_get_mic_v1(struct krb5_ctx *ctx, struct xdr_buf *text,
struct xdr_netobj *token)
memcpy(ptr + GSS_KRB5_TOK_HDR_LEN, md5cksum.data, md5cksum.len);
- seq_send = gss_seq_send_fetch_and_inc(ctx);
+ seq_send = atomic_fetch_inc(&ctx->seq_send);
if (krb5_make_seq_num(ctx, ctx->seq, ctx->initiate ? 0 : 0xff,
seq_send, ptr + GSS_KRB5_TOK_HDR_LEN, ptr + 8))
/* Set up the sequence number. Now 64-bits in clear
* text and w/o direction indicator */
- seq_send_be64 = cpu_to_be64(gss_seq_send64_fetch_and_inc(ctx));
+ seq_send_be64 = cpu_to_be64(atomic64_fetch_inc(&ctx->seq_send64));
memcpy(krb5_hdr + 8, (char *) &seq_send_be64, 8);
if (ctx->initiate) {
memcpy(ptr + GSS_KRB5_TOK_HDR_LEN, md5cksum.data, md5cksum.len);
- seq_send = gss_seq_send_fetch_and_inc(kctx);
+ seq_send = atomic_fetch_inc(&kctx->seq_send);
/* XXX would probably be more efficient to compute checksum
* and encrypt at the same time: */
*be16ptr++ = 0;
be64ptr = (__be64 *)be16ptr;
- *be64ptr = cpu_to_be64(gss_seq_send64_fetch_and_inc(kctx));
+ *be64ptr = cpu_to_be64(atomic64_fetch_inc(&kctx->seq_send64));
err = (*kctx->gk5e->encrypt_v2)(kctx, offset, buf, pages);
if (err)
struct rpc_clnt *clnt = task->tk_client;
int status = task->tk_status;
+ /* Check if the task was already transmitted */
+ if (!test_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate)) {
+ xprt_end_transmit(task);
+ task->tk_action = call_transmit_status;
+ return;
+ }
+
dprint_status(task);
trace_rpc_connect_status(task);
task->tk_status = 0;
/* Note: rpc_verify_header() may have freed the RPC slot */
if (task->tk_rqstp == req) {
+ xdr_free_bvec(&req->rq_rcv_buf);
req->rq_reply_bytes_recvd = req->rq_rcv_buf.len = 0;
if (task->tk_client->cl_discrtry)
xprt_conditional_disconnect(req->rq_xprt,
rqstp->rq_xprt_hlen = 0;
clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, iov, nr, buflen);
+ iov_iter_kvec(&msg.msg_iter, READ, iov, nr, buflen);
if (base != 0) {
iov_iter_advance(&msg.msg_iter, base);
buflen -= base;
static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
size_t nbytes)
{
- static __be32 *p;
+ __be32 *p;
int space_left;
int frag1bytes, frag2bytes;
WARN_ON_ONCE(xdr->iov);
return;
}
- if (fraglen) {
+ if (fraglen)
xdr->end = head->iov_base + head->iov_len;
- xdr->page_ptr--;
- }
/* (otherwise assume xdr->end is already set) */
+ xdr->page_ptr--;
head->iov_len = len;
buf->len = len;
xdr->p = head->iov_base + head->iov_len;
return;
if (xprt_test_and_set_connecting(xprt))
return;
- xprt->stat.connect_start = jiffies;
- xprt->ops->connect(xprt, task);
+ /* Race breaker */
+ if (!xprt_connected(xprt)) {
+ xprt->stat.connect_start = jiffies;
+ xprt->ops->connect(xprt, task);
+ } else {
+ xprt_clear_connecting(xprt);
+ task->tk_status = 0;
+ rpc_wake_up_queued_task(&xprt->pending, task);
+ }
}
xprt_release_write(xprt, task);
}
req->rq_snd_buf.buflen = 0;
req->rq_rcv_buf.len = 0;
req->rq_rcv_buf.buflen = 0;
+ req->rq_snd_buf.bvec = NULL;
+ req->rq_rcv_buf.bvec = NULL;
req->rq_release_snd_buf = NULL;
xprt_reset_majortimeo(req);
dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid,
{
size_t i,n;
- if (!(buf->flags & XDRBUF_SPARSE_PAGES))
+ if (!want || !(buf->flags & XDRBUF_SPARSE_PAGES))
return want;
- if (want > buf->page_len)
- want = buf->page_len;
n = (buf->page_base + want + PAGE_SIZE - 1) >> PAGE_SHIFT;
for (i = 0; i < n; i++) {
if (buf->pages[i])
continue;
buf->bvec[i].bv_page = buf->pages[i] = alloc_page(gfp);
if (!buf->pages[i]) {
- buf->page_len = (i * PAGE_SIZE) - buf->page_base;
- return buf->page_len;
+ i *= PAGE_SIZE;
+ return i > buf->page_base ? i - buf->page_base : 0;
}
}
return want;
xs_read_kvec(struct socket *sock, struct msghdr *msg, int flags,
struct kvec *kvec, size_t count, size_t seek)
{
- iov_iter_kvec(&msg->msg_iter, READ | ITER_KVEC, kvec, 1, count);
+ iov_iter_kvec(&msg->msg_iter, READ, kvec, 1, count);
return xs_sock_recvmsg(sock, msg, flags, seek);
}
struct bio_vec *bvec, unsigned long nr, size_t count,
size_t seek)
{
- iov_iter_bvec(&msg->msg_iter, READ | ITER_BVEC, bvec, nr, count);
+ iov_iter_bvec(&msg->msg_iter, READ, bvec, nr, count);
return xs_sock_recvmsg(sock, msg, flags, seek);
}
xs_read_discard(struct socket *sock, struct msghdr *msg, int flags,
size_t count)
{
- struct kvec kvec = { 0 };
- return xs_read_kvec(sock, msg, flags | MSG_TRUNC, &kvec, count, 0);
+ iov_iter_discard(&msg->msg_iter, READ, count);
+ return sock_recvmsg(sock, msg, flags);
}
static ssize_t
if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
goto out;
if (ret != want)
- goto eagain;
+ goto out;
seek = 0;
} else {
seek -= buf->head[0].iov_len;
offset += buf->head[0].iov_len;
}
- if (seek < buf->page_len) {
- want = xs_alloc_sparse_pages(buf,
- min_t(size_t, count - offset, buf->page_len),
- GFP_NOWAIT);
+
+ want = xs_alloc_sparse_pages(buf,
+ min_t(size_t, count - offset, buf->page_len),
+ GFP_NOWAIT);
+ if (seek < want) {
ret = xs_read_bvec(sock, msg, flags, buf->bvec,
xdr_buf_pagecount(buf),
want + buf->page_base,
if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
goto out;
if (ret != want)
- goto eagain;
+ goto out;
seek = 0;
} else {
- seek -= buf->page_len;
- offset += buf->page_len;
+ seek -= want;
+ offset += want;
}
+
if (seek < buf->tail[0].iov_len) {
want = min_t(size_t, count - offset, buf->tail[0].iov_len);
ret = xs_read_kvec(sock, msg, flags, &buf->tail[0], want, seek);
if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
goto out;
if (ret != want)
- goto eagain;
+ goto out;
} else
offset += buf->tail[0].iov_len;
ret = -EMSGSIZE;
- msg->msg_flags |= MSG_TRUNC;
out:
*read = offset - seek_init;
return ret;
-eagain:
- ret = -EAGAIN;
- goto out;
sock_err:
offset += seek;
goto out;
if (transport->recv.offset == transport->recv.len) {
if (xs_read_stream_request_done(transport))
msg->msg_flags |= MSG_EOR;
- return transport->recv.copied;
+ return read;
}
switch (ret) {
+ default:
+ break;
+ case -EFAULT:
case -EMSGSIZE:
- return transport->recv.copied;
+ msg->msg_flags |= MSG_TRUNC;
+ return read;
case 0:
return -ESHUTDOWN;
- default:
- if (ret < 0)
- return ret;
}
- return -EAGAIN;
+ return ret < 0 ? ret : read;
}
static size_t
ret = xs_read_stream_request(transport, msg, flags, req);
if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
- xprt_complete_bc_request(req, ret);
+ xprt_complete_bc_request(req, transport->recv.copied);
return ret;
}
spin_lock(&xprt->queue_lock);
if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
- xprt_complete_rqst(req->rq_task, ret);
+ xprt_complete_rqst(req->rq_task, transport->recv.copied);
xprt_unpin_rqst(req);
out:
spin_unlock(&xprt->queue_lock);
if (ret <= 0)
goto out_err;
transport->recv.offset = ret;
- if (ret != want) {
- ret = -EAGAIN;
- goto out_err;
- }
+ if (transport->recv.offset != want)
+ return transport->recv.offset;
transport->recv.len = be32_to_cpu(transport->recv.fraghdr) &
RPC_FRAGMENT_SIZE_MASK;
transport->recv.offset -= sizeof(transport->recv.fraghdr);
}
switch (be32_to_cpu(transport->recv.calldir)) {
+ default:
+ msg.msg_flags |= MSG_TRUNC;
+ break;
case RPC_CALL:
ret = xs_read_stream_call(transport, &msg, flags);
break;
goto out_err;
read += ret;
if (transport->recv.offset < transport->recv.len) {
+ if (!(msg.msg_flags & MSG_TRUNC))
+ return read;
+ msg.msg_flags = 0;
ret = xs_read_discard(transport->sock, &msg, flags,
transport->recv.len - transport->recv.offset);
if (ret <= 0)
transport->recv.offset += ret;
read += ret;
if (transport->recv.offset != transport->recv.len)
- return -EAGAIN;
+ return read;
}
if (xs_read_stream_request_done(transport)) {
trace_xs_stream_read_request(transport);
transport->recv.len = 0;
return read;
out_err:
- switch (ret) {
- case 0:
- case -ESHUTDOWN:
- xprt_force_disconnect(&transport->xprt);
- return -ESHUTDOWN;
- }
- return ret;
+ return ret != 0 ? ret : -ESHUTDOWN;
}
static void xs_stream_data_receive(struct sock_xprt *transport)
ssize_t ret = 0;
mutex_lock(&transport->recv_mutex);
+ clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state);
if (transport->sock == NULL)
goto out;
- clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state);
for (;;) {
ret = xs_read_stream(transport, MSG_DONTWAIT);
- if (ret <= 0)
+ if (ret < 0)
break;
read += ret;
cond_resched();
int err;
mutex_lock(&transport->recv_mutex);
+ clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state);
sk = transport->inet;
if (sk == NULL)
goto out;
- clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state);
for (;;) {
skb = skb_recv_udp(sk, 0, 1, &err);
if (skb == NULL)
/* Apply trial address if we just left trial period */
if (!trial && !self) {
- tipc_net_finalize(net, tn->trial_addr);
+ tipc_sched_net_finalize(net, tn->trial_addr);
+ msg_set_prevnode(buf_msg(d->skb), tn->trial_addr);
msg_set_type(buf_msg(d->skb), DSC_REQ_MSG);
}
goto exit;
}
- /* Trial period over ? */
- if (!time_before(jiffies, tn->addr_trial_end)) {
- /* Did we just leave it ? */
- if (!tipc_own_addr(net))
- tipc_net_finalize(net, tn->trial_addr);
-
- msg_set_type(buf_msg(d->skb), DSC_REQ_MSG);
- msg_set_prevnode(buf_msg(d->skb), tipc_own_addr(net));
+ /* Did we just leave trial period ? */
+ if (!time_before(jiffies, tn->addr_trial_end) && !tipc_own_addr(net)) {
+ mod_timer(&d->timer, jiffies + TIPC_DISC_INIT);
+ spin_unlock_bh(&d->lock);
+ tipc_sched_net_finalize(net, tn->trial_addr);
+ return;
}
/* Adjust timeout interval according to discovery phase */
d->timer_intv = TIPC_DISC_SLOW;
else if (!d->num_nodes && d->timer_intv > TIPC_DISC_FAST)
d->timer_intv = TIPC_DISC_FAST;
+ msg_set_type(buf_msg(d->skb), DSC_REQ_MSG);
+ msg_set_prevnode(buf_msg(d->skb), tn->trial_addr);
}
mod_timer(&d->timer, jiffies + d->timer_intv);
if (in_range(peers_prio, l->priority + 1, TIPC_MAX_LINK_PRI))
l->priority = peers_prio;
- /* ACTIVATE_MSG serves as PEER_RESET if link is already down */
- if (msg_peer_stopping(hdr))
+ /* If peer is going down we want full re-establish cycle */
+ if (msg_peer_stopping(hdr)) {
rc = tipc_link_fsm_evt(l, LINK_FAILURE_EVT);
- else if ((mtyp == RESET_MSG) || !link_is_up(l))
+ break;
+ }
+ /* ACTIVATE_MSG serves as PEER_RESET if link is already down */
+ if (mtyp == RESET_MSG || !link_is_up(l))
rc = tipc_link_fsm_evt(l, LINK_PEER_RESET_EVT);
/* ACTIVATE_MSG takes up link if it was already locally reset */
- if ((mtyp == ACTIVATE_MSG) && (l->state == LINK_ESTABLISHING))
+ if (mtyp == ACTIVATE_MSG && l->state == LINK_ESTABLISHING)
rc = TIPC_LINK_UP_EVT;
l->peer_session = msg_session(hdr);
* - A local spin_lock protecting the queue of subscriber events.
*/
+struct tipc_net_work {
+ struct work_struct work;
+ struct net *net;
+ u32 addr;
+};
+
+static void tipc_net_finalize(struct net *net, u32 addr);
+
int tipc_net_init(struct net *net, u8 *node_id, u32 addr)
{
if (tipc_own_id(net)) {
return 0;
}
-void tipc_net_finalize(struct net *net, u32 addr)
+static void tipc_net_finalize(struct net *net, u32 addr)
{
struct tipc_net *tn = tipc_net(net);
- if (!cmpxchg(&tn->node_addr, 0, addr)) {
- tipc_set_node_addr(net, addr);
- tipc_named_reinit(net);
- tipc_sk_reinit(net);
- tipc_nametbl_publish(net, TIPC_CFG_SRV, addr, addr,
- TIPC_CLUSTER_SCOPE, 0, addr);
- }
+ if (cmpxchg(&tn->node_addr, 0, addr))
+ return;
+ tipc_set_node_addr(net, addr);
+ tipc_named_reinit(net);
+ tipc_sk_reinit(net);
+ tipc_nametbl_publish(net, TIPC_CFG_SRV, addr, addr,
+ TIPC_CLUSTER_SCOPE, 0, addr);
+}
+
+static void tipc_net_finalize_work(struct work_struct *work)
+{
+ struct tipc_net_work *fwork;
+
+ fwork = container_of(work, struct tipc_net_work, work);
+ tipc_net_finalize(fwork->net, fwork->addr);
+ kfree(fwork);
+}
+
+void tipc_sched_net_finalize(struct net *net, u32 addr)
+{
+ struct tipc_net_work *fwork = kzalloc(sizeof(*fwork), GFP_ATOMIC);
+
+ if (!fwork)
+ return;
+ INIT_WORK(&fwork->work, tipc_net_finalize_work);
+ fwork->net = net;
+ fwork->addr = addr;
+ schedule_work(&fwork->work);
}
void tipc_net_stop(struct net *net)
extern const struct nla_policy tipc_nl_net_policy[];
int tipc_net_init(struct net *net, u8 *node_id, u32 addr);
-void tipc_net_finalize(struct net *net, u32 addr);
+void tipc_sched_net_finalize(struct net *net, u32 addr);
void tipc_net_stop(struct net *net);
int tipc_nl_net_dump(struct sk_buff *skb, struct netlink_callback *cb);
int tipc_nl_net_set(struct sk_buff *skb, struct genl_info *info);
/* tipc_node_cleanup - delete nodes that does not
* have active links for NODE_CLEANUP_AFTER time
*/
-static int tipc_node_cleanup(struct tipc_node *peer)
+static bool tipc_node_cleanup(struct tipc_node *peer)
{
struct tipc_net *tn = tipc_net(peer->net);
bool deleted = false;
- spin_lock_bh(&tn->node_list_lock);
+ /* If lock held by tipc_node_stop() the node will be deleted anyway */
+ if (!spin_trylock_bh(&tn->node_list_lock))
+ return false;
+
tipc_node_write_lock(peer);
if (!node_is_up(peer) && time_after(jiffies, peer->delete_at)) {
/**
* tipc_sk_anc_data_recv - optionally capture ancillary data for received message
* @m: descriptor for message info
- * @msg: received message header
+ * @skb: received message buffer
* @tsk: TIPC port associated with message
*
* Note: Ancillary data is not captured if not requested by receiver.
*
* Returns 0 if successful, otherwise errno
*/
-static int tipc_sk_anc_data_recv(struct msghdr *m, struct tipc_msg *msg,
+static int tipc_sk_anc_data_recv(struct msghdr *m, struct sk_buff *skb,
struct tipc_sock *tsk)
{
+ struct tipc_msg *msg;
u32 anc_data[3];
u32 err;
u32 dest_type;
if (likely(m->msg_controllen == 0))
return 0;
+ msg = buf_msg(skb);
/* Optionally capture errored message object(s) */
err = msg ? msg_errcode(msg) : 0;
if (res)
return res;
if (anc_data[1]) {
+ if (skb_linearize(skb))
+ return -ENOMEM;
+ msg = buf_msg(skb);
res = put_cmsg(m, SOL_TIPC, TIPC_RETDATA, anc_data[1],
msg_data(msg));
if (res)
/* Collect msg meta data, including error code and rejected data */
tipc_sk_set_orig_addr(m, skb);
- rc = tipc_sk_anc_data_recv(m, hdr, tsk);
+ rc = tipc_sk_anc_data_recv(m, skb, tsk);
if (unlikely(rc))
goto exit;
+ hdr = buf_msg(skb);
/* Capture data if non-error msg, otherwise just set return value */
if (likely(!err)) {
/* Collect msg meta data, incl. error code and rejected data */
if (!copied) {
tipc_sk_set_orig_addr(m, skb);
- rc = tipc_sk_anc_data_recv(m, hdr, tsk);
+ rc = tipc_sk_anc_data_recv(m, skb, tsk);
if (rc)
break;
+ hdr = buf_msg(skb);
}
/* Copy data if msg ok, otherwise return error/partial data */
iov.iov_base = &s;
iov.iov_len = sizeof(s);
msg.msg_name = NULL;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, iov.iov_len);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, iov.iov_len);
ret = sock_recvmsg(con->sock, &msg, MSG_DONTWAIT);
if (ret == -EWOULDBLOCK)
return -EWOULDBLOCK;
iov.iov_base = kaddr + offset;
iov.iov_len = size;
- iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg_iter, WRITE, &iov, 1, size);
rc = tls_push_data(sk, &msg_iter, size,
flags, TLS_RECORD_TYPE_DATA);
kunmap(page);
{
struct iov_iter msg_iter;
- iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&msg_iter, WRITE, NULL, 0, 0);
return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA);
}
struct crypto_tfm *tfm = crypto_aead_tfm(ctx->aead_send);
bool async_capable = tfm->__crt_alg->cra_flags & CRYPTO_ALG_ASYNC;
unsigned char record_type = TLS_RECORD_TYPE_DATA;
- bool is_kvec = msg->msg_iter.type & ITER_KVEC;
+ bool is_kvec = iov_iter_is_kvec(&msg->msg_iter);
bool eor = !(msg->msg_flags & MSG_MORE);
size_t try_to_copy, copied = 0;
struct sk_msg *msg_pl, *msg_en;
bool cmsg = false;
int target, err = 0;
long timeo;
- bool is_kvec = msg->msg_iter.type & ITER_KVEC;
+ bool is_kvec = iov_iter_is_kvec(&msg->msg_iter);
int num_async = 0;
flags |= nonblock;
p1 = (u8*)(ht_capa);
p2 = (u8*)(ht_capa_mask);
- for (i = 0; i<sizeof(*ht_capa); i++)
+ for (i = 0; i < sizeof(*ht_capa); i++)
p1[i] &= p2[i];
}
-/* Do a logical ht_capa &= ht_capa_mask. */
+/* Do a logical vht_capa &= vht_capa_mask. */
void cfg80211_oper_and_vht_capa(struct ieee80211_vht_cap *vht_capa,
const struct ieee80211_vht_cap *vht_capa_mask)
{
}
memset(¶ms, 0, sizeof(params));
+ params.beacon_csa.ftm_responder = -1;
if (!info->attrs[NL80211_ATTR_WIPHY_FREQ] ||
!info->attrs[NL80211_ATTR_CH_SWITCH_COUNT])
* All devices must be idle as otherwise if you are actively
* scanning some new beacon hints could be learned and would
* count as new regulatory hints.
+ * Also if there is any other active beaconing interface we
+ * need not issue a disconnect hint and reset any info such
+ * as chan dfs state, etc.
*/
list_for_each_entry(rdev, &cfg80211_rdev_list, list) {
list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) {
wdev_lock(wdev);
- if (wdev->conn || wdev->current_bss)
+ if (wdev->conn || wdev->current_bss ||
+ cfg80211_beaconing_iface_active(wdev))
is_all_idle = false;
wdev_unlock(wdev);
}
cfg80211_oper_and_ht_capa(&connect->ht_capa_mask,
rdev->wiphy.ht_capa_mod_mask);
+ cfg80211_oper_and_vht_capa(&connect->vht_capa_mask,
+ rdev->wiphy.vht_capa_mod_mask);
if (connkeys && connkeys->def >= 0) {
int idx;
ies[pos + ext],
ext == 2))
pos = skip_ie(ies, ielen, pos);
+ else
+ break;
}
} else {
pos = skip_ie(ies, ielen, pos);
}
len = *skb->data;
- needed = 1 + (len >> 4) + (len & 0x0f);
+ needed = 1 + ((len >> 4) + (len & 0x0f) + 1) / 2;
if (!pskb_may_pull(skb, needed)) {
/* packet is too short to hold the addresses it claims
sk_for_each(s, &x25_list)
if ((!strcmp(addr->x25_addr,
x25_sk(s)->source_addr.x25_addr) ||
- !strcmp(addr->x25_addr,
+ !strcmp(x25_sk(s)->source_addr.x25_addr,
null_x25_address.x25_addr)) &&
s->sk_state == TCP_LISTEN) {
/*
goto out;
}
- len = strlen(addr->sx25_addr.x25_addr);
- for (i = 0; i < len; i++) {
- if (!isdigit(addr->sx25_addr.x25_addr[i])) {
- rc = -EINVAL;
- goto out;
+ /* check for the null_x25_address */
+ if (strcmp(addr->sx25_addr.x25_addr, null_x25_address.x25_addr)) {
+
+ len = strlen(addr->sx25_addr.x25_addr);
+ for (i = 0; i < len; i++) {
+ if (!isdigit(addr->sx25_addr.x25_addr[i])) {
+ rc = -EINVAL;
+ goto out;
+ }
}
}
sk->sk_state_change(sk);
break;
}
+ case X25_CALL_REQUEST:
+ /* call collision */
+ x25->causediag.cause = 0x01;
+ x25->causediag.diagnostic = 0x48;
+
+ x25_write_internal(sk, X25_CLEAR_REQUEST);
+ x25_disconnect(sk, EISCONN, 0x01, 0x48);
+ break;
+
case X25_CLEAR_REQUEST:
if (!pskb_may_pull(skb, X25_STD_MIN_LEN + 2))
goto out_clear;
config XFRM_OFFLOAD
bool
- depends on XFRM
config XFRM_ALGO
tristate
struct xfrm_mgr *km;
struct xfrm_policy *pol = NULL;
-#ifdef CONFIG_COMPAT
if (in_compat_syscall())
return -EOPNOTSUPP;
-#endif
if (!optval && !optlen) {
xfrm_sk_policy_insert(sk, XFRM_POLICY_IN, NULL);
const struct xfrm_link *link;
int type, err;
-#ifdef CONFIG_COMPAT
if (in_compat_syscall())
return -EOPNOTSUPP;
-#endif
type = nlh->nlmsg_type;
if (type > XFRM_MSG_MAX)
if (res < 0)
perror("HIDIOCSFEATURE");
else
- printf("ioctl HIDIOCGFEATURE returned: %d\n", res);
+ printf("ioctl HIDIOCSFEATURE returned: %d\n", res);
/* Get Feature */
buf[0] = 0x9; /* Report Number */
cc-disable-warning = $(call try-run,\
$(CC) -Werror $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1)))
-# cc-name
-# Expands to either gcc or clang
-cc-name = $(shell $(CC) -v 2>&1 | grep -q "clang version" && echo clang || echo gcc)
-
# cc-version
cc-version = $(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-version.sh $(CC))
-# cc-fullversion
-cc-fullversion = $(shell $(CONFIG_SHELL) \
- $(srctree)/scripts/gcc-version.sh -p $(CC))
-
# cc-ifversion
# Usage: EXTRA_CFLAGS += $(call cc-ifversion, -lt, 0402, -O1)
cc-ifversion = $(shell [ $(cc-version) $(1) $(2) ] && echo $(3) || echo $(4))
objtool_args += --no-unreachable
endif
ifdef CONFIG_RETPOLINE
-ifneq ($(RETPOLINE_CFLAGS),)
objtool_args += --retpoline
endif
-endif
ifdef CONFIG_MODVERSIONS
warning-1 += $(call cc-option, -Wunused-but-set-variable)
warning-1 += $(call cc-option, -Wunused-const-variable)
warning-1 += $(call cc-option, -Wpacked-not-aligned)
+warning-1 += $(call cc-option, -Wstringop-truncation)
warning-1 += $(call cc-disable-warning, missing-field-initializers)
warning-1 += $(call cc-disable-warning, sign-compare)
KBUILD_CFLAGS += $(warning)
else
-ifeq ($(cc-name),clang)
+ifdef CONFIG_CC_IS_CLANG
KBUILD_CFLAGS += $(call cc-disable-warning, initializer-overrides)
KBUILD_CFLAGS += $(call cc-disable-warning, unused-value)
KBUILD_CFLAGS += $(call cc-disable-warning, format)
gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT_PERFORMANCE) \
+= -fplugin-arg-randomize_layout_plugin-performance-mode
+gcc-plugin-$(CONFIG_GCC_PLUGIN_STACKLEAK) += stackleak_plugin.so
+gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STACKLEAK) \
+ += -DSTACKLEAK_PLUGIN
+gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STACKLEAK) \
+ += -fplugin-arg-stackleak_plugin-track-min-size=$(CONFIG_STACKLEAK_TRACK_MIN_SIZE)
+ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+ DISABLE_STACKLEAK_PLUGIN += -fplugin-arg-stackleak_plugin-disable
+endif
+export DISABLE_STACKLEAK_PLUGIN
+
# All the plugin CFLAGS are collected here in case a build target needs to
# filter them out of the KBUILD_CFLAGS.
GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
# Try to figure out the source directory prefix so we can remove it from the
# addr2line output. HACK ALERT: This assumes that start_kernel() is in
-# kernel/init.c! This only works for vmlinux. Otherwise it falls back to
+# init/main.c! This only works for vmlinux. Otherwise it falls back to
# printing the absolute path.
find_dir_prefix() {
local objfile=$1
in structures. This reduces the performance hit of RANDSTRUCT
at the cost of weakened randomization.
+config GCC_PLUGIN_STACKLEAK
+ bool "Erase the kernel stack before returning from syscalls"
+ depends on GCC_PLUGINS
+ depends on HAVE_ARCH_STACKLEAK
+ help
+ This option makes the kernel erase the kernel stack before
+ returning from system calls. That reduces the information which
+ kernel stack leak bugs can reveal and blocks some uninitialized
+ stack variable attacks.
+
+ The tradeoff is the performance impact: on a single CPU system kernel
+ compilation sees a 1% slowdown, other systems and workloads may vary
+ and you are advised to test this feature on your expected workload
+ before deploying it.
+
+ This plugin was ported from grsecurity/PaX. More information at:
+ * https://grsecurity.net/
+ * https://pax.grsecurity.net/
+
+config STACKLEAK_TRACK_MIN_SIZE
+ int "Minimum stack frame size of functions tracked by STACKLEAK"
+ default 100
+ range 0 4096
+ depends on GCC_PLUGIN_STACKLEAK
+ help
+ The STACKLEAK gcc plugin instruments the kernel code for tracking
+ the lowest border of the kernel stack (and for some other purposes).
+ It inserts the stackleak_track_stack() call for the functions with
+ a stack frame size greater than or equal to this parameter.
+ If unsure, leave the default value 100.
+
+config STACKLEAK_METRICS
+ bool "Show STACKLEAK metrics in the /proc file system"
+ depends on GCC_PLUGIN_STACKLEAK
+ depends on PROC_FS
+ help
+ If this is set, STACKLEAK metrics for every task are available in
+ the /proc file system. In particular, /proc/<pid>/stack_depth
+ shows the maximum kernel stack consumption for the current and
+ previous syscalls. Although this information is not precise, it
+ can be useful for estimating the STACKLEAK performance impact for
+ your workloads.
+
+config STACKLEAK_RUNTIME_DISABLE
+ bool "Allow runtime disabling of kernel stack erasing"
+ depends on GCC_PLUGIN_STACKLEAK
+ help
+ This option provides 'stack_erasing' sysctl, which can be used in
+ runtime to control kernel stack erasing for kernels built with
+ CONFIG_GCC_PLUGIN_STACKLEAK.
+
endif
--- /dev/null
+/*
+ * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
+ * Modified by Alexander Popov <alex.popov@linux.com>
+ * Licensed under the GPL v2
+ *
+ * Note: the choice of the license means that the compilation process is
+ * NOT 'eligible' as defined by gcc's library exception to the GPL v3,
+ * but for the kernel it doesn't matter since it doesn't link against
+ * any of the gcc libraries
+ *
+ * This gcc plugin is needed for tracking the lowest border of the kernel stack.
+ * It instruments the kernel code inserting stackleak_track_stack() calls:
+ * - after alloca();
+ * - for the functions with a stack frame size greater than or equal
+ * to the "track-min-size" plugin parameter.
+ *
+ * This plugin is ported from grsecurity/PaX. For more information see:
+ * https://grsecurity.net/
+ * https://pax.grsecurity.net/
+ *
+ * Debugging:
+ * - use fprintf() to stderr, debug_generic_expr(), debug_gimple_stmt(),
+ * print_rtl() and print_simple_rtl();
+ * - add "-fdump-tree-all -fdump-rtl-all" to the plugin CFLAGS in
+ * Makefile.gcc-plugins to see the verbose dumps of the gcc passes;
+ * - use gcc -E to understand the preprocessing shenanigans;
+ * - use gcc with enabled CFG/GIMPLE/SSA verification (--enable-checking).
+ */
+
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static int track_frame_size = -1;
+static const char track_function[] = "stackleak_track_stack";
+
+/*
+ * Mark these global variables (roots) for gcc garbage collector since
+ * they point to the garbage-collected memory.
+ */
+static GTY(()) tree track_function_decl;
+
+static struct plugin_info stackleak_plugin_info = {
+ .version = "201707101337",
+ .help = "track-min-size=nn\ttrack stack for functions with a stack frame size >= nn bytes\n"
+ "disable\t\tdo not activate the plugin\n"
+};
+
+static void stackleak_add_track_stack(gimple_stmt_iterator *gsi, bool after)
+{
+ gimple stmt;
+ gcall *stackleak_track_stack;
+ cgraph_node_ptr node;
+ int frequency;
+ basic_block bb;
+
+ /* Insert call to void stackleak_track_stack(void) */
+ stmt = gimple_build_call(track_function_decl, 0);
+ stackleak_track_stack = as_a_gcall(stmt);
+ if (after) {
+ gsi_insert_after(gsi, stackleak_track_stack,
+ GSI_CONTINUE_LINKING);
+ } else {
+ gsi_insert_before(gsi, stackleak_track_stack, GSI_SAME_STMT);
+ }
+
+ /* Update the cgraph */
+ bb = gimple_bb(stackleak_track_stack);
+ node = cgraph_get_create_node(track_function_decl);
+ gcc_assert(node);
+ frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+ cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+ stackleak_track_stack, bb->count, frequency);
+}
+
+static bool is_alloca(gimple stmt)
+{
+ if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA))
+ return true;
+
+#if BUILDING_GCC_VERSION >= 4007
+ if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+ return true;
+#endif
+
+ return false;
+}
+
+/*
+ * Work with the GIMPLE representation of the code. Insert the
+ * stackleak_track_stack() call after alloca() and into the beginning
+ * of the function if it is not instrumented.
+ */
+static unsigned int stackleak_instrument_execute(void)
+{
+ basic_block bb, entry_bb;
+ bool prologue_instrumented = false, is_leaf = true;
+ gimple_stmt_iterator gsi;
+
+ /*
+ * ENTRY_BLOCK_PTR is a basic block which represents possible entry
+ * point of a function. This block does not contain any code and
+ * has a CFG edge to its successor.
+ */
+ gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+ entry_bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+
+ /*
+ * Loop through the GIMPLE statements in each of cfun basic blocks.
+ * cfun is a global variable which represents the function that is
+ * currently processed.
+ */
+ FOR_EACH_BB_FN(bb, cfun) {
+ for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
+ gimple stmt;
+
+ stmt = gsi_stmt(gsi);
+
+ /* Leaf function is a function which makes no calls */
+ if (is_gimple_call(stmt))
+ is_leaf = false;
+
+ if (!is_alloca(stmt))
+ continue;
+
+ /* Insert stackleak_track_stack() call after alloca() */
+ stackleak_add_track_stack(&gsi, true);
+ if (bb == entry_bb)
+ prologue_instrumented = true;
+ }
+ }
+
+ if (prologue_instrumented)
+ return 0;
+
+ /*
+ * Special cases to skip the instrumentation.
+ *
+ * Taking the address of static inline functions materializes them,
+ * but we mustn't instrument some of them as the resulting stack
+ * alignment required by the function call ABI will break other
+ * assumptions regarding the expected (but not otherwise enforced)
+ * register clobbering ABI.
+ *
+ * Case in point: native_save_fl on amd64 when optimized for size
+ * clobbers rdx if it were instrumented here.
+ *
+ * TODO: any more special cases?
+ */
+ if (is_leaf &&
+ !TREE_PUBLIC(current_function_decl) &&
+ DECL_DECLARED_INLINE_P(current_function_decl)) {
+ return 0;
+ }
+
+ if (is_leaf &&
+ !strncmp(IDENTIFIER_POINTER(DECL_NAME(current_function_decl)),
+ "_paravirt_", 10)) {
+ return 0;
+ }
+
+ /* Insert stackleak_track_stack() call at the function beginning */
+ bb = entry_bb;
+ if (!single_pred_p(bb)) {
+ /* gcc_assert(bb_loop_depth(bb) ||
+ (bb->flags & BB_IRREDUCIBLE_LOOP)); */
+ split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+ gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+ bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+ }
+ gsi = gsi_after_labels(bb);
+ stackleak_add_track_stack(&gsi, false);
+
+ return 0;
+}
+
+static bool large_stack_frame(void)
+{
+#if BUILDING_GCC_VERSION >= 8000
+ return maybe_ge(get_frame_size(), track_frame_size);
+#else
+ return (get_frame_size() >= track_frame_size);
+#endif
+}
+
+/*
+ * Work with the RTL representation of the code.
+ * Remove the unneeded stackleak_track_stack() calls from the functions
+ * which don't call alloca() and don't have a large enough stack frame size.
+ */
+static unsigned int stackleak_cleanup_execute(void)
+{
+ rtx_insn *insn, *next;
+
+ if (cfun->calls_alloca)
+ return 0;
+
+ if (large_stack_frame())
+ return 0;
+
+ /*
+ * Find stackleak_track_stack() calls. Loop through the chain of insns,
+ * which is an RTL representation of the code for a function.
+ *
+ * The example of a matching insn:
+ * (call_insn 8 4 10 2 (call (mem (symbol_ref ("stackleak_track_stack")
+ * [flags 0x41] <function_decl 0x7f7cd3302a80 stackleak_track_stack>)
+ * [0 stackleak_track_stack S1 A8]) (0)) 675 {*call} (expr_list
+ * (symbol_ref ("stackleak_track_stack") [flags 0x41] <function_decl
+ * 0x7f7cd3302a80 stackleak_track_stack>) (expr_list (0) (nil))) (nil))
+ */
+ for (insn = get_insns(); insn; insn = next) {
+ rtx body;
+
+ next = NEXT_INSN(insn);
+
+ /* Check the expression code of the insn */
+ if (!CALL_P(insn))
+ continue;
+
+ /*
+ * Check the expression code of the insn body, which is an RTL
+ * Expression (RTX) describing the side effect performed by
+ * that insn.
+ */
+ body = PATTERN(insn);
+
+ if (GET_CODE(body) == PARALLEL)
+ body = XVECEXP(body, 0, 0);
+
+ if (GET_CODE(body) != CALL)
+ continue;
+
+ /*
+ * Check the first operand of the call expression. It should
+ * be a mem RTX describing the needed subroutine with a
+ * symbol_ref RTX.
+ */
+ body = XEXP(body, 0);
+ if (GET_CODE(body) != MEM)
+ continue;
+
+ body = XEXP(body, 0);
+ if (GET_CODE(body) != SYMBOL_REF)
+ continue;
+
+ if (SYMBOL_REF_DECL(body) != track_function_decl)
+ continue;
+
+ /* Delete the stackleak_track_stack() call */
+ delete_insn_and_edges(insn);
+#if BUILDING_GCC_VERSION >= 4007 && BUILDING_GCC_VERSION < 8000
+ if (GET_CODE(next) == NOTE &&
+ NOTE_KIND(next) == NOTE_INSN_CALL_ARG_LOCATION) {
+ insn = next;
+ next = NEXT_INSN(insn);
+ delete_insn_and_edges(insn);
+ }
+#endif
+ }
+
+ return 0;
+}
+
+static bool stackleak_gate(void)
+{
+ tree section;
+
+ section = lookup_attribute("section",
+ DECL_ATTRIBUTES(current_function_decl));
+ if (section && TREE_VALUE(section)) {
+ section = TREE_VALUE(TREE_VALUE(section));
+
+ if (!strncmp(TREE_STRING_POINTER(section), ".init.text", 10))
+ return false;
+ if (!strncmp(TREE_STRING_POINTER(section), ".devinit.text", 13))
+ return false;
+ if (!strncmp(TREE_STRING_POINTER(section), ".cpuinit.text", 13))
+ return false;
+ if (!strncmp(TREE_STRING_POINTER(section), ".meminit.text", 13))
+ return false;
+ }
+
+ return track_frame_size >= 0;
+}
+
+/* Build the function declaration for stackleak_track_stack() */
+static void stackleak_start_unit(void *gcc_data __unused,
+ void *user_data __unused)
+{
+ tree fntype;
+
+ /* void stackleak_track_stack(void) */
+ fntype = build_function_type_list(void_type_node, NULL_TREE);
+ track_function_decl = build_fn_decl(track_function, fntype);
+ DECL_ASSEMBLER_NAME(track_function_decl); /* for LTO */
+ TREE_PUBLIC(track_function_decl) = 1;
+ TREE_USED(track_function_decl) = 1;
+ DECL_EXTERNAL(track_function_decl) = 1;
+ DECL_ARTIFICIAL(track_function_decl) = 1;
+ DECL_PRESERVE_P(track_function_decl) = 1;
+}
+
+/*
+ * Pass gate function is a predicate function that gets executed before the
+ * corresponding pass. If the return value is 'true' the pass gets executed,
+ * otherwise, it is skipped.
+ */
+static bool stackleak_instrument_gate(void)
+{
+ return stackleak_gate();
+}
+
+#define PASS_NAME stackleak_instrument
+#define PROPERTIES_REQUIRED PROP_gimple_leh | PROP_cfg
+#define TODO_FLAGS_START TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts
+#define TODO_FLAGS_FINISH TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func \
+ | TODO_update_ssa | TODO_rebuild_cgraph_edges
+#include "gcc-generate-gimple-pass.h"
+
+static bool stackleak_cleanup_gate(void)
+{
+ return stackleak_gate();
+}
+
+#define PASS_NAME stackleak_cleanup
+#define TODO_FLAGS_FINISH TODO_dump_func
+#include "gcc-generate-rtl-pass.h"
+
+/*
+ * Every gcc plugin exports a plugin_init() function that is called right
+ * after the plugin is loaded. This function is responsible for registering
+ * the plugin callbacks and doing other required initialization.
+ */
+__visible int plugin_init(struct plugin_name_args *plugin_info,
+ struct plugin_gcc_version *version)
+{
+ const char * const plugin_name = plugin_info->base_name;
+ const int argc = plugin_info->argc;
+ const struct plugin_argument * const argv = plugin_info->argv;
+ int i = 0;
+
+ /* Extra GGC root tables describing our GTY-ed data */
+ static const struct ggc_root_tab gt_ggc_r_gt_stackleak[] = {
+ {
+ .base = &track_function_decl,
+ .nelt = 1,
+ .stride = sizeof(track_function_decl),
+ .cb = >_ggc_mx_tree_node,
+ .pchw = >_pch_nx_tree_node
+ },
+ LAST_GGC_ROOT_TAB
+ };
+
+ /*
+ * The stackleak_instrument pass should be executed before the
+ * "optimized" pass, which is the control flow graph cleanup that is
+ * performed just before expanding gcc trees to the RTL. In former
+ * versions of the plugin this new pass was inserted before the
+ * "tree_profile" pass, which is currently called "profile".
+ */
+ PASS_INFO(stackleak_instrument, "optimized", 1,
+ PASS_POS_INSERT_BEFORE);
+
+ /*
+ * The stackleak_cleanup pass should be executed before the "*free_cfg"
+ * pass. It's the moment when the stack frame size is already final,
+ * function prologues and epilogues are generated, and the
+ * machine-dependent code transformations are not done.
+ */
+ PASS_INFO(stackleak_cleanup, "*free_cfg", 1, PASS_POS_INSERT_BEFORE);
+
+ if (!plugin_default_version_check(version, &gcc_version)) {
+ error(G_("incompatible gcc/plugin versions"));
+ return 1;
+ }
+
+ /* Parse the plugin arguments */
+ for (i = 0; i < argc; i++) {
+ if (!strcmp(argv[i].key, "disable"))
+ return 0;
+
+ if (!strcmp(argv[i].key, "track-min-size")) {
+ if (!argv[i].value) {
+ error(G_("no value supplied for option '-fplugin-arg-%s-%s'"),
+ plugin_name, argv[i].key);
+ return 1;
+ }
+
+ track_frame_size = atoi(argv[i].value);
+ if (track_frame_size < 0) {
+ error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"),
+ plugin_name, argv[i].key, argv[i].value);
+ return 1;
+ }
+ } else {
+ error(G_("unknown option '-fplugin-arg-%s-%s'"),
+ plugin_name, argv[i].key);
+ return 1;
+ }
+ }
+
+ /* Give the information about the plugin */
+ register_callback(plugin_name, PLUGIN_INFO, NULL,
+ &stackleak_plugin_info);
+
+ /* Register to be called before processing a translation unit */
+ register_callback(plugin_name, PLUGIN_START_UNIT,
+ &stackleak_start_unit, NULL);
+
+ /* Register an extra GCC garbage collector (GGC) root table */
+ register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL,
+ (void *)>_ggc_r_gt_stackleak);
+
+ /*
+ * Hook into the Pass Manager to register new gcc passes.
+ *
+ * The stack frame size info is available only at the last RTL pass,
+ * when it's too late to insert complex code like a function call.
+ * So we register two gcc passes to instrument every function at first
+ * and remove the unneeded instrumentation later.
+ */
+ register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+ &stackleak_instrument_pass_info);
+ register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+ &stackleak_cleanup_pass_info);
+
+ return 0;
+}
$(simple-targets): $(obj)/conf
$< $(silent) --$@ $(Kconfig)
-PHONY += oldnoconfig silentoldconfig savedefconfig defconfig
-
-# oldnoconfig is an alias of olddefconfig, because people already are dependent
-# on its behavior (sets new symbols to their default value but not 'n') with the
-# counter-intuitive name.
-oldnoconfig: olddefconfig
- @echo " WARNING: \"oldnoconfig\" target will be removed after Linux 4.19"
- @echo " Please use \"olddefconfig\" instead, which is an alias."
-
-# We do not expect manual invokcation of "silentoldcofig" (or "syncconfig").
-silentoldconfig: syncconfig
- @echo " WARNING: \"silentoldconfig\" has been renamed to \"syncconfig\""
- @echo " and is now an internal implementation detail."
- @echo " What you want is probably \"oldconfig\"."
- @echo " \"silentoldconfig\" will be removed after Linux 4.19"
+PHONY += savedefconfig defconfig
savedefconfig: $(obj)/conf
$< $(silent) --$@=defconfig $(Kconfig)
{"randconfig", no_argument, NULL, randconfig},
{"listnewconfig", no_argument, NULL, listnewconfig},
{"olddefconfig", no_argument, NULL, olddefconfig},
- /*
- * oldnoconfig is an alias of olddefconfig, because people already
- * are dependent on its behavior(sets new symbols to their default
- * value but not 'n') with the counter-intuitive name.
- */
- {"oldnoconfig", no_argument, NULL, olddefconfig},
{NULL, 0, NULL, 0}
};
printf(" --syncconfig Similar to oldconfig but generates configuration in\n"
" include/{generated/,config/}\n");
printf(" --olddefconfig Same as oldconfig but sets new symbols to their default value\n");
- printf(" --oldnoconfig An alias of olddefconfig\n");
printf(" --defconfig <file> New config with default defined in <file>\n");
printf(" --savedefconfig <file> Save the minimal current configuration to <file>\n");
printf(" --allnoconfig New config where all options are answered with no\n");
echo " -n use allnoconfig instead of alldefconfig"
echo " -r list redundant entries when merging fragments"
echo " -O dir to put generated output files. Consider setting \$KCONFIG_CONFIG instead."
+ echo
+ echo "Used prefix: '$CONFIG_PREFIX'. You can redefine it with \$CONFIG_ environment variable."
}
RUNMAKE=true
ALLTARGET=alldefconfig
WARNREDUN=false
OUTPUT=.
+CONFIG_PREFIX=${CONFIG_-CONFIG_}
while true; do
case $1 in
fi
MERGE_LIST=$*
-SED_CONFIG_EXP="s/^\(# \)\{0,1\}\(CONFIG_[a-zA-Z0-9_]*\)[= ].*/\2/p"
+SED_CONFIG_EXP1="s/^\(${CONFIG_PREFIX}[a-zA-Z0-9_]*\)=.*/\1/p"
+SED_CONFIG_EXP2="s/^# \(${CONFIG_PREFIX}[a-zA-Z0-9_]*\) is not set$/\1/p"
+
TMP_FILE=$(mktemp ./.tmp.config.XXXXXXXXXX)
echo "Using $INITFILE as base"
echo "The merge file '$MERGE_FILE' does not exist. Exit." >&2
exit 1
fi
- CFG_LIST=$(sed -n "$SED_CONFIG_EXP" $MERGE_FILE)
+ CFG_LIST=$(sed -n -e "$SED_CONFIG_EXP1" -e "$SED_CONFIG_EXP2" $MERGE_FILE)
for CFG in $CFG_LIST ; do
grep -q -w $CFG $TMP_FILE || continue
# Check all specified config values took (might have missed-dependency issues)
-for CFG in $(sed -n "$SED_CONFIG_EXP" $TMP_FILE); do
+for CFG in $(sed -n -e "$SED_CONFIG_EXP1" -e "$SED_CONFIG_EXP2" $TMP_FILE); do
REQUESTED_VAL=$(grep -w -e "$CFG" $TMP_FILE)
ACTUAL_VAL=$(grep -w -e "$CFG" "$KCONFIG_CONFIG")
cp System.map "$tmpdir/boot/System.map-$version"
cp $KCONFIG_CONFIG "$tmpdir/boot/config-$version"
fi
-cp "$($MAKE -s image_name)" "$tmpdir/$installed_image_path"
+cp "$($MAKE -s -f $srctree/Makefile image_name)" "$tmpdir/$installed_image_path"
-if grep -q "^CONFIG_OF=y" $KCONFIG_CONFIG ; then
+if grep -q "^CONFIG_OF_EARLY_FLATTREE=y" $KCONFIG_CONFIG ; then
# Only some architectures with OF support have this target
- if grep -q dtbs_install "${srctree}/arch/$SRCARCH/Makefile"; then
+ if [ -d "${srctree}/arch/$SRCARCH/boot/dts" ]; then
$MAKE KBUILD_SRC= INSTALL_DTBS_PATH="$tmpdir/usr/lib/$packagename" dtbs_install
fi
fi
version=$KERNELRELEASE
if [ -n "$KDEB_PKGVERSION" ]; then
packageversion=$KDEB_PKGVERSION
+ revision=${packageversion##*-}
else
revision=$(cat .version 2>/dev/null||echo 1)
packageversion=$version-$revision
#!$(command -v $MAKE) -f
build:
- \$(MAKE) KERNELRELEASE=${version} ARCH=${ARCH} KBUILD_SRC=
+ \$(MAKE) KERNELRELEASE=${version} ARCH=${ARCH} \
+ KBUILD_BUILD_VERSION=${revision} KBUILD_SRC=
binary-arch:
- \$(MAKE) KERNELRELEASE=${version} ARCH=${ARCH} KBUILD_SRC= intdeb-pkg
+ \$(MAKE) KERNELRELEASE=${version} ARCH=${ARCH} \
+ KBUILD_BUILD_VERSION=${revision} KBUILD_SRC= intdeb-pkg
clean:
rm -rf debian/*tmp debian/files
# how we were called determines which rpms we build and how we build them
if [ "$1" = prebuilt ]; then
S=DEL
+ MAKE="$MAKE -f $srctree/Makefile"
else
S=
fi
$S %setup -q
$S
$S %build
-$S make %{?_smp_mflags} KBUILD_BUILD_VERSION=%{release}
+$S $MAKE %{?_smp_mflags} KBUILD_BUILD_VERSION=%{release}
$S
%install
mkdir -p %{buildroot}/boot
%ifarch ia64
mkdir -p %{buildroot}/boot/efi
- cp \$(make image_name) %{buildroot}/boot/efi/vmlinuz-$KERNELRELEASE
+ cp \$($MAKE image_name) %{buildroot}/boot/efi/vmlinuz-$KERNELRELEASE
ln -s efi/vmlinuz-$KERNELRELEASE %{buildroot}/boot/
%else
- cp \$(make image_name) %{buildroot}/boot/vmlinuz-$KERNELRELEASE
+ cp \$($MAKE image_name) %{buildroot}/boot/vmlinuz-$KERNELRELEASE
%endif
-$M make %{?_smp_mflags} INSTALL_MOD_PATH=%{buildroot} KBUILD_SRC= modules_install
- make %{?_smp_mflags} INSTALL_HDR_PATH=%{buildroot}/usr KBUILD_SRC= headers_install
+$M $MAKE %{?_smp_mflags} INSTALL_MOD_PATH=%{buildroot} modules_install
+ $MAKE %{?_smp_mflags} INSTALL_HDR_PATH=%{buildroot}/usr headers_install
cp System.map %{buildroot}/boot/System.map-$KERNELRELEASE
cp .config %{buildroot}/boot/config-$KERNELRELEASE
bzip2 -9 --keep vmlinux
fi
# Check for uncommitted changes
- if git status -uno --porcelain | grep -qv '^.. scripts/package'; then
+ if git diff-index --name-only HEAD | grep -qv "^scripts/package"; then
printf '%s' -dirty
fi
self.curline = 0
try:
for line in fd:
- line = line.decode(locale.getpreferredencoding(False), errors='ignore')
self.curline += 1
if self.curline > maxlines:
break
* When we have processed a group that starts off with a known-false
* #if/#elif sequence (which has therefore been deleted) followed by a
* #elif that we don't understand and therefore must keep, we edit the
- * latter into a #if to keep the nesting correct. We use strncpy() to
+ * latter into a #if to keep the nesting correct. We use memcpy() to
* overwrite the 4 byte token "elif" with "if " without a '\0' byte.
*
* When we find a true #elif in a group, the following block will
static void Itrue (void) { Ftrue(); ignoreon(); }
static void Ifalse(void) { Ffalse(); ignoreon(); }
/* modify this line */
-static void Mpass (void) { strncpy(keyword, "if ", 4); Pelif(); }
+static void Mpass (void) { memcpy(keyword, "if ", 4); Pelif(); }
static void Mtrue (void) { keywordedit("else"); state(IS_TRUE_MIDDLE); }
static void Melif (void) { keywordedit("endif"); state(IS_FALSE_TRAILER); }
static void Melse (void) { keywordedit("endif"); state(IS_FALSE_ELSE); }
if (error)
return error;
- parent = aa_get_ns(dir->i_private);
+ parent = aa_get_ns(dir->i_private);
/* rmdir calls the generic securityfs functions to remove files
* from the apparmor dir. It is up to the apparmor ns locking
* to avoid races.
/* update caching of label on file_ctx */
spin_lock(&fctx->lock);
old = rcu_dereference_protected(fctx->label,
- spin_is_locked(&fctx->lock));
+ lockdep_is_held(&fctx->lock));
l = aa_label_merge(old, label, GFP_ATOMIC);
if (l) {
if (l != old) {
{
struct aa_label *label = aa_current_raw_label();
+ might_sleep();
+
if (label_is_stale(label)) {
label = aa_get_newest_label(label);
if (aa_replace_current_label(label) == 0)
__e; \
})
+struct aa_secmark {
+ u8 audit;
+ u8 deny;
+ u32 secid;
+ char *label;
+};
+
extern struct aa_sfs_entry aa_sfs_entry_network[];
void audit_net_cb(struct audit_buffer *ab, void *va);
int aa_sock_file_perm(struct aa_label *label, const char *op, u32 request,
struct socket *sock);
+int apparmor_secmark_check(struct aa_label *label, char *op, u32 request,
+ u32 secid, struct sock *sk);
+
#endif /* __AA_NET_H */
struct aa_rlimit rlimits;
+ int secmark_count;
+ struct aa_secmark *secmark;
+
struct aa_loaddata *rawdata;
unsigned char *hash;
char *dirname;
/* secid value that will not be allocated */
#define AA_SECID_INVALID 0
+/* secid value that matches any other secid */
+#define AA_SECID_WILDCARD 1
+
struct aa_label *aa_secid_to_label(u32 secid);
int apparmor_secid_to_secctx(u32 secid, char **secdata, u32 *seclen);
int apparmor_secctx_to_secid(const char *secdata, u32 seclen, u32 *secid);
const char *end = fqname + n;
const char *name = skipn_spaces(fqname, n);
- if (!name)
- return NULL;
*ns_name = NULL;
*ns_len = 0;
+
+ if (!name)
+ return NULL;
+
if (name[0] == ':') {
char *split = strnchr(&name[1], end - &name[1], ':');
*ns_name = skipn_spaces(&name[1], end - &name[1]);
#include <linux/sysctl.h>
#include <linux/audit.h>
#include <linux/user_namespace.h>
+#include <linux/netfilter_ipv4.h>
+#include <linux/netfilter_ipv6.h>
#include <net/sock.h>
#include "include/apparmor.h"
struct aa_label *tracer, *tracee;
int error;
- tracer = begin_current_label_crit_section();
+ tracer = __begin_current_label_crit_section();
tracee = aa_get_task_label(child);
error = aa_may_ptrace(tracer, tracee,
(mode & PTRACE_MODE_READ) ? AA_PTRACE_READ
: AA_PTRACE_TRACE);
aa_put_label(tracee);
- end_current_label_crit_section(tracer);
+ __end_current_label_crit_section(tracer);
return error;
}
struct aa_label *tracer, *tracee;
int error;
- tracee = begin_current_label_crit_section();
+ tracee = __begin_current_label_crit_section();
tracer = aa_get_task_label(parent);
error = aa_may_ptrace(tracer, tracee, AA_PTRACE_TRACE);
aa_put_label(tracer);
- end_current_label_crit_section(tracee);
+ __end_current_label_crit_section(tracee);
return error;
}
return aa_sock_perm(OP_SHUTDOWN, AA_MAY_SHUTDOWN, sock);
}
+#ifdef CONFIG_NETWORK_SECMARK
/**
* apparmor_socket_sock_recv_skb - check perms before associating skb to sk
*
*/
static int apparmor_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb)
{
- return 0;
+ struct aa_sk_ctx *ctx = SK_CTX(sk);
+
+ if (!skb->secmark)
+ return 0;
+
+ return apparmor_secmark_check(ctx->label, OP_RECVMSG, AA_MAY_RECEIVE,
+ skb->secmark, sk);
}
+#endif
static struct aa_label *sk_peer_label(struct sock *sk)
ctx->label = aa_get_current_label();
}
+#ifdef CONFIG_NETWORK_SECMARK
+static int apparmor_inet_conn_request(struct sock *sk, struct sk_buff *skb,
+ struct request_sock *req)
+{
+ struct aa_sk_ctx *ctx = SK_CTX(sk);
+
+ if (!skb->secmark)
+ return 0;
+
+ return apparmor_secmark_check(ctx->label, OP_CONNECT, AA_MAY_CONNECT,
+ skb->secmark, sk);
+}
+#endif
+
static struct security_hook_list apparmor_hooks[] __lsm_ro_after_init = {
LSM_HOOK_INIT(ptrace_access_check, apparmor_ptrace_access_check),
LSM_HOOK_INIT(ptrace_traceme, apparmor_ptrace_traceme),
LSM_HOOK_INIT(socket_getsockopt, apparmor_socket_getsockopt),
LSM_HOOK_INIT(socket_setsockopt, apparmor_socket_setsockopt),
LSM_HOOK_INIT(socket_shutdown, apparmor_socket_shutdown),
+#ifdef CONFIG_NETWORK_SECMARK
LSM_HOOK_INIT(socket_sock_rcv_skb, apparmor_socket_sock_rcv_skb),
+#endif
LSM_HOOK_INIT(socket_getpeersec_stream,
apparmor_socket_getpeersec_stream),
LSM_HOOK_INIT(socket_getpeersec_dgram,
apparmor_socket_getpeersec_dgram),
LSM_HOOK_INIT(sock_graft, apparmor_sock_graft),
+#ifdef CONFIG_NETWORK_SECMARK
+ LSM_HOOK_INIT(inet_conn_request, apparmor_inet_conn_request),
+#endif
LSM_HOOK_INIT(cred_alloc_blank, apparmor_cred_alloc_blank),
LSM_HOOK_INIT(cred_free, apparmor_cred_free),
}
#endif /* CONFIG_SYSCTL */
+#if defined(CONFIG_NETFILTER) && defined(CONFIG_NETWORK_SECMARK)
+static unsigned int apparmor_ip_postroute(void *priv,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ struct aa_sk_ctx *ctx;
+ struct sock *sk;
+
+ if (!skb->secmark)
+ return NF_ACCEPT;
+
+ sk = skb_to_full_sk(skb);
+ if (sk == NULL)
+ return NF_ACCEPT;
+
+ ctx = SK_CTX(sk);
+ if (!apparmor_secmark_check(ctx->label, OP_SENDMSG, AA_MAY_SEND,
+ skb->secmark, sk))
+ return NF_ACCEPT;
+
+ return NF_DROP_ERR(-ECONNREFUSED);
+
+}
+
+static unsigned int apparmor_ipv4_postroute(void *priv,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ return apparmor_ip_postroute(priv, skb, state);
+}
+
+static unsigned int apparmor_ipv6_postroute(void *priv,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ return apparmor_ip_postroute(priv, skb, state);
+}
+
+static const struct nf_hook_ops apparmor_nf_ops[] = {
+ {
+ .hook = apparmor_ipv4_postroute,
+ .pf = NFPROTO_IPV4,
+ .hooknum = NF_INET_POST_ROUTING,
+ .priority = NF_IP_PRI_SELINUX_FIRST,
+ },
+#if IS_ENABLED(CONFIG_IPV6)
+ {
+ .hook = apparmor_ipv6_postroute,
+ .pf = NFPROTO_IPV6,
+ .hooknum = NF_INET_POST_ROUTING,
+ .priority = NF_IP6_PRI_SELINUX_FIRST,
+ },
+#endif
+};
+
+static int __net_init apparmor_nf_register(struct net *net)
+{
+ int ret;
+
+ ret = nf_register_net_hooks(net, apparmor_nf_ops,
+ ARRAY_SIZE(apparmor_nf_ops));
+ return ret;
+}
+
+static void __net_exit apparmor_nf_unregister(struct net *net)
+{
+ nf_unregister_net_hooks(net, apparmor_nf_ops,
+ ARRAY_SIZE(apparmor_nf_ops));
+}
+
+static struct pernet_operations apparmor_net_ops = {
+ .init = apparmor_nf_register,
+ .exit = apparmor_nf_unregister,
+};
+
+static int __init apparmor_nf_ip_init(void)
+{
+ int err;
+
+ if (!apparmor_enabled)
+ return 0;
+
+ err = register_pernet_subsys(&apparmor_net_ops);
+ if (err)
+ panic("Apparmor: register_pernet_subsys: error %d\n", err);
+
+ return 0;
+}
+__initcall(apparmor_nf_ip_init);
+#endif
+
static int __init apparmor_init(void)
{
int error;
#include "include/label.h"
#include "include/net.h"
#include "include/policy.h"
+#include "include/secid.h"
#include "net_names.h"
static int aa_label_sk_perm(struct aa_label *label, const char *op, u32 request,
struct sock *sk)
{
- struct aa_profile *profile;
- DEFINE_AUDIT_SK(sa, op, sk);
+ int error = 0;
AA_BUG(!label);
AA_BUG(!sk);
- if (unconfined(label))
- return 0;
+ if (!unconfined(label)) {
+ struct aa_profile *profile;
+ DEFINE_AUDIT_SK(sa, op, sk);
- return fn_for_each_confined(label, profile,
- aa_profile_af_sk_perm(profile, &sa, request, sk));
+ error = fn_for_each_confined(label, profile,
+ aa_profile_af_sk_perm(profile, &sa, request, sk));
+ }
+
+ return error;
}
int aa_sk_perm(const char *op, u32 request, struct sock *sk)
return aa_label_sk_perm(label, op, request, sock->sk);
}
+
+#ifdef CONFIG_NETWORK_SECMARK
+static int apparmor_secmark_init(struct aa_secmark *secmark)
+{
+ struct aa_label *label;
+
+ if (secmark->label[0] == '*') {
+ secmark->secid = AA_SECID_WILDCARD;
+ return 0;
+ }
+
+ label = aa_label_strn_parse(&root_ns->unconfined->label,
+ secmark->label, strlen(secmark->label),
+ GFP_ATOMIC, false, false);
+
+ if (IS_ERR(label))
+ return PTR_ERR(label);
+
+ secmark->secid = label->secid;
+
+ return 0;
+}
+
+static int aa_secmark_perm(struct aa_profile *profile, u32 request, u32 secid,
+ struct common_audit_data *sa, struct sock *sk)
+{
+ int i, ret;
+ struct aa_perms perms = { };
+
+ if (profile->secmark_count == 0)
+ return 0;
+
+ for (i = 0; i < profile->secmark_count; i++) {
+ if (!profile->secmark[i].secid) {
+ ret = apparmor_secmark_init(&profile->secmark[i]);
+ if (ret)
+ return ret;
+ }
+
+ if (profile->secmark[i].secid == secid ||
+ profile->secmark[i].secid == AA_SECID_WILDCARD) {
+ if (profile->secmark[i].deny)
+ perms.deny = ALL_PERMS_MASK;
+ else
+ perms.allow = ALL_PERMS_MASK;
+
+ if (profile->secmark[i].audit)
+ perms.audit = ALL_PERMS_MASK;
+ }
+ }
+
+ aa_apply_modes_to_perms(profile, &perms);
+
+ return aa_check_perms(profile, &perms, request, sa, audit_net_cb);
+}
+
+int apparmor_secmark_check(struct aa_label *label, char *op, u32 request,
+ u32 secid, struct sock *sk)
+{
+ struct aa_profile *profile;
+ DEFINE_AUDIT_SK(sa, op, sk);
+
+ return fn_for_each_confined(label, profile,
+ aa_secmark_perm(profile, request, secid,
+ &sa, sk));
+}
+#endif
for (i = 0; i < profile->xattr_count; i++)
kzfree(profile->xattrs[i]);
kzfree(profile->xattrs);
+ for (i = 0; i < profile->secmark_count; i++)
+ kzfree(profile->secmark[i].label);
+ kzfree(profile->secmark);
kzfree(profile->dirname);
aa_put_dfa(profile->xmatch);
aa_put_dfa(profile->policy.dfa);
return 0;
}
+static bool unpack_u8(struct aa_ext *e, u8 *data, const char *name)
+{
+ if (unpack_nameX(e, AA_U8, name)) {
+ if (!inbounds(e, sizeof(u8)))
+ return 0;
+ if (data)
+ *data = get_unaligned((u8 *)e->pos);
+ e->pos += sizeof(u8);
+ return 1;
+ }
+ return 0;
+}
+
static bool unpack_u32(struct aa_ext *e, u32 *data, const char *name)
{
if (unpack_nameX(e, AA_U32, name)) {
return 0;
}
+static bool unpack_secmark(struct aa_ext *e, struct aa_profile *profile)
+{
+ void *pos = e->pos;
+ int i, size;
+
+ if (unpack_nameX(e, AA_STRUCT, "secmark")) {
+ size = unpack_array(e, NULL);
+
+ profile->secmark = kcalloc(size, sizeof(struct aa_secmark),
+ GFP_KERNEL);
+ if (!profile->secmark)
+ goto fail;
+
+ profile->secmark_count = size;
+
+ for (i = 0; i < size; i++) {
+ if (!unpack_u8(e, &profile->secmark[i].audit, NULL))
+ goto fail;
+ if (!unpack_u8(e, &profile->secmark[i].deny, NULL))
+ goto fail;
+ if (!unpack_strdup(e, &profile->secmark[i].label, NULL))
+ goto fail;
+ }
+ if (!unpack_nameX(e, AA_ARRAYEND, NULL))
+ goto fail;
+ if (!unpack_nameX(e, AA_STRUCTEND, NULL))
+ goto fail;
+ }
+
+ return 1;
+
+fail:
+ if (profile->secmark) {
+ for (i = 0; i < size; i++)
+ kfree(profile->secmark[i].label);
+ kfree(profile->secmark);
+ profile->secmark_count = 0;
+ }
+
+ e->pos = pos;
+ return 0;
+}
+
static bool unpack_rlimits(struct aa_ext *e, struct aa_profile *profile)
{
void *pos = e->pos;
goto fail;
}
+ if (!unpack_secmark(e, profile)) {
+ info = "failed to unpack profile secmark rules";
+ goto fail;
+ }
+
if (unpack_nameX(e, AA_STRUCT, "policydb")) {
/* generic policy dfa - optional and may be NULL */
info = "failed to unpack policydb";
* secids - do not pin labels with a refcount. They rely on the label
* properly updating/freeing them
*/
-
-#define AA_FIRST_SECID 1
+#define AA_FIRST_SECID 2
static DEFINE_IDR(aa_secids);
static DEFINE_SPINLOCK(secid_lock);
pks.pkey_algo = "rsa";
pks.hash_algo = hash_algo_name[hdr->hash_algo];
+ pks.encoding = "pkcs1";
pks.digest = (u8 *)data;
pks.digest_size = datalen;
pks.s = hdr->sig;
obj-$(CONFIG_SYSCTL) += sysctl.o
obj-$(CONFIG_PERSISTENT_KEYRINGS) += persistent.o
obj-$(CONFIG_KEY_DH_OPERATIONS) += dh.o
+obj-$(CONFIG_ASYMMETRIC_KEY_TYPE) += keyctl_pkey.o
#
# Key types
return keyctl_restrict_keyring(arg2, compat_ptr(arg3),
compat_ptr(arg4));
+ case KEYCTL_PKEY_QUERY:
+ if (arg3 != 0)
+ return -EINVAL;
+ return keyctl_pkey_query(arg2,
+ compat_ptr(arg4),
+ compat_ptr(arg5));
+
+ case KEYCTL_PKEY_ENCRYPT:
+ case KEYCTL_PKEY_DECRYPT:
+ case KEYCTL_PKEY_SIGN:
+ return keyctl_pkey_e_d_s(option,
+ compat_ptr(arg2), compat_ptr(arg3),
+ compat_ptr(arg4), compat_ptr(arg5));
+
+ case KEYCTL_PKEY_VERIFY:
+ return keyctl_pkey_verify(compat_ptr(arg2), compat_ptr(arg3),
+ compat_ptr(arg4), compat_ptr(arg5));
+
default:
return -EOPNOTSUPP;
}
#endif
#endif
+#ifdef CONFIG_ASYMMETRIC_KEY_TYPE
+extern long keyctl_pkey_query(key_serial_t,
+ const char __user *,
+ struct keyctl_pkey_query __user *);
+
+extern long keyctl_pkey_verify(const struct keyctl_pkey_params __user *,
+ const char __user *,
+ const void __user *, const void __user *);
+
+extern long keyctl_pkey_e_d_s(int,
+ const struct keyctl_pkey_params __user *,
+ const char __user *,
+ const void __user *, void __user *);
+#else
+static inline long keyctl_pkey_query(key_serial_t id,
+ const char __user *_info,
+ struct keyctl_pkey_query __user *_res)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline long keyctl_pkey_verify(const struct keyctl_pkey_params __user *params,
+ const char __user *_info,
+ const void __user *_in,
+ const void __user *_in2)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline long keyctl_pkey_e_d_s(int op,
+ const struct keyctl_pkey_params __user *params,
+ const char __user *_info,
+ const void __user *_in,
+ void __user *_out)
+{
+ return -EOPNOTSUPP;
+}
+#endif
+
/*
* Debugging key validation
*/
(const char __user *) arg3,
(const char __user *) arg4);
+ case KEYCTL_PKEY_QUERY:
+ if (arg3 != 0)
+ return -EINVAL;
+ return keyctl_pkey_query((key_serial_t)arg2,
+ (const char __user *)arg4,
+ (struct keyctl_pkey_query *)arg5);
+
+ case KEYCTL_PKEY_ENCRYPT:
+ case KEYCTL_PKEY_DECRYPT:
+ case KEYCTL_PKEY_SIGN:
+ return keyctl_pkey_e_d_s(
+ option,
+ (const struct keyctl_pkey_params __user *)arg2,
+ (const char __user *)arg3,
+ (const void __user *)arg4,
+ (void __user *)arg5);
+
+ case KEYCTL_PKEY_VERIFY:
+ return keyctl_pkey_verify(
+ (const struct keyctl_pkey_params __user *)arg2,
+ (const char __user *)arg3,
+ (const void __user *)arg4,
+ (const void __user *)arg5);
+
default:
return -EOPNOTSUPP;
}
--- /dev/null
+/* Public-key operation keyctls
+ *
+ * Copyright (C) 2016 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/slab.h>
+#include <linux/err.h>
+#include <linux/key.h>
+#include <linux/keyctl.h>
+#include <linux/parser.h>
+#include <linux/uaccess.h>
+#include <keys/user-type.h>
+#include "internal.h"
+
+static void keyctl_pkey_params_free(struct kernel_pkey_params *params)
+{
+ kfree(params->info);
+ key_put(params->key);
+}
+
+enum {
+ Opt_err = -1,
+ Opt_enc, /* "enc=<encoding>" eg. "enc=oaep" */
+ Opt_hash, /* "hash=<digest-name>" eg. "hash=sha1" */
+};
+
+static const match_table_t param_keys = {
+ { Opt_enc, "enc=%s" },
+ { Opt_hash, "hash=%s" },
+ { Opt_err, NULL }
+};
+
+/*
+ * Parse the information string which consists of key=val pairs.
+ */
+static int keyctl_pkey_params_parse(struct kernel_pkey_params *params)
+{
+ unsigned long token_mask = 0;
+ substring_t args[MAX_OPT_ARGS];
+ char *c = params->info, *p, *q;
+ int token;
+
+ while ((p = strsep(&c, " \t"))) {
+ if (*p == '\0' || *p == ' ' || *p == '\t')
+ continue;
+ token = match_token(p, param_keys, args);
+ if (__test_and_set_bit(token, &token_mask))
+ return -EINVAL;
+ q = args[0].from;
+ if (!q[0])
+ return -EINVAL;
+
+ switch (token) {
+ case Opt_enc:
+ params->encoding = q;
+ break;
+
+ case Opt_hash:
+ params->hash_algo = q;
+ break;
+
+ default:
+ return -EINVAL;
+ }
+ }
+
+ return 0;
+}
+
+/*
+ * Interpret parameters. Callers must always call the free function
+ * on params, even if an error is returned.
+ */
+static int keyctl_pkey_params_get(key_serial_t id,
+ const char __user *_info,
+ struct kernel_pkey_params *params)
+{
+ key_ref_t key_ref;
+ void *p;
+ int ret;
+
+ memset(params, 0, sizeof(*params));
+ params->encoding = "raw";
+
+ p = strndup_user(_info, PAGE_SIZE);
+ if (IS_ERR(p))
+ return PTR_ERR(p);
+ params->info = p;
+
+ ret = keyctl_pkey_params_parse(params);
+ if (ret < 0)
+ return ret;
+
+ key_ref = lookup_user_key(id, 0, KEY_NEED_SEARCH);
+ if (IS_ERR(key_ref))
+ return PTR_ERR(key_ref);
+ params->key = key_ref_to_ptr(key_ref);
+
+ if (!params->key->type->asym_query)
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
+/*
+ * Get parameters from userspace. Callers must always call the free function
+ * on params, even if an error is returned.
+ */
+static int keyctl_pkey_params_get_2(const struct keyctl_pkey_params __user *_params,
+ const char __user *_info,
+ int op,
+ struct kernel_pkey_params *params)
+{
+ struct keyctl_pkey_params uparams;
+ struct kernel_pkey_query info;
+ int ret;
+
+ memset(params, 0, sizeof(*params));
+ params->encoding = "raw";
+
+ if (copy_from_user(&uparams, _params, sizeof(uparams)) != 0)
+ return -EFAULT;
+
+ ret = keyctl_pkey_params_get(uparams.key_id, _info, params);
+ if (ret < 0)
+ return ret;
+
+ ret = params->key->type->asym_query(params, &info);
+ if (ret < 0)
+ return ret;
+
+ switch (op) {
+ case KEYCTL_PKEY_ENCRYPT:
+ case KEYCTL_PKEY_DECRYPT:
+ if (uparams.in_len > info.max_enc_size ||
+ uparams.out_len > info.max_dec_size)
+ return -EINVAL;
+ break;
+ case KEYCTL_PKEY_SIGN:
+ case KEYCTL_PKEY_VERIFY:
+ if (uparams.in_len > info.max_sig_size ||
+ uparams.out_len > info.max_data_size)
+ return -EINVAL;
+ break;
+ default:
+ BUG();
+ }
+
+ params->in_len = uparams.in_len;
+ params->out_len = uparams.out_len;
+ return 0;
+}
+
+/*
+ * Query information about an asymmetric key.
+ */
+long keyctl_pkey_query(key_serial_t id,
+ const char __user *_info,
+ struct keyctl_pkey_query __user *_res)
+{
+ struct kernel_pkey_params params;
+ struct kernel_pkey_query res;
+ long ret;
+
+ memset(¶ms, 0, sizeof(params));
+
+ ret = keyctl_pkey_params_get(id, _info, ¶ms);
+ if (ret < 0)
+ goto error;
+
+ ret = params.key->type->asym_query(¶ms, &res);
+ if (ret < 0)
+ goto error;
+
+ ret = -EFAULT;
+ if (copy_to_user(_res, &res, sizeof(res)) == 0 &&
+ clear_user(_res->__spare, sizeof(_res->__spare)) == 0)
+ ret = 0;
+
+error:
+ keyctl_pkey_params_free(¶ms);
+ return ret;
+}
+
+/*
+ * Encrypt/decrypt/sign
+ *
+ * Encrypt data, decrypt data or sign data using a public key.
+ *
+ * _info is a string of supplementary information in key=val format. For
+ * instance, it might contain:
+ *
+ * "enc=pkcs1 hash=sha256"
+ *
+ * where enc= specifies the encoding and hash= selects the OID to go in that
+ * particular encoding if required. If enc= isn't supplied, it's assumed that
+ * the caller is supplying raw values.
+ *
+ * If successful, the amount of data written into the output buffer is
+ * returned.
+ */
+long keyctl_pkey_e_d_s(int op,
+ const struct keyctl_pkey_params __user *_params,
+ const char __user *_info,
+ const void __user *_in,
+ void __user *_out)
+{
+ struct kernel_pkey_params params;
+ void *in, *out;
+ long ret;
+
+ ret = keyctl_pkey_params_get_2(_params, _info, op, ¶ms);
+ if (ret < 0)
+ goto error_params;
+
+ ret = -EOPNOTSUPP;
+ if (!params.key->type->asym_eds_op)
+ goto error_params;
+
+ switch (op) {
+ case KEYCTL_PKEY_ENCRYPT:
+ params.op = kernel_pkey_encrypt;
+ break;
+ case KEYCTL_PKEY_DECRYPT:
+ params.op = kernel_pkey_decrypt;
+ break;
+ case KEYCTL_PKEY_SIGN:
+ params.op = kernel_pkey_sign;
+ break;
+ default:
+ BUG();
+ }
+
+ in = memdup_user(_in, params.in_len);
+ if (IS_ERR(in)) {
+ ret = PTR_ERR(in);
+ goto error_params;
+ }
+
+ ret = -ENOMEM;
+ out = kmalloc(params.out_len, GFP_KERNEL);
+ if (!out)
+ goto error_in;
+
+ ret = params.key->type->asym_eds_op(¶ms, in, out);
+ if (ret < 0)
+ goto error_out;
+
+ if (copy_to_user(_out, out, ret) != 0)
+ ret = -EFAULT;
+
+error_out:
+ kfree(out);
+error_in:
+ kfree(in);
+error_params:
+ keyctl_pkey_params_free(¶ms);
+ return ret;
+}
+
+/*
+ * Verify a signature.
+ *
+ * Verify a public key signature using the given key, or if not given, search
+ * for a matching key.
+ *
+ * _info is a string of supplementary information in key=val format. For
+ * instance, it might contain:
+ *
+ * "enc=pkcs1 hash=sha256"
+ *
+ * where enc= specifies the signature blob encoding and hash= selects the OID
+ * to go in that particular encoding. If enc= isn't supplied, it's assumed
+ * that the caller is supplying raw values.
+ *
+ * If successful, 0 is returned.
+ */
+long keyctl_pkey_verify(const struct keyctl_pkey_params __user *_params,
+ const char __user *_info,
+ const void __user *_in,
+ const void __user *_in2)
+{
+ struct kernel_pkey_params params;
+ void *in, *in2;
+ long ret;
+
+ ret = keyctl_pkey_params_get_2(_params, _info, KEYCTL_PKEY_VERIFY,
+ ¶ms);
+ if (ret < 0)
+ goto error_params;
+
+ ret = -EOPNOTSUPP;
+ if (!params.key->type->asym_verify_signature)
+ goto error_params;
+
+ in = memdup_user(_in, params.in_len);
+ if (IS_ERR(in)) {
+ ret = PTR_ERR(in);
+ goto error_params;
+ }
+
+ in2 = memdup_user(_in2, params.in2_len);
+ if (IS_ERR(in2)) {
+ ret = PTR_ERR(in2);
+ goto error_in;
+ }
+
+ params.op = kernel_pkey_verify;
+ ret = params.key->type->asym_verify_signature(¶ms, in, in2);
+
+ kfree(in2);
+error_in:
+ kfree(in);
+error_params:
+ keyctl_pkey_params_free(¶ms);
+ return ret;
+}
#include <linux/tpm.h>
#include <linux/tpm_command.h>
-#include "trusted.h"
+#include <keys/trusted.h>
static const char hmac_alg[] = "hmac(sha1)";
static const char hash_alg[] = "sha1";
/*
* calculate authorization info fields to send to TPM
*/
-static int TSS_authhmac(unsigned char *digest, const unsigned char *key,
+int TSS_authhmac(unsigned char *digest, const unsigned char *key,
unsigned int keylen, unsigned char *h1,
unsigned char *h2, unsigned char h3, ...)
{
kzfree(sdesc);
return ret;
}
+EXPORT_SYMBOL_GPL(TSS_authhmac);
/*
* verify the AUTH1_COMMAND (Seal) result from TPM
*/
-static int TSS_checkhmac1(unsigned char *buffer,
+int TSS_checkhmac1(unsigned char *buffer,
const uint32_t command,
const unsigned char *ononce,
const unsigned char *key,
kzfree(sdesc);
return ret;
}
+EXPORT_SYMBOL_GPL(TSS_checkhmac1);
/*
* verify the AUTH2_COMMAND (unseal) result from TPM
* For key specific tpm requests, we will generate and send our
* own TPM command packets using the drivers send function.
*/
-static int trusted_tpm_send(unsigned char *cmd, size_t buflen)
+int trusted_tpm_send(unsigned char *cmd, size_t buflen)
{
int rc;
rc = -EPERM;
return rc;
}
+EXPORT_SYMBOL_GPL(trusted_tpm_send);
/*
* Lock a trusted key, by extending a selected PCR.
/*
* Create an object independent authorisation protocol (oiap) session
*/
-static int oiap(struct tpm_buf *tb, uint32_t *handle, unsigned char *nonce)
+int oiap(struct tpm_buf *tb, uint32_t *handle, unsigned char *nonce)
{
int ret;
TPM_NONCE_SIZE);
return 0;
}
+EXPORT_SYMBOL_GPL(oiap);
struct tpm_digests {
unsigned char encauth[SHA1_DIGEST_SIZE];
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __TRUSTED_KEY_H
-#define __TRUSTED_KEY_H
-
-/* implementation specific TPM constants */
-#define MAX_BUF_SIZE 512
-#define TPM_GETRANDOM_SIZE 14
-#define TPM_OSAP_SIZE 36
-#define TPM_OIAP_SIZE 10
-#define TPM_SEAL_SIZE 87
-#define TPM_UNSEAL_SIZE 104
-#define TPM_SIZE_OFFSET 2
-#define TPM_RETURN_OFFSET 6
-#define TPM_DATA_OFFSET 10
-
-#define LOAD32(buffer, offset) (ntohl(*(uint32_t *)&buffer[offset]))
-#define LOAD32N(buffer, offset) (*(uint32_t *)&buffer[offset])
-#define LOAD16(buffer, offset) (ntohs(*(uint16_t *)&buffer[offset]))
-
-struct tpm_buf {
- int len;
- unsigned char data[MAX_BUF_SIZE];
-};
-
-#define INIT_BUF(tb) (tb->len = 0)
-
-struct osapsess {
- uint32_t handle;
- unsigned char secret[SHA1_DIGEST_SIZE];
- unsigned char enonce[TPM_NONCE_SIZE];
-};
-
-/* discrete values, but have to store in uint16_t for TPM use */
-enum {
- SEAL_keytype = 1,
- SRK_keytype = 4
-};
-
-#define TPM_DEBUG 0
-
-#if TPM_DEBUG
-static inline void dump_options(struct trusted_key_options *o)
-{
- pr_info("trusted_key: sealing key type %d\n", o->keytype);
- pr_info("trusted_key: sealing key handle %0X\n", o->keyhandle);
- pr_info("trusted_key: pcrlock %d\n", o->pcrlock);
- pr_info("trusted_key: pcrinfo %d\n", o->pcrinfo_len);
- print_hex_dump(KERN_INFO, "pcrinfo ", DUMP_PREFIX_NONE,
- 16, 1, o->pcrinfo, o->pcrinfo_len, 0);
-}
-
-static inline void dump_payload(struct trusted_key_payload *p)
-{
- pr_info("trusted_key: key_len %d\n", p->key_len);
- print_hex_dump(KERN_INFO, "key ", DUMP_PREFIX_NONE,
- 16, 1, p->key, p->key_len, 0);
- pr_info("trusted_key: bloblen %d\n", p->blob_len);
- print_hex_dump(KERN_INFO, "blob ", DUMP_PREFIX_NONE,
- 16, 1, p->blob, p->blob_len, 0);
- pr_info("trusted_key: migratable %d\n", p->migratable);
-}
-
-static inline void dump_sess(struct osapsess *s)
-{
- print_hex_dump(KERN_INFO, "trusted-key: handle ", DUMP_PREFIX_NONE,
- 16, 1, &s->handle, 4, 0);
- pr_info("trusted-key: secret:\n");
- print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE,
- 16, 1, &s->secret, SHA1_DIGEST_SIZE, 0);
- pr_info("trusted-key: enonce:\n");
- print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE,
- 16, 1, &s->enonce, SHA1_DIGEST_SIZE, 0);
-}
-
-static inline void dump_tpm_buf(unsigned char *buf)
-{
- int len;
-
- pr_info("\ntrusted-key: tpm buffer\n");
- len = LOAD32(buf, TPM_SIZE_OFFSET);
- print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE, 16, 1, buf, len, 0);
-}
-#else
-static inline void dump_options(struct trusted_key_options *o)
-{
-}
-
-static inline void dump_payload(struct trusted_key_payload *p)
-{
-}
-
-static inline void dump_sess(struct osapsess *s)
-{
-}
-
-static inline void dump_tpm_buf(unsigned char *buf)
-{
-}
-#endif
-
-static inline void store8(struct tpm_buf *buf, const unsigned char value)
-{
- buf->data[buf->len++] = value;
-}
-
-static inline void store16(struct tpm_buf *buf, const uint16_t value)
-{
- *(uint16_t *) & buf->data[buf->len] = htons(value);
- buf->len += sizeof value;
-}
-
-static inline void store32(struct tpm_buf *buf, const uint32_t value)
-{
- *(uint32_t *) & buf->data[buf->len] = htonl(value);
- buf->len += sizeof value;
-}
-
-static inline void storebytes(struct tpm_buf *buf, const unsigned char *in,
- const int len)
-{
- memcpy(buf->data + buf->len, in, len);
- buf->len += len;
-}
-#endif
addr_buf = address;
while (walk_size < addrlen) {
+ if (walk_size + sizeof(sa_family_t) > addrlen)
+ return -EINVAL;
+
addr = addr_buf;
switch (addr->sa_family) {
case AF_UNSPEC:
{ RTM_NEWSTATS, NETLINK_ROUTE_SOCKET__NLMSG_READ },
{ RTM_GETSTATS, NETLINK_ROUTE_SOCKET__NLMSG_READ },
{ RTM_NEWCACHEREPORT, NETLINK_ROUTE_SOCKET__NLMSG_READ },
+ { RTM_NEWCHAIN, NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
+ { RTM_DELCHAIN, NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
+ { RTM_GETCHAIN, NETLINK_ROUTE_SOCKET__NLMSG_READ },
};
static const struct nlmsg_perm nlmsg_tcpdiag_perms[] =
switch (sclass) {
case SECCLASS_NETLINK_ROUTE_SOCKET:
- /* RTM_MAX always point to RTM_SETxxxx, ie RTM_NEWxxx + 3 */
+ /* RTM_MAX always points to RTM_SETxxxx, ie RTM_NEWxxx + 3.
+ * If the BUILD_BUG_ON() below fails you must update the
+ * structures at the top of this file with the new mappings
+ * before updating the BUILD_BUG_ON() macro!
+ */
BUILD_BUG_ON(RTM_MAX != (RTM_NEWCHAIN + 3));
err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms,
sizeof(nlmsg_route_perms));
break;
case SECCLASS_NETLINK_XFRM_SOCKET:
+ /* If the BUILD_BUG_ON() below fails you must update the
+ * structures at the top of this file with the new mappings
+ * before updating the BUILD_BUG_ON() macro!
+ */
BUILD_BUG_ON(XFRM_MSG_MAX != XFRM_MSG_MAPPING);
err = nlmsg_perm(nlmsg_type, perm, nlmsg_xfrm_perms,
sizeof(nlmsg_xfrm_perms));
char *rangep[2];
if (!pol->mls_enabled) {
- if ((def_sid != SECSID_NULL && oldc) || (*scontext) == '\0')
- return 0;
- return -EINVAL;
+ /*
+ * With no MLS, only return -EINVAL if there is a MLS field
+ * and it did not come from an xattr.
+ */
+ if (oldc && def_sid == SECSID_NULL)
+ return -EINVAL;
+ return 0;
}
/*
return 0;
}
+/* add a new kcontrol object; call with card->controls_rwsem locked */
+static int __snd_ctl_add(struct snd_card *card, struct snd_kcontrol *kcontrol)
+{
+ struct snd_ctl_elem_id id;
+ unsigned int idx;
+ unsigned int count;
+
+ id = kcontrol->id;
+ if (id.index > UINT_MAX - kcontrol->count)
+ return -EINVAL;
+
+ if (snd_ctl_find_id(card, &id)) {
+ dev_err(card->dev,
+ "control %i:%i:%i:%s:%i is already present\n",
+ id.iface, id.device, id.subdevice, id.name, id.index);
+ return -EBUSY;
+ }
+
+ if (snd_ctl_find_hole(card, kcontrol->count) < 0)
+ return -ENOMEM;
+
+ list_add_tail(&kcontrol->list, &card->controls);
+ card->controls_count += kcontrol->count;
+ kcontrol->id.numid = card->last_numid + 1;
+ card->last_numid += kcontrol->count;
+
+ id = kcontrol->id;
+ count = kcontrol->count;
+ for (idx = 0; idx < count; idx++, id.index++, id.numid++)
+ snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_ADD, &id);
+
+ return 0;
+}
+
/**
* snd_ctl_add - add the control instance to the card
* @card: the card instance
*/
int snd_ctl_add(struct snd_card *card, struct snd_kcontrol *kcontrol)
{
- struct snd_ctl_elem_id id;
- unsigned int idx;
- unsigned int count;
int err = -EINVAL;
if (! kcontrol)
return err;
if (snd_BUG_ON(!card || !kcontrol->info))
goto error;
- id = kcontrol->id;
- if (id.index > UINT_MAX - kcontrol->count)
- goto error;
down_write(&card->controls_rwsem);
- if (snd_ctl_find_id(card, &id)) {
- up_write(&card->controls_rwsem);
- dev_err(card->dev, "control %i:%i:%i:%s:%i is already present\n",
- id.iface,
- id.device,
- id.subdevice,
- id.name,
- id.index);
- err = -EBUSY;
- goto error;
- }
- if (snd_ctl_find_hole(card, kcontrol->count) < 0) {
- up_write(&card->controls_rwsem);
- err = -ENOMEM;
- goto error;
- }
- list_add_tail(&kcontrol->list, &card->controls);
- card->controls_count += kcontrol->count;
- kcontrol->id.numid = card->last_numid + 1;
- card->last_numid += kcontrol->count;
- id = kcontrol->id;
- count = kcontrol->count;
+ err = __snd_ctl_add(card, kcontrol);
up_write(&card->controls_rwsem);
- for (idx = 0; idx < count; idx++, id.index++, id.numid++)
- snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_ADD, &id);
+ if (err < 0)
+ goto error;
return 0;
error:
kctl->tlv.c = snd_ctl_elem_user_tlv;
/* This function manage to free the instance on failure. */
- err = snd_ctl_add(card, kctl);
- if (err < 0)
- return err;
+ down_write(&card->controls_rwsem);
+ err = __snd_ctl_add(card, kctl);
+ if (err < 0) {
+ snd_ctl_free_one(kctl);
+ goto unlock;
+ }
offset = snd_ctl_get_ioff(kctl, &info->id);
snd_ctl_build_ioff(&info->id, kctl, offset);
/*
* which locks the element.
*/
- down_write(&card->controls_rwsem);
card->user_ctl_count++;
- up_write(&card->controls_rwsem);
+ unlock:
+ up_write(&card->controls_rwsem);
return 0;
}
runtime->oss.channels = params_channels(params);
runtime->oss.rate = params_rate(params);
- vfree(runtime->oss.buffer);
- runtime->oss.buffer = vmalloc(runtime->oss.period_bytes);
+ kvfree(runtime->oss.buffer);
+ runtime->oss.buffer = kvzalloc(runtime->oss.period_bytes, GFP_KERNEL);
if (!runtime->oss.buffer) {
err = -ENOMEM;
goto failure;
{
struct snd_pcm_runtime *runtime;
runtime = substream->runtime;
- vfree(runtime->oss.buffer);
+ kvfree(runtime->oss.buffer);
runtime->oss.buffer = NULL;
#ifdef CONFIG_SND_PCM_OSS_PLUGINS
snd_pcm_oss_plugin_clear(substream);
return -ENXIO;
size /= 8;
if (plugin->buf_frames < frames) {
- vfree(plugin->buf);
- plugin->buf = vmalloc(size);
+ kvfree(plugin->buf);
+ plugin->buf = kvzalloc(size, GFP_KERNEL);
plugin->buf_frames = frames;
}
if (!plugin->buf) {
if (plugin->private_free)
plugin->private_free(plugin);
kfree(plugin->buf_channels);
- vfree(plugin->buf);
+ kvfree(plugin->buf);
kfree(plugin);
return 0;
}
#include <sound/timer.h>
#include <sound/minors.h>
#include <linux/uio.h>
+#include <linux/delay.h>
#include "pcm_local.h"
* and this may lead to a deadlock when the code path takes read sem
* twice (e.g. one in snd_pcm_action_nonatomic() and another in
* snd_pcm_stream_lock()). As a (suboptimal) workaround, let writer to
- * spin until it gets the lock.
+ * sleep until all the readers are completed without blocking by writer.
*/
-static inline void down_write_nonblock(struct rw_semaphore *lock)
+static inline void down_write_nonfifo(struct rw_semaphore *lock)
{
while (!down_write_trylock(lock))
- cond_resched();
+ msleep(1);
}
#define PCM_LOCK_DEFAULT 0
res = -ENOMEM;
goto _nolock;
}
- down_write_nonblock(&snd_pcm_link_rwsem);
+ down_write_nonfifo(&snd_pcm_link_rwsem);
write_lock_irq(&snd_pcm_link_rwlock);
if (substream->runtime->status->state == SNDRV_PCM_STATE_OPEN ||
substream->runtime->status->state != substream1->runtime->status->state ||
struct snd_pcm_substream *s;
int res = 0;
- down_write_nonblock(&snd_pcm_link_rwsem);
+ down_write_nonfifo(&snd_pcm_link_rwsem);
write_lock_irq(&snd_pcm_link_rwlock);
if (!snd_pcm_stream_linked(substream)) {
res = -EALREADY;
static void pcm_release_private(struct snd_pcm_substream *substream)
{
- snd_pcm_unlink(substream);
+ if (snd_pcm_stream_linked(substream))
+ snd_pcm_unlink(substream);
}
void snd_pcm_release_substream(struct snd_pcm_substream *substream)
struct snd_interval *s = hw_param_interval(params, rule->var);
const struct snd_interval *r =
hw_param_interval_c(params, SNDRV_PCM_HW_PARAM_RATE);
- struct snd_interval t = {
- .min = s->min, .max = s->max, .integer = 1,
- };
+ struct snd_interval t = {0};
+ unsigned int step = 0;
int i;
for (i = 0; i < CIP_SFC_COUNT; ++i) {
- unsigned int rate = amdtp_rate_table[i];
- unsigned int step = amdtp_syt_intervals[i];
-
- if (!snd_interval_test(r, rate))
- continue;
-
- t.min = roundup(t.min, step);
- t.max = rounddown(t.max, step);
+ if (snd_interval_test(r, amdtp_rate_table[i]))
+ step = max(step, amdtp_syt_intervals[i]);
}
- if (snd_interval_checkempty(&t))
- return -EINVAL;
+ t.min = roundup(s->min, step);
+ t.max = rounddown(s->max, step);
+ t.integer = 1;
return snd_interval_refine(s, &t);
}
-static int apply_constraint_to_rate(struct snd_pcm_hw_params *params,
- struct snd_pcm_hw_rule *rule)
-{
- struct snd_interval *r =
- hw_param_interval(params, SNDRV_PCM_HW_PARAM_RATE);
- const struct snd_interval *s = hw_param_interval_c(params, rule->deps[0]);
- struct snd_interval t = {
- .min = UINT_MAX, .max = 0, .integer = 1,
- };
- int i;
-
- for (i = 0; i < CIP_SFC_COUNT; ++i) {
- unsigned int step = amdtp_syt_intervals[i];
- unsigned int rate = amdtp_rate_table[i];
-
- if (s->min % step || s->max % step)
- continue;
-
- t.min = min(t.min, rate);
- t.max = max(t.max, rate);
- }
-
- return snd_interval_refine(r, &t);
-}
-
/**
* amdtp_stream_add_pcm_hw_constraints - add hw constraints for PCM substream
* @s: the AMDTP stream, which must be initialized.
*/
err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_PERIOD_SIZE,
apply_constraint_to_size, NULL,
+ SNDRV_PCM_HW_PARAM_PERIOD_SIZE,
SNDRV_PCM_HW_PARAM_RATE, -1);
if (err < 0)
goto end;
- err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
- apply_constraint_to_rate, NULL,
- SNDRV_PCM_HW_PARAM_PERIOD_SIZE, -1);
- if (err < 0)
- goto end;
err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_SIZE,
apply_constraint_to_size, NULL,
+ SNDRV_PCM_HW_PARAM_BUFFER_SIZE,
SNDRV_PCM_HW_PARAM_RATE, -1);
if (err < 0)
goto end;
- err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
- apply_constraint_to_rate, NULL,
- SNDRV_PCM_HW_PARAM_BUFFER_SIZE, -1);
- if (err < 0)
- goto end;
end:
return err;
}
cancel_delayed_work_sync(&dice->dwork);
if (dice->registered) {
- /* No need to wait for releasing card object in this context. */
- snd_card_free_when_closed(dice->card);
+ // Block till all of ALSA character devices are released.
+ snd_card_free(dice->card);
}
mutex_destroy(&dice->mutex);
if (err < 0) {
if (chip->release_dma)
chip->release_dma(chip, chip->dma_private_data, chip->dma1);
- snd_free_pages(runtime->dma_area, runtime->dma_bytes);
return err;
}
chip->playback_substream = substream;
if (err < 0) {
if (chip->release_dma)
chip->release_dma(chip, chip->dma_private_data, chip->dma2);
- snd_free_pages(runtime->dma_area, runtime->dma_bytes);
return err;
}
chip->capture_substream = substream;
{
struct snd_ac97 *ac97 = snd_kcontrol_chip(kcontrol);
int reg = kcontrol->private_value & 0xff;
- int shift = (kcontrol->private_value >> 8) & 0xff;
+ int shift = (kcontrol->private_value >> 8) & 0x0f;
int mask = (kcontrol->private_value >> 16) & 0xff;
// int invert = (kcontrol->private_value >> 24) & 0xff;
unsigned short value, old, new;
#define SPI_PL_BIT_R_R (2<<7) /* right channel = right */
#define SPI_PL_BIT_R_C (3<<7) /* right channel = (L+R)/2 */
#define SPI_IZD_REG 2
-#define SPI_IZD_BIT (1<<4) /* infinite zero detect */
+#define SPI_IZD_BIT (0<<4) /* infinite zero detect */
#define SPI_FMT_REG 3
#define SPI_FMT_BIT_RJ (0<<0) /* right justified mode */
/* https://bugzilla.redhat.com/show_bug.cgi?id=1525104 */
SND_PCI_QUIRK(0x1849, 0xc892, "Asrock B85M-ITX", 0),
/* https://bugzilla.redhat.com/show_bug.cgi?id=1525104 */
+ SND_PCI_QUIRK(0x1849, 0x0397, "Asrock N68C-S UCC", 0),
+ /* https://bugzilla.redhat.com/show_bug.cgi?id=1525104 */
SND_PCI_QUIRK(0x1849, 0x7662, "Asrock H81M-HDS", 0),
/* https://bugzilla.redhat.com/show_bug.cgi?id=1525104 */
SND_PCI_QUIRK(0x1043, 0x8733, "Asus Prime X370-Pro", 0),
/* AMD Hudson */
{ PCI_DEVICE(0x1022, 0x780d),
.driver_data = AZX_DRIVER_GENERIC | AZX_DCAPS_PRESET_ATI_SB },
+ /* AMD Stoney */
+ { PCI_DEVICE(0x1022, 0x157a),
+ .driver_data = AZX_DRIVER_GENERIC | AZX_DCAPS_PRESET_ATI_SB |
+ AZX_DCAPS_PM_RUNTIME },
/* AMD Raven */
{ PCI_DEVICE(0x1022, 0x15e3),
.driver_data = AZX_DRIVER_GENERIC | AZX_DCAPS_PRESET_ATI_SB |
SND_PCI_QUIRK(0x1028, 0x0708, "Alienware 15 R2 2016", QUIRK_ALIENWARE),
SND_PCI_QUIRK(0x1102, 0x0010, "Sound Blaster Z", QUIRK_SBZ),
SND_PCI_QUIRK(0x1102, 0x0023, "Sound Blaster Z", QUIRK_SBZ),
+ SND_PCI_QUIRK(0x1102, 0x0033, "Sound Blaster ZxR", QUIRK_SBZ),
SND_PCI_QUIRK(0x1458, 0xA016, "Recon3Di", QUIRK_R3DI),
SND_PCI_QUIRK(0x1458, 0xA026, "Gigabyte G1.Sniper Z97", QUIRK_R3DI),
SND_PCI_QUIRK(0x1458, 0xA036, "Gigabyte GA-Z170X-Gaming 7", QUIRK_R3DI),
snd_hda_power_down(codec);
if (spec->mem_base)
- iounmap(spec->mem_base);
+ pci_iounmap(codec->bus->pci, spec->mem_base);
kfree(spec->spec_init_verbs);
kfree(codec->spec);
}
break;
case QUIRK_AE5:
codec_dbg(codec, "%s: QUIRK_AE5 applied.\n", __func__);
- snd_hda_apply_pincfgs(codec, r3di_pincfgs);
+ snd_hda_apply_pincfgs(codec, ae5_pincfgs);
break;
}
case 0x10ec0285:
case 0x10ec0298:
case 0x10ec0289:
+ case 0x10ec0300:
alc_update_coef_idx(codec, 0x10, 1<<9, 0);
break;
case 0x10ec0275:
ALC269_TYPE_ALC215,
ALC269_TYPE_ALC225,
ALC269_TYPE_ALC294,
+ ALC269_TYPE_ALC300,
ALC269_TYPE_ALC700,
};
case ALC269_TYPE_ALC215:
case ALC269_TYPE_ALC225:
case ALC269_TYPE_ALC294:
+ case ALC269_TYPE_ALC300:
case ALC269_TYPE_ALC700:
ssids = alc269_ssids;
break;
{ 0x19, 0x21a11010 }, /* dock mic */
{ }
};
+ /* Assure the speaker pin to be coupled with DAC NID 0x03; otherwise
+ * the speaker output becomes too low by some reason on Thinkpads with
+ * ALC298 codec
+ */
+ static hda_nid_t preferred_pairs[] = {
+ 0x14, 0x03, 0x17, 0x02, 0x21, 0x02,
+ 0
+ };
struct alc_spec *spec = codec->spec;
if (action == HDA_FIXUP_ACT_PRE_PROBE) {
+ spec->gen.preferred_dacs = preferred_pairs;
spec->parse_flags = HDA_PINCFG_NO_HP_FIXUP;
snd_hda_apply_pincfgs(codec, pincfgs);
} else if (action == HDA_FIXUP_ACT_INIT) {
spec->gen.preferred_dacs = preferred_pairs;
}
+/* The DAC of NID 0x3 will introduce click/pop noise on headphones, so invalidate it */
+static void alc285_fixup_invalidate_dacs(struct hda_codec *codec,
+ const struct hda_fixup *fix, int action)
+{
+ if (action != HDA_FIXUP_ACT_PRE_PROBE)
+ return;
+
+ snd_hda_override_wcaps(codec, 0x03, 0);
+}
+
/* for hda_fixup_thinkpad_acpi() */
#include "thinkpad_helper.c"
ALC255_FIXUP_DELL_HEADSET_MIC,
ALC295_FIXUP_HP_X360,
ALC221_FIXUP_HP_HEADSET_MIC,
+ ALC285_FIXUP_LENOVO_HEADPHONE_NOISE,
+ ALC295_FIXUP_HP_AUTO_MUTE,
+ ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE,
};
static const struct hda_fixup alc269_fixups[] = {
[ALC269_FIXUP_HP_MUTE_LED_MIC3] = {
.type = HDA_FIXUP_FUNC,
.v.func = alc269_fixup_hp_mute_led_mic3,
+ .chained = true,
+ .chain_id = ALC295_FIXUP_HP_AUTO_MUTE
},
[ALC269_FIXUP_HP_GPIO_LED] = {
.type = HDA_FIXUP_FUNC,
.chained = true,
.chain_id = ALC269_FIXUP_HEADSET_MIC
},
+ [ALC285_FIXUP_LENOVO_HEADPHONE_NOISE] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc285_fixup_invalidate_dacs,
+ },
+ [ALC295_FIXUP_HP_AUTO_MUTE] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc_fixup_auto_mute_via_amp,
+ },
+ [ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x18, 0x01a1913c }, /* use as headset mic, without its own jack detect */
+ { }
+ },
+ .chained = true,
+ .chain_id = ALC269_FIXUP_HEADSET_MIC
+ },
};
static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1025, 0x0762, "Acer Aspire E1-472", ALC271_FIXUP_HP_GATE_MIC_JACK_E1_572),
SND_PCI_QUIRK(0x1025, 0x0775, "Acer Aspire E1-572", ALC271_FIXUP_HP_GATE_MIC_JACK_E1_572),
SND_PCI_QUIRK(0x1025, 0x079b, "Acer Aspire V5-573G", ALC282_FIXUP_ASPIRE_V5_PINS),
+ SND_PCI_QUIRK(0x1025, 0x102b, "Acer Aspire C24-860", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1025, 0x106d, "Acer Cloudbook 14", ALC283_FIXUP_CHROME_BOOK),
+ SND_PCI_QUIRK(0x1025, 0x128f, "Acer Veriton Z6860G", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1025, 0x1290, "Acer Veriton Z4860G", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1025, 0x1291, "Acer Veriton Z4660G", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x0470, "Dell M101z", ALC269_FIXUP_DELL_M101Z),
SND_PCI_QUIRK(0x1028, 0x054b, "Dell XPS one 2710", ALC275_FIXUP_DELL_XPS),
SND_PCI_QUIRK(0x1028, 0x05bd, "Dell Latitude E6440", ALC292_FIXUP_DELL_E7X),
SND_PCI_QUIRK(0x103c, 0x2336, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC1),
SND_PCI_QUIRK(0x103c, 0x2337, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC1),
SND_PCI_QUIRK(0x103c, 0x221c, "HP EliteBook 755 G2", ALC280_FIXUP_HP_HEADSET_MIC),
+ SND_PCI_QUIRK(0x103c, 0x820d, "HP Pavilion 15", ALC269_FIXUP_HP_MUTE_LED_MIC3),
SND_PCI_QUIRK(0x103c, 0x8256, "HP", ALC221_FIXUP_HP_FRONT_MIC),
SND_PCI_QUIRK(0x103c, 0x827e, "HP x360", ALC295_FIXUP_HP_X360),
SND_PCI_QUIRK(0x103c, 0x82bf, "HP", ALC221_FIXUP_HP_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x144d, 0xc740, "Samsung Ativ book 8 (NP870Z5G)", ALC269_FIXUP_ATIV_BOOK_8),
SND_PCI_QUIRK(0x1458, 0xfa53, "Gigabyte BXBT-2807", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x1462, 0xb120, "MSI Cubi MS-B120", ALC283_FIXUP_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1462, 0xb171, "Cubi N 8GL (MS-B171)", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x17aa, 0x1036, "Lenovo P520", ALC233_FIXUP_LENOVO_MULTI_CODECS),
SND_PCI_QUIRK(0x17aa, 0x20f2, "Thinkpad SL410/510", ALC269_FIXUP_SKU_IGNORE),
SND_PCI_QUIRK(0x17aa, 0x215e, "Thinkpad L512", ALC269_FIXUP_SKU_IGNORE),
{0x12, 0x90a60130},
{0x19, 0x03a11020},
{0x21, 0x0321101f}),
+ SND_HDA_PIN_QUIRK(0x10ec0285, 0x17aa, "Lenovo", ALC285_FIXUP_LENOVO_HEADPHONE_NOISE,
+ {0x12, 0x90a60130},
+ {0x14, 0x90170110},
+ {0x19, 0x04a11040},
+ {0x21, 0x04211020}),
+ SND_HDA_PIN_QUIRK(0x10ec0286, 0x1025, "Acer", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE,
+ {0x12, 0x90a60130},
+ {0x17, 0x90170110},
+ {0x21, 0x02211020}),
SND_HDA_PIN_QUIRK(0x10ec0288, 0x1028, "Dell", ALC288_FIXUP_DELL1_MIC_NO_PRESENCE,
{0x12, 0x90a60120},
{0x14, 0x90170110},
spec->gen.mixer_nid = 0; /* ALC2x4 does not have any loopback mixer path */
alc_update_coef_idx(codec, 0x6b, 0x0018, (1<<4) | (1<<3)); /* UAJ MIC Vref control by verb */
break;
+ case 0x10ec0300:
+ spec->codec_variant = ALC269_TYPE_ALC300;
+ spec->gen.mixer_nid = 0; /* no loopback on ALC300 */
+ break;
case 0x10ec0700:
case 0x10ec0701:
case 0x10ec0703:
HDA_CODEC_ENTRY(0x10ec0295, "ALC295", patch_alc269),
HDA_CODEC_ENTRY(0x10ec0298, "ALC298", patch_alc269),
HDA_CODEC_ENTRY(0x10ec0299, "ALC299", patch_alc269),
+ HDA_CODEC_ENTRY(0x10ec0300, "ALC300", patch_alc269),
HDA_CODEC_REV_ENTRY(0x10ec0861, 0x100340, "ALC660", patch_alc861),
HDA_CODEC_ENTRY(0x10ec0660, "ALC660-VD", patch_alc861vd),
HDA_CODEC_ENTRY(0x10ec0861, "ALC861", patch_alc861),
removefunc = false;
}
if (led_set_func(TPACPI_LED_MICMUTE, false) >= 0 &&
- snd_hda_gen_add_micmute_led(codec,
- update_tpacpi_micmute) > 0)
+ !snd_hda_gen_add_micmute_led(codec,
+ update_tpacpi_micmute))
removefunc = false;
}
*/
snd_hdac_codec_read(hdev, hdev->afg, 0, AC_VERB_SET_POWER_STATE,
AC_PWRST_D3);
- err = snd_hdac_display_power(bus, false);
- if (err < 0) {
- dev_err(dev, "Cannot turn on display power on i915\n");
- return err;
- }
hlink = snd_hdac_ext_bus_get_link(bus, dev_name(dev));
if (!hlink) {
snd_hdac_ext_bus_link_put(bus, hlink);
- return 0;
+ err = snd_hdac_display_power(bus, false);
+ if (err < 0)
+ dev_err(dev, "Cannot turn off display power on i915\n");
+
+ return err;
}
static int hdac_hdmi_runtime_resume(struct device *dev)
#define PCM186X_MAX_REGISTER PCM186X_CURR_TRIM_CTRL
/* PCM186X_PAGE */
-#define PCM186X_RESET 0xff
+#define PCM186X_RESET 0xfe
/* PCM186X_ADCX_INPUT_SEL_X */
#define PCM186X_ADC_INPUT_SEL_POL BIT(7)
};
static const struct snd_soc_dapm_widget pcm3060_dapm_widgets[] = {
- SND_SOC_DAPM_OUTPUT("OUTL+"),
- SND_SOC_DAPM_OUTPUT("OUTR+"),
- SND_SOC_DAPM_OUTPUT("OUTL-"),
- SND_SOC_DAPM_OUTPUT("OUTR-"),
+ SND_SOC_DAPM_OUTPUT("OUTL"),
+ SND_SOC_DAPM_OUTPUT("OUTR"),
SND_SOC_DAPM_INPUT("INL"),
SND_SOC_DAPM_INPUT("INR"),
};
static const struct snd_soc_dapm_route pcm3060_dapm_map[] = {
- { "OUTL+", NULL, "Playback" },
- { "OUTR+", NULL, "Playback" },
- { "OUTL-", NULL, "Playback" },
- { "OUTR-", NULL, "Playback" },
+ { "OUTL", NULL, "Playback" },
+ { "OUTR", NULL, "Playback" },
{ "Capture", NULL, "INL" },
{ "Capture", NULL, "INR" },
static void wm_adsp2_show_fw_status(struct wm_adsp *dsp)
{
- u16 scratch[4];
+ unsigned int scratch[4];
+ unsigned int addr = dsp->base + ADSP2_SCRATCH0;
+ unsigned int i;
int ret;
- ret = regmap_raw_read(dsp->regmap, dsp->base + ADSP2_SCRATCH0,
- scratch, sizeof(scratch));
- if (ret) {
- adsp_err(dsp, "Failed to read SCRATCH regs: %d\n", ret);
- return;
+ for (i = 0; i < ARRAY_SIZE(scratch); ++i) {
+ ret = regmap_read(dsp->regmap, addr + i, &scratch[i]);
+ if (ret) {
+ adsp_err(dsp, "Failed to read SCRATCH%u: %d\n", i, ret);
+ return;
+ }
}
adsp_dbg(dsp, "FW SCRATCH 0:0x%x 1:0x%x 2:0x%x 3:0x%x\n",
- be16_to_cpu(scratch[0]),
- be16_to_cpu(scratch[1]),
- be16_to_cpu(scratch[2]),
- be16_to_cpu(scratch[3]));
+ scratch[0], scratch[1], scratch[2], scratch[3]);
}
static void wm_adsp2v2_show_fw_status(struct wm_adsp *dsp)
{
- u32 scratch[2];
+ unsigned int scratch[2];
int ret;
- ret = regmap_raw_read(dsp->regmap, dsp->base + ADSP2V2_SCRATCH0_1,
- scratch, sizeof(scratch));
-
+ ret = regmap_read(dsp->regmap, dsp->base + ADSP2V2_SCRATCH0_1,
+ &scratch[0]);
if (ret) {
- adsp_err(dsp, "Failed to read SCRATCH regs: %d\n", ret);
+ adsp_err(dsp, "Failed to read SCRATCH0_1: %d\n", ret);
return;
}
- scratch[0] = be32_to_cpu(scratch[0]);
- scratch[1] = be32_to_cpu(scratch[1]);
+ ret = regmap_read(dsp->regmap, dsp->base + ADSP2V2_SCRATCH2_3,
+ &scratch[1]);
+ if (ret) {
+ adsp_err(dsp, "Failed to read SCRATCH2_3: %d\n", ret);
+ return;
+ }
adsp_dbg(dsp, "FW SCRATCH 0:0x%x 1:0x%x 2:0x%x 3:0x%x\n",
scratch[0] & 0xFFFF,
codec, then enable this option by saying Y or m. This is a
recommended option
-config SND_SOC_INTEL_SKYLAKE_SSP_CLK
- tristate
-
config SND_SOC_INTEL_SKYLAKE
tristate "SKL/BXT/KBL/GLK/CNL... Platforms"
depends on PCI && ACPI
+ select SND_SOC_INTEL_SKYLAKE_COMMON
+ help
+ If you have a Intel Skylake/Broxton/ApolloLake/KabyLake/
+ GeminiLake or CannonLake platform with the DSP enabled in the BIOS
+ then enable this option by saying Y or m.
+
+if SND_SOC_INTEL_SKYLAKE
+
+config SND_SOC_INTEL_SKYLAKE_SSP_CLK
+ tristate
+
+config SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC
+ bool "HDAudio codec support"
+ help
+ If you have a Intel Skylake/Broxton/ApolloLake/KabyLake/
+ GeminiLake or CannonLake platform with an HDaudio codec
+ then enable this option by saying Y
+
+config SND_SOC_INTEL_SKYLAKE_COMMON
+ tristate
select SND_HDA_EXT_CORE
select SND_HDA_DSP_LOADER
select SND_SOC_TOPOLOGY
select SND_SOC_INTEL_SST
+ select SND_SOC_HDAC_HDA if SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC
select SND_SOC_ACPI_INTEL_MATCH
help
If you have a Intel Skylake/Broxton/ApolloLake/KabyLake/
GeminiLake or CannonLake platform with the DSP enabled in the BIOS
then enable this option by saying Y or m.
+endif ## SND_SOC_INTEL_SKYLAKE
+
config SND_SOC_ACPI_INTEL_MATCH
tristate
select SND_SOC_ACPI if ACPI
Say Y if you have such a device.
If unsure select "N".
-config SND_SOC_INTEL_SKL_HDA_DSP_GENERIC_MACH
- tristate "SKL/KBL/BXT/APL with HDA Codecs"
- select SND_SOC_HDAC_HDMI
- select SND_SOC_HDAC_HDA
- help
- This adds support for ASoC machine driver for Intel platforms
- SKL/KBL/BXT/APL with iDisp, HDA audio codecs.
- Say Y or m if you have such a device. This is a recommended option.
- If unsure select "N".
-
config SND_SOC_INTEL_GLK_RT5682_MAX98357A_MACH
tristate "GLK with RT5682 and MAX98357A in I2S Mode"
depends on MFD_INTEL_LPSS && I2C && ACPI
endif ## SND_SOC_INTEL_SKYLAKE
+if SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC
+
+config SND_SOC_INTEL_SKL_HDA_DSP_GENERIC_MACH
+ tristate "SKL/KBL/BXT/APL with HDA Codecs"
+ select SND_SOC_HDAC_HDMI
+ # SND_SOC_HDAC_HDA is already selected
+ help
+ This adds support for ASoC machine driver for Intel platforms
+ SKL/KBL/BXT/APL with iDisp, HDA audio codecs.
+ Say Y or m if you have such a device. This is a recommended option.
+ If unsure select "N".
+
+endif ## SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC
+
endif ## SND_SOC_INTEL_MACH
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
+#include <linux/dmi.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
#define CHT_PLAT_CLK_3_HZ 19200000
#define CHT_CODEC_DAI "HiFi"
+#define QUIRK_PMC_PLT_CLK_0 0x01
+
struct cht_mc_private {
struct clk *mclk;
struct snd_soc_jack jack;
.num_controls = ARRAY_SIZE(cht_mc_controls),
};
+static const struct dmi_system_id cht_max98090_quirk_table[] = {
+ {
+ /* Swanky model Chromebook (Toshiba Chromebook 2) */
+ .matches = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "Swanky"),
+ },
+ .driver_data = (void *)QUIRK_PMC_PLT_CLK_0,
+ },
+ {}
+};
+
static int snd_cht_mc_probe(struct platform_device *pdev)
{
+ const struct dmi_system_id *dmi_id;
struct device *dev = &pdev->dev;
int ret_val = 0;
struct cht_mc_private *drv;
+ const char *mclk_name;
+ int quirks = 0;
+
+ dmi_id = dmi_first_match(cht_max98090_quirk_table);
+ if (dmi_id)
+ quirks = (unsigned long)dmi_id->driver_data;
drv = devm_kzalloc(&pdev->dev, sizeof(*drv), GFP_KERNEL);
if (!drv)
snd_soc_card_cht.dev = &pdev->dev;
snd_soc_card_set_drvdata(&snd_soc_card_cht, drv);
- drv->mclk = devm_clk_get(&pdev->dev, "pmc_plt_clk_3");
+ if (quirks & QUIRK_PMC_PLT_CLK_0)
+ mclk_name = "pmc_plt_clk_0";
+ else
+ mclk_name = "pmc_plt_clk_3";
+
+ drv->mclk = devm_clk_get(&pdev->dev, mclk_name);
if (IS_ERR(drv->mclk)) {
dev_err(&pdev->dev,
- "Failed to get MCLK from pmc_plt_clk_3: %ld\n",
- PTR_ERR(drv->mclk));
+ "Failed to get MCLK from %s: %ld\n",
+ mclk_name, PTR_ERR(drv->mclk));
return PTR_ERR(drv->mclk);
}
#include "skl.h"
#include "skl-sst-dsp.h"
#include "skl-sst-ipc.h"
+#if IS_ENABLED(CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC)
#include "../../../soc/codecs/hdac_hda.h"
+#endif
/*
* initialize the PCI registers
platform_device_unregister(skl->clk_dev);
}
+#if IS_ENABLED(CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC)
+
#define IDISP_INTEL_VENDOR_ID 0x80860000
/*
#endif
}
+#endif /* CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC */
+
/*
* Probe the given codec address
*/
(AC_VERB_PARAMETERS << 8) | AC_PAR_VENDOR_ID;
unsigned int res = -1;
struct skl *skl = bus_to_skl(bus);
+#if IS_ENABLED(CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC)
struct hdac_hda_priv *hda_codec;
- struct hdac_device *hdev;
int err;
+#endif
+ struct hdac_device *hdev;
mutex_lock(&bus->cmd_mutex);
snd_hdac_bus_send_cmd(bus, cmd);
return -EIO;
dev_dbg(bus->dev, "codec #%d probed OK: %x\n", addr, res);
+#if IS_ENABLED(CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC)
hda_codec = devm_kzalloc(&skl->pci->dev, sizeof(*hda_codec),
GFP_KERNEL);
if (!hda_codec)
load_codec_module(&hda_codec->codec);
}
return 0;
+#else
+ hdev = devm_kzalloc(&skl->pci->dev, sizeof(*hdev), GFP_KERNEL);
+ if (!hdev)
+ return -ENOMEM;
+
+ return snd_hdac_ext_bus_device_init(bus, addr, hdev);
+#endif /* CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC */
}
/* Codec initialization */
}
}
+ /*
+ * we are done probing so decrement link counts
+ */
+ list_for_each_entry(hlink, &bus->hlink_list, list)
+ snd_hdac_ext_bus_link_put(bus, hlink);
+
if (IS_ENABLED(CONFIG_SND_SOC_HDAC_HDMI)) {
err = snd_hdac_display_power(bus, false);
if (err < 0) {
}
}
- /*
- * we are done probing so decrement link counts
- */
- list_for_each_entry(hlink, &bus->hlink_list, list)
- snd_hdac_ext_bus_link_put(bus, hlink);
-
/* configure PM */
pm_runtime_put_noidle(bus->dev);
pm_runtime_allow(bus->dev);
hbus = skl_to_hbus(skl);
bus = skl_to_bus(skl);
-#if IS_ENABLED(CONFIG_SND_SOC_HDAC_HDA)
+#if IS_ENABLED(CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC)
ext_ops = snd_soc_hdac_hda_get_ops();
#endif
snd_hdac_ext_bus_init(bus, &pci->dev, &bus_core_ops, io_ops, ext_ops);
#include "../codecs/twl6040.h"
struct abe_twl6040 {
+ struct snd_soc_card card;
+ struct snd_soc_dai_link dai_links[2];
int jack_detection; /* board can detect jack events */
int mclk_freq; /* MCLK frequency speed for twl6040 */
};
ARRAY_SIZE(dmic_audio_map));
}
-/* Digital audio interface glue - connects codec <--> CPU */
-static struct snd_soc_dai_link abe_twl6040_dai_links[] = {
- {
- .name = "TWL6040",
- .stream_name = "TWL6040",
- .codec_dai_name = "twl6040-legacy",
- .codec_name = "twl6040-codec",
- .init = omap_abe_twl6040_init,
- .ops = &omap_abe_ops,
- },
- {
- .name = "DMIC",
- .stream_name = "DMIC Capture",
- .codec_dai_name = "dmic-hifi",
- .codec_name = "dmic-codec",
- .init = omap_abe_dmic_init,
- .ops = &omap_abe_dmic_ops,
- },
-};
-
-/* Audio machine driver */
-static struct snd_soc_card omap_abe_card = {
- .owner = THIS_MODULE,
-
- .dapm_widgets = twl6040_dapm_widgets,
- .num_dapm_widgets = ARRAY_SIZE(twl6040_dapm_widgets),
- .dapm_routes = audio_map,
- .num_dapm_routes = ARRAY_SIZE(audio_map),
-};
-
static int omap_abe_probe(struct platform_device *pdev)
{
struct device_node *node = pdev->dev.of_node;
- struct snd_soc_card *card = &omap_abe_card;
+ struct snd_soc_card *card;
struct device_node *dai_node;
struct abe_twl6040 *priv;
int num_links = 0;
return -ENODEV;
}
- card->dev = &pdev->dev;
-
priv = devm_kzalloc(&pdev->dev, sizeof(struct abe_twl6040), GFP_KERNEL);
if (priv == NULL)
return -ENOMEM;
+ card = &priv->card;
+ card->dev = &pdev->dev;
+ card->owner = THIS_MODULE;
+ card->dapm_widgets = twl6040_dapm_widgets;
+ card->num_dapm_widgets = ARRAY_SIZE(twl6040_dapm_widgets);
+ card->dapm_routes = audio_map;
+ card->num_dapm_routes = ARRAY_SIZE(audio_map);
+
if (snd_soc_of_parse_card_name(card, "ti,model")) {
dev_err(&pdev->dev, "Card name is not provided\n");
return -ENODEV;
dev_err(&pdev->dev, "McPDM node is not provided\n");
return -EINVAL;
}
- abe_twl6040_dai_links[0].cpu_of_node = dai_node;
- abe_twl6040_dai_links[0].platform_of_node = dai_node;
+
+ priv->dai_links[0].name = "DMIC";
+ priv->dai_links[0].stream_name = "TWL6040";
+ priv->dai_links[0].cpu_of_node = dai_node;
+ priv->dai_links[0].platform_of_node = dai_node;
+ priv->dai_links[0].codec_dai_name = "twl6040-legacy";
+ priv->dai_links[0].codec_name = "twl6040-codec";
+ priv->dai_links[0].init = omap_abe_twl6040_init;
+ priv->dai_links[0].ops = &omap_abe_ops;
dai_node = of_parse_phandle(node, "ti,dmic", 0);
if (dai_node) {
num_links = 2;
- abe_twl6040_dai_links[1].cpu_of_node = dai_node;
- abe_twl6040_dai_links[1].platform_of_node = dai_node;
+ priv->dai_links[1].name = "TWL6040";
+ priv->dai_links[1].stream_name = "DMIC Capture";
+ priv->dai_links[1].cpu_of_node = dai_node;
+ priv->dai_links[1].platform_of_node = dai_node;
+ priv->dai_links[1].codec_dai_name = "dmic-hifi";
+ priv->dai_links[1].codec_name = "dmic-codec";
+ priv->dai_links[1].init = omap_abe_dmic_init;
+ priv->dai_links[1].ops = &omap_abe_dmic_ops;
} else {
num_links = 1;
}
return -ENODEV;
}
- card->dai_link = abe_twl6040_dai_links;
+ card->dai_link = priv->dai_links;
card->num_links = num_links;
snd_soc_card_set_drvdata(card, priv);
struct device *dev;
void __iomem *io_base;
struct clk *fclk;
+ struct pm_qos_request pm_qos_req;
+ int latency;
int fclk_freq;
int out_freq;
int clk_div;
mutex_lock(&dmic->mutex);
+ pm_qos_remove_request(&dmic->pm_qos_req);
+
if (!dai->active)
dmic->active = 0;
/* packet size is threshold * channels */
dma_data = snd_soc_dai_get_dma_data(dai, substream);
dma_data->maxburst = dmic->threshold * channels;
+ dmic->latency = (OMAP_DMIC_THRES_MAX - dmic->threshold) * USEC_PER_SEC /
+ params_rate(params);
return 0;
}
struct omap_dmic *dmic = snd_soc_dai_get_drvdata(dai);
u32 ctrl;
+ if (pm_qos_request_active(&dmic->pm_qos_req))
+ pm_qos_update_request(&dmic->pm_qos_req, dmic->latency);
+
/* Configure uplink threshold */
omap_dmic_write(dmic, OMAP_DMIC_FIFO_CTRL_REG, dmic->threshold);
pkt_size = channels;
}
- latency = ((((buffer_size - pkt_size) / channels) * 1000)
- / (params->rate_num / params->rate_den));
-
+ latency = (buffer_size - pkt_size) / channels;
+ latency = latency * USEC_PER_SEC /
+ (params->rate_num / params->rate_den);
mcbsp->latency[substream->stream] = latency;
omap_mcbsp_set_threshold(substream, pkt_size);
unsigned long phys_base;
void __iomem *io_base;
int irq;
+ struct pm_qos_request pm_qos_req;
+ int latency[2];
struct mutex mutex;
struct snd_soc_dai *dai)
{
struct omap_mcpdm *mcpdm = snd_soc_dai_get_drvdata(dai);
+ int tx = (substream->stream == SNDRV_PCM_STREAM_PLAYBACK);
+ int stream1 = tx ? SNDRV_PCM_STREAM_PLAYBACK : SNDRV_PCM_STREAM_CAPTURE;
+ int stream2 = tx ? SNDRV_PCM_STREAM_CAPTURE : SNDRV_PCM_STREAM_PLAYBACK;
mutex_lock(&mcpdm->mutex);
}
}
+ if (mcpdm->latency[stream2])
+ pm_qos_update_request(&mcpdm->pm_qos_req,
+ mcpdm->latency[stream2]);
+ else if (mcpdm->latency[stream1])
+ pm_qos_remove_request(&mcpdm->pm_qos_req);
+
+ mcpdm->latency[stream1] = 0;
+
mutex_unlock(&mcpdm->mutex);
}
int stream = substream->stream;
struct snd_dmaengine_dai_dma_data *dma_data;
u32 threshold;
- int channels;
+ int channels, latency;
int link_mask = 0;
channels = params_channels(params);
dma_data->maxburst =
(MCPDM_DN_THRES_MAX - threshold) * channels;
+ latency = threshold;
} else {
/* If playback is not running assume a stereo stream to come */
if (!mcpdm->config[!stream].link_mask)
mcpdm->config[!stream].link_mask = (0x3 << 3);
dma_data->maxburst = threshold * channels;
+ latency = (MCPDM_DN_THRES_MAX - threshold);
}
+ /*
+ * The DMA must act to a DMA request within latency time (usec) to avoid
+ * under/overflow
+ */
+ mcpdm->latency[stream] = latency * USEC_PER_SEC / params_rate(params);
+
+ if (!mcpdm->latency[stream])
+ mcpdm->latency[stream] = 10;
+
/* Check if we need to restart McPDM with this stream */
if (mcpdm->config[stream].link_mask &&
mcpdm->config[stream].link_mask != link_mask)
struct snd_soc_dai *dai)
{
struct omap_mcpdm *mcpdm = snd_soc_dai_get_drvdata(dai);
+ struct pm_qos_request *pm_qos_req = &mcpdm->pm_qos_req;
+ int tx = (substream->stream == SNDRV_PCM_STREAM_PLAYBACK);
+ int stream1 = tx ? SNDRV_PCM_STREAM_PLAYBACK : SNDRV_PCM_STREAM_CAPTURE;
+ int stream2 = tx ? SNDRV_PCM_STREAM_CAPTURE : SNDRV_PCM_STREAM_PLAYBACK;
+ int latency = mcpdm->latency[stream2];
+
+ /* Prevent omap hardware from hitting off between FIFO fills */
+ if (!latency || mcpdm->latency[stream1] < latency)
+ latency = mcpdm->latency[stream1];
+
+ if (pm_qos_request_active(pm_qos_req))
+ pm_qos_update_request(pm_qos_req, latency);
+ else if (latency)
+ pm_qos_add_request(pm_qos_req, PM_QOS_CPU_DMA_LATENCY, latency);
if (!omap_mcpdm_active(mcpdm)) {
omap_mcpdm_start(mcpdm);
free_irq(mcpdm->irq, (void *)mcpdm);
pm_runtime_disable(mcpdm->dev);
+ if (pm_qos_request_active(&mcpdm->pm_qos_req))
+ pm_qos_remove_request(&mcpdm->pm_qos_req);
+
return 0;
}
struct device_node *cpu = NULL;
struct device *dev = card->dev;
struct snd_soc_dai_link *link;
+ struct of_phandle_args args;
int ret, num_links;
ret = snd_soc_of_parse_card_name(card, "model");
goto err;
}
- link->cpu_of_node = of_parse_phandle(cpu, "sound-dai", 0);
- if (!link->cpu_of_node) {
+ ret = of_parse_phandle_with_args(cpu, "sound-dai",
+ "#sound-dai-cells", 0, &args);
+ if (ret) {
dev_err(card->dev, "error getting cpu phandle\n");
- ret = -EINVAL;
goto err;
}
+ link->cpu_of_node = args.np;
+ link->id = args.args[0];
ret = snd_soc_of_get_dai_name(cpu, &link->cpu_dai_name);
if (ret) {
}
static const struct snd_soc_dapm_widget q6afe_dai_widgets[] = {
- SND_SOC_DAPM_AIF_OUT("HDMI_RX", "HDMI Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SLIMBUS_0_RX", "Slimbus Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SLIMBUS_1_RX", "Slimbus1 Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SLIMBUS_2_RX", "Slimbus2 Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SLIMBUS_3_RX", "Slimbus3 Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SLIMBUS_4_RX", "Slimbus4 Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SLIMBUS_5_RX", "Slimbus5 Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SLIMBUS_6_RX", "Slimbus6 Playback", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SLIMBUS_0_TX", "Slimbus Capture", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SLIMBUS_1_TX", "Slimbus1 Capture", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SLIMBUS_2_TX", "Slimbus2 Capture", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SLIMBUS_3_TX", "Slimbus3 Capture", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SLIMBUS_4_TX", "Slimbus4 Capture", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SLIMBUS_5_TX", "Slimbus5 Capture", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SLIMBUS_6_TX", "Slimbus6 Capture", 0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_MI2S_RX", "Quaternary MI2S Playback",
+ SND_SOC_DAPM_AIF_IN("HDMI_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("SLIMBUS_0_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("SLIMBUS_1_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("SLIMBUS_2_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("SLIMBUS_3_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("SLIMBUS_4_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("SLIMBUS_5_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("SLIMBUS_6_RX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_OUT("SLIMBUS_0_TX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_OUT("SLIMBUS_1_TX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_OUT("SLIMBUS_2_TX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_OUT("SLIMBUS_3_TX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_OUT("SLIMBUS_4_TX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_OUT("SLIMBUS_5_TX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_OUT("SLIMBUS_6_TX", NULL, 0, 0, 0, 0),
+ SND_SOC_DAPM_AIF_IN("QUAT_MI2S_RX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_MI2S_TX", "Quaternary MI2S Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_MI2S_TX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_MI2S_RX", "Tertiary MI2S Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_MI2S_RX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_MI2S_TX", "Tertiary MI2S Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_MI2S_TX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_MI2S_RX", "Secondary MI2S Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_MI2S_RX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_MI2S_TX", "Secondary MI2S Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_MI2S_TX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_MI2S_RX_SD1",
+ SND_SOC_DAPM_AIF_IN("SEC_MI2S_RX_SD1",
"Secondary MI2S Playback SD1",
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRI_MI2S_RX", "Primary MI2S Playback",
+ SND_SOC_DAPM_AIF_IN("PRI_MI2S_RX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRI_MI2S_TX", "Primary MI2S Capture",
+ SND_SOC_DAPM_AIF_OUT("PRI_MI2S_TX", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_0", "Primary TDM0 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_1", "Primary TDM1 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_2", "Primary TDM2 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_3", "Primary TDM3 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_4", "Primary TDM4 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_5", "Primary TDM5 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_6", "Primary TDM6 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_RX_7", "Primary TDM7 Playback",
+ SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_RX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_0", "Primary TDM0 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_1", "Primary TDM1 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_2", "Primary TDM2 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_3", "Primary TDM3 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_4", "Primary TDM4 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_5", "Primary TDM5 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_6", "Primary TDM6 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("PRIMARY_TDM_TX_7", "Primary TDM7 Capture",
+ SND_SOC_DAPM_AIF_OUT("PRIMARY_TDM_TX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_0", "Secondary TDM0 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_1", "Secondary TDM1 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_2", "Secondary TDM2 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_3", "Secondary TDM3 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_4", "Secondary TDM4 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_5", "Secondary TDM5 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_6", "Secondary TDM6 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("SEC_TDM_RX_7", "Secondary TDM7 Playback",
+ SND_SOC_DAPM_AIF_IN("SEC_TDM_RX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_0", "Secondary TDM0 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_1", "Secondary TDM1 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_2", "Secondary TDM2 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_3", "Secondary TDM3 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_4", "Secondary TDM4 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_5", "Secondary TDM5 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_6", "Secondary TDM6 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("SEC_TDM_TX_7", "Secondary TDM7 Capture",
+ SND_SOC_DAPM_AIF_OUT("SEC_TDM_TX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_0", "Tertiary TDM0 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_1", "Tertiary TDM1 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_2", "Tertiary TDM2 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_3", "Tertiary TDM3 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_4", "Tertiary TDM4 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_5", "Tertiary TDM5 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_6", "Tertiary TDM6 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("TERT_TDM_RX_7", "Tertiary TDM7 Playback",
+ SND_SOC_DAPM_AIF_IN("TERT_TDM_RX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_0", "Tertiary TDM0 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_1", "Tertiary TDM1 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_2", "Tertiary TDM2 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_3", "Tertiary TDM3 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_4", "Tertiary TDM4 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_5", "Tertiary TDM5 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_6", "Tertiary TDM6 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("TERT_TDM_TX_7", "Tertiary TDM7 Capture",
+ SND_SOC_DAPM_AIF_OUT("TERT_TDM_TX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_0", "Quaternary TDM0 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_1", "Quaternary TDM1 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_2", "Quaternary TDM2 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_3", "Quaternary TDM3 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_4", "Quaternary TDM4 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_5", "Quaternary TDM5 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_6", "Quaternary TDM6 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUAT_TDM_RX_7", "Quaternary TDM7 Playback",
+ SND_SOC_DAPM_AIF_IN("QUAT_TDM_RX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_0", "Quaternary TDM0 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_1", "Quaternary TDM1 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_2", "Quaternary TDM2 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_3", "Quaternary TDM3 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_4", "Quaternary TDM4 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_5", "Quaternary TDM5 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_6", "Quaternary TDM6 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUAT_TDM_TX_7", "Quaternary TDM7 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUAT_TDM_TX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_0", "Quinary TDM0 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_1", "Quinary TDM1 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_2", "Quinary TDM2 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_3", "Quinary TDM3 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_4", "Quinary TDM4 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_5", "Quinary TDM5 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_6", "Quinary TDM6 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_OUT("QUIN_TDM_RX_7", "Quinary TDM7 Playback",
+ SND_SOC_DAPM_AIF_IN("QUIN_TDM_RX_7", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_0", "Quinary TDM0 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_0", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_1", "Quinary TDM1 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_1", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_2", "Quinary TDM2 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_2", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_3", "Quinary TDM3 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_3", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_4", "Quinary TDM4 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_4", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_5", "Quinary TDM5 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_5", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_6", "Quinary TDM6 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_6", NULL,
0, 0, 0, 0),
- SND_SOC_DAPM_AIF_IN("QUIN_TDM_TX_7", "Quinary TDM7 Capture",
+ SND_SOC_DAPM_AIF_OUT("QUIN_TDM_TX_7", NULL,
0, 0, 0, 0),
};
#define AFE_PORT_I2S_SD1 0x2
#define AFE_PORT_I2S_SD2 0x3
#define AFE_PORT_I2S_SD3 0x4
-#define AFE_PORT_I2S_SD0_MASK BIT(0x1)
-#define AFE_PORT_I2S_SD1_MASK BIT(0x2)
-#define AFE_PORT_I2S_SD2_MASK BIT(0x3)
-#define AFE_PORT_I2S_SD3_MASK BIT(0x4)
-#define AFE_PORT_I2S_SD0_1_MASK GENMASK(2, 1)
-#define AFE_PORT_I2S_SD2_3_MASK GENMASK(4, 3)
-#define AFE_PORT_I2S_SD0_1_2_MASK GENMASK(3, 1)
-#define AFE_PORT_I2S_SD0_1_2_3_MASK GENMASK(4, 1)
+#define AFE_PORT_I2S_SD0_MASK BIT(0x0)
+#define AFE_PORT_I2S_SD1_MASK BIT(0x1)
+#define AFE_PORT_I2S_SD2_MASK BIT(0x2)
+#define AFE_PORT_I2S_SD3_MASK BIT(0x3)
+#define AFE_PORT_I2S_SD0_1_MASK GENMASK(1, 0)
+#define AFE_PORT_I2S_SD2_3_MASK GENMASK(3, 2)
+#define AFE_PORT_I2S_SD0_1_2_MASK GENMASK(2, 0)
+#define AFE_PORT_I2S_SD0_1_2_3_MASK GENMASK(3, 0)
#define AFE_PORT_I2S_QUAD01 0x5
#define AFE_PORT_I2S_QUAD23 0x6
#define AFE_PORT_I2S_6CHS 0x7
.rate_max = 48000, \
}, \
.name = "MultiMedia"#num, \
- .probe = fe_dai_probe, \
.id = MSM_FRONTEND_DAI_MULTIMEDIA##num, \
}
}
}
-static const struct snd_soc_dapm_route afe_pcm_routes[] = {
- {"MM_DL1", NULL, "MultiMedia1 Playback" },
- {"MM_DL2", NULL, "MultiMedia2 Playback" },
- {"MM_DL3", NULL, "MultiMedia3 Playback" },
- {"MM_DL4", NULL, "MultiMedia4 Playback" },
- {"MM_DL5", NULL, "MultiMedia5 Playback" },
- {"MM_DL6", NULL, "MultiMedia6 Playback" },
- {"MM_DL7", NULL, "MultiMedia7 Playback" },
- {"MM_DL7", NULL, "MultiMedia8 Playback" },
- {"MultiMedia1 Capture", NULL, "MM_UL1"},
- {"MultiMedia2 Capture", NULL, "MM_UL2"},
- {"MultiMedia3 Capture", NULL, "MM_UL3"},
- {"MultiMedia4 Capture", NULL, "MM_UL4"},
- {"MultiMedia5 Capture", NULL, "MM_UL5"},
- {"MultiMedia6 Capture", NULL, "MM_UL6"},
- {"MultiMedia7 Capture", NULL, "MM_UL7"},
- {"MultiMedia8 Capture", NULL, "MM_UL8"},
-
-};
-
-static int fe_dai_probe(struct snd_soc_dai *dai)
-{
- struct snd_soc_dapm_context *dapm;
-
- dapm = snd_soc_component_get_dapm(dai->component);
- snd_soc_dapm_add_routes(dapm, afe_pcm_routes,
- ARRAY_SIZE(afe_pcm_routes));
-
- return 0;
-}
-
-
static const struct snd_soc_component_driver q6asm_fe_dai_component = {
.name = DRV_NAME,
.ops = &q6asm_dai_ops,
{"MM_UL6", NULL, "MultiMedia6 Mixer"},
{"MM_UL7", NULL, "MultiMedia7 Mixer"},
{"MM_UL8", NULL, "MultiMedia8 Mixer"},
+
+ {"MM_DL1", NULL, "MultiMedia1 Playback" },
+ {"MM_DL2", NULL, "MultiMedia2 Playback" },
+ {"MM_DL3", NULL, "MultiMedia3 Playback" },
+ {"MM_DL4", NULL, "MultiMedia4 Playback" },
+ {"MM_DL5", NULL, "MultiMedia5 Playback" },
+ {"MM_DL6", NULL, "MultiMedia6 Playback" },
+ {"MM_DL7", NULL, "MultiMedia7 Playback" },
+ {"MM_DL8", NULL, "MultiMedia8 Playback" },
+
+ {"MultiMedia1 Capture", NULL, "MM_UL1"},
+ {"MultiMedia2 Capture", NULL, "MM_UL2"},
+ {"MultiMedia3 Capture", NULL, "MM_UL3"},
+ {"MultiMedia4 Capture", NULL, "MM_UL4"},
+ {"MultiMedia5 Capture", NULL, "MM_UL5"},
+ {"MultiMedia6 Capture", NULL, "MM_UL6"},
+ {"MultiMedia7 Capture", NULL, "MM_UL7"},
+ {"MultiMedia8 Capture", NULL, "MM_UL8"},
+
};
static int routing_hw_params(struct snd_pcm_substream *substream,
static const struct snd_dmaengine_pcm_config rk_dmaengine_pcm_config = {
.pcm_hardware = &snd_rockchip_hardware,
+ .prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config,
.prealloc_buffer_size = 32 * 1024,
};
if (rsnd_ssi_is_multi_slave(mod, io))
return 0;
- if (ssi->rate) {
+ if (ssi->usrcnt > 1) {
if (ssi->rate != rate) {
dev_err(dev, "SSI parent/child should use same rate\n");
return -EINVAL;
snd_soc_acpi_find_machine(struct snd_soc_acpi_mach *machines)
{
struct snd_soc_acpi_mach *mach;
+ struct snd_soc_acpi_mach *mach_alt;
for (mach = machines; mach->id[0]; mach++) {
if (acpi_dev_present(mach->id, NULL, -1)) {
- if (mach->machine_quirk)
- mach = mach->machine_quirk(mach);
+ if (mach->machine_quirk) {
+ mach_alt = mach->machine_quirk(mach);
+ if (!mach_alt)
+ continue; /* not full match, ignore */
+ mach = mach_alt;
+ }
+
return mach;
}
}
}
card->instantiated = 1;
+ dapm_mark_endpoints_dirty(card);
snd_soc_dapm_sync(&card->dapm);
mutex_unlock(&card->mutex);
mutex_unlock(&client_mutex);
char *mclk_name, *p, *s = (char *)pname;
int ret, i = 0;
- mclk = devm_kzalloc(dev, sizeof(mclk), GFP_KERNEL);
+ mclk = devm_kzalloc(dev, sizeof(*mclk), GFP_KERNEL);
if (!mclk)
return -ENOMEM;
config SND_SUN50I_CODEC_ANALOG
tristate "Allwinner sun50i Codec Analog Controls Support"
depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
- select SND_SUNXI_ADDA_PR_REGMAP
+ select SND_SUN8I_ADDA_PR_REGMAP
help
Say Y or M if you want to add support for the analog controls for
the codec embedded in Allwinner A64 SoC.
{ "Right Digital DAC Mixer", "AIF1 Slot 0 Digital DAC Playback Switch",
"AIF1 Slot 0 Right"},
- /* ADC routes */
+ /* ADC Routes */
+ { "AIF1 Slot 0 Right ADC", NULL, "ADC" },
+ { "AIF1 Slot 0 Left ADC", NULL, "ADC" },
+
+ /* ADC Mixer Routes */
{ "Left Digital ADC Mixer", "AIF1 Data Digital ADC Capture Switch",
"AIF1 Slot 0 Left ADC" },
{ "Right Digital ADC Mixer", "AIF1 Data Digital ADC Capture Switch",
static int sun8i_codec_remove(struct platform_device *pdev)
{
- struct snd_soc_card *card = platform_get_drvdata(pdev);
- struct sun8i_codec *scodec = snd_soc_card_get_drvdata(card);
-
pm_runtime_disable(&pdev->dev);
if (!pm_runtime_status_suspended(&pdev->dev))
sun8i_codec_runtime_suspend(&pdev->dev);
- clk_disable_unprepare(scodec->clk_module);
- clk_disable_unprepare(scodec->clk_bus);
-
return 0;
}
runtime->hw = snd_cs4231_playback;
err = snd_cs4231_open(chip, CS4231_MODE_PLAY);
- if (err < 0) {
- snd_free_pages(runtime->dma_area, runtime->dma_bytes);
+ if (err < 0)
return err;
- }
chip->playback_substream = substream;
chip->p_periods_sent = 0;
snd_pcm_set_sync(substream);
runtime->hw = snd_cs4231_capture;
err = snd_cs4231_open(chip, CS4231_MODE_RECORD);
- if (err < 0) {
- snd_free_pages(runtime->dma_area, runtime->dma_bytes);
+ if (err < 0)
return err;
- }
chip->capture_substream = substream;
chip->c_periods_sent = 0;
snd_pcm_set_sync(substream);
__error:
if (chip) {
+ /* chip->active is inside the chip->card object,
+ * decrement before memory is possibly returned.
+ */
+ atomic_dec(&chip->active);
if (!chip->num_interfaces)
snd_card_free(chip->card);
- atomic_dec(&chip->active);
}
mutex_unlock(®ister_mutex);
return err;
.ifnum = QUIRK_NO_INTERFACE
}
},
+/* Dell WD19 Dock */
+{
+ USB_DEVICE(0x0bda, 0x402e),
+ .driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) {
+ .vendor_name = "Dell",
+ .product_name = "WD19 Dock",
+ .profile_name = "Dell-WD15-Dock",
+ .ifnum = QUIRK_NO_INTERFACE
+ }
+},
#undef USB_DEVICE_VENDOR_SPEC
return SNDRV_PCM_FMTBIT_DSD_U32_BE;
break;
+ case USB_ID(0x152a, 0x85de): /* SMSL D1 DAC */
case USB_ID(0x16d0, 0x09dd): /* Encore mDSD */
case USB_ID(0x0d8c, 0x0316): /* Hegel HD12 DSD */
case USB_ID(0x16b0, 0x06b2): /* NuPrime DAC-10 */
#define wmb() asm volatile("dmb ishst" ::: "memory")
#define rmb() asm volatile("dmb ishld" ::: "memory")
-#define smp_store_release(p, v) \
-do { \
- union { typeof(*p) __val; char __c[1]; } __u = \
- { .__val = (__force typeof(*p)) (v) }; \
- \
- switch (sizeof(*p)) { \
- case 1: \
- asm volatile ("stlrb %w1, %0" \
- : "=Q" (*p) \
- : "r" (*(__u8 *)__u.__c) \
- : "memory"); \
- break; \
- case 2: \
- asm volatile ("stlrh %w1, %0" \
- : "=Q" (*p) \
- : "r" (*(__u16 *)__u.__c) \
- : "memory"); \
- break; \
- case 4: \
- asm volatile ("stlr %w1, %0" \
- : "=Q" (*p) \
- : "r" (*(__u32 *)__u.__c) \
- : "memory"); \
- break; \
- case 8: \
- asm volatile ("stlr %1, %0" \
- : "=Q" (*p) \
- : "r" (*(__u64 *)__u.__c) \
- : "memory"); \
- break; \
- default: \
- /* Only to shut up gcc ... */ \
- mb(); \
- break; \
- } \
+#define smp_store_release(p, v) \
+do { \
+ union { typeof(*p) __val; char __c[1]; } __u = \
+ { .__val = (v) }; \
+ \
+ switch (sizeof(*p)) { \
+ case 1: \
+ asm volatile ("stlrb %w1, %0" \
+ : "=Q" (*p) \
+ : "r" (*(__u8_alias_t *)__u.__c) \
+ : "memory"); \
+ break; \
+ case 2: \
+ asm volatile ("stlrh %w1, %0" \
+ : "=Q" (*p) \
+ : "r" (*(__u16_alias_t *)__u.__c) \
+ : "memory"); \
+ break; \
+ case 4: \
+ asm volatile ("stlr %w1, %0" \
+ : "=Q" (*p) \
+ : "r" (*(__u32_alias_t *)__u.__c) \
+ : "memory"); \
+ break; \
+ case 8: \
+ asm volatile ("stlr %1, %0" \
+ : "=Q" (*p) \
+ : "r" (*(__u64_alias_t *)__u.__c) \
+ : "memory"); \
+ break; \
+ default: \
+ /* Only to shut up gcc ... */ \
+ mb(); \
+ break; \
+ } \
} while (0)
-#define smp_load_acquire(p) \
-({ \
- union { typeof(*p) __val; char __c[1]; } __u; \
- \
- switch (sizeof(*p)) { \
- case 1: \
- asm volatile ("ldarb %w0, %1" \
- : "=r" (*(__u8 *)__u.__c) \
- : "Q" (*p) : "memory"); \
- break; \
- case 2: \
- asm volatile ("ldarh %w0, %1" \
- : "=r" (*(__u16 *)__u.__c) \
- : "Q" (*p) : "memory"); \
- break; \
- case 4: \
- asm volatile ("ldar %w0, %1" \
- : "=r" (*(__u32 *)__u.__c) \
- : "Q" (*p) : "memory"); \
- break; \
- case 8: \
- asm volatile ("ldar %0, %1" \
- : "=r" (*(__u64 *)__u.__c) \
- : "Q" (*p) : "memory"); \
- break; \
- default: \
- /* Only to shut up gcc ... */ \
- mb(); \
- break; \
- } \
- __u.__val; \
+#define smp_load_acquire(p) \
+({ \
+ union { typeof(*p) __val; char __c[1]; } __u = \
+ { .__c = { 0 } }; \
+ \
+ switch (sizeof(*p)) { \
+ case 1: \
+ asm volatile ("ldarb %w0, %1" \
+ : "=r" (*(__u8_alias_t *)__u.__c) \
+ : "Q" (*p) : "memory"); \
+ break; \
+ case 2: \
+ asm volatile ("ldarh %w0, %1" \
+ : "=r" (*(__u16_alias_t *)__u.__c) \
+ : "Q" (*p) : "memory"); \
+ break; \
+ case 4: \
+ asm volatile ("ldar %w0, %1" \
+ : "=r" (*(__u32_alias_t *)__u.__c) \
+ : "Q" (*p) : "memory"); \
+ break; \
+ case 8: \
+ asm volatile ("ldar %0, %1" \
+ : "=r" (*(__u64_alias_t *)__u.__c) \
+ : "Q" (*p) : "memory"); \
+ break; \
+ default: \
+ /* Only to shut up gcc ... */ \
+ mb(); \
+ break; \
+ } \
+ __u.__val; \
})
#endif /* _TOOLS_LINUX_ASM_AARCH64_BARRIER_H */
*/
#define __ARCH_WANT_RENAMEAT
+#define __ARCH_WANT_NEW_STAT
#include <asm-generic/unistd.h>
#define KVM_REG_PPC_DEC_EXPIRY (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbe)
#define KVM_REG_PPC_ONLINE (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xbf)
+#define KVM_REG_PPC_PTCR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc0)
/* Transactional Memory checkpointed state:
* This is all GPRs, all VSX regs and a subset of SPRs
#define KVM_S390_VM_CRYPTO_ENABLE_DEA_KW 1
#define KVM_S390_VM_CRYPTO_DISABLE_AES_KW 2
#define KVM_S390_VM_CRYPTO_DISABLE_DEA_KW 3
+#define KVM_S390_VM_CRYPTO_ENABLE_APIE 4
+#define KVM_S390_VM_CRYPTO_DISABLE_APIE 5
/* kvm attributes for migration mode */
#define KVM_S390_VM_MIGRATION_STOP 0
#define X86_FEATURE_LA57 (16*32+16) /* 5-level page tables */
#define X86_FEATURE_RDPID (16*32+22) /* RDPID instruction */
#define X86_FEATURE_CLDEMOTE (16*32+25) /* CLDEMOTE instruction */
+#define X86_FEATURE_MOVDIRI (16*32+27) /* MOVDIRI instruction */
+#define X86_FEATURE_MOVDIR64B (16*32+28) /* MOVDIR64B instruction */
/* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
#define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery support */
__u8 injected;
__u8 nr;
__u8 has_error_code;
- union {
- __u8 pad;
- __u8 pending;
- };
+ __u8 pending;
__u32 error_code;
} exception;
struct {
#define KVM_STATE_NESTED_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_RUN_PENDING 0x00000002
+#define KVM_STATE_NESTED_EVMCS 0x00000004
#define KVM_STATE_NESTED_SMM_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_SMM_VMXON 0x00000002
SEE ALSO
========
- **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8)
+ **bpf**\ (2),
+ **bpf-helpers**\ (7),
+ **bpftool**\ (8),
+ **bpftool-prog**\ (8),
+ **bpftool-map**\ (8),
+ **bpftool-net**\ (8),
+ **bpftool-perf**\ (8)
SEE ALSO
========
- **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-cgroup**\ (8)
+ **bpf**\ (2),
+ **bpf-helpers**\ (7),
+ **bpftool**\ (8),
+ **bpftool-prog**\ (8),
+ **bpftool-cgroup**\ (8),
+ **bpftool-net**\ (8),
+ **bpftool-perf**\ (8)
SEE ALSO
========
- **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8)
+ **bpf**\ (2),
+ **bpf-helpers**\ (7),
+ **bpftool**\ (8),
+ **bpftool-prog**\ (8),
+ **bpftool-map**\ (8),
+ **bpftool-cgroup**\ (8),
+ **bpftool-perf**\ (8)
SEE ALSO
========
- **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8)
+ **bpf**\ (2),
+ **bpf-helpers**\ (7),
+ **bpftool**\ (8),
+ **bpftool-prog**\ (8),
+ **bpftool-map**\ (8),
+ **bpftool-cgroup**\ (8),
+ **bpftool-net**\ (8)
Generate human-readable JSON output. Implies **-j**.
-f, --bpffs
- Show file names of pinned programs.
+ When showing BPF programs, show file names of pinned
+ programs.
EXAMPLES
========
SEE ALSO
========
- **bpftool**\ (8), **bpftool-map**\ (8), **bpftool-cgroup**\ (8)
+ **bpf**\ (2),
+ **bpf-helpers**\ (7),
+ **bpftool**\ (8),
+ **bpftool-map**\ (8),
+ **bpftool-cgroup**\ (8),
+ **bpftool-net**\ (8),
+ **bpftool-perf**\ (8)
SEE ALSO
========
- **bpftool-map**\ (8), **bpftool-prog**\ (8), **bpftool-cgroup**\ (8)
- **bpftool-perf**\ (8), **bpftool-net**\ (8)
+ **bpf**\ (2),
+ **bpf-helpers**\ (7),
+ **bpftool-prog**\ (8),
+ **bpftool-map**\ (8),
+ **bpftool-cgroup**\ (8),
+ **bpftool-net**\ (8),
+ **bpftool-perf**\ (8)
}
static int btf_dumper_modifier(const struct btf_dumper *d, __u32 type_id,
- const void *data)
+ __u8 bit_offset, const void *data)
{
int actual_type_id;
if (actual_type_id < 0)
return actual_type_id;
- return btf_dumper_do_type(d, actual_type_id, 0, data);
+ return btf_dumper_do_type(d, actual_type_id, bit_offset, data);
}
static void btf_dumper_enum(const void *data, json_writer_t *jw)
case BTF_KIND_VOLATILE:
case BTF_KIND_CONST:
case BTF_KIND_RESTRICT:
- return btf_dumper_modifier(d, type_id, data);
+ return btf_dumper_modifier(d, type_id, bit_offset, data);
default:
jsonw_printf(d->jw, "(unsupported-kind");
return -EINVAL;
return 0;
}
-int open_obj_pinned(char *path)
+int open_obj_pinned(char *path, bool quiet)
{
int fd;
fd = bpf_obj_get(path);
if (fd < 0) {
- p_err("bpf obj get (%s): %s", path,
- errno == EACCES && !is_bpffs(dirname(path)) ?
- "directory not in bpf file system (bpffs)" :
- strerror(errno));
+ if (!quiet)
+ p_err("bpf obj get (%s): %s", path,
+ errno == EACCES && !is_bpffs(dirname(path)) ?
+ "directory not in bpf file system (bpffs)" :
+ strerror(errno));
return -1;
}
enum bpf_obj_type type;
int fd;
- fd = open_obj_pinned(path);
+ fd = open_obj_pinned(path, false);
if (fd < 0)
return -1;
return NULL;
}
- while ((n = getline(&line, &line_n, fdi))) {
+ while ((n = getline(&line, &line_n, fdi)) > 0) {
char *value;
int len;
while ((ftse = fts_read(fts))) {
if (!(ftse->fts_info & FTS_F))
continue;
- fd = open_obj_pinned(ftse->fts_path);
+ fd = open_obj_pinned(ftse->fts_path, true);
if (fd < 0)
continue;
int get_fd_type(int fd);
const char *get_fd_type_name(enum bpf_obj_type type);
char *get_fdinfo(int fd, const char *key);
-int open_obj_pinned(char *path);
+int open_obj_pinned(char *path, bool quiet);
int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type);
int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32));
int do_pin_fd(int fd, const char *name);
if (!hash_empty(prog_table.table)) {
struct pinned_obj *obj;
- printf("\n");
hash_for_each_possible(prog_table.table, obj, hash, info->id) {
if (obj->id == info->id)
- printf("\tpinned %s\n", obj->path);
+ printf("\n\tpinned %s", obj->path);
}
}
}
NEXT_ARG();
} else if (is_prefix(*argv, "map")) {
+ void *new_map_replace;
char *endptr, *name;
int fd;
if (fd < 0)
goto err_free_reuse_maps;
- map_replace = reallocarray(map_replace, old_map_fds + 1,
- sizeof(*map_replace));
- if (!map_replace) {
+ new_map_replace = reallocarray(map_replace,
+ old_map_fds + 1,
+ sizeof(*map_replace));
+ if (!new_map_replace) {
p_err("mem alloc failed");
goto err_free_reuse_maps;
}
+ map_replace = new_map_replace;
+
map_replace[old_map_fds].idx = idx;
map_replace[old_map_fds].name = name;
map_replace[old_map_fds].fd = fd;
dwarf_getlocations \
fortify-source \
sync-compare-and-swap \
+ get_current_dir_name \
glibc \
gtk2 \
gtk2-infobar \
test-dwarf_getlocations.bin \
test-fortify-source.bin \
test-sync-compare-and-swap.bin \
+ test-get_current_dir_name.bin \
test-glibc.bin \
test-gtk2.bin \
test-gtk2-infobar.bin \
$(OUTPUT)test-libelf.bin:
$(BUILD) -lelf
+$(OUTPUT)test-get_current_dir_name.bin:
+ $(BUILD)
+
$(OUTPUT)test-glibc.bin:
$(BUILD)
# include "test-libelf-mmap.c"
#undef main
+#define main main_test_get_current_dir_name
+# include "test-get_current_dir_name.c"
+#undef main
+
#define main main_test_glibc
# include "test-glibc.c"
#undef main
main_test_hello();
main_test_libelf();
main_test_libelf_mmap();
+ main_test_get_current_dir_name();
main_test_glibc();
main_test_dwarf();
main_test_dwarf_getlocations();
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <unistd.h>
+#include <stdlib.h>
+
+int main(void)
+{
+ free(get_current_dir_name());
+ return 0;
+}
#define TIOCGPTLCK _IOR('T', 0x39, int) /* Get Pty lock state */
#define TIOCGEXCL _IOR('T', 0x40, int) /* Get exclusive mode state */
#define TIOCGPTPEER _IO('T', 0x41) /* Safely open the slave */
+#define TIOCGISO7816 _IOR('T', 0x42, struct serial_iso7816)
+#define TIOCSISO7816 _IOWR('T', 0x43, struct serial_iso7816)
#define FIONCLEX 0x5450
#define FIOCLEX 0x5451
/* fs/stat.c */
#define __NR_readlinkat 78
__SYSCALL(__NR_readlinkat, sys_readlinkat)
+#if defined(__ARCH_WANT_NEW_STAT) || defined(__ARCH_WANT_STAT64)
#define __NR3264_fstatat 79
__SC_3264(__NR3264_fstatat, sys_fstatat64, sys_newfstatat)
#define __NR3264_fstat 80
__SC_3264(__NR3264_fstat, sys_fstat64, sys_newfstat)
+#endif
/* fs/sync.c */
#define __NR_sync 81
*/
#define I915_PARAM_CS_TIMESTAMP_FREQUENCY 51
+/*
+ * Once upon a time we supposed that writes through the GGTT would be
+ * immediately in physical memory (once flushed out of the CPU path). However,
+ * on a few different processors and chipsets, this is not necessarily the case
+ * as the writes appear to be buffered internally. Thus a read of the backing
+ * storage (physical memory) via a different path (with different physical tags
+ * to the indirect write via the GGTT) will see stale values from before
+ * the GGTT write. Inside the kernel, we can for the most part keep track of
+ * the different read/write domains in use (e.g. set-domain), but the assumption
+ * of coherency is baked into the ABI, hence reporting its true state in this
+ * parameter.
+ *
+ * Reports true when writes via mmap_gtt are immediately visible following an
+ * lfence to flush the WCB.
+ *
+ * Reports false when writes via mmap_gtt are indeterminately delayed in an in
+ * internal buffer and are _not_ immediately visible to third parties accessing
+ * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC
+ * communications channel when reporting false is strongly disadvised.
+ */
+#define I915_PARAM_MMAP_GTT_COHERENT 52
+
typedef struct drm_i915_getparam {
__s32 param;
/*
* Return
* 0 on success, or a negative error in case of failure.
*
- * struct bpf_sock *bpf_sk_lookup_tcp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u32 netns, u64 flags)
+ * struct bpf_sock *bpf_sk_lookup_tcp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u64 netns, u64 flags)
* Description
* Look for TCP socket matching *tuple*, optionally in a child
* network namespace *netns*. The return value must be checked,
* **sizeof**\ (*tuple*\ **->ipv6**)
* Look for an IPv6 socket.
*
- * If the *netns* is zero, then the socket lookup table in the
- * netns associated with the *ctx* will be used. For the TC hooks,
- * this in the netns of the device in the skb. For socket hooks,
- * this in the netns of the socket. If *netns* is non-zero, then
- * it specifies the ID of the netns relative to the netns
- * associated with the *ctx*.
+ * If the *netns* is a negative signed 32-bit integer, then the
+ * socket lookup table in the netns associated with the *ctx* will
+ * will be used. For the TC hooks, this is the netns of the device
+ * in the skb. For socket hooks, this is the netns of the socket.
+ * If *netns* is any other signed 32-bit value greater than or
+ * equal to zero then it specifies the ID of the netns relative to
+ * the netns associated with the *ctx*. *netns* values beyond the
+ * range of 32-bit integers are reserved for future use.
*
* All values for *flags* are reserved for future usage, and must
* be left at zero.
* **CONFIG_NET** configuration option.
* Return
* Pointer to *struct bpf_sock*, or NULL in case of failure.
+ * For sockets with reuseport option, the *struct bpf_sock*
+ * result is from reuse->socks[] using the hash of the tuple.
*
- * struct bpf_sock *bpf_sk_lookup_udp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u32 netns, u64 flags)
+ * struct bpf_sock *bpf_sk_lookup_udp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u64 netns, u64 flags)
* Description
* Look for UDP socket matching *tuple*, optionally in a child
* network namespace *netns*. The return value must be checked,
* **sizeof**\ (*tuple*\ **->ipv6**)
* Look for an IPv6 socket.
*
- * If the *netns* is zero, then the socket lookup table in the
- * netns associated with the *ctx* will be used. For the TC hooks,
- * this in the netns of the device in the skb. For socket hooks,
- * this in the netns of the socket. If *netns* is non-zero, then
- * it specifies the ID of the netns relative to the netns
- * associated with the *ctx*.
+ * If the *netns* is a negative signed 32-bit integer, then the
+ * socket lookup table in the netns associated with the *ctx* will
+ * will be used. For the TC hooks, this is the netns of the device
+ * in the skb. For socket hooks, this is the netns of the socket.
+ * If *netns* is any other signed 32-bit value greater than or
+ * equal to zero then it specifies the ID of the netns relative to
+ * the netns associated with the *ctx*. *netns* values beyond the
+ * range of 32-bit integers are reserved for future use.
*
* All values for *flags* are reserved for future usage, and must
* be left at zero.
* **CONFIG_NET** configuration option.
* Return
* Pointer to *struct bpf_sock*, or NULL in case of failure.
+ * For sockets with reuseport option, the *struct bpf_sock*
+ * result is from reuse->socks[] using the hash of the tuple.
*
* int bpf_sk_release(struct bpf_sock *sk)
* Description
/* BPF_FUNC_perf_event_output for sk_buff input context. */
#define BPF_F_CTXLEN_MASK (0xfffffULL << 32)
+/* Current network namespace */
+#define BPF_F_CURRENT_NETNS (-1L)
+
/* Mode for BPF_FUNC_skb_adjust_room helper. */
enum bpf_adj_room_mode {
BPF_ADJ_ROOM_NET,
BPF_LWT_ENCAP_SEG6_INLINE
};
+#define __bpf_md_ptr(type, name) \
+union { \
+ type name; \
+ __u64 :64; \
+} __attribute__((aligned(8)))
+
/* user accessible mirror of in-kernel sk_buff.
* new fields can only be added to the end of this structure
*/
/* ... here. */
__u32 data_meta;
- struct bpf_flow_keys *flow_keys;
+ __bpf_md_ptr(struct bpf_flow_keys *, flow_keys);
};
struct bpf_tunnel_key {
* be added to the end of this structure
*/
struct sk_msg_md {
- void *data;
- void *data_end;
+ __bpf_md_ptr(void *, data);
+ __bpf_md_ptr(void *, data_end);
__u32 family;
__u32 remote_ip4; /* Stored in network byte order */
* Start of directly accessible data. It begins from
* the tcp/udp header.
*/
- void *data;
- void *data_end; /* End of directly accessible data */
+ __bpf_md_ptr(void *, data);
+ /* End of directly accessible data */
+ __bpf_md_ptr(void *, data_end);
/*
* Total length of packet (starting from the tcp/udp header).
* Note that the directly accessible bytes (data_end - data)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_FS_H
+#define _UAPI_LINUX_FS_H
+
+/*
+ * This file has definitions for some important file table structures
+ * and constants and structures used by various generic file system
+ * ioctl's. Please do not make any changes in this file before
+ * sending patches for review to linux-fsdevel@vger.kernel.org and
+ * linux-api@vger.kernel.org.
+ */
+
+#include <linux/limits.h>
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/*
+ * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
+ * the file limit at runtime and only root can increase the per-process
+ * nr_file rlimit, so it's safe to set up a ridiculously high absolute
+ * upper limit on files-per-process.
+ *
+ * Some programs (notably those using select()) may have to be
+ * recompiled to take full advantage of the new limits..
+ */
+
+/* Fixed constants first: */
+#undef NR_OPEN
+#define INR_OPEN_CUR 1024 /* Initial setting for nfile rlimits */
+#define INR_OPEN_MAX 4096 /* Hard limit for nfile rlimits */
+
+#define BLOCK_SIZE_BITS 10
+#define BLOCK_SIZE (1<<BLOCK_SIZE_BITS)
+
+#define SEEK_SET 0 /* seek relative to beginning of file */
+#define SEEK_CUR 1 /* seek relative to current file position */
+#define SEEK_END 2 /* seek relative to end of file */
+#define SEEK_DATA 3 /* seek to the next data */
+#define SEEK_HOLE 4 /* seek to the next hole */
+#define SEEK_MAX SEEK_HOLE
+
+#define RENAME_NOREPLACE (1 << 0) /* Don't overwrite target */
+#define RENAME_EXCHANGE (1 << 1) /* Exchange source and dest */
+#define RENAME_WHITEOUT (1 << 2) /* Whiteout source */
+
+struct file_clone_range {
+ __s64 src_fd;
+ __u64 src_offset;
+ __u64 src_length;
+ __u64 dest_offset;
+};
+
+struct fstrim_range {
+ __u64 start;
+ __u64 len;
+ __u64 minlen;
+};
+
+/* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */
+#define FILE_DEDUPE_RANGE_SAME 0
+#define FILE_DEDUPE_RANGE_DIFFERS 1
+
+/* from struct btrfs_ioctl_file_extent_same_info */
+struct file_dedupe_range_info {
+ __s64 dest_fd; /* in - destination file */
+ __u64 dest_offset; /* in - start of extent in destination */
+ __u64 bytes_deduped; /* out - total # of bytes we were able
+ * to dedupe from this file. */
+ /* status of this dedupe operation:
+ * < 0 for error
+ * == FILE_DEDUPE_RANGE_SAME if dedupe succeeds
+ * == FILE_DEDUPE_RANGE_DIFFERS if data differs
+ */
+ __s32 status; /* out - see above description */
+ __u32 reserved; /* must be zero */
+};
+
+/* from struct btrfs_ioctl_file_extent_same_args */
+struct file_dedupe_range {
+ __u64 src_offset; /* in - start of extent in source */
+ __u64 src_length; /* in - length of extent */
+ __u16 dest_count; /* in - total elements in info array */
+ __u16 reserved1; /* must be zero */
+ __u32 reserved2; /* must be zero */
+ struct file_dedupe_range_info info[0];
+};
+
+/* And dynamically-tunable limits and defaults: */
+struct files_stat_struct {
+ unsigned long nr_files; /* read only */
+ unsigned long nr_free_files; /* read only */
+ unsigned long max_files; /* tunable */
+};
+
+struct inodes_stat_t {
+ long nr_inodes;
+ long nr_unused;
+ long dummy[5]; /* padding for sysctl ABI compatibility */
+};
+
+
+#define NR_FILE 8192 /* this can well be larger on a larger system */
+
+
+/*
+ * These are the fs-independent mount-flags: up to 32 flags are supported
+ */
+#define MS_RDONLY 1 /* Mount read-only */
+#define MS_NOSUID 2 /* Ignore suid and sgid bits */
+#define MS_NODEV 4 /* Disallow access to device special files */
+#define MS_NOEXEC 8 /* Disallow program execution */
+#define MS_SYNCHRONOUS 16 /* Writes are synced at once */
+#define MS_REMOUNT 32 /* Alter flags of a mounted FS */
+#define MS_MANDLOCK 64 /* Allow mandatory locks on an FS */
+#define MS_DIRSYNC 128 /* Directory modifications are synchronous */
+#define MS_NOATIME 1024 /* Do not update access times. */
+#define MS_NODIRATIME 2048 /* Do not update directory access times */
+#define MS_BIND 4096
+#define MS_MOVE 8192
+#define MS_REC 16384
+#define MS_VERBOSE 32768 /* War is peace. Verbosity is silence.
+ MS_VERBOSE is deprecated. */
+#define MS_SILENT 32768
+#define MS_POSIXACL (1<<16) /* VFS does not apply the umask */
+#define MS_UNBINDABLE (1<<17) /* change to unbindable */
+#define MS_PRIVATE (1<<18) /* change to private */
+#define MS_SLAVE (1<<19) /* change to slave */
+#define MS_SHARED (1<<20) /* change to shared */
+#define MS_RELATIME (1<<21) /* Update atime relative to mtime/ctime. */
+#define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */
+#define MS_I_VERSION (1<<23) /* Update inode I_version field */
+#define MS_STRICTATIME (1<<24) /* Always perform atime updates */
+#define MS_LAZYTIME (1<<25) /* Update the on-disk [acm]times lazily */
+
+/* These sb flags are internal to the kernel */
+#define MS_SUBMOUNT (1<<26)
+#define MS_NOREMOTELOCK (1<<27)
+#define MS_NOSEC (1<<28)
+#define MS_BORN (1<<29)
+#define MS_ACTIVE (1<<30)
+#define MS_NOUSER (1<<31)
+
+/*
+ * Superblock flags that can be altered by MS_REMOUNT
+ */
+#define MS_RMT_MASK (MS_RDONLY|MS_SYNCHRONOUS|MS_MANDLOCK|MS_I_VERSION|\
+ MS_LAZYTIME)
+
+/*
+ * Old magic mount flag and mask
+ */
+#define MS_MGC_VAL 0xC0ED0000
+#define MS_MGC_MSK 0xffff0000
+
+/*
+ * Structure for FS_IOC_FSGETXATTR[A] and FS_IOC_FSSETXATTR.
+ */
+struct fsxattr {
+ __u32 fsx_xflags; /* xflags field value (get/set) */
+ __u32 fsx_extsize; /* extsize field value (get/set)*/
+ __u32 fsx_nextents; /* nextents field value (get) */
+ __u32 fsx_projid; /* project identifier (get/set) */
+ __u32 fsx_cowextsize; /* CoW extsize field value (get/set)*/
+ unsigned char fsx_pad[8];
+};
+
+/*
+ * Flags for the fsx_xflags field
+ */
+#define FS_XFLAG_REALTIME 0x00000001 /* data in realtime volume */
+#define FS_XFLAG_PREALLOC 0x00000002 /* preallocated file extents */
+#define FS_XFLAG_IMMUTABLE 0x00000008 /* file cannot be modified */
+#define FS_XFLAG_APPEND 0x00000010 /* all writes append */
+#define FS_XFLAG_SYNC 0x00000020 /* all writes synchronous */
+#define FS_XFLAG_NOATIME 0x00000040 /* do not update access time */
+#define FS_XFLAG_NODUMP 0x00000080 /* do not include in backups */
+#define FS_XFLAG_RTINHERIT 0x00000100 /* create with rt bit set */
+#define FS_XFLAG_PROJINHERIT 0x00000200 /* create with parents projid */
+#define FS_XFLAG_NOSYMLINKS 0x00000400 /* disallow symlink creation */
+#define FS_XFLAG_EXTSIZE 0x00000800 /* extent size allocator hint */
+#define FS_XFLAG_EXTSZINHERIT 0x00001000 /* inherit inode extent size */
+#define FS_XFLAG_NODEFRAG 0x00002000 /* do not defragment */
+#define FS_XFLAG_FILESTREAM 0x00004000 /* use filestream allocator */
+#define FS_XFLAG_DAX 0x00008000 /* use DAX for IO */
+#define FS_XFLAG_COWEXTSIZE 0x00010000 /* CoW extent size allocator hint */
+#define FS_XFLAG_HASATTR 0x80000000 /* no DIFLAG for this */
+
+/* the read-only stuff doesn't really belong here, but any other place is
+ probably as bad and I don't want to create yet another include file. */
+
+#define BLKROSET _IO(0x12,93) /* set device read-only (0 = read-write) */
+#define BLKROGET _IO(0x12,94) /* get read-only status (0 = read_write) */
+#define BLKRRPART _IO(0x12,95) /* re-read partition table */
+#define BLKGETSIZE _IO(0x12,96) /* return device size /512 (long *arg) */
+#define BLKFLSBUF _IO(0x12,97) /* flush buffer cache */
+#define BLKRASET _IO(0x12,98) /* set read ahead for block device */
+#define BLKRAGET _IO(0x12,99) /* get current read ahead setting */
+#define BLKFRASET _IO(0x12,100)/* set filesystem (mm/filemap.c) read-ahead */
+#define BLKFRAGET _IO(0x12,101)/* get filesystem (mm/filemap.c) read-ahead */
+#define BLKSECTSET _IO(0x12,102)/* set max sectors per request (ll_rw_blk.c) */
+#define BLKSECTGET _IO(0x12,103)/* get max sectors per request (ll_rw_blk.c) */
+#define BLKSSZGET _IO(0x12,104)/* get block device sector size */
+#if 0
+#define BLKPG _IO(0x12,105)/* See blkpg.h */
+
+/* Some people are morons. Do not use sizeof! */
+
+#define BLKELVGET _IOR(0x12,106,size_t)/* elevator get */
+#define BLKELVSET _IOW(0x12,107,size_t)/* elevator set */
+/* This was here just to show that the number is taken -
+ probably all these _IO(0x12,*) ioctls should be moved to blkpg.h. */
+#endif
+/* A jump here: 108-111 have been used for various private purposes. */
+#define BLKBSZGET _IOR(0x12,112,size_t)
+#define BLKBSZSET _IOW(0x12,113,size_t)
+#define BLKGETSIZE64 _IOR(0x12,114,size_t) /* return device size in bytes (u64 *arg) */
+#define BLKTRACESETUP _IOWR(0x12,115,struct blk_user_trace_setup)
+#define BLKTRACESTART _IO(0x12,116)
+#define BLKTRACESTOP _IO(0x12,117)
+#define BLKTRACETEARDOWN _IO(0x12,118)
+#define BLKDISCARD _IO(0x12,119)
+#define BLKIOMIN _IO(0x12,120)
+#define BLKIOOPT _IO(0x12,121)
+#define BLKALIGNOFF _IO(0x12,122)
+#define BLKPBSZGET _IO(0x12,123)
+#define BLKDISCARDZEROES _IO(0x12,124)
+#define BLKSECDISCARD _IO(0x12,125)
+#define BLKROTATIONAL _IO(0x12,126)
+#define BLKZEROOUT _IO(0x12,127)
+/*
+ * A jump here: 130-131 are reserved for zoned block devices
+ * (see uapi/linux/blkzoned.h)
+ */
+
+#define BMAP_IOCTL 1 /* obsolete - kept for compatibility */
+#define FIBMAP _IO(0x00,1) /* bmap access */
+#define FIGETBSZ _IO(0x00,2) /* get the block size used for bmap */
+#define FIFREEZE _IOWR('X', 119, int) /* Freeze */
+#define FITHAW _IOWR('X', 120, int) /* Thaw */
+#define FITRIM _IOWR('X', 121, struct fstrim_range) /* Trim */
+#define FICLONE _IOW(0x94, 9, int)
+#define FICLONERANGE _IOW(0x94, 13, struct file_clone_range)
+#define FIDEDUPERANGE _IOWR(0x94, 54, struct file_dedupe_range)
+
+#define FSLABEL_MAX 256 /* Max chars for the interface; each fs may differ */
+
+#define FS_IOC_GETFLAGS _IOR('f', 1, long)
+#define FS_IOC_SETFLAGS _IOW('f', 2, long)
+#define FS_IOC_GETVERSION _IOR('v', 1, long)
+#define FS_IOC_SETVERSION _IOW('v', 2, long)
+#define FS_IOC_FIEMAP _IOWR('f', 11, struct fiemap)
+#define FS_IOC32_GETFLAGS _IOR('f', 1, int)
+#define FS_IOC32_SETFLAGS _IOW('f', 2, int)
+#define FS_IOC32_GETVERSION _IOR('v', 1, int)
+#define FS_IOC32_SETVERSION _IOW('v', 2, int)
+#define FS_IOC_FSGETXATTR _IOR('X', 31, struct fsxattr)
+#define FS_IOC_FSSETXATTR _IOW('X', 32, struct fsxattr)
+#define FS_IOC_GETFSLABEL _IOR(0x94, 49, char[FSLABEL_MAX])
+#define FS_IOC_SETFSLABEL _IOW(0x94, 50, char[FSLABEL_MAX])
+
+/*
+ * File system encryption support
+ */
+/* Policy provided via an ioctl on the topmost directory */
+#define FS_KEY_DESCRIPTOR_SIZE 8
+
+#define FS_POLICY_FLAGS_PAD_4 0x00
+#define FS_POLICY_FLAGS_PAD_8 0x01
+#define FS_POLICY_FLAGS_PAD_16 0x02
+#define FS_POLICY_FLAGS_PAD_32 0x03
+#define FS_POLICY_FLAGS_PAD_MASK 0x03
+#define FS_POLICY_FLAGS_VALID 0x03
+
+/* Encryption algorithms */
+#define FS_ENCRYPTION_MODE_INVALID 0
+#define FS_ENCRYPTION_MODE_AES_256_XTS 1
+#define FS_ENCRYPTION_MODE_AES_256_GCM 2
+#define FS_ENCRYPTION_MODE_AES_256_CBC 3
+#define FS_ENCRYPTION_MODE_AES_256_CTS 4
+#define FS_ENCRYPTION_MODE_AES_128_CBC 5
+#define FS_ENCRYPTION_MODE_AES_128_CTS 6
+#define FS_ENCRYPTION_MODE_SPECK128_256_XTS 7 /* Removed, do not use. */
+#define FS_ENCRYPTION_MODE_SPECK128_256_CTS 8 /* Removed, do not use. */
+
+struct fscrypt_policy {
+ __u8 version;
+ __u8 contents_encryption_mode;
+ __u8 filenames_encryption_mode;
+ __u8 flags;
+ __u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
+};
+
+#define FS_IOC_SET_ENCRYPTION_POLICY _IOR('f', 19, struct fscrypt_policy)
+#define FS_IOC_GET_ENCRYPTION_PWSALT _IOW('f', 20, __u8[16])
+#define FS_IOC_GET_ENCRYPTION_POLICY _IOW('f', 21, struct fscrypt_policy)
+
+/* Parameters for passing an encryption key into the kernel keyring */
+#define FS_KEY_DESC_PREFIX "fscrypt:"
+#define FS_KEY_DESC_PREFIX_SIZE 8
+
+/* Structure that userspace passes to the kernel keyring */
+#define FS_MAX_KEY_SIZE 64
+
+struct fscrypt_key {
+ __u32 mode;
+ __u8 raw[FS_MAX_KEY_SIZE];
+ __u32 size;
+};
+
+/*
+ * Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS)
+ *
+ * Note: for historical reasons, these flags were originally used and
+ * defined for use by ext2/ext3, and then other file systems started
+ * using these flags so they wouldn't need to write their own version
+ * of chattr/lsattr (which was shipped as part of e2fsprogs). You
+ * should think twice before trying to use these flags in new
+ * contexts, or trying to assign these flags, since they are used both
+ * as the UAPI and the on-disk encoding for ext2/3/4. Also, we are
+ * almost out of 32-bit flags. :-)
+ *
+ * We have recently hoisted FS_IOC_FSGETXATTR / FS_IOC_FSSETXATTR from
+ * XFS to the generic FS level interface. This uses a structure that
+ * has padding and hence has more room to grow, so it may be more
+ * appropriate for many new use cases.
+ *
+ * Please do not change these flags or interfaces before checking with
+ * linux-fsdevel@vger.kernel.org and linux-api@vger.kernel.org.
+ */
+#define FS_SECRM_FL 0x00000001 /* Secure deletion */
+#define FS_UNRM_FL 0x00000002 /* Undelete */
+#define FS_COMPR_FL 0x00000004 /* Compress file */
+#define FS_SYNC_FL 0x00000008 /* Synchronous updates */
+#define FS_IMMUTABLE_FL 0x00000010 /* Immutable file */
+#define FS_APPEND_FL 0x00000020 /* writes to file may only append */
+#define FS_NODUMP_FL 0x00000040 /* do not dump file */
+#define FS_NOATIME_FL 0x00000080 /* do not update atime */
+/* Reserved for compression usage... */
+#define FS_DIRTY_FL 0x00000100
+#define FS_COMPRBLK_FL 0x00000200 /* One or more compressed clusters */
+#define FS_NOCOMP_FL 0x00000400 /* Don't compress */
+/* End compression flags --- maybe not all used */
+#define FS_ENCRYPT_FL 0x00000800 /* Encrypted file */
+#define FS_BTREE_FL 0x00001000 /* btree format dir */
+#define FS_INDEX_FL 0x00001000 /* hash-indexed directory */
+#define FS_IMAGIC_FL 0x00002000 /* AFS directory */
+#define FS_JOURNAL_DATA_FL 0x00004000 /* Reserved for ext3 */
+#define FS_NOTAIL_FL 0x00008000 /* file tail should not be merged */
+#define FS_DIRSYNC_FL 0x00010000 /* dirsync behaviour (directories only) */
+#define FS_TOPDIR_FL 0x00020000 /* Top of directory hierarchies*/
+#define FS_HUGE_FILE_FL 0x00040000 /* Reserved for ext4 */
+#define FS_EXTENT_FL 0x00080000 /* Extents */
+#define FS_EA_INODE_FL 0x00200000 /* Inode used for large EA */
+#define FS_EOFBLOCKS_FL 0x00400000 /* Reserved for ext4 */
+#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
+#define FS_INLINE_DATA_FL 0x10000000 /* Reserved for ext4 */
+#define FS_PROJINHERIT_FL 0x20000000 /* Create with parents projid */
+#define FS_RESERVED_FL 0x80000000 /* reserved for ext2 lib */
+
+#define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */
+#define FS_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */
+
+
+#define SYNC_FILE_RANGE_WAIT_BEFORE 1
+#define SYNC_FILE_RANGE_WRITE 2
+#define SYNC_FILE_RANGE_WAIT_AFTER 4
+
+/*
+ * Flags for preadv2/pwritev2:
+ */
+
+typedef int __bitwise __kernel_rwf_t;
+
+/* high priority request, poll if possible */
+#define RWF_HIPRI ((__force __kernel_rwf_t)0x00000001)
+
+/* per-IO O_DSYNC */
+#define RWF_DSYNC ((__force __kernel_rwf_t)0x00000002)
+
+/* per-IO O_SYNC */
+#define RWF_SYNC ((__force __kernel_rwf_t)0x00000004)
+
+/* per-IO, return -EAGAIN if operation would block */
+#define RWF_NOWAIT ((__force __kernel_rwf_t)0x00000008)
+
+/* per-IO O_APPEND */
+#define RWF_APPEND ((__force __kernel_rwf_t)0x00000010)
+
+/* mask of flags supported by the kernel */
+#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
+ RWF_APPEND)
+
+#endif /* _UAPI_LINUX_FS_H */
IFLA_BR_MCAST_STATS_ENABLED,
IFLA_BR_MCAST_IGMP_VERSION,
IFLA_BR_MCAST_MLD_VERSION,
+ IFLA_BR_VLAN_STATS_PER_PORT,
__IFLA_BR_MAX,
};
struct kvm_coalesced_mmio_zone {
__u64 addr;
__u32 size;
- __u32 pad;
+ union {
+ __u32 pad;
+ __u32 pio;
+ };
};
struct kvm_coalesced_mmio {
__u64 phys_addr;
__u32 len;
- __u32 pad;
+ union {
+ __u32 pad;
+ __u32 pio;
+ };
__u8 data[8];
};
#define KVM_S390_SIE_PAGE_OFFSET 1
+/*
+ * On arm64, machine type can be used to request the physical
+ * address size for the VM. Bits[7-0] are reserved for the guest
+ * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
+ * value 0 implies the default IPA size, 40bits.
+ */
+#define KVM_VM_TYPE_ARM_IPA_SIZE_MASK 0xffULL
+#define KVM_VM_TYPE_ARM_IPA_SIZE(x) \
+ ((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
/*
* ioctls for /dev/kvm fds:
*/
#define KVM_CAP_HYPERV_SEND_IPI 161
#define KVM_CAP_COALESCED_PIO 162
#define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
+#define KVM_CAP_EXCEPTION_PAYLOAD 164
+#define KVM_CAP_ARM_VM_IPA_SIZE 165
#ifdef KVM_CAP_IRQ_ROUTING
#define MAP_HUGE_2MB HUGETLB_FLAG_ENCODE_2MB
#define MAP_HUGE_8MB HUGETLB_FLAG_ENCODE_8MB
#define MAP_HUGE_16MB HUGETLB_FLAG_ENCODE_16MB
+#define MAP_HUGE_32MB HUGETLB_FLAG_ENCODE_32MB
#define MAP_HUGE_256MB HUGETLB_FLAG_ENCODE_256MB
+#define MAP_HUGE_512MB HUGETLB_FLAG_ENCODE_512MB
#define MAP_HUGE_1GB HUGETLB_FLAG_ENCODE_1GB
#define MAP_HUGE_2GB HUGETLB_FLAG_ENCODE_2GB
#define MAP_HUGE_16GB HUGETLB_FLAG_ENCODE_16GB
#define NETLINK_LIST_MEMBERSHIPS 9
#define NETLINK_CAP_ACK 10
#define NETLINK_EXT_ACK 11
+#define NETLINK_DUMP_STRICT_CHK 12
struct nl_pktinfo {
__u32 group;
*
* PERF_RECORD_MISC_MMAP_DATA - PERF_RECORD_MMAP* events
* PERF_RECORD_MISC_COMM_EXEC - PERF_RECORD_COMM event
+ * PERF_RECORD_MISC_FORK_EXEC - PERF_RECORD_FORK event (perf internal)
* PERF_RECORD_MISC_SWITCH_OUT - PERF_RECORD_SWITCH* events
*/
#define PERF_RECORD_MISC_MMAP_DATA (1 << 13)
#define PERF_RECORD_MISC_COMM_EXEC (1 << 13)
+#define PERF_RECORD_MISC_FORK_EXEC (1 << 13)
#define PERF_RECORD_MISC_SWITCH_OUT (1 << 13)
/*
* These PERF_RECORD_MISC_* flags below are safely reused
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef __LINUX_PKT_CLS_H
+#define __LINUX_PKT_CLS_H
+
+#include <linux/types.h>
+#include <linux/pkt_sched.h>
+
+#define TC_COOKIE_MAX_SIZE 16
+
+/* Action attributes */
+enum {
+ TCA_ACT_UNSPEC,
+ TCA_ACT_KIND,
+ TCA_ACT_OPTIONS,
+ TCA_ACT_INDEX,
+ TCA_ACT_STATS,
+ TCA_ACT_PAD,
+ TCA_ACT_COOKIE,
+ __TCA_ACT_MAX
+};
+
+#define TCA_ACT_MAX __TCA_ACT_MAX
+#define TCA_OLD_COMPAT (TCA_ACT_MAX+1)
+#define TCA_ACT_MAX_PRIO 32
+#define TCA_ACT_BIND 1
+#define TCA_ACT_NOBIND 0
+#define TCA_ACT_UNBIND 1
+#define TCA_ACT_NOUNBIND 0
+#define TCA_ACT_REPLACE 1
+#define TCA_ACT_NOREPLACE 0
+
+#define TC_ACT_UNSPEC (-1)
+#define TC_ACT_OK 0
+#define TC_ACT_RECLASSIFY 1
+#define TC_ACT_SHOT 2
+#define TC_ACT_PIPE 3
+#define TC_ACT_STOLEN 4
+#define TC_ACT_QUEUED 5
+#define TC_ACT_REPEAT 6
+#define TC_ACT_REDIRECT 7
+#define TC_ACT_TRAP 8 /* For hw path, this means "trap to cpu"
+ * and don't further process the frame
+ * in hardware. For sw path, this is
+ * equivalent of TC_ACT_STOLEN - drop
+ * the skb and act like everything
+ * is alright.
+ */
+#define TC_ACT_VALUE_MAX TC_ACT_TRAP
+
+/* There is a special kind of actions called "extended actions",
+ * which need a value parameter. These have a local opcode located in
+ * the highest nibble, starting from 1. The rest of the bits
+ * are used to carry the value. These two parts together make
+ * a combined opcode.
+ */
+#define __TC_ACT_EXT_SHIFT 28
+#define __TC_ACT_EXT(local) ((local) << __TC_ACT_EXT_SHIFT)
+#define TC_ACT_EXT_VAL_MASK ((1 << __TC_ACT_EXT_SHIFT) - 1)
+#define TC_ACT_EXT_OPCODE(combined) ((combined) & (~TC_ACT_EXT_VAL_MASK))
+#define TC_ACT_EXT_CMP(combined, opcode) (TC_ACT_EXT_OPCODE(combined) == opcode)
+
+#define TC_ACT_JUMP __TC_ACT_EXT(1)
+#define TC_ACT_GOTO_CHAIN __TC_ACT_EXT(2)
+#define TC_ACT_EXT_OPCODE_MAX TC_ACT_GOTO_CHAIN
+
+/* Action type identifiers*/
+enum {
+ TCA_ID_UNSPEC=0,
+ TCA_ID_POLICE=1,
+ /* other actions go here */
+ __TCA_ID_MAX=255
+};
+
+#define TCA_ID_MAX __TCA_ID_MAX
+
+struct tc_police {
+ __u32 index;
+ int action;
+#define TC_POLICE_UNSPEC TC_ACT_UNSPEC
+#define TC_POLICE_OK TC_ACT_OK
+#define TC_POLICE_RECLASSIFY TC_ACT_RECLASSIFY
+#define TC_POLICE_SHOT TC_ACT_SHOT
+#define TC_POLICE_PIPE TC_ACT_PIPE
+
+ __u32 limit;
+ __u32 burst;
+ __u32 mtu;
+ struct tc_ratespec rate;
+ struct tc_ratespec peakrate;
+ int refcnt;
+ int bindcnt;
+ __u32 capab;
+};
+
+struct tcf_t {
+ __u64 install;
+ __u64 lastuse;
+ __u64 expires;
+ __u64 firstuse;
+};
+
+struct tc_cnt {
+ int refcnt;
+ int bindcnt;
+};
+
+#define tc_gen \
+ __u32 index; \
+ __u32 capab; \
+ int action; \
+ int refcnt; \
+ int bindcnt
+
+enum {
+ TCA_POLICE_UNSPEC,
+ TCA_POLICE_TBF,
+ TCA_POLICE_RATE,
+ TCA_POLICE_PEAKRATE,
+ TCA_POLICE_AVRATE,
+ TCA_POLICE_RESULT,
+ TCA_POLICE_TM,
+ TCA_POLICE_PAD,
+ __TCA_POLICE_MAX
+#define TCA_POLICE_RESULT TCA_POLICE_RESULT
+};
+
+#define TCA_POLICE_MAX (__TCA_POLICE_MAX - 1)
+
+/* tca flags definitions */
+#define TCA_CLS_FLAGS_SKIP_HW (1 << 0) /* don't offload filter to HW */
+#define TCA_CLS_FLAGS_SKIP_SW (1 << 1) /* don't use filter in SW */
+#define TCA_CLS_FLAGS_IN_HW (1 << 2) /* filter is offloaded to HW */
+#define TCA_CLS_FLAGS_NOT_IN_HW (1 << 3) /* filter isn't offloaded to HW */
+#define TCA_CLS_FLAGS_VERBOSE (1 << 4) /* verbose logging */
+
+/* U32 filters */
+
+#define TC_U32_HTID(h) ((h)&0xFFF00000)
+#define TC_U32_USERHTID(h) (TC_U32_HTID(h)>>20)
+#define TC_U32_HASH(h) (((h)>>12)&0xFF)
+#define TC_U32_NODE(h) ((h)&0xFFF)
+#define TC_U32_KEY(h) ((h)&0xFFFFF)
+#define TC_U32_UNSPEC 0
+#define TC_U32_ROOT (0xFFF00000)
+
+enum {
+ TCA_U32_UNSPEC,
+ TCA_U32_CLASSID,
+ TCA_U32_HASH,
+ TCA_U32_LINK,
+ TCA_U32_DIVISOR,
+ TCA_U32_SEL,
+ TCA_U32_POLICE,
+ TCA_U32_ACT,
+ TCA_U32_INDEV,
+ TCA_U32_PCNT,
+ TCA_U32_MARK,
+ TCA_U32_FLAGS,
+ TCA_U32_PAD,
+ __TCA_U32_MAX
+};
+
+#define TCA_U32_MAX (__TCA_U32_MAX - 1)
+
+struct tc_u32_key {
+ __be32 mask;
+ __be32 val;
+ int off;
+ int offmask;
+};
+
+struct tc_u32_sel {
+ unsigned char flags;
+ unsigned char offshift;
+ unsigned char nkeys;
+
+ __be16 offmask;
+ __u16 off;
+ short offoff;
+
+ short hoff;
+ __be32 hmask;
+ struct tc_u32_key keys[0];
+};
+
+struct tc_u32_mark {
+ __u32 val;
+ __u32 mask;
+ __u32 success;
+};
+
+struct tc_u32_pcnt {
+ __u64 rcnt;
+ __u64 rhit;
+ __u64 kcnts[0];
+};
+
+/* Flags */
+
+#define TC_U32_TERMINAL 1
+#define TC_U32_OFFSET 2
+#define TC_U32_VAROFFSET 4
+#define TC_U32_EAT 8
+
+#define TC_U32_MAXDEPTH 8
+
+
+/* RSVP filter */
+
+enum {
+ TCA_RSVP_UNSPEC,
+ TCA_RSVP_CLASSID,
+ TCA_RSVP_DST,
+ TCA_RSVP_SRC,
+ TCA_RSVP_PINFO,
+ TCA_RSVP_POLICE,
+ TCA_RSVP_ACT,
+ __TCA_RSVP_MAX
+};
+
+#define TCA_RSVP_MAX (__TCA_RSVP_MAX - 1 )
+
+struct tc_rsvp_gpi {
+ __u32 key;
+ __u32 mask;
+ int offset;
+};
+
+struct tc_rsvp_pinfo {
+ struct tc_rsvp_gpi dpi;
+ struct tc_rsvp_gpi spi;
+ __u8 protocol;
+ __u8 tunnelid;
+ __u8 tunnelhdr;
+ __u8 pad;
+};
+
+/* ROUTE filter */
+
+enum {
+ TCA_ROUTE4_UNSPEC,
+ TCA_ROUTE4_CLASSID,
+ TCA_ROUTE4_TO,
+ TCA_ROUTE4_FROM,
+ TCA_ROUTE4_IIF,
+ TCA_ROUTE4_POLICE,
+ TCA_ROUTE4_ACT,
+ __TCA_ROUTE4_MAX
+};
+
+#define TCA_ROUTE4_MAX (__TCA_ROUTE4_MAX - 1)
+
+
+/* FW filter */
+
+enum {
+ TCA_FW_UNSPEC,
+ TCA_FW_CLASSID,
+ TCA_FW_POLICE,
+ TCA_FW_INDEV, /* used by CONFIG_NET_CLS_IND */
+ TCA_FW_ACT, /* used by CONFIG_NET_CLS_ACT */
+ TCA_FW_MASK,
+ __TCA_FW_MAX
+};
+
+#define TCA_FW_MAX (__TCA_FW_MAX - 1)
+
+/* TC index filter */
+
+enum {
+ TCA_TCINDEX_UNSPEC,
+ TCA_TCINDEX_HASH,
+ TCA_TCINDEX_MASK,
+ TCA_TCINDEX_SHIFT,
+ TCA_TCINDEX_FALL_THROUGH,
+ TCA_TCINDEX_CLASSID,
+ TCA_TCINDEX_POLICE,
+ TCA_TCINDEX_ACT,
+ __TCA_TCINDEX_MAX
+};
+
+#define TCA_TCINDEX_MAX (__TCA_TCINDEX_MAX - 1)
+
+/* Flow filter */
+
+enum {
+ FLOW_KEY_SRC,
+ FLOW_KEY_DST,
+ FLOW_KEY_PROTO,
+ FLOW_KEY_PROTO_SRC,
+ FLOW_KEY_PROTO_DST,
+ FLOW_KEY_IIF,
+ FLOW_KEY_PRIORITY,
+ FLOW_KEY_MARK,
+ FLOW_KEY_NFCT,
+ FLOW_KEY_NFCT_SRC,
+ FLOW_KEY_NFCT_DST,
+ FLOW_KEY_NFCT_PROTO_SRC,
+ FLOW_KEY_NFCT_PROTO_DST,
+ FLOW_KEY_RTCLASSID,
+ FLOW_KEY_SKUID,
+ FLOW_KEY_SKGID,
+ FLOW_KEY_VLAN_TAG,
+ FLOW_KEY_RXHASH,
+ __FLOW_KEY_MAX,
+};
+
+#define FLOW_KEY_MAX (__FLOW_KEY_MAX - 1)
+
+enum {
+ FLOW_MODE_MAP,
+ FLOW_MODE_HASH,
+};
+
+enum {
+ TCA_FLOW_UNSPEC,
+ TCA_FLOW_KEYS,
+ TCA_FLOW_MODE,
+ TCA_FLOW_BASECLASS,
+ TCA_FLOW_RSHIFT,
+ TCA_FLOW_ADDEND,
+ TCA_FLOW_MASK,
+ TCA_FLOW_XOR,
+ TCA_FLOW_DIVISOR,
+ TCA_FLOW_ACT,
+ TCA_FLOW_POLICE,
+ TCA_FLOW_EMATCHES,
+ TCA_FLOW_PERTURB,
+ __TCA_FLOW_MAX
+};
+
+#define TCA_FLOW_MAX (__TCA_FLOW_MAX - 1)
+
+/* Basic filter */
+
+enum {
+ TCA_BASIC_UNSPEC,
+ TCA_BASIC_CLASSID,
+ TCA_BASIC_EMATCHES,
+ TCA_BASIC_ACT,
+ TCA_BASIC_POLICE,
+ __TCA_BASIC_MAX
+};
+
+#define TCA_BASIC_MAX (__TCA_BASIC_MAX - 1)
+
+
+/* Cgroup classifier */
+
+enum {
+ TCA_CGROUP_UNSPEC,
+ TCA_CGROUP_ACT,
+ TCA_CGROUP_POLICE,
+ TCA_CGROUP_EMATCHES,
+ __TCA_CGROUP_MAX,
+};
+
+#define TCA_CGROUP_MAX (__TCA_CGROUP_MAX - 1)
+
+/* BPF classifier */
+
+#define TCA_BPF_FLAG_ACT_DIRECT (1 << 0)
+
+enum {
+ TCA_BPF_UNSPEC,
+ TCA_BPF_ACT,
+ TCA_BPF_POLICE,
+ TCA_BPF_CLASSID,
+ TCA_BPF_OPS_LEN,
+ TCA_BPF_OPS,
+ TCA_BPF_FD,
+ TCA_BPF_NAME,
+ TCA_BPF_FLAGS,
+ TCA_BPF_FLAGS_GEN,
+ TCA_BPF_TAG,
+ TCA_BPF_ID,
+ __TCA_BPF_MAX,
+};
+
+#define TCA_BPF_MAX (__TCA_BPF_MAX - 1)
+
+/* Flower classifier */
+
+enum {
+ TCA_FLOWER_UNSPEC,
+ TCA_FLOWER_CLASSID,
+ TCA_FLOWER_INDEV,
+ TCA_FLOWER_ACT,
+ TCA_FLOWER_KEY_ETH_DST, /* ETH_ALEN */
+ TCA_FLOWER_KEY_ETH_DST_MASK, /* ETH_ALEN */
+ TCA_FLOWER_KEY_ETH_SRC, /* ETH_ALEN */
+ TCA_FLOWER_KEY_ETH_SRC_MASK, /* ETH_ALEN */
+ TCA_FLOWER_KEY_ETH_TYPE, /* be16 */
+ TCA_FLOWER_KEY_IP_PROTO, /* u8 */
+ TCA_FLOWER_KEY_IPV4_SRC, /* be32 */
+ TCA_FLOWER_KEY_IPV4_SRC_MASK, /* be32 */
+ TCA_FLOWER_KEY_IPV4_DST, /* be32 */
+ TCA_FLOWER_KEY_IPV4_DST_MASK, /* be32 */
+ TCA_FLOWER_KEY_IPV6_SRC, /* struct in6_addr */
+ TCA_FLOWER_KEY_IPV6_SRC_MASK, /* struct in6_addr */
+ TCA_FLOWER_KEY_IPV6_DST, /* struct in6_addr */
+ TCA_FLOWER_KEY_IPV6_DST_MASK, /* struct in6_addr */
+ TCA_FLOWER_KEY_TCP_SRC, /* be16 */
+ TCA_FLOWER_KEY_TCP_DST, /* be16 */
+ TCA_FLOWER_KEY_UDP_SRC, /* be16 */
+ TCA_FLOWER_KEY_UDP_DST, /* be16 */
+
+ TCA_FLOWER_FLAGS,
+ TCA_FLOWER_KEY_VLAN_ID, /* be16 */
+ TCA_FLOWER_KEY_VLAN_PRIO, /* u8 */
+ TCA_FLOWER_KEY_VLAN_ETH_TYPE, /* be16 */
+
+ TCA_FLOWER_KEY_ENC_KEY_ID, /* be32 */
+ TCA_FLOWER_KEY_ENC_IPV4_SRC, /* be32 */
+ TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK,/* be32 */
+ TCA_FLOWER_KEY_ENC_IPV4_DST, /* be32 */
+ TCA_FLOWER_KEY_ENC_IPV4_DST_MASK,/* be32 */
+ TCA_FLOWER_KEY_ENC_IPV6_SRC, /* struct in6_addr */
+ TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK,/* struct in6_addr */
+ TCA_FLOWER_KEY_ENC_IPV6_DST, /* struct in6_addr */
+ TCA_FLOWER_KEY_ENC_IPV6_DST_MASK,/* struct in6_addr */
+
+ TCA_FLOWER_KEY_TCP_SRC_MASK, /* be16 */
+ TCA_FLOWER_KEY_TCP_DST_MASK, /* be16 */
+ TCA_FLOWER_KEY_UDP_SRC_MASK, /* be16 */
+ TCA_FLOWER_KEY_UDP_DST_MASK, /* be16 */
+ TCA_FLOWER_KEY_SCTP_SRC_MASK, /* be16 */
+ TCA_FLOWER_KEY_SCTP_DST_MASK, /* be16 */
+
+ TCA_FLOWER_KEY_SCTP_SRC, /* be16 */
+ TCA_FLOWER_KEY_SCTP_DST, /* be16 */
+
+ TCA_FLOWER_KEY_ENC_UDP_SRC_PORT, /* be16 */
+ TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK, /* be16 */
+ TCA_FLOWER_KEY_ENC_UDP_DST_PORT, /* be16 */
+ TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, /* be16 */
+
+ TCA_FLOWER_KEY_FLAGS, /* be32 */
+ TCA_FLOWER_KEY_FLAGS_MASK, /* be32 */
+
+ TCA_FLOWER_KEY_ICMPV4_CODE, /* u8 */
+ TCA_FLOWER_KEY_ICMPV4_CODE_MASK,/* u8 */
+ TCA_FLOWER_KEY_ICMPV4_TYPE, /* u8 */
+ TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,/* u8 */
+ TCA_FLOWER_KEY_ICMPV6_CODE, /* u8 */
+ TCA_FLOWER_KEY_ICMPV6_CODE_MASK,/* u8 */
+ TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */
+ TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */
+
+ TCA_FLOWER_KEY_ARP_SIP, /* be32 */
+ TCA_FLOWER_KEY_ARP_SIP_MASK, /* be32 */
+ TCA_FLOWER_KEY_ARP_TIP, /* be32 */
+ TCA_FLOWER_KEY_ARP_TIP_MASK, /* be32 */
+ TCA_FLOWER_KEY_ARP_OP, /* u8 */
+ TCA_FLOWER_KEY_ARP_OP_MASK, /* u8 */
+ TCA_FLOWER_KEY_ARP_SHA, /* ETH_ALEN */
+ TCA_FLOWER_KEY_ARP_SHA_MASK, /* ETH_ALEN */
+ TCA_FLOWER_KEY_ARP_THA, /* ETH_ALEN */
+ TCA_FLOWER_KEY_ARP_THA_MASK, /* ETH_ALEN */
+
+ TCA_FLOWER_KEY_MPLS_TTL, /* u8 - 8 bits */
+ TCA_FLOWER_KEY_MPLS_BOS, /* u8 - 1 bit */
+ TCA_FLOWER_KEY_MPLS_TC, /* u8 - 3 bits */
+ TCA_FLOWER_KEY_MPLS_LABEL, /* be32 - 20 bits */
+
+ TCA_FLOWER_KEY_TCP_FLAGS, /* be16 */
+ TCA_FLOWER_KEY_TCP_FLAGS_MASK, /* be16 */
+
+ TCA_FLOWER_KEY_IP_TOS, /* u8 */
+ TCA_FLOWER_KEY_IP_TOS_MASK, /* u8 */
+ TCA_FLOWER_KEY_IP_TTL, /* u8 */
+ TCA_FLOWER_KEY_IP_TTL_MASK, /* u8 */
+
+ TCA_FLOWER_KEY_CVLAN_ID, /* be16 */
+ TCA_FLOWER_KEY_CVLAN_PRIO, /* u8 */
+ TCA_FLOWER_KEY_CVLAN_ETH_TYPE, /* be16 */
+
+ TCA_FLOWER_KEY_ENC_IP_TOS, /* u8 */
+ TCA_FLOWER_KEY_ENC_IP_TOS_MASK, /* u8 */
+ TCA_FLOWER_KEY_ENC_IP_TTL, /* u8 */
+ TCA_FLOWER_KEY_ENC_IP_TTL_MASK, /* u8 */
+
+ TCA_FLOWER_KEY_ENC_OPTS,
+ TCA_FLOWER_KEY_ENC_OPTS_MASK,
+
+ TCA_FLOWER_IN_HW_COUNT,
+
+ __TCA_FLOWER_MAX,
+};
+
+#define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1)
+
+enum {
+ TCA_FLOWER_KEY_ENC_OPTS_UNSPEC,
+ TCA_FLOWER_KEY_ENC_OPTS_GENEVE, /* Nested
+ * TCA_FLOWER_KEY_ENC_OPT_GENEVE_
+ * attributes
+ */
+ __TCA_FLOWER_KEY_ENC_OPTS_MAX,
+};
+
+#define TCA_FLOWER_KEY_ENC_OPTS_MAX (__TCA_FLOWER_KEY_ENC_OPTS_MAX - 1)
+
+enum {
+ TCA_FLOWER_KEY_ENC_OPT_GENEVE_UNSPEC,
+ TCA_FLOWER_KEY_ENC_OPT_GENEVE_CLASS, /* u16 */
+ TCA_FLOWER_KEY_ENC_OPT_GENEVE_TYPE, /* u8 */
+ TCA_FLOWER_KEY_ENC_OPT_GENEVE_DATA, /* 4 to 128 bytes */
+
+ __TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX,
+};
+
+#define TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX \
+ (__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX - 1)
+
+enum {
+ TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0),
+ TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST = (1 << 1),
+};
+
+/* Match-all classifier */
+
+enum {
+ TCA_MATCHALL_UNSPEC,
+ TCA_MATCHALL_CLASSID,
+ TCA_MATCHALL_ACT,
+ TCA_MATCHALL_FLAGS,
+ __TCA_MATCHALL_MAX,
+};
+
+#define TCA_MATCHALL_MAX (__TCA_MATCHALL_MAX - 1)
+
+/* Extended Matches */
+
+struct tcf_ematch_tree_hdr {
+ __u16 nmatches;
+ __u16 progid;
+};
+
+enum {
+ TCA_EMATCH_TREE_UNSPEC,
+ TCA_EMATCH_TREE_HDR,
+ TCA_EMATCH_TREE_LIST,
+ __TCA_EMATCH_TREE_MAX
+};
+#define TCA_EMATCH_TREE_MAX (__TCA_EMATCH_TREE_MAX - 1)
+
+struct tcf_ematch_hdr {
+ __u16 matchid;
+ __u16 kind;
+ __u16 flags;
+ __u16 pad; /* currently unused */
+};
+
+/* 0 1
+ * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+ * +-----------------------+-+-+---+
+ * | Unused |S|I| R |
+ * +-----------------------+-+-+---+
+ *
+ * R(2) ::= relation to next ematch
+ * where: 0 0 END (last ematch)
+ * 0 1 AND
+ * 1 0 OR
+ * 1 1 Unused (invalid)
+ * I(1) ::= invert result
+ * S(1) ::= simple payload
+ */
+#define TCF_EM_REL_END 0
+#define TCF_EM_REL_AND (1<<0)
+#define TCF_EM_REL_OR (1<<1)
+#define TCF_EM_INVERT (1<<2)
+#define TCF_EM_SIMPLE (1<<3)
+
+#define TCF_EM_REL_MASK 3
+#define TCF_EM_REL_VALID(v) (((v) & TCF_EM_REL_MASK) != TCF_EM_REL_MASK)
+
+enum {
+ TCF_LAYER_LINK,
+ TCF_LAYER_NETWORK,
+ TCF_LAYER_TRANSPORT,
+ __TCF_LAYER_MAX
+};
+#define TCF_LAYER_MAX (__TCF_LAYER_MAX - 1)
+
+/* Ematch type assignments
+ * 1..32767 Reserved for ematches inside kernel tree
+ * 32768..65535 Free to use, not reliable
+ */
+#define TCF_EM_CONTAINER 0
+#define TCF_EM_CMP 1
+#define TCF_EM_NBYTE 2
+#define TCF_EM_U32 3
+#define TCF_EM_META 4
+#define TCF_EM_TEXT 5
+#define TCF_EM_VLAN 6
+#define TCF_EM_CANID 7
+#define TCF_EM_IPSET 8
+#define TCF_EM_IPT 9
+#define TCF_EM_MAX 9
+
+enum {
+ TCF_EM_PROG_TC
+};
+
+enum {
+ TCF_EM_OPND_EQ,
+ TCF_EM_OPND_GT,
+ TCF_EM_OPND_LT
+};
+
+#endif
#define PR_SET_SPECULATION_CTRL 53
/* Speculation control variants */
# define PR_SPEC_STORE_BYPASS 0
+# define PR_SPEC_INDIRECT_BRANCH 1
/* Return and control values for PR_SET/GET_SPECULATION_CTRL */
# define PR_SPEC_NOT_AFFECTED 0
# define PR_SPEC_PRCTL (1UL << 0)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2015 Jiri Pirko <jiri@resnulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef __LINUX_TC_BPF_H
+#define __LINUX_TC_BPF_H
+
+#include <linux/pkt_cls.h>
+
+#define TCA_ACT_BPF 13
+
+struct tc_act_bpf {
+ tc_gen;
+};
+
+enum {
+ TCA_ACT_BPF_UNSPEC,
+ TCA_ACT_BPF_TM,
+ TCA_ACT_BPF_PARMS,
+ TCA_ACT_BPF_OPS_LEN,
+ TCA_ACT_BPF_OPS,
+ TCA_ACT_BPF_FD,
+ TCA_ACT_BPF_NAME,
+ TCA_ACT_BPF_PAD,
+ TCA_ACT_BPF_TAG,
+ TCA_ACT_BPF_ID,
+ __TCA_ACT_BPF_MAX,
+};
+#define TCA_ACT_BPF_MAX (__TCA_ACT_BPF_MAX - 1)
+
+#endif
#define SNDRV_TIMER_PSFLG_EARLY_EVENT (1<<2) /* write early event to the poll queue */
struct snd_timer_params {
- unsigned int flags; /* flags - SNDRV_MIXER_PSFLG_* */
+ unsigned int flags; /* flags - SNDRV_TIMER_PSFLG_* */
unsigned int ticks; /* requested resolution in ticks */
unsigned int queue_size; /* total size of queue (32-1024) */
unsigned int reserved0; /* reserved, was: failure locations */
prog->expected_attach_type = type;
}
-#define BPF_PROG_SEC_IMPL(string, ptype, eatype, atype) \
- { string, sizeof(string) - 1, ptype, eatype, atype }
+#define BPF_PROG_SEC_IMPL(string, ptype, eatype, is_attachable, atype) \
+ { string, sizeof(string) - 1, ptype, eatype, is_attachable, atype }
/* Programs that can NOT be attached. */
-#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, -EINVAL)
+#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, 0, 0)
/* Programs that can be attached. */
#define BPF_APROG_SEC(string, ptype, atype) \
- BPF_PROG_SEC_IMPL(string, ptype, 0, atype)
+ BPF_PROG_SEC_IMPL(string, ptype, 0, 1, atype)
/* Programs that must specify expected attach type at load time. */
#define BPF_EAPROG_SEC(string, ptype, eatype) \
- BPF_PROG_SEC_IMPL(string, ptype, eatype, eatype)
+ BPF_PROG_SEC_IMPL(string, ptype, eatype, 1, eatype)
/* Programs that can be attached but attach type can't be identified by section
* name. Kept for backward compatibility.
size_t len;
enum bpf_prog_type prog_type;
enum bpf_attach_type expected_attach_type;
+ int is_attachable;
enum bpf_attach_type attach_type;
} section_names[] = {
BPF_PROG_SEC("socket", BPF_PROG_TYPE_SOCKET_FILTER),
for (i = 0; i < ARRAY_SIZE(section_names); i++) {
if (strncmp(name, section_names[i].sec, section_names[i].len))
continue;
- if (section_names[i].attach_type == -EINVAL)
+ if (!section_names[i].is_attachable)
return -EINVAL;
*attach_type = section_names[i].attach_type;
return 0;
case OPTION_INTEGER:
case OPTION_UINTEGER:
case OPTION_LONG:
+ case OPTION_ULONG:
case OPTION_U64:
default:
break;
case OPTION_INTEGER:
case OPTION_UINTEGER:
case OPTION_LONG:
+ case OPTION_ULONG:
case OPTION_U64:
default:
break;
return opterror(opt, "expects a numerical value", flags);
return 0;
+ case OPTION_ULONG:
+ if (unset) {
+ *(unsigned long *)opt->value = 0;
+ return 0;
+ }
+ if (opt->flags & PARSE_OPT_OPTARG && !p->opt) {
+ *(unsigned long *)opt->value = opt->defval;
+ return 0;
+ }
+ if (get_arg(p, opt, flags, &arg))
+ return -1;
+ *(unsigned long *)opt->value = strtoul(arg, (char **)&s, 10);
+ if (*s)
+ return opterror(opt, "expects a numerical value", flags);
+ return 0;
+
case OPTION_U64:
if (unset) {
*(u64 *)opt->value = 0;
case OPTION_ARGUMENT:
break;
case OPTION_LONG:
+ case OPTION_ULONG:
case OPTION_U64:
case OPTION_INTEGER:
case OPTION_UINTEGER:
OPTION_STRING,
OPTION_INTEGER,
OPTION_LONG,
+ OPTION_ULONG,
OPTION_CALLBACK,
OPTION_U64,
OPTION_UINTEGER,
#define OPT_INTEGER(s, l, v, h) { .type = OPTION_INTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, int *), .help = (h) }
#define OPT_UINTEGER(s, l, v, h) { .type = OPTION_UINTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h) }
#define OPT_LONG(s, l, v, h) { .type = OPTION_LONG, .short_name = (s), .long_name = (l), .value = check_vtype(v, long *), .help = (h) }
+#define OPT_ULONG(s, l, v, h) { .type = OPTION_ULONG, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned long *), .help = (h) }
#define OPT_U64(s, l, v, h) { .type = OPTION_U64, .short_name = (s), .long_name = (l), .value = check_vtype(v, u64 *), .help = (h) }
#define OPT_STRING(s, l, v, a, h) { .type = OPTION_STRING, .short_name = (s), .long_name = (l), .value = check_vtype(v, const char **), .argh = (a), .help = (h) }
#define OPT_STRING_OPTARG(s, l, v, a, h, d) \
struct symbol *pfunc = insn->func->pfunc;
unsigned int prev_offset = 0;
- list_for_each_entry_from(rela, &file->rodata->rela->rela_list, list) {
+ list_for_each_entry_from(rela, &table->rela_sec->rela_list, list) {
if (rela == next_table)
break;
{
struct rela *text_rela, *rodata_rela;
struct instruction *orig_insn = insn;
+ struct section *rodata_sec;
unsigned long table_offset;
/*
/* look for a relocation which references .rodata */
text_rela = find_rela_by_dest_range(insn->sec, insn->offset,
insn->len);
- if (!text_rela || text_rela->sym != file->rodata->sym)
+ if (!text_rela || text_rela->sym->type != STT_SECTION ||
+ !text_rela->sym->sec->rodata)
continue;
table_offset = text_rela->addend;
+ rodata_sec = text_rela->sym->sec;
+
if (text_rela->type == R_X86_64_PC32)
table_offset += 4;
* Make sure the .rodata address isn't associated with a
* symbol. gcc jump tables are anonymous data.
*/
- if (find_symbol_containing(file->rodata, table_offset))
+ if (find_symbol_containing(rodata_sec, table_offset))
continue;
- rodata_rela = find_rela_by_dest(file->rodata, table_offset);
+ rodata_rela = find_rela_by_dest(rodata_sec, table_offset);
if (rodata_rela) {
/*
* Use of RIP-relative switch jumps is quite rare, and
struct symbol *func;
int ret;
- if (!file->rodata || !file->rodata->rela)
+ if (!file->rodata)
return 0;
for_each_sec(file, sec) {
return 0;
}
+static void mark_rodata(struct objtool_file *file)
+{
+ struct section *sec;
+ bool found = false;
+
+ /*
+ * This searches for the .rodata section or multiple .rodata.func_name
+ * sections if -fdata-sections is being used. The .str.1.1 and .str.1.8
+ * rodata sections are ignored as they don't contain jump tables.
+ */
+ for_each_sec(file, sec) {
+ if (!strncmp(sec->name, ".rodata", 7) &&
+ !strstr(sec->name, ".str1.")) {
+ sec->rodata = true;
+ found = true;
+ }
+ }
+
+ file->rodata = found;
+}
+
static int decode_sections(struct objtool_file *file)
{
int ret;
+ mark_rodata(file);
+
ret = decode_instructions(file);
if (ret)
return ret;
INIT_LIST_HEAD(&file.insn_list);
hash_init(file.insn_hash);
file.whitelist = find_section_by_name(file.elf, ".discard.func_stack_frame_non_standard");
- file.rodata = find_section_by_name(file.elf, ".rodata");
file.c_file = find_section_by_name(file.elf, ".comment");
file.ignore_unreachables = no_unreachable;
file.hints = false;
struct elf *elf;
struct list_head insn_list;
DECLARE_HASHTABLE(insn_hash, 16);
- struct section *rodata, *whitelist;
- bool ignore_unreachables, c_file, hints;
+ struct section *whitelist;
+ bool ignore_unreachables, c_file, hints, rodata;
};
int check(const char *objname, bool orc);
#include "elf.h"
#include "warn.h"
+#define MAX_NAME_LEN 128
+
struct section *find_section_by_name(struct elf *elf, const char *name)
{
struct section *sec;
/* Create parent/child links for any cold subfunctions */
list_for_each_entry(sec, &elf->sections, list) {
list_for_each_entry(sym, &sec->symbol_list, list) {
+ char pname[MAX_NAME_LEN + 1];
+ size_t pnamelen;
if (sym->type != STT_FUNC)
continue;
sym->pfunc = sym->cfunc = sym;
- coldstr = strstr(sym->name, ".cold.");
+ coldstr = strstr(sym->name, ".cold");
if (!coldstr)
continue;
- coldstr[0] = '\0';
- pfunc = find_symbol_by_name(elf, sym->name);
- coldstr[0] = '.';
+ pnamelen = coldstr - sym->name;
+ if (pnamelen > MAX_NAME_LEN) {
+ WARN("%s(): parent function name exceeds maximum length of %d characters",
+ sym->name, MAX_NAME_LEN);
+ return -1;
+ }
+
+ strncpy(pname, sym->name, pnamelen);
+ pname[pnamelen] = '\0';
+ pfunc = find_symbol_by_name(elf, pname);
if (!pfunc) {
WARN("%s(): can't find parent function",
sym->name);
- goto err;
+ return -1;
}
sym->pfunc = pfunc;
rela->offset = rela->rela.r_offset;
symndx = GELF_R_SYM(rela->rela.r_info);
rela->sym = find_symbol_by_index(elf, symndx);
+ rela->rela_sec = sec;
if (!rela->sym) {
WARN("can't find rela entry symbol %d for %s",
symndx, sec->name);
char *name;
int idx;
unsigned int len;
- bool changed, text;
+ bool changed, text, rodata;
};
struct symbol {
struct list_head list;
struct hlist_node hash;
GElf_Rela rela;
+ struct section *rela_sec;
struct symbol *sym;
unsigned int type;
unsigned long offset;
--- /dev/null
+
+For --xed the xed tool is needed. Here is how to install it:
+
+ $ git clone https://github.com/intelxed/mbuild.git mbuild
+ $ git clone https://github.com/intelxed/xed
+ $ cd xed
+ $ ./mfile.py --share
+ $ ./mfile.py examples
+ $ sudo ./mfile.py --prefix=/usr/local install
+ $ sudo ldconfig
+ $ sudo cp obj/examples/xed /usr/local/bin
+
+Basic xed testing:
+
+ $ xed | head -3
+ ERROR: required argument(s) were missing
+ Copyright (C) 2017, Intel Corporation. All rights reserved.
+ XED version: [v10.0-328-g7d62c8c49b7b]
+ $
While it is possible to create scripts to analyze the data, an alternative
approach is available to export the data to a sqlite or postgresql database.
Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
-and to script call-graph-from-sql.py for an example of using the database.
+and to script exported-sql-viewer.py for an example of using the database.
There is also script intel-pt-events.py which provides an example of how to
unpack the raw data for power events and PTWRITE.
l synthesize last branch entries (use with i or x)
s skip initial number of events
- The default is all events i.e. the same as --itrace=ibxwpe
+ The default is all events i.e. the same as --itrace=ibxwpe,
+ except for perf script where it is --itrace=ce
- In addition, the period (default 100000) for instructions events
- can be specified in units of:
+ In addition, the period (default 100000, except for perf script where it is 1)
+ for instructions events can be specified in units of:
i instructions
t ticks
S - read sample value (PERF_SAMPLE_READ)
D - pin the event to the PMU
W - group is weak and will fallback to non-group if not schedulable,
- only supported in 'perf stat' for now.
The 'p' modifier can be used for specifying how precise the instruction
address should be. The 'p' modifier can be specified multiple times:
will be printed. Each entry has function name and file/line. Enabled by
default, disable with --no-inline.
+--insn-trace::
+ Show instruction stream for intel_pt traces. Combine with --xed to
+ show disassembly.
+
+--xed::
+ Run xed disassembler on output. Requires installing the xed disassembler.
+
+--call-trace::
+ Show call stream for intel_pt traces. The CPUs are interleaved, but
+ can be filtered with -C.
+
+--call-ret-trace::
+ Show call and return stream for intel_pt traces.
+
+--graph-function::
+ For itrace only show specified functions and their callees for
+ itrace. Multiple functions can be separated by comma.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
--hierarchy::
Enable hierarchy output.
+--overwrite::
+ Enable this to use just the most recent records, which helps in high core count
+ machines such as Knights Landing/Mill, but right now is disabled by default as
+ the pausing used in this technique is leading to loss of metadata events such
+ as PERF_RECORD_MMAP which makes 'perf top' unable to resolve samples, leading
+ to lots of unknown samples appearing on the UI. Enable this if you are in such
+ machines and profiling a workload that doesn't creates short lived threads and/or
+ doesn't uses many executable mmap operations. Work is being planed to solve
+ this situation, till then, this will remain disabled by default.
+
--force::
Don't do ownership validation.
--kernel-syscall-graph::
Show the kernel callchains on the syscall exit path.
+--max-events=N::
+ Stop after processing N events. Note that strace-like events are considered
+ only at exit time or when a syscall is interrupted, i.e. in those cases this
+ option is equivalent to the number of lines printed.
+
--max-stack::
Set the stack depth limit when parsing the callchain, anything
beyond the specified depth will be ignored. Note that at this point
As you can see, there was major pagefault in python process, from
CRYPTO_push_info_ routine which faulted somewhere in libcrypto.so.
+Trace the first 4 open, openat or open_by_handle_at syscalls (in the future more syscalls may match here):
+
+ $ perf trace -e open* --max-events 4
+ [root@jouet perf]# trace -e open* --max-events 4
+ 2272.992 ( 0.037 ms): gnome-shell/1370 openat(dfd: CWD, filename: /proc/self/stat) = 31
+ 2277.481 ( 0.139 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
+ 3026.398 ( 0.076 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
+ 4294.665 ( 0.015 ms): sed/15879 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
+ $
+
+Trace the first minor page fault when running a workload:
+
+ # perf trace -F min --max-stack=7 --max-events 1 sleep 1
+ 0.000 ( 0.000 ms): sleep/18006 minfault [__clear_user+0x1a] => 0x5626efa56080 (?k)
+ __clear_user ([kernel.kallsyms])
+ load_elf_binary ([kernel.kallsyms])
+ search_binary_handler ([kernel.kallsyms])
+ __do_execve_file.isra.33 ([kernel.kallsyms])
+ __x64_sys_execve ([kernel.kallsyms])
+ do_syscall_64 ([kernel.kallsyms])
+ entry_SYSCALL_64 ([kernel.kallsyms])
+ #
+
+Trace the next min page page fault to take place on the first CPU:
+
+ # perf trace -F min --call-graph=dwarf --max-events 1 --cpu 0
+ 0.000 ( 0.000 ms): Web Content/17136 minfault [js::gc::Chunk::fetchNextDecommittedArena+0x4b] => 0x7fbe6181b000 (?.)
+ js::gc::FreeSpan::initAsEmpty (inlined)
+ js::gc::Arena::setAsNotAllocated (inlined)
+ js::gc::Chunk::fetchNextDecommittedArena (/usr/lib64/firefox/libxul.so)
+ js::gc::Chunk::allocateArena (/usr/lib64/firefox/libxul.so)
+ js::gc::GCRuntime::allocateArena (/usr/lib64/firefox/libxul.so)
+ js::gc::ArenaLists::allocateFromArena (/usr/lib64/firefox/libxul.so)
+ js::gc::GCRuntime::tryNewTenuredThing<JSString, (js::AllowGC)1> (inlined)
+ js::AllocateString<JSString, (js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
+ js::Allocate<JSThinInlineString, (js::AllowGC)1> (inlined)
+ JSThinInlineString::new_<(js::AllowGC)1> (inlined)
+ AllocateInlineString<(js::AllowGC)1, unsigned char> (inlined)
+ js::ConcatStrings<(js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
+ [0x18b26e6bc2bd] (/tmp/perf-17136.map)
+ #
+
+Trace the next two sched:sched_switch events, four block:*_plug events, the
+next block:*_unplug and the next three net:*dev_queue events, this last one
+with a backtrace of at most 16 entries, system wide:
+
+ # perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
+ 0.000 :0/0 sched:sched_switch:swapper/2:0 [120] S ==> rcu_sched:10 [120]
+ 0.015 rcu_sched/10 sched:sched_switch:rcu_sched:10 [120] R ==> swapper/2:0 [120]
+ 254.198 irq/50-iwlwifi/680 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051f600 len=66
+ __dev_queue_xmit ([kernel.kallsyms])
+ 273.977 :0/0 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051f600 len=78
+ __dev_queue_xmit ([kernel.kallsyms])
+ 274.007 :0/0 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051ff00 len=78
+ __dev_queue_xmit ([kernel.kallsyms])
+ 2930.140 kworker/u16:58/2722 block:block_plug:[kworker/u16:58]
+ 2930.162 kworker/u16:58/2722 block:block_unplug:[kworker/u16:58] 1
+ 4466.094 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
+ 8050.123 kworker/u16:30/2694 block:block_plug:[kworker/u16:30]
+ 8050.271 kworker/u16:30/2694 block:block_plug:[kworker/u16:30]
+ #
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script[1]
endif
endif
+ifeq ($(feature-get_current_dir_name), 1)
+ CFLAGS += -DHAVE_GET_CURRENT_DIR_NAME
+endif
+
+
ifdef NO_LIBELF
NO_DWARF := 1
NO_DEMANGLE := 1
include ../scripts/Makefile.include
+include ../scripts/Makefile.arch
# The default target of this Makefile is...
all:
SHELL = $(SHELL_PATH)
linux_uapi_dir := $(srctree)/tools/include/uapi/linux
+asm_generic_uapi_dir := $(srctree)/tools/include/uapi/asm-generic
+arch_asm_uapi_dir := $(srctree)/tools/arch/$(SRCARCH)/include/uapi/asm/
beauty_outdir := $(OUTPUT)trace/beauty/generated
beauty_ioctl_outdir := $(beauty_outdir)/ioctl
$(madvise_behavior_array): $(madvise_hdr_dir)/mman-common.h $(madvise_behavior_tbl)
$(Q)$(SHELL) '$(madvise_behavior_tbl)' $(madvise_hdr_dir) > $@
+mmap_flags_array := $(beauty_outdir)/mmap_flags_array.c
+mmap_flags_tbl := $(srctree)/tools/perf/trace/beauty/mmap_flags.sh
+
+$(mmap_flags_array): $(asm_generic_uapi_dir)/mman.h $(asm_generic_uapi_dir)/mman-common.h $(arch_asm_uapi_dir)/mman.h $(mmap_flags_tbl)
+ $(Q)$(SHELL) '$(mmap_flags_tbl)' $(asm_generic_uapi_dir) $(arch_asm_uapi_dir) > $@
+
+mount_flags_array := $(beauty_outdir)/mount_flags_array.c
+mount_flags_tbl := $(srctree)/tools/perf/trace/beauty/mount_flags.sh
+
+$(mount_flags_array): $(linux_uapi_dir)/fs.h $(mount_flags_tbl)
+ $(Q)$(SHELL) '$(mount_flags_tbl)' $(linux_uapi_dir) > $@
+
prctl_option_array := $(beauty_outdir)/prctl_option_array.c
prctl_hdr_dir := $(srctree)/tools/include/uapi/linux/
prctl_option_tbl := $(srctree)/tools/perf/trace/beauty/prctl_option.sh
$(socket_ipproto_array) \
$(vhost_virtio_ioctl_array) \
$(madvise_behavior_array) \
+ $(mmap_flags_array) \
+ $(mount_flags_array) \
$(perf_ioctl_array) \
$(prctl_option_array) \
$(arch_errno_name_array)
$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c \
$(OUTPUT)pmu-events/pmu-events.c \
$(OUTPUT)$(madvise_behavior_array) \
+ $(OUTPUT)$(mmap_flags_array) \
+ $(OUTPUT)$(mount_flags_array) \
$(OUTPUT)$(drm_ioctl_array) \
$(OUTPUT)$(pkey_alloc_access_rights_array) \
$(OUTPUT)$(sndrv_ctl_ioctl_array) \
{
local sc nr last_sc
- create_table_exe=`mktemp /tmp/create-table-XXXXXX`
+ create_table_exe=`mktemp ${TMPDIR:-/tmp}/create-table-XXXXXX`
{
ifndef NO_DWARF
PERF_HAVE_DWARF_REGS := 1
endif
+
+PERF_HAVE_JITDUMP := 1
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+static int is_branch_cond(const char *cond)
+{
+ if (cond[0] == '\0')
+ return 1;
+
+ if (cond[0] == 'a' && cond[1] == '\0')
+ return 1;
+
+ if (cond[0] == 'c' &&
+ (cond[1] == 'c' || cond[1] == 's') &&
+ cond[2] == '\0')
+ return 1;
+
+ if (cond[0] == 'e' &&
+ (cond[1] == '\0' ||
+ (cond[1] == 'q' && cond[2] == '\0')))
+ return 1;
+
+ if (cond[0] == 'g' &&
+ (cond[1] == '\0' ||
+ (cond[1] == 't' && cond[2] == '\0') ||
+ (cond[1] == 'e' && cond[2] == '\0') ||
+ (cond[1] == 'e' && cond[2] == 'u' && cond[3] == '\0')))
+ return 1;
+
+ if (cond[0] == 'l' &&
+ (cond[1] == '\0' ||
+ (cond[1] == 't' && cond[2] == '\0') ||
+ (cond[1] == 'u' && cond[2] == '\0') ||
+ (cond[1] == 'e' && cond[2] == '\0') ||
+ (cond[1] == 'e' && cond[2] == 'u' && cond[3] == '\0')))
+ return 1;
+
+ if (cond[0] == 'n' &&
+ (cond[1] == '\0' ||
+ (cond[1] == 'e' && cond[2] == '\0') ||
+ (cond[1] == 'z' && cond[2] == '\0') ||
+ (cond[1] == 'e' && cond[2] == 'g' && cond[3] == '\0')))
+ return 1;
+
+ if (cond[0] == 'b' &&
+ cond[1] == 'p' &&
+ cond[2] == 'o' &&
+ cond[3] == 's' &&
+ cond[4] == '\0')
+ return 1;
+
+ if (cond[0] == 'v' &&
+ (cond[1] == 'c' || cond[1] == 's') &&
+ cond[2] == '\0')
+ return 1;
+
+ if (cond[0] == 'b' &&
+ cond[1] == 'z' &&
+ cond[2] == '\0')
+ return 1;
+
+ return 0;
+}
+
+static int is_branch_reg_cond(const char *cond)
+{
+ if ((cond[0] == 'n' || cond[0] == 'l') &&
+ cond[1] == 'z' &&
+ cond[2] == '\0')
+ return 1;
+
+ if (cond[0] == 'z' &&
+ cond[1] == '\0')
+ return 1;
+
+ if ((cond[0] == 'g' || cond[0] == 'l') &&
+ cond[1] == 'e' &&
+ cond[2] == 'z' &&
+ cond[3] == '\0')
+ return 1;
+
+ if (cond[0] == 'g' &&
+ cond[1] == 'z' &&
+ cond[2] == '\0')
+ return 1;
+
+ return 0;
+}
+
+static int is_branch_float_cond(const char *cond)
+{
+ if (cond[0] == '\0')
+ return 1;
+
+ if ((cond[0] == 'a' || cond[0] == 'e' ||
+ cond[0] == 'z' || cond[0] == 'g' ||
+ cond[0] == 'l' || cond[0] == 'n' ||
+ cond[0] == 'o' || cond[0] == 'u') &&
+ cond[1] == '\0')
+ return 1;
+
+ if (((cond[0] == 'g' && cond[1] == 'e') ||
+ (cond[0] == 'l' && (cond[1] == 'e' ||
+ cond[1] == 'g')) ||
+ (cond[0] == 'n' && (cond[1] == 'e' ||
+ cond[1] == 'z')) ||
+ (cond[0] == 'u' && (cond[1] == 'e' ||
+ cond[1] == 'g' ||
+ cond[1] == 'l'))) &&
+ cond[2] == '\0')
+ return 1;
+
+ if (cond[0] == 'u' &&
+ (cond[1] == 'g' || cond[1] == 'l') &&
+ cond[2] == 'e' &&
+ cond[3] == '\0')
+ return 1;
+
+ return 0;
+}
+
+static struct ins_ops *sparc__associate_instruction_ops(struct arch *arch, const char *name)
+{
+ struct ins_ops *ops = NULL;
+
+ if (!strcmp(name, "call") ||
+ !strcmp(name, "jmp") ||
+ !strcmp(name, "jmpl")) {
+ ops = &call_ops;
+ } else if (!strcmp(name, "ret") ||
+ !strcmp(name, "retl") ||
+ !strcmp(name, "return")) {
+ ops = &ret_ops;
+ } else if (!strcmp(name, "mov")) {
+ ops = &mov_ops;
+ } else {
+ if (name[0] == 'c' &&
+ (name[1] == 'w' || name[1] == 'x'))
+ name += 2;
+
+ if (name[0] == 'b') {
+ const char *cond = name + 1;
+
+ if (cond[0] == 'r') {
+ if (is_branch_reg_cond(cond + 1))
+ ops = &jump_ops;
+ } else if (is_branch_cond(cond)) {
+ ops = &jump_ops;
+ }
+ } else if (name[0] == 'f' && name[1] == 'b') {
+ if (is_branch_float_cond(name + 2))
+ ops = &jump_ops;
+ }
+ }
+
+ if (ops)
+ arch__associate_ins_ops(arch, name, ops);
+
+ return ops;
+}
+
+static int sparc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
+{
+ if (!arch->initialized) {
+ arch->initialized = true;
+ arch->associate_instruction_ops = sparc__associate_instruction_ops;
+ arch->objdump.comment_char = '#';
+ }
+
+ return 0;
+}
ui__warning("%s\n", msg);
goto try_again;
}
-
+ if ((errno == EINVAL || errno == EBADF) &&
+ pos->leader != pos &&
+ pos->weak_group) {
+ pos = perf_evlist__reset_weak_group(evlist, pos);
+ goto try_again;
+ }
rc = -errno;
perf_evsel__open_strerror(pos, &opts->target,
errno, msg, sizeof(msg));
if (!rec->opts.full_auxtrace)
perf_header__clear_feat(&session->header, HEADER_AUXTRACE);
+ if (!(rec->opts.use_clockid && rec->opts.clockid_res_ns))
+ perf_header__clear_feat(&session->header, HEADER_CLOCKID);
+
perf_header__clear_feat(&session->header, HEADER_STAT);
}
record__init_features(rec);
+ if (rec->opts.use_clockid && rec->opts.clockid_res_ns)
+ session->header.env.clockid_res_ns = rec->opts.clockid_res_ns;
+
if (forks) {
err = perf_evlist__prepare_workload(rec->evlist, &opts->target,
argv, data->is_pipe,
CLOCKID_END,
};
+static int get_clockid_res(clockid_t clk_id, u64 *res_ns)
+{
+ struct timespec res;
+
+ *res_ns = 0;
+ if (!clock_getres(clk_id, &res))
+ *res_ns = res.tv_nsec + res.tv_sec * NSEC_PER_SEC;
+ else
+ pr_warning("WARNING: Failed to determine specified clock resolution.\n");
+
+ return 0;
+}
+
static int parse_clockid(const struct option *opt, const char *str, int unset)
{
struct record_opts *opts = (struct record_opts *)opt->value;
/* if its a number, we're done */
if (sscanf(str, "%d", &opts->clockid) == 1)
- return 0;
+ return get_clockid_res(opts->clockid, &opts->clockid_res_ns);
/* allow a "CLOCK_" prefix to the name */
if (!strncasecmp(str, "CLOCK_", 6))
for (cm = clockids; cm->name; cm++) {
if (!strcasecmp(str, cm->name)) {
opts->clockid = cm->clockid;
- return 0;
+ return get_clockid_res(opts->clockid,
+ &opts->clockid_res_ns);
}
}
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
+#include <subcmd/pager.h>
#include "sane_ctype.h"
static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en,
struct perf_insn *x, u8 *inbuf, int len,
- int insn, FILE *fp)
+ int insn, FILE *fp, int *total_cycles)
{
int printed = fprintf(fp, "\t%016" PRIx64 "\t%-30s\t#%s%s%s%s", ip,
dump_insn(x, ip, inbuf, len, NULL),
en->flags.in_tx ? " INTX" : "",
en->flags.abort ? " ABORT" : "");
if (en->flags.cycles) {
- printed += fprintf(fp, " %d cycles", en->flags.cycles);
+ *total_cycles += en->flags.cycles;
+ printed += fprintf(fp, " %d cycles [%d]", en->flags.cycles, *total_cycles);
if (insn)
printed += fprintf(fp, " %.2f IPC", (float)insn / en->flags.cycles);
}
u8 buffer[MAXBB];
unsigned off;
struct symbol *lastsym = NULL;
+ int total_cycles = 0;
if (!(br && br->nr))
return 0;
printed += ip__fprintf_sym(br->entries[nr - 1].from, thread,
x.cpumode, x.cpu, &lastsym, attr, fp);
printed += ip__fprintf_jump(br->entries[nr - 1].from, &br->entries[nr - 1],
- &x, buffer, len, 0, fp);
+ &x, buffer, len, 0, fp, &total_cycles);
}
/* Print all blocks */
printed += ip__fprintf_sym(ip, thread, x.cpumode, x.cpu, &lastsym, attr, fp);
if (ip == end) {
- printed += ip__fprintf_jump(ip, &br->entries[i], &x, buffer + off, len - off, insn, fp);
+ printed += ip__fprintf_jump(ip, &br->entries[i], &x, buffer + off, len - off, insn, fp,
+ &total_cycles);
break;
} else {
printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", ip,
return printed;
}
+static const char *resolve_branch_sym(struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct thread *thread,
+ struct addr_location *al,
+ u64 *ip)
+{
+ struct addr_location addr_al;
+ struct perf_event_attr *attr = &evsel->attr;
+ const char *name = NULL;
+
+ if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
+ if (sample_addr_correlates_sym(attr)) {
+ thread__resolve(thread, &addr_al, sample);
+ if (addr_al.sym)
+ name = addr_al.sym->name;
+ else
+ *ip = sample->addr;
+ } else {
+ *ip = sample->addr;
+ }
+ } else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
+ if (al->sym)
+ name = al->sym->name;
+ else
+ *ip = sample->ip;
+ }
+ return name;
+}
+
static int perf_sample__fprintf_callindent(struct perf_sample *sample,
struct perf_evsel *evsel,
struct thread *thread,
{
struct perf_event_attr *attr = &evsel->attr;
size_t depth = thread_stack__depth(thread);
- struct addr_location addr_al;
const char *name = NULL;
static int spacing;
int len = 0;
if (thread->ts && sample->flags & PERF_IP_FLAG_RETURN)
depth += 1;
- if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
- if (sample_addr_correlates_sym(attr)) {
- thread__resolve(thread, &addr_al, sample);
- if (addr_al.sym)
- name = addr_al.sym->name;
- else
- ip = sample->addr;
- } else {
- ip = sample->addr;
- }
- } else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
- if (al->sym)
- name = al->sym->name;
- else
- ip = sample->ip;
- }
+ name = resolve_branch_sym(sample, evsel, thread, al, &ip);
if (PRINT_FIELD(DSO) && !(PRINT_FIELD(IP) || PRINT_FIELD(ADDR))) {
dlen += fprintf(fp, "(");
}
}
+static bool show_event(struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct thread *thread,
+ struct addr_location *al)
+{
+ int depth = thread_stack__depth(thread);
+
+ if (!symbol_conf.graph_function)
+ return true;
+
+ if (thread->filter) {
+ if (depth <= thread->filter_entry_depth) {
+ thread->filter = false;
+ return false;
+ }
+ return true;
+ } else {
+ const char *s = symbol_conf.graph_function;
+ u64 ip;
+ const char *name = resolve_branch_sym(sample, evsel, thread, al,
+ &ip);
+ unsigned nlen;
+
+ if (!name)
+ return false;
+ nlen = strlen(name);
+ while (*s) {
+ unsigned len = strcspn(s, ",");
+ if (nlen == len && !strncmp(name, s, len)) {
+ thread->filter = true;
+ thread->filter_entry_depth = depth;
+ return true;
+ }
+ s += len;
+ if (*s == ',')
+ s++;
+ }
+ return false;
+ }
+}
+
static void process_event(struct perf_script *script,
struct perf_sample *sample, struct perf_evsel *evsel,
struct addr_location *al,
if (output[type].fields == 0)
return;
+ if (!show_event(sample, evsel, thread, al))
+ return;
+
++es->samples;
perf_sample__fprintf_start(sample, thread, evsel,
if (PRINT_FIELD(METRIC))
perf_sample__fprint_metric(script, thread, evsel, sample, fp);
+
+ if (verbose)
+ fflush(fp);
}
static struct scripting_ops *scripting_ops;
#define perf_script__process_auxtrace_info 0
#endif
+static int parse_insn_trace(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ parse_output_fields(NULL, "+insn,-event,-period", 0);
+ itrace_parse_synth_opts(opt, "i0ns", 0);
+ nanosecs = true;
+ return 0;
+}
+
+static int parse_xed(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ force_pager("xed -F insn: -A -64 | less");
+ return 0;
+}
+
+static int parse_call_trace(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent", 0);
+ itrace_parse_synth_opts(opt, "cewp", 0);
+ nanosecs = true;
+ return 0;
+}
+
+static int parse_callret_trace(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent,+flags", 0);
+ itrace_parse_synth_opts(opt, "crewp", 0);
+ nanosecs = true;
+ return 0;
+}
+
int cmd_script(int argc, const char **argv)
{
bool show_full_info = false;
char *rec_script_path = NULL;
char *rep_script_path = NULL;
struct perf_session *session;
- struct itrace_synth_opts itrace_synth_opts = { .set = false, };
+ struct itrace_synth_opts itrace_synth_opts = {
+ .set = false,
+ .default_no_sample = true,
+ };
char *script_path = NULL;
const char **__argv;
int i, j, err = 0;
"system-wide collection from all CPUs"),
OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
"only consider these symbols"),
+ OPT_CALLBACK_OPTARG(0, "insn-trace", &itrace_synth_opts, NULL, NULL,
+ "Decode instructions from itrace", parse_insn_trace),
+ OPT_CALLBACK_OPTARG(0, "xed", NULL, NULL, NULL,
+ "Run xed disassembler on output", parse_xed),
+ OPT_CALLBACK_OPTARG(0, "call-trace", &itrace_synth_opts, NULL, NULL,
+ "Decode calls from from itrace", parse_call_trace),
+ OPT_CALLBACK_OPTARG(0, "call-ret-trace", &itrace_synth_opts, NULL, NULL,
+ "Decode calls and returns from itrace", parse_callret_trace),
+ OPT_STRING(0, "graph-function", &symbol_conf.graph_function, "symbol[,symbol...]",
+ "Only print symbols and callees with --call-trace/--call-ret-trace"),
OPT_STRING(0, "stop-bt", &symbol_conf.bt_stop_list_str, "symbol[,symbol...]",
"Stop display of callgraph at these symbols"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
exit(-1);
}
- if (!script_name)
+ if (!script_name) {
setup_pager();
+ use_browser = 0;
+ }
session = perf_session__new(&data, false, &script.tool);
if (session == NULL)
script.session = session;
script__setup_sample_type(&script);
- if (output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT)
+ if ((output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT) ||
+ symbol_conf.graph_function)
itrace_synth_opts.thread_stack = true;
session->itrace_synth_opts = &itrace_synth_opts;
return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
}
-static struct perf_evsel *perf_evsel__reset_weak_group(struct perf_evsel *evsel)
+static bool is_target_alive(struct target *_target,
+ struct thread_map *threads)
{
- struct perf_evsel *c2, *leader;
- bool is_open = true;
+ struct stat st;
+ int i;
- leader = evsel->leader;
- pr_debug("Weak group for %s/%d failed\n",
- leader->name, leader->nr_members);
+ if (!target__has_task(_target))
+ return true;
- /*
- * for_each_group_member doesn't work here because it doesn't
- * include the first entry.
- */
- evlist__for_each_entry(evsel_list, c2) {
- if (c2 == evsel)
- is_open = false;
- if (c2->leader == leader) {
- if (is_open)
- perf_evsel__close(c2);
- c2->leader = c2;
- c2->nr_members = 0;
- }
+ for (i = 0; i < threads->nr; i++) {
+ char path[PATH_MAX];
+
+ scnprintf(path, PATH_MAX, "%s/%d", procfs__mountpoint(),
+ threads->map[i].pid);
+
+ if (!stat(path, &st))
+ return true;
}
- return leader;
+
+ return false;
}
static int __run_perf_stat(int argc, const char **argv, int run_idx)
if ((errno == EINVAL || errno == EBADF) &&
counter->leader != counter &&
counter->weak_group) {
- counter = perf_evsel__reset_weak_group(counter);
+ counter = perf_evlist__reset_weak_group(evsel_list, counter);
goto try_again;
}
enable_counters();
while (!done) {
nanosleep(&ts, NULL);
+ if (!is_target_alive(&target, evsel_list->threads))
+ break;
if (timeout)
break;
if (interval) {
if (!target__none(&opts->target))
perf_evlist__enable(top->evlist);
- /* Wait for a minimal set of events before starting the snapshot */
- perf_evlist__poll(top->evlist, 100);
-
- perf_top__mmap_read(top);
-
ret = -1;
if (pthread_create(&thread, NULL, (use_browser > 0 ? display_thread_tui :
display_thread), top)) {
}
}
+ /* Wait for a minimal set of events before starting the snapshot */
+ perf_evlist__poll(top->evlist, 100);
+
+ perf_top__mmap_read(top);
+
while (!done) {
u64 hits = top->samples;
.uses_mmap = true,
},
.proc_map_timeout = 500,
- .overwrite = 1,
+ /*
+ * FIXME: This will lose PERF_RECORD_MMAP and other metadata
+ * when we pause, fix that and reenable. Probably using a
+ * separate evlist with a dummy event, i.e. a non-overwrite
+ * ring buffer just for metadata events, while PERF_RECORD_SAMPLE
+ * stays in overwrite mode. -acme
+ * */
+ .overwrite = 0,
},
.max_stack = sysctl__max_stack(),
.annotation_opts = annotation__default_options,
"Show raw trace event output (do not use print fmt or plugins)"),
OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
"Show entries in a hierarchy"),
+ OPT_BOOLEAN(0, "overwrite", &top.record_opts.overwrite,
+ "Use a backward ring buffer, default: no"),
OPT_BOOLEAN(0, "force", &symbol_conf.force, "don't complain, do it"),
OPT_UINTEGER(0, "num-thread-synthesize", &top.nr_threads_synthesize,
"number of thread to run event synthesize"),
}
}
+ if (opts->branch_stack && callchain_param.enabled)
+ symbol_conf.show_branchflag_count = true;
+
sort__mode = SORT_MODE__TOP;
/* display thread wants entries to be collapsed in a different tree */
perf_hpp_list.need_collapse = 1;
u64 base_time;
FILE *output;
unsigned long nr_events;
+ unsigned long nr_events_printed;
+ unsigned long max_events;
struct strlist *ev_qualifier;
struct {
size_t nr;
} stats;
unsigned int max_stack;
unsigned int min_stack;
+ bool raw_augmented_syscalls;
bool not_ev_qualifier;
bool live;
bool full_time;
struct syscall_arg_fmt {
size_t (*scnprintf)(char *bf, size_t size, struct syscall_arg *arg);
+ unsigned long (*mask_val)(struct syscall_arg *arg, unsigned long val);
void *parm;
const char *name;
bool show_zero;
.arg = { [0] = { .scnprintf = SCA_HEX, /* addr */ },
[2] = { .scnprintf = SCA_MMAP_PROT, /* prot */ },
[3] = { .scnprintf = SCA_MMAP_FLAGS, /* flags */ }, }, },
+ { .name = "mount",
+ .arg = { [0] = { .scnprintf = SCA_FILENAME, /* dev_name */ },
+ [3] = { .scnprintf = SCA_MOUNT_FLAGS, /* flags */
+ .mask_val = SCAMV_MOUNT_FLAGS, /* flags */ }, }, },
{ .name = "mprotect",
.arg = { [0] = { .scnprintf = SCA_HEX, /* start */ },
[2] = { .scnprintf = SCA_MMAP_PROT, /* prot */ }, }, },
.arg = { [2] = { .scnprintf = SCA_SIGNUM, /* sig */ }, }, },
{ .name = "tkill",
.arg = { [1] = { .scnprintf = SCA_SIGNUM, /* sig */ }, }, },
- { .name = "umount2", .alias = "umount", },
+ { .name = "umount2", .alias = "umount",
+ .arg = { [0] = { .scnprintf = SCA_FILENAME, /* name */ }, }, },
{ .name = "uname", .alias = "newuname", },
{ .name = "unlinkat",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ }, }, },
return bsearch(name, syscall_fmts, nmemb, sizeof(struct syscall_fmt), syscall_fmt__cmp);
}
+static struct syscall_fmt *syscall_fmt__find_by_alias(const char *alias)
+{
+ int i, nmemb = ARRAY_SIZE(syscall_fmts);
+
+ for (i = 0; i < nmemb; ++i) {
+ if (syscall_fmts[i].alias && strcmp(syscall_fmts[i].alias, alias) == 0)
+ return &syscall_fmts[i];
+ }
+
+ return NULL;
+}
+
/*
* is_exit: is this "exit" or "exit_group"?
* is_open: is this "open" or "openat"? To associate the fd returned in sys_exit with the pathname in sys_enter.
return scnprintf(bf, size, "arg%d: ", arg->idx);
}
+/*
+ * Check if the value is in fact zero, i.e. mask whatever needs masking, such
+ * as mount 'flags' argument that needs ignoring some magic flag, see comment
+ * in tools/perf/trace/beauty/mount_flags.c
+ */
+static unsigned long syscall__mask_val(struct syscall *sc, struct syscall_arg *arg, unsigned long val)
+{
+ if (sc->arg_fmt && sc->arg_fmt[arg->idx].mask_val)
+ return sc->arg_fmt[arg->idx].mask_val(arg, val);
+
+ return val;
+}
+
static size_t syscall__scnprintf_val(struct syscall *sc, char *bf, size_t size,
struct syscall_arg *arg, unsigned long val)
{
continue;
val = syscall_arg__val(&arg, arg.idx);
+ /*
+ * Some syscall args need some mask, most don't and
+ * return val untouched.
+ */
+ val = syscall__mask_val(sc, &arg, val);
/*
* Suppress this argument if its value is zero and
printed += fprintf(trace->output, "%-70s) ...\n", ttrace->entry_str);
ttrace->entry_pending = false;
+ ++trace->nr_events_printed;
+
return printed;
}
return printed;
}
-static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sample, int *augmented_args_size)
+static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sample, int *augmented_args_size, bool raw_augmented)
{
void *augmented_args = NULL;
+ /*
+ * For now with BPF raw_augmented we hook into raw_syscalls:sys_enter
+ * and there we get all 6 syscall args plus the tracepoint common
+ * fields (sizeof(long)) and the syscall_nr (another long). So we check
+ * if that is the case and if so don't look after the sc->args_size,
+ * but always after the full raw_syscalls:sys_enter payload, which is
+ * fixed.
+ *
+ * We'll revisit this later to pass s->args_size to the BPF augmenter
+ * (now tools/perf/examples/bpf/augmented_raw_syscalls.c, so that it
+ * copies only what we need for each syscall, like what happens when we
+ * use syscalls:sys_enter_NAME, so that we reduce the kernel/userspace
+ * traffic to just what is needed for each syscall.
+ */
+ int args_size = raw_augmented ? (8 * (int)sizeof(long)) : sc->args_size;
- *augmented_args_size = sample->raw_size - sc->args_size;
+ *augmented_args_size = sample->raw_size - args_size;
if (*augmented_args_size > 0)
- augmented_args = sample->raw_data + sc->args_size;
+ augmented_args = sample->raw_data + args_size;
return augmented_args;
}
* here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
*/
if (evsel != trace->syscalls.events.sys_enter)
- augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size);
+ augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls);
ttrace->entry_time = sample->time;
msg = ttrace->entry_str;
printed += scnprintf(msg + printed, trace__entry_str_size - printed, "%s(", sc->name);
goto out_put;
args = perf_evsel__sc_tp_ptr(evsel, args, sample);
- augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size);
+ augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls);
syscall__scnprintf_args(sc, msg, sizeof(msg), args, augmented_args, augmented_args_size, trace, thread);
fprintf(trace->output, "%s", msg);
err = 0;
int max_stack = evsel->attr.sample_max_stack ?
evsel->attr.sample_max_stack :
trace->max_stack;
+ int err;
- if (machine__resolve(trace->host, &al, sample) < 0 ||
- thread__resolve_callchain(al.thread, cursor, evsel, sample, NULL, NULL, max_stack))
+ if (machine__resolve(trace->host, &al, sample) < 0)
return -1;
- return 0;
+ err = thread__resolve_callchain(al.thread, cursor, evsel, sample, NULL, NULL, max_stack);
+ addr_location__put(&al);
+ return err;
}
static int trace__fprintf_callchain(struct trace *trace, struct perf_sample *sample)
fputc('\n', trace->output);
+ /*
+ * We only consider an 'event' for the sake of --max-events a non-filtered
+ * sys_enter + sys_exit and other tracepoint events.
+ */
+ if (++trace->nr_events_printed == trace->max_events && trace->max_events != ULONG_MAX)
+ interrupted = true;
+
if (callchain_ret > 0)
trace__fprintf_callchain(trace, sample);
else if (callchain_ret < 0)
{
binary__fprintf(sample->raw_data, sample->raw_size, 8,
bpf_output__printer, NULL, trace->output);
+ ++trace->nr_events_printed;
}
static int trace__event_handler(struct trace *trace, struct perf_evsel *evsel,
union perf_event *event __maybe_unused,
struct perf_sample *sample)
{
- struct thread *thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ struct thread *thread;
int callchain_ret = 0;
+ /*
+ * Check if we called perf_evsel__disable(evsel) due to, for instance,
+ * this event's max_events having been hit and this is an entry coming
+ * from the ring buffer that we should discard, since the max events
+ * have already been considered/printed.
+ */
+ if (evsel->disabled)
+ return 0;
+
+ thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
if (sample->callchain) {
callchain_ret = trace__resolve_callchain(trace, evsel, sample, &callchain_cursor);
event_format__fprintf(evsel->tp_format, sample->cpu,
sample->raw_data, sample->raw_size,
trace->output);
+ ++trace->nr_events_printed;
+
+ if (evsel->max_events != ULONG_MAX && ++evsel->nr_events_printed == evsel->max_events) {
+ perf_evsel__disable(evsel);
+ perf_evsel__close(evsel);
+ }
}
}
trace__fprintf_callchain(trace, sample);
else if (callchain_ret < 0)
pr_err("Problem processing %s callchain, skipping...\n", perf_evsel__name(evsel));
- thread__put(thread);
out:
+ thread__put(thread);
return 0;
}
trace__fprintf_callchain(trace, sample);
else if (callchain_ret < 0)
pr_err("Problem processing %s callchain, skipping...\n", perf_evsel__name(evsel));
+
+ ++trace->nr_events_printed;
out:
err = 0;
out_put:
tracepoint_handler handler = evsel->handler;
handler(trace, evsel, event, sample);
}
+
+ if (trace->nr_events_printed >= trace->max_events && trace->max_events != ULONG_MAX)
+ interrupted = true;
}
static int trace__add_syscall_newtp(struct trace *trace)
int timeout = done ? 100 : -1;
if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
- if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
+ if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP | POLLNVAL) == 0)
draining = true;
goto again;
int len = strlen(str) + 1, err = -1, list, idx;
char *strace_groups_dir = system_path(STRACE_GROUPS_DIR);
char group_name[PATH_MAX];
+ struct syscall_fmt *fmt;
if (strace_groups_dir == NULL)
return -1;
if (syscalltbl__id(trace->sctbl, s) >= 0 ||
syscalltbl__strglobmatch_first(trace->sctbl, s, &idx) >= 0) {
list = 1;
+ goto do_concat;
+ }
+
+ fmt = syscall_fmt__find_by_alias(s);
+ if (fmt != NULL) {
+ list = 1;
+ s = fmt->name;
} else {
path__join(group_name, sizeof(group_name), strace_groups_dir, s);
if (access(group_name, R_OK) == 0)
list = 1;
}
-
+do_concat:
if (lists[list]) {
sprintf(lists[list] + strlen(lists[list]), ",%s", s);
} else {
.trace_syscalls = false,
.kernel_syscallchains = false,
.max_stack = UINT_MAX,
+ .max_events = ULONG_MAX,
};
const char *output_name = NULL;
const struct option trace_options[] = {
&record_parse_callchain_opt),
OPT_BOOLEAN(0, "kernel-syscall-graph", &trace.kernel_syscallchains,
"Show the kernel callchains on the syscall exit path"),
+ OPT_ULONG(0, "max-events", &trace.max_events,
+ "Set the maximum number of events to print, exit after that is reached. "),
OPT_UINTEGER(0, "min-stack", &trace.min_stack,
"Set the minimum stack depth when parsing the callchain, "
"anything below the specified depth will be ignored."),
evsel->handler = trace__sys_enter;
evlist__for_each_entry(trace.evlist, evsel) {
+ bool raw_syscalls_sys_exit = strcmp(perf_evsel__name(evsel), "raw_syscalls:sys_exit") == 0;
+
+ if (raw_syscalls_sys_exit) {
+ trace.raw_augmented_syscalls = true;
+ goto init_augmented_syscall_tp;
+ }
+
if (strstarts(perf_evsel__name(evsel), "syscalls:sys_exit_")) {
+init_augmented_syscall_tp:
perf_evsel__init_augmented_syscall_tp(evsel);
perf_evsel__init_augmented_syscall_tp_ret(evsel);
evsel->handler = trace__sys_exit;
include/uapi/drm/drm.h
include/uapi/drm/i915_drm.h
include/uapi/linux/fcntl.h
+include/uapi/linux/fs.h
include/uapi/linux/kcmp.h
include/uapi/linux/kvm.h
include/uapi/linux/in.h
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Augment the raw_syscalls tracepoints with the contents of the pointer arguments.
+ *
+ * Test it with:
+ *
+ * perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c cat /etc/passwd > /dev/null
+ *
+ * This exactly matches what is marshalled into the raw_syscall:sys_enter
+ * payload expected by the 'perf trace' beautifiers.
+ *
+ * For now it just uses the existing tracepoint augmentation code in 'perf
+ * trace', in the next csets we'll hook up these with the sys_enter/sys_exit
+ * code that will combine entry/exit in a strace like way.
+ */
+
+#include <stdio.h>
+#include <linux/socket.h>
+
+/* bpf-output associated map */
+struct bpf_map SEC("maps") __augmented_syscalls__ = {
+ .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+ .key_size = sizeof(int),
+ .value_size = sizeof(u32),
+ .max_entries = __NR_CPUS__,
+};
+
+struct syscall_enter_args {
+ unsigned long long common_tp_fields;
+ long syscall_nr;
+ unsigned long args[6];
+};
+
+struct syscall_exit_args {
+ unsigned long long common_tp_fields;
+ long syscall_nr;
+ long ret;
+};
+
+struct augmented_filename {
+ unsigned int size;
+ int reserved;
+ char value[256];
+};
+
+#define SYS_OPEN 2
+#define SYS_OPENAT 257
+
+SEC("raw_syscalls:sys_enter")
+int sys_enter(struct syscall_enter_args *args)
+{
+ struct {
+ struct syscall_enter_args args;
+ struct augmented_filename filename;
+ } augmented_args;
+ unsigned int len = sizeof(augmented_args);
+ const void *filename_arg = NULL;
+
+ probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
+ /*
+ * Yonghong and Edward Cree sayz:
+ *
+ * https://www.spinics.net/lists/netdev/msg531645.html
+ *
+ * >> R0=inv(id=0) R1=inv2 R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1
+ * >> 10: (bf) r1 = r6
+ * >> 11: (07) r1 += 16
+ * >> 12: (05) goto pc+2
+ * >> 15: (79) r3 = *(u64 *)(r1 +0)
+ * >> dereference of modified ctx ptr R1 off=16 disallowed
+ * > Aha, we at least got a different error message this time.
+ * > And indeed llvm has done that optimisation, rather than the more obvious
+ * > 11: r3 = *(u64 *)(r1 +16)
+ * > because it wants to have lots of reads share a single insn. You may be able
+ * > to defeat that optimisation by adding compiler barriers, idk. Maybe someone
+ * > with llvm knowledge can figure out how to stop it (ideally, llvm would know
+ * > when it's generating for bpf backend and not do that). -O0? ¯\_(ツ)_/¯
+ *
+ * The optimization mostly likes below:
+ *
+ * br1:
+ * ...
+ * r1 += 16
+ * goto merge
+ * br2:
+ * ...
+ * r1 += 20
+ * goto merge
+ * merge:
+ * *(u64 *)(r1 + 0)
+ *
+ * The compiler tries to merge common loads. There is no easy way to
+ * stop this compiler optimization without turning off a lot of other
+ * optimizations. The easiest way is to add barriers:
+ *
+ * __asm__ __volatile__("": : :"memory")
+ *
+ * after the ctx memory access to prevent their down stream merging.
+ */
+ switch (augmented_args.args.syscall_nr) {
+ case SYS_OPEN: filename_arg = (const void *)args->args[0];
+ __asm__ __volatile__("": : :"memory");
+ break;
+ case SYS_OPENAT: filename_arg = (const void *)args->args[1];
+ break;
+ }
+
+ if (filename_arg != NULL) {
+ augmented_args.filename.reserved = 0;
+ augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
+ sizeof(augmented_args.filename.value),
+ filename_arg);
+ if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
+ len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
+ len &= sizeof(augmented_args.filename.value) - 1;
+ }
+ } else {
+ len = sizeof(augmented_args.args);
+ }
+
+ perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
+ return 0;
+}
+
+SEC("raw_syscalls:sys_exit")
+int sys_exit(struct syscall_exit_args *args)
+{
+ return 1; /* 0 as soon as we start copying data returned by the kernel, e.g. 'read' */
+}
+
+license(GPL);
}
static int
-debug_cache_init(void)
+create_jit_cache_dir(void)
{
char str[32];
char *base, *p;
strftime(str, sizeof(str), JIT_LANG"-jit-%Y%m%d", &tm);
- snprintf(jit_path, PATH_MAX - 1, "%s/.debug/", base);
-
+ ret = snprintf(jit_path, PATH_MAX, "%s/.debug/", base);
+ if (ret >= PATH_MAX) {
+ warnx("jvmti: cannot generate jit cache dir because %s/.debug/"
+ " is too long, please check the cwd, JITDUMPDIR, and"
+ " HOME variables", base);
+ return -1;
+ }
ret = mkdir(jit_path, 0755);
if (ret == -1) {
if (errno != EEXIST) {
}
}
- snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit", base);
+ ret = snprintf(jit_path, PATH_MAX, "%s/.debug/jit", base);
+ if (ret >= PATH_MAX) {
+ warnx("jvmti: cannot generate jit cache dir because"
+ " %s/.debug/jit is too long, please check the cwd,"
+ " JITDUMPDIR, and HOME variables", base);
+ return -1;
+ }
ret = mkdir(jit_path, 0755);
if (ret == -1) {
if (errno != EEXIST) {
- warn("cannot create jit cache dir %s", jit_path);
+ warn("jvmti: cannot create jit cache dir %s", jit_path);
return -1;
}
}
- snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit/%s.XXXXXXXX", base, str);
-
+ ret = snprintf(jit_path, PATH_MAX, "%s/.debug/jit/%s.XXXXXXXX", base, str);
+ if (ret >= PATH_MAX) {
+ warnx("jvmti: cannot generate jit cache dir because"
+ " %s/.debug/jit/%s.XXXXXXXX is too long, please check"
+ " the cwd, JITDUMPDIR, and HOME variables",
+ base, str);
+ return -1;
+ }
p = mkdtemp(jit_path);
if (p != jit_path) {
- warn("cannot create jit cache dir %s", jit_path);
+ warn("jvmti: cannot create jit cache dir %s", jit_path);
return -1;
}
{
char dump_path[PATH_MAX];
struct jitheader header;
- int fd;
+ int fd, ret;
FILE *fp;
init_arch_timestamp();
memset(&header, 0, sizeof(header));
- debug_cache_init();
+ /*
+ * jitdump file dir
+ */
+ if (create_jit_cache_dir() < 0)
+ return NULL;
/*
* jitdump file name
*/
- scnprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
+ ret = snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
+ if (ret >= PATH_MAX) {
+ warnx("jvmti: cannot generate jitdump file full path because"
+ " %s/jit-%i.dump is too long, please check the cwd,"
+ " JITDUMPDIR, and HOME variables", jit_path, getpid());
+ return NULL;
+ }
fd = open(dump_path, O_CREAT|O_TRUNC|O_RDWR, 0666);
if (fd == -1)
unsigned initial_delay;
bool use_clockid;
clockid_t clockid;
+ u64 clockid_res_ns;
unsigned int proc_map_timeout;
};
+++ /dev/null
-#!/usr/bin/python2
-# call-graph-from-sql.py: create call-graph from sql database
-# Copyright (c) 2014-2017, Intel Corporation.
-#
-# This program is free software; you can redistribute it and/or modify it
-# under the terms and conditions of the GNU General Public License,
-# version 2, as published by the Free Software Foundation.
-#
-# This program is distributed in the hope it will be useful, but WITHOUT
-# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
-# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
-# more details.
-
-# To use this script you will need to have exported data using either the
-# export-to-sqlite.py or the export-to-postgresql.py script. Refer to those
-# scripts for details.
-#
-# Following on from the example in the export scripts, a
-# call-graph can be displayed for the pt_example database like this:
-#
-# python tools/perf/scripts/python/call-graph-from-sql.py pt_example
-#
-# Note that for PostgreSQL, this script supports connecting to remote databases
-# by setting hostname, port, username, password, and dbname e.g.
-#
-# python tools/perf/scripts/python/call-graph-from-sql.py "hostname=myhost username=myuser password=mypassword dbname=pt_example"
-#
-# The result is a GUI window with a tree representing a context-sensitive
-# call-graph. Expanding a couple of levels of the tree and adjusting column
-# widths to suit will display something like:
-#
-# Call Graph: pt_example
-# Call Path Object Count Time(ns) Time(%) Branch Count Branch Count(%)
-# v- ls
-# v- 2638:2638
-# v- _start ld-2.19.so 1 10074071 100.0 211135 100.0
-# |- unknown unknown 1 13198 0.1 1 0.0
-# >- _dl_start ld-2.19.so 1 1400980 13.9 19637 9.3
-# >- _d_linit_internal ld-2.19.so 1 448152 4.4 11094 5.3
-# v-__libc_start_main@plt ls 1 8211741 81.5 180397 85.4
-# >- _dl_fixup ld-2.19.so 1 7607 0.1 108 0.1
-# >- __cxa_atexit libc-2.19.so 1 11737 0.1 10 0.0
-# >- __libc_csu_init ls 1 10354 0.1 10 0.0
-# |- _setjmp libc-2.19.so 1 0 0.0 4 0.0
-# v- main ls 1 8182043 99.6 180254 99.9
-#
-# Points to note:
-# The top level is a command name (comm)
-# The next level is a thread (pid:tid)
-# Subsequent levels are functions
-# 'Count' is the number of calls
-# 'Time' is the elapsed time until the function returns
-# Percentages are relative to the level above
-# 'Branch Count' is the total number of branches for that function and all
-# functions that it calls
-
-import sys
-from PySide.QtCore import *
-from PySide.QtGui import *
-from PySide.QtSql import *
-from decimal import *
-
-class TreeItem():
-
- def __init__(self, db, row, parent_item):
- self.db = db
- self.row = row
- self.parent_item = parent_item
- self.query_done = False;
- self.child_count = 0
- self.child_items = []
- self.data = ["", "", "", "", "", "", ""]
- self.comm_id = 0
- self.thread_id = 0
- self.call_path_id = 1
- self.branch_count = 0
- self.time = 0
- if not parent_item:
- self.setUpRoot()
-
- def setUpRoot(self):
- self.query_done = True
- query = QSqlQuery(self.db)
- ret = query.exec_('SELECT id, comm FROM comms')
- if not ret:
- raise Exception("Query failed: " + query.lastError().text())
- while query.next():
- if not query.value(0):
- continue
- child_item = TreeItem(self.db, self.child_count, self)
- self.child_items.append(child_item)
- self.child_count += 1
- child_item.setUpLevel1(query.value(0), query.value(1))
-
- def setUpLevel1(self, comm_id, comm):
- self.query_done = True;
- self.comm_id = comm_id
- self.data[0] = comm
- self.child_items = []
- self.child_count = 0
- query = QSqlQuery(self.db)
- ret = query.exec_('SELECT thread_id, ( SELECT pid FROM threads WHERE id = thread_id ), ( SELECT tid FROM threads WHERE id = thread_id ) FROM comm_threads WHERE comm_id = ' + str(comm_id))
- if not ret:
- raise Exception("Query failed: " + query.lastError().text())
- while query.next():
- child_item = TreeItem(self.db, self.child_count, self)
- self.child_items.append(child_item)
- self.child_count += 1
- child_item.setUpLevel2(comm_id, query.value(0), query.value(1), query.value(2))
-
- def setUpLevel2(self, comm_id, thread_id, pid, tid):
- self.comm_id = comm_id
- self.thread_id = thread_id
- self.data[0] = str(pid) + ":" + str(tid)
-
- def getChildItem(self, row):
- return self.child_items[row]
-
- def getParentItem(self):
- return self.parent_item
-
- def getRow(self):
- return self.row
-
- def timePercent(self, b):
- if not self.time:
- return "0.0"
- x = (b * Decimal(100)) / self.time
- return str(x.quantize(Decimal('.1'), rounding=ROUND_HALF_UP))
-
- def branchPercent(self, b):
- if not self.branch_count:
- return "0.0"
- x = (b * Decimal(100)) / self.branch_count
- return str(x.quantize(Decimal('.1'), rounding=ROUND_HALF_UP))
-
- def addChild(self, call_path_id, name, dso, count, time, branch_count):
- child_item = TreeItem(self.db, self.child_count, self)
- child_item.comm_id = self.comm_id
- child_item.thread_id = self.thread_id
- child_item.call_path_id = call_path_id
- child_item.branch_count = branch_count
- child_item.time = time
- child_item.data[0] = name
- if dso == "[kernel.kallsyms]":
- dso = "[kernel]"
- child_item.data[1] = dso
- child_item.data[2] = str(count)
- child_item.data[3] = str(time)
- child_item.data[4] = self.timePercent(time)
- child_item.data[5] = str(branch_count)
- child_item.data[6] = self.branchPercent(branch_count)
- self.child_items.append(child_item)
- self.child_count += 1
-
- def selectCalls(self):
- self.query_done = True;
- query = QSqlQuery(self.db)
- ret = query.exec_('SELECT id, call_path_id, branch_count, call_time, return_time, '
- '( SELECT name FROM symbols WHERE id = ( SELECT symbol_id FROM call_paths WHERE id = call_path_id ) ), '
- '( SELECT short_name FROM dsos WHERE id = ( SELECT dso_id FROM symbols WHERE id = ( SELECT symbol_id FROM call_paths WHERE id = call_path_id ) ) ), '
- '( SELECT ip FROM call_paths where id = call_path_id ) '
- 'FROM calls WHERE parent_call_path_id = ' + str(self.call_path_id) + ' AND comm_id = ' + str(self.comm_id) + ' AND thread_id = ' + str(self.thread_id) +
- ' ORDER BY call_path_id')
- if not ret:
- raise Exception("Query failed: " + query.lastError().text())
- last_call_path_id = 0
- name = ""
- dso = ""
- count = 0
- branch_count = 0
- total_branch_count = 0
- time = 0
- total_time = 0
- while query.next():
- if query.value(1) == last_call_path_id:
- count += 1
- branch_count += query.value(2)
- time += query.value(4) - query.value(3)
- else:
- if count:
- self.addChild(last_call_path_id, name, dso, count, time, branch_count)
- last_call_path_id = query.value(1)
- name = query.value(5)
- dso = query.value(6)
- count = 1
- total_branch_count += branch_count
- total_time += time
- branch_count = query.value(2)
- time = query.value(4) - query.value(3)
- if count:
- self.addChild(last_call_path_id, name, dso, count, time, branch_count)
- total_branch_count += branch_count
- total_time += time
- # Top level does not have time or branch count, so fix that here
- if total_branch_count > self.branch_count:
- self.branch_count = total_branch_count
- if self.branch_count:
- for child_item in self.child_items:
- child_item.data[6] = self.branchPercent(child_item.branch_count)
- if total_time > self.time:
- self.time = total_time
- if self.time:
- for child_item in self.child_items:
- child_item.data[4] = self.timePercent(child_item.time)
-
- def childCount(self):
- if not self.query_done:
- self.selectCalls()
- return self.child_count
-
- def columnCount(self):
- return 7
-
- def columnHeader(self, column):
- headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
- return headers[column]
-
- def getData(self, column):
- return self.data[column]
-
-class TreeModel(QAbstractItemModel):
-
- def __init__(self, db, parent=None):
- super(TreeModel, self).__init__(parent)
- self.db = db
- self.root = TreeItem(db, 0, None)
-
- def columnCount(self, parent):
- return self.root.columnCount()
-
- def rowCount(self, parent):
- if parent.isValid():
- parent_item = parent.internalPointer()
- else:
- parent_item = self.root
- return parent_item.childCount()
-
- def headerData(self, section, orientation, role):
- if role == Qt.TextAlignmentRole:
- if section > 1:
- return Qt.AlignRight
- if role != Qt.DisplayRole:
- return None
- if orientation != Qt.Horizontal:
- return None
- return self.root.columnHeader(section)
-
- def parent(self, child):
- child_item = child.internalPointer()
- if child_item is self.root:
- return QModelIndex()
- parent_item = child_item.getParentItem()
- return self.createIndex(parent_item.getRow(), 0, parent_item)
-
- def index(self, row, column, parent):
- if parent.isValid():
- parent_item = parent.internalPointer()
- else:
- parent_item = self.root
- child_item = parent_item.getChildItem(row)
- return self.createIndex(row, column, child_item)
-
- def data(self, index, role):
- if role == Qt.TextAlignmentRole:
- if index.column() > 1:
- return Qt.AlignRight
- if role != Qt.DisplayRole:
- return None
- index_item = index.internalPointer()
- return index_item.getData(index.column())
-
-class MainWindow(QMainWindow):
-
- def __init__(self, db, dbname, parent=None):
- super(MainWindow, self).__init__(parent)
-
- self.setObjectName("MainWindow")
- self.setWindowTitle("Call Graph: " + dbname)
- self.move(100, 100)
- self.resize(800, 600)
- style = self.style()
- icon = style.standardIcon(QStyle.SP_MessageBoxInformation)
- self.setWindowIcon(icon);
-
- self.model = TreeModel(db)
-
- self.view = QTreeView()
- self.view.setModel(self.model)
-
- self.setCentralWidget(self.view)
-
-if __name__ == '__main__':
- if (len(sys.argv) < 2):
- print >> sys.stderr, "Usage is: call-graph-from-sql.py <database name>"
- raise Exception("Too few arguments")
-
- dbname = sys.argv[1]
-
- is_sqlite3 = False
- try:
- f = open(dbname)
- if f.read(15) == "SQLite format 3":
- is_sqlite3 = True
- f.close()
- except:
- pass
-
- if is_sqlite3:
- db = QSqlDatabase.addDatabase('QSQLITE')
- else:
- db = QSqlDatabase.addDatabase('QPSQL')
- opts = dbname.split()
- for opt in opts:
- if '=' in opt:
- opt = opt.split('=')
- if opt[0] == 'hostname':
- db.setHostName(opt[1])
- elif opt[0] == 'port':
- db.setPort(int(opt[1]))
- elif opt[0] == 'username':
- db.setUserName(opt[1])
- elif opt[0] == 'password':
- db.setPassword(opt[1])
- elif opt[0] == 'dbname':
- dbname = opt[1]
- else:
- dbname = opt
-
- db.setDatabaseName(dbname)
- if not db.open():
- raise Exception("Failed to open database " + dbname + " error: " + db.lastError().text())
-
- app = QApplication(sys.argv)
- window = MainWindow(db, dbname)
- window.show()
- err = app.exec_()
- db.close()
- sys.exit(err)
# pt_example=# \q
#
# An example of using the database is provided by the script
-# call-graph-from-sql.py. Refer to that script for details.
+# exported-sql-viewer.py. Refer to that script for details.
#
# Tables:
#
# sqlite> .quit
#
# An example of using the database is provided by the script
-# call-graph-from-sql.py. Refer to that script for details.
+# exported-sql-viewer.py. Refer to that script for details.
#
# The database structure is practically the same as created by the script
# export-to-postgresql.py. Refer to that script for details. A notable
--- /dev/null
+#!/usr/bin/python2
+# SPDX-License-Identifier: GPL-2.0
+# exported-sql-viewer.py: view data from sql database
+# Copyright (c) 2014-2018, Intel Corporation.
+
+# To use this script you will need to have exported data using either the
+# export-to-sqlite.py or the export-to-postgresql.py script. Refer to those
+# scripts for details.
+#
+# Following on from the example in the export scripts, a
+# call-graph can be displayed for the pt_example database like this:
+#
+# python tools/perf/scripts/python/exported-sql-viewer.py pt_example
+#
+# Note that for PostgreSQL, this script supports connecting to remote databases
+# by setting hostname, port, username, password, and dbname e.g.
+#
+# python tools/perf/scripts/python/exported-sql-viewer.py "hostname=myhost username=myuser password=mypassword dbname=pt_example"
+#
+# The result is a GUI window with a tree representing a context-sensitive
+# call-graph. Expanding a couple of levels of the tree and adjusting column
+# widths to suit will display something like:
+#
+# Call Graph: pt_example
+# Call Path Object Count Time(ns) Time(%) Branch Count Branch Count(%)
+# v- ls
+# v- 2638:2638
+# v- _start ld-2.19.so 1 10074071 100.0 211135 100.0
+# |- unknown unknown 1 13198 0.1 1 0.0
+# >- _dl_start ld-2.19.so 1 1400980 13.9 19637 9.3
+# >- _d_linit_internal ld-2.19.so 1 448152 4.4 11094 5.3
+# v-__libc_start_main@plt ls 1 8211741 81.5 180397 85.4
+# >- _dl_fixup ld-2.19.so 1 7607 0.1 108 0.1
+# >- __cxa_atexit libc-2.19.so 1 11737 0.1 10 0.0
+# >- __libc_csu_init ls 1 10354 0.1 10 0.0
+# |- _setjmp libc-2.19.so 1 0 0.0 4 0.0
+# v- main ls 1 8182043 99.6 180254 99.9
+#
+# Points to note:
+# The top level is a command name (comm)
+# The next level is a thread (pid:tid)
+# Subsequent levels are functions
+# 'Count' is the number of calls
+# 'Time' is the elapsed time until the function returns
+# Percentages are relative to the level above
+# 'Branch Count' is the total number of branches for that function and all
+# functions that it calls
+
+# There is also a "All branches" report, which displays branches and
+# possibly disassembly. However, presently, the only supported disassembler is
+# Intel XED, and additionally the object code must be present in perf build ID
+# cache. To use Intel XED, libxed.so must be present. To build and install
+# libxed.so:
+# git clone https://github.com/intelxed/mbuild.git mbuild
+# git clone https://github.com/intelxed/xed
+# cd xed
+# ./mfile.py --share
+# sudo ./mfile.py --prefix=/usr/local install
+# sudo ldconfig
+#
+# Example report:
+#
+# Time CPU Command PID TID Branch Type In Tx Branch
+# 8107675239590 2 ls 22011 22011 return from interrupt No ffffffff86a00a67 native_irq_return_iret ([kernel]) -> 7fab593ea260 _start (ld-2.19.so)
+# 7fab593ea260 48 89 e7 mov %rsp, %rdi
+# 8107675239899 2 ls 22011 22011 hardware interrupt No 7fab593ea260 _start (ld-2.19.so) -> ffffffff86a012e0 page_fault ([kernel])
+# 8107675241900 2 ls 22011 22011 return from interrupt No ffffffff86a00a67 native_irq_return_iret ([kernel]) -> 7fab593ea260 _start (ld-2.19.so)
+# 7fab593ea260 48 89 e7 mov %rsp, %rdi
+# 7fab593ea263 e8 c8 06 00 00 callq 0x7fab593ea930
+# 8107675241900 2 ls 22011 22011 call No 7fab593ea263 _start+0x3 (ld-2.19.so) -> 7fab593ea930 _dl_start (ld-2.19.so)
+# 7fab593ea930 55 pushq %rbp
+# 7fab593ea931 48 89 e5 mov %rsp, %rbp
+# 7fab593ea934 41 57 pushq %r15
+# 7fab593ea936 41 56 pushq %r14
+# 7fab593ea938 41 55 pushq %r13
+# 7fab593ea93a 41 54 pushq %r12
+# 7fab593ea93c 53 pushq %rbx
+# 7fab593ea93d 48 89 fb mov %rdi, %rbx
+# 7fab593ea940 48 83 ec 68 sub $0x68, %rsp
+# 7fab593ea944 0f 31 rdtsc
+# 7fab593ea946 48 c1 e2 20 shl $0x20, %rdx
+# 7fab593ea94a 89 c0 mov %eax, %eax
+# 7fab593ea94c 48 09 c2 or %rax, %rdx
+# 7fab593ea94f 48 8b 05 1a 15 22 00 movq 0x22151a(%rip), %rax
+# 8107675242232 2 ls 22011 22011 hardware interrupt No 7fab593ea94f _dl_start+0x1f (ld-2.19.so) -> ffffffff86a012e0 page_fault ([kernel])
+# 8107675242900 2 ls 22011 22011 return from interrupt No ffffffff86a00a67 native_irq_return_iret ([kernel]) -> 7fab593ea94f _dl_start+0x1f (ld-2.19.so)
+# 7fab593ea94f 48 8b 05 1a 15 22 00 movq 0x22151a(%rip), %rax
+# 7fab593ea956 48 89 15 3b 13 22 00 movq %rdx, 0x22133b(%rip)
+# 8107675243232 2 ls 22011 22011 hardware interrupt No 7fab593ea956 _dl_start+0x26 (ld-2.19.so) -> ffffffff86a012e0 page_fault ([kernel])
+
+import sys
+import weakref
+import threading
+import string
+import cPickle
+import re
+import os
+from PySide.QtCore import *
+from PySide.QtGui import *
+from PySide.QtSql import *
+from decimal import *
+from ctypes import *
+from multiprocessing import Process, Array, Value, Event
+
+# Data formatting helpers
+
+def tohex(ip):
+ if ip < 0:
+ ip += 1 << 64
+ return "%x" % ip
+
+def offstr(offset):
+ if offset:
+ return "+0x%x" % offset
+ return ""
+
+def dsoname(name):
+ if name == "[kernel.kallsyms]":
+ return "[kernel]"
+ return name
+
+def findnth(s, sub, n, offs=0):
+ pos = s.find(sub)
+ if pos < 0:
+ return pos
+ if n <= 1:
+ return offs + pos
+ return findnth(s[pos + 1:], sub, n - 1, offs + pos + 1)
+
+# Percent to one decimal place
+
+def PercentToOneDP(n, d):
+ if not d:
+ return "0.0"
+ x = (n * Decimal(100)) / d
+ return str(x.quantize(Decimal(".1"), rounding=ROUND_HALF_UP))
+
+# Helper for queries that must not fail
+
+def QueryExec(query, stmt):
+ ret = query.exec_(stmt)
+ if not ret:
+ raise Exception("Query failed: " + query.lastError().text())
+
+# Background thread
+
+class Thread(QThread):
+
+ done = Signal(object)
+
+ def __init__(self, task, param=None, parent=None):
+ super(Thread, self).__init__(parent)
+ self.task = task
+ self.param = param
+
+ def run(self):
+ while True:
+ if self.param is None:
+ done, result = self.task()
+ else:
+ done, result = self.task(self.param)
+ self.done.emit(result)
+ if done:
+ break
+
+# Tree data model
+
+class TreeModel(QAbstractItemModel):
+
+ def __init__(self, root, parent=None):
+ super(TreeModel, self).__init__(parent)
+ self.root = root
+ self.last_row_read = 0
+
+ def Item(self, parent):
+ if parent.isValid():
+ return parent.internalPointer()
+ else:
+ return self.root
+
+ def rowCount(self, parent):
+ result = self.Item(parent).childCount()
+ if result < 0:
+ result = 0
+ self.dataChanged.emit(parent, parent)
+ return result
+
+ def hasChildren(self, parent):
+ return self.Item(parent).hasChildren()
+
+ def headerData(self, section, orientation, role):
+ if role == Qt.TextAlignmentRole:
+ return self.columnAlignment(section)
+ if role != Qt.DisplayRole:
+ return None
+ if orientation != Qt.Horizontal:
+ return None
+ return self.columnHeader(section)
+
+ def parent(self, child):
+ child_item = child.internalPointer()
+ if child_item is self.root:
+ return QModelIndex()
+ parent_item = child_item.getParentItem()
+ return self.createIndex(parent_item.getRow(), 0, parent_item)
+
+ def index(self, row, column, parent):
+ child_item = self.Item(parent).getChildItem(row)
+ return self.createIndex(row, column, child_item)
+
+ def DisplayData(self, item, index):
+ return item.getData(index.column())
+
+ def FetchIfNeeded(self, row):
+ if row > self.last_row_read:
+ self.last_row_read = row
+ if row + 10 >= self.root.child_count:
+ self.fetcher.Fetch(glb_chunk_sz)
+
+ def columnAlignment(self, column):
+ return Qt.AlignLeft
+
+ def columnFont(self, column):
+ return None
+
+ def data(self, index, role):
+ if role == Qt.TextAlignmentRole:
+ return self.columnAlignment(index.column())
+ if role == Qt.FontRole:
+ return self.columnFont(index.column())
+ if role != Qt.DisplayRole:
+ return None
+ item = index.internalPointer()
+ return self.DisplayData(item, index)
+
+# Table data model
+
+class TableModel(QAbstractTableModel):
+
+ def __init__(self, parent=None):
+ super(TableModel, self).__init__(parent)
+ self.child_count = 0
+ self.child_items = []
+ self.last_row_read = 0
+
+ def Item(self, parent):
+ if parent.isValid():
+ return parent.internalPointer()
+ else:
+ return self
+
+ def rowCount(self, parent):
+ return self.child_count
+
+ def headerData(self, section, orientation, role):
+ if role == Qt.TextAlignmentRole:
+ return self.columnAlignment(section)
+ if role != Qt.DisplayRole:
+ return None
+ if orientation != Qt.Horizontal:
+ return None
+ return self.columnHeader(section)
+
+ def index(self, row, column, parent):
+ return self.createIndex(row, column, self.child_items[row])
+
+ def DisplayData(self, item, index):
+ return item.getData(index.column())
+
+ def FetchIfNeeded(self, row):
+ if row > self.last_row_read:
+ self.last_row_read = row
+ if row + 10 >= self.child_count:
+ self.fetcher.Fetch(glb_chunk_sz)
+
+ def columnAlignment(self, column):
+ return Qt.AlignLeft
+
+ def columnFont(self, column):
+ return None
+
+ def data(self, index, role):
+ if role == Qt.TextAlignmentRole:
+ return self.columnAlignment(index.column())
+ if role == Qt.FontRole:
+ return self.columnFont(index.column())
+ if role != Qt.DisplayRole:
+ return None
+ item = index.internalPointer()
+ return self.DisplayData(item, index)
+
+# Model cache
+
+model_cache = weakref.WeakValueDictionary()
+model_cache_lock = threading.Lock()
+
+def LookupCreateModel(model_name, create_fn):
+ model_cache_lock.acquire()
+ try:
+ model = model_cache[model_name]
+ except:
+ model = None
+ if model is None:
+ model = create_fn()
+ model_cache[model_name] = model
+ model_cache_lock.release()
+ return model
+
+# Find bar
+
+class FindBar():
+
+ def __init__(self, parent, finder, is_reg_expr=False):
+ self.finder = finder
+ self.context = []
+ self.last_value = None
+ self.last_pattern = None
+
+ label = QLabel("Find:")
+ label.setSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
+
+ self.textbox = QComboBox()
+ self.textbox.setEditable(True)
+ self.textbox.currentIndexChanged.connect(self.ValueChanged)
+
+ self.progress = QProgressBar()
+ self.progress.setRange(0, 0)
+ self.progress.hide()
+
+ if is_reg_expr:
+ self.pattern = QCheckBox("Regular Expression")
+ else:
+ self.pattern = QCheckBox("Pattern")
+ self.pattern.setSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
+
+ self.next_button = QToolButton()
+ self.next_button.setIcon(parent.style().standardIcon(QStyle.SP_ArrowDown))
+ self.next_button.released.connect(lambda: self.NextPrev(1))
+
+ self.prev_button = QToolButton()
+ self.prev_button.setIcon(parent.style().standardIcon(QStyle.SP_ArrowUp))
+ self.prev_button.released.connect(lambda: self.NextPrev(-1))
+
+ self.close_button = QToolButton()
+ self.close_button.setIcon(parent.style().standardIcon(QStyle.SP_DockWidgetCloseButton))
+ self.close_button.released.connect(self.Deactivate)
+
+ self.hbox = QHBoxLayout()
+ self.hbox.setContentsMargins(0, 0, 0, 0)
+
+ self.hbox.addWidget(label)
+ self.hbox.addWidget(self.textbox)
+ self.hbox.addWidget(self.progress)
+ self.hbox.addWidget(self.pattern)
+ self.hbox.addWidget(self.next_button)
+ self.hbox.addWidget(self.prev_button)
+ self.hbox.addWidget(self.close_button)
+
+ self.bar = QWidget()
+ self.bar.setLayout(self.hbox);
+ self.bar.hide()
+
+ def Widget(self):
+ return self.bar
+
+ def Activate(self):
+ self.bar.show()
+ self.textbox.setFocus()
+
+ def Deactivate(self):
+ self.bar.hide()
+
+ def Busy(self):
+ self.textbox.setEnabled(False)
+ self.pattern.hide()
+ self.next_button.hide()
+ self.prev_button.hide()
+ self.progress.show()
+
+ def Idle(self):
+ self.textbox.setEnabled(True)
+ self.progress.hide()
+ self.pattern.show()
+ self.next_button.show()
+ self.prev_button.show()
+
+ def Find(self, direction):
+ value = self.textbox.currentText()
+ pattern = self.pattern.isChecked()
+ self.last_value = value
+ self.last_pattern = pattern
+ self.finder.Find(value, direction, pattern, self.context)
+
+ def ValueChanged(self):
+ value = self.textbox.currentText()
+ pattern = self.pattern.isChecked()
+ index = self.textbox.currentIndex()
+ data = self.textbox.itemData(index)
+ # Store the pattern in the combo box to keep it with the text value
+ if data == None:
+ self.textbox.setItemData(index, pattern)
+ else:
+ self.pattern.setChecked(data)
+ self.Find(0)
+
+ def NextPrev(self, direction):
+ value = self.textbox.currentText()
+ pattern = self.pattern.isChecked()
+ if value != self.last_value:
+ index = self.textbox.findText(value)
+ # Allow for a button press before the value has been added to the combo box
+ if index < 0:
+ index = self.textbox.count()
+ self.textbox.addItem(value, pattern)
+ self.textbox.setCurrentIndex(index)
+ return
+ else:
+ self.textbox.setItemData(index, pattern)
+ elif pattern != self.last_pattern:
+ # Keep the pattern recorded in the combo box up to date
+ index = self.textbox.currentIndex()
+ self.textbox.setItemData(index, pattern)
+ self.Find(direction)
+
+ def NotFound(self):
+ QMessageBox.information(self.bar, "Find", "'" + self.textbox.currentText() + "' not found")
+
+# Context-sensitive call graph data model item base
+
+class CallGraphLevelItemBase(object):
+
+ def __init__(self, glb, row, parent_item):
+ self.glb = glb
+ self.row = row
+ self.parent_item = parent_item
+ self.query_done = False;
+ self.child_count = 0
+ self.child_items = []
+
+ def getChildItem(self, row):
+ return self.child_items[row]
+
+ def getParentItem(self):
+ return self.parent_item
+
+ def getRow(self):
+ return self.row
+
+ def childCount(self):
+ if not self.query_done:
+ self.Select()
+ if not self.child_count:
+ return -1
+ return self.child_count
+
+ def hasChildren(self):
+ if not self.query_done:
+ return True
+ return self.child_count > 0
+
+ def getData(self, column):
+ return self.data[column]
+
+# Context-sensitive call graph data model level 2+ item base
+
+class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
+
+ def __init__(self, glb, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item):
+ super(CallGraphLevelTwoPlusItemBase, self).__init__(glb, row, parent_item)
+ self.comm_id = comm_id
+ self.thread_id = thread_id
+ self.call_path_id = call_path_id
+ self.branch_count = branch_count
+ self.time = time
+
+ def Select(self):
+ self.query_done = True;
+ query = QSqlQuery(self.glb.db)
+ QueryExec(query, "SELECT call_path_id, name, short_name, COUNT(calls.id), SUM(return_time - call_time), SUM(branch_count)"
+ " FROM calls"
+ " INNER JOIN call_paths ON calls.call_path_id = call_paths.id"
+ " INNER JOIN symbols ON call_paths.symbol_id = symbols.id"
+ " INNER JOIN dsos ON symbols.dso_id = dsos.id"
+ " WHERE parent_call_path_id = " + str(self.call_path_id) +
+ " AND comm_id = " + str(self.comm_id) +
+ " AND thread_id = " + str(self.thread_id) +
+ " GROUP BY call_path_id, name, short_name"
+ " ORDER BY call_path_id")
+ while query.next():
+ child_item = CallGraphLevelThreeItem(self.glb, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+ self.child_items.append(child_item)
+ self.child_count += 1
+
+# Context-sensitive call graph data model level three item
+
+class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
+
+ def __init__(self, glb, row, comm_id, thread_id, call_path_id, name, dso, count, time, branch_count, parent_item):
+ super(CallGraphLevelThreeItem, self).__init__(glb, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item)
+ dso = dsoname(dso)
+ self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
+ self.dbid = call_path_id
+
+# Context-sensitive call graph data model level two item
+
+class CallGraphLevelTwoItem(CallGraphLevelTwoPlusItemBase):
+
+ def __init__(self, glb, row, comm_id, thread_id, pid, tid, parent_item):
+ super(CallGraphLevelTwoItem, self).__init__(glb, row, comm_id, thread_id, 1, 0, 0, parent_item)
+ self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
+ self.dbid = thread_id
+
+ def Select(self):
+ super(CallGraphLevelTwoItem, self).Select()
+ for child_item in self.child_items:
+ self.time += child_item.time
+ self.branch_count += child_item.branch_count
+ for child_item in self.child_items:
+ child_item.data[4] = PercentToOneDP(child_item.time, self.time)
+ child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
+
+# Context-sensitive call graph data model level one item
+
+class CallGraphLevelOneItem(CallGraphLevelItemBase):
+
+ def __init__(self, glb, row, comm_id, comm, parent_item):
+ super(CallGraphLevelOneItem, self).__init__(glb, row, parent_item)
+ self.data = [comm, "", "", "", "", "", ""]
+ self.dbid = comm_id
+
+ def Select(self):
+ self.query_done = True;
+ query = QSqlQuery(self.glb.db)
+ QueryExec(query, "SELECT thread_id, pid, tid"
+ " FROM comm_threads"
+ " INNER JOIN threads ON thread_id = threads.id"
+ " WHERE comm_id = " + str(self.dbid))
+ while query.next():
+ child_item = CallGraphLevelTwoItem(self.glb, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
+ self.child_items.append(child_item)
+ self.child_count += 1
+
+# Context-sensitive call graph data model root item
+
+class CallGraphRootItem(CallGraphLevelItemBase):
+
+ def __init__(self, glb):
+ super(CallGraphRootItem, self).__init__(glb, 0, None)
+ self.dbid = 0
+ self.query_done = True;
+ query = QSqlQuery(glb.db)
+ QueryExec(query, "SELECT id, comm FROM comms")
+ while query.next():
+ if not query.value(0):
+ continue
+ child_item = CallGraphLevelOneItem(glb, self.child_count, query.value(0), query.value(1), self)
+ self.child_items.append(child_item)
+ self.child_count += 1
+
+# Context-sensitive call graph data model
+
+class CallGraphModel(TreeModel):
+
+ def __init__(self, glb, parent=None):
+ super(CallGraphModel, self).__init__(CallGraphRootItem(glb), parent)
+ self.glb = glb
+
+ def columnCount(self, parent=None):
+ return 7
+
+ def columnHeader(self, column):
+ headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
+ return headers[column]
+
+ def columnAlignment(self, column):
+ alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+ return alignment[column]
+
+ def FindSelect(self, value, pattern, query):
+ if pattern:
+ # postgresql and sqlite pattern patching differences:
+ # postgresql LIKE is case sensitive but sqlite LIKE is not
+ # postgresql LIKE allows % and _ to be escaped with \ but sqlite LIKE does not
+ # postgresql supports ILIKE which is case insensitive
+ # sqlite supports GLOB (text only) which uses * and ? and is case sensitive
+ if not self.glb.dbref.is_sqlite3:
+ # Escape % and _
+ s = value.replace("%", "\%")
+ s = s.replace("_", "\_")
+ # Translate * and ? into SQL LIKE pattern characters % and _
+ trans = string.maketrans("*?", "%_")
+ match = " LIKE '" + str(s).translate(trans) + "'"
+ else:
+ match = " GLOB '" + str(value) + "'"
+ else:
+ match = " = '" + str(value) + "'"
+ QueryExec(query, "SELECT call_path_id, comm_id, thread_id"
+ " FROM calls"
+ " INNER JOIN call_paths ON calls.call_path_id = call_paths.id"
+ " INNER JOIN symbols ON call_paths.symbol_id = symbols.id"
+ " WHERE symbols.name" + match +
+ " GROUP BY comm_id, thread_id, call_path_id"
+ " ORDER BY comm_id, thread_id, call_path_id")
+
+ def FindPath(self, query):
+ # Turn the query result into a list of ids that the tree view can walk
+ # to open the tree at the right place.
+ ids = []
+ parent_id = query.value(0)
+ while parent_id:
+ ids.insert(0, parent_id)
+ q2 = QSqlQuery(self.glb.db)
+ QueryExec(q2, "SELECT parent_id"
+ " FROM call_paths"
+ " WHERE id = " + str(parent_id))
+ if not q2.next():
+ break
+ parent_id = q2.value(0)
+ # The call path root is not used
+ if ids[0] == 1:
+ del ids[0]
+ ids.insert(0, query.value(2))
+ ids.insert(0, query.value(1))
+ return ids
+
+ def Found(self, query, found):
+ if found:
+ return self.FindPath(query)
+ return []
+
+ def FindValue(self, value, pattern, query, last_value, last_pattern):
+ if last_value == value and pattern == last_pattern:
+ found = query.first()
+ else:
+ self.FindSelect(value, pattern, query)
+ found = query.next()
+ return self.Found(query, found)
+
+ def FindNext(self, query):
+ found = query.next()
+ if not found:
+ found = query.first()
+ return self.Found(query, found)
+
+ def FindPrev(self, query):
+ found = query.previous()
+ if not found:
+ found = query.last()
+ return self.Found(query, found)
+
+ def FindThread(self, c):
+ if c.direction == 0 or c.value != c.last_value or c.pattern != c.last_pattern:
+ ids = self.FindValue(c.value, c.pattern, c.query, c.last_value, c.last_pattern)
+ elif c.direction > 0:
+ ids = self.FindNext(c.query)
+ else:
+ ids = self.FindPrev(c.query)
+ return (True, ids)
+
+ def Find(self, value, direction, pattern, context, callback):
+ class Context():
+ def __init__(self, *x):
+ self.value, self.direction, self.pattern, self.query, self.last_value, self.last_pattern = x
+ def Update(self, *x):
+ self.value, self.direction, self.pattern, self.last_value, self.last_pattern = x + (self.value, self.pattern)
+ if len(context):
+ context[0].Update(value, direction, pattern)
+ else:
+ context.append(Context(value, direction, pattern, QSqlQuery(self.glb.db), None, None))
+ # Use a thread so the UI is not blocked during the SELECT
+ thread = Thread(self.FindThread, context[0])
+ thread.done.connect(lambda ids, t=thread, c=callback: self.FindDone(t, c, ids), Qt.QueuedConnection)
+ thread.start()
+
+ def FindDone(self, thread, callback, ids):
+ callback(ids)
+
+# Vertical widget layout
+
+class VBox():
+
+ def __init__(self, w1, w2, w3=None):
+ self.vbox = QWidget()
+ self.vbox.setLayout(QVBoxLayout());
+
+ self.vbox.layout().setContentsMargins(0, 0, 0, 0)
+
+ self.vbox.layout().addWidget(w1)
+ self.vbox.layout().addWidget(w2)
+ if w3:
+ self.vbox.layout().addWidget(w3)
+
+ def Widget(self):
+ return self.vbox
+
+# Context-sensitive call graph window
+
+class CallGraphWindow(QMdiSubWindow):
+
+ def __init__(self, glb, parent=None):
+ super(CallGraphWindow, self).__init__(parent)
+
+ self.model = LookupCreateModel("Context-Sensitive Call Graph", lambda x=glb: CallGraphModel(x))
+
+ self.view = QTreeView()
+ self.view.setModel(self.model)
+
+ for c, w in ((0, 250), (1, 100), (2, 60), (3, 70), (4, 70), (5, 100)):
+ self.view.setColumnWidth(c, w)
+
+ self.find_bar = FindBar(self, self)
+
+ self.vbox = VBox(self.view, self.find_bar.Widget())
+
+ self.setWidget(self.vbox.Widget())
+
+ AddSubWindow(glb.mainwindow.mdi_area, self, "Context-Sensitive Call Graph")
+
+ def DisplayFound(self, ids):
+ if not len(ids):
+ return False
+ parent = QModelIndex()
+ for dbid in ids:
+ found = False
+ n = self.model.rowCount(parent)
+ for row in xrange(n):
+ child = self.model.index(row, 0, parent)
+ if child.internalPointer().dbid == dbid:
+ found = True
+ self.view.setCurrentIndex(child)
+ parent = child
+ break
+ if not found:
+ break
+ return found
+
+ def Find(self, value, direction, pattern, context):
+ self.view.setFocus()
+ self.find_bar.Busy()
+ self.model.Find(value, direction, pattern, context, self.FindDone)
+
+ def FindDone(self, ids):
+ found = True
+ if not self.DisplayFound(ids):
+ found = False
+ self.find_bar.Idle()
+ if not found:
+ self.find_bar.NotFound()
+
+# Child data item finder
+
+class ChildDataItemFinder():
+
+ def __init__(self, root):
+ self.root = root
+ self.value, self.direction, self.pattern, self.last_value, self.last_pattern = (None,) * 5
+ self.rows = []
+ self.pos = 0
+
+ def FindSelect(self):
+ self.rows = []
+ if self.pattern:
+ pattern = re.compile(self.value)
+ for child in self.root.child_items:
+ for column_data in child.data:
+ if re.search(pattern, str(column_data)) is not None:
+ self.rows.append(child.row)
+ break
+ else:
+ for child in self.root.child_items:
+ for column_data in child.data:
+ if self.value in str(column_data):
+ self.rows.append(child.row)
+ break
+
+ def FindValue(self):
+ self.pos = 0
+ if self.last_value != self.value or self.pattern != self.last_pattern:
+ self.FindSelect()
+ if not len(self.rows):
+ return -1
+ return self.rows[self.pos]
+
+ def FindThread(self):
+ if self.direction == 0 or self.value != self.last_value or self.pattern != self.last_pattern:
+ row = self.FindValue()
+ elif len(self.rows):
+ if self.direction > 0:
+ self.pos += 1
+ if self.pos >= len(self.rows):
+ self.pos = 0
+ else:
+ self.pos -= 1
+ if self.pos < 0:
+ self.pos = len(self.rows) - 1
+ row = self.rows[self.pos]
+ else:
+ row = -1
+ return (True, row)
+
+ def Find(self, value, direction, pattern, context, callback):
+ self.value, self.direction, self.pattern, self.last_value, self.last_pattern = (value, direction,pattern, self.value, self.pattern)
+ # Use a thread so the UI is not blocked
+ thread = Thread(self.FindThread)
+ thread.done.connect(lambda row, t=thread, c=callback: self.FindDone(t, c, row), Qt.QueuedConnection)
+ thread.start()
+
+ def FindDone(self, thread, callback, row):
+ callback(row)
+
+# Number of database records to fetch in one go
+
+glb_chunk_sz = 10000
+
+# size of pickled integer big enough for record size
+
+glb_nsz = 8
+
+# Background process for SQL data fetcher
+
+class SQLFetcherProcess():
+
+ def __init__(self, dbref, sql, buffer, head, tail, fetch_count, fetching_done, process_target, wait_event, fetched_event, prep):
+ # Need a unique connection name
+ conn_name = "SQLFetcher" + str(os.getpid())
+ self.db, dbname = dbref.Open(conn_name)
+ self.sql = sql
+ self.buffer = buffer
+ self.head = head
+ self.tail = tail
+ self.fetch_count = fetch_count
+ self.fetching_done = fetching_done
+ self.process_target = process_target
+ self.wait_event = wait_event
+ self.fetched_event = fetched_event
+ self.prep = prep
+ self.query = QSqlQuery(self.db)
+ self.query_limit = 0 if "$$last_id$$" in sql else 2
+ self.last_id = -1
+ self.fetched = 0
+ self.more = True
+ self.local_head = self.head.value
+ self.local_tail = self.tail.value
+
+ def Select(self):
+ if self.query_limit:
+ if self.query_limit == 1:
+ return
+ self.query_limit -= 1
+ stmt = self.sql.replace("$$last_id$$", str(self.last_id))
+ QueryExec(self.query, stmt)
+
+ def Next(self):
+ if not self.query.next():
+ self.Select()
+ if not self.query.next():
+ return None
+ self.last_id = self.query.value(0)
+ return self.prep(self.query)
+
+ def WaitForTarget(self):
+ while True:
+ self.wait_event.clear()
+ target = self.process_target.value
+ if target > self.fetched or target < 0:
+ break
+ self.wait_event.wait()
+ return target
+
+ def HasSpace(self, sz):
+ if self.local_tail <= self.local_head:
+ space = len(self.buffer) - self.local_head
+ if space > sz:
+ return True
+ if space >= glb_nsz:
+ # Use 0 (or space < glb_nsz) to mean there is no more at the top of the buffer
+ nd = cPickle.dumps(0, cPickle.HIGHEST_PROTOCOL)
+ self.buffer[self.local_head : self.local_head + len(nd)] = nd
+ self.local_head = 0
+ if self.local_tail - self.local_head > sz:
+ return True
+ return False
+
+ def WaitForSpace(self, sz):
+ if self.HasSpace(sz):
+ return
+ while True:
+ self.wait_event.clear()
+ self.local_tail = self.tail.value
+ if self.HasSpace(sz):
+ return
+ self.wait_event.wait()
+
+ def AddToBuffer(self, obj):
+ d = cPickle.dumps(obj, cPickle.HIGHEST_PROTOCOL)
+ n = len(d)
+ nd = cPickle.dumps(n, cPickle.HIGHEST_PROTOCOL)
+ sz = n + glb_nsz
+ self.WaitForSpace(sz)
+ pos = self.local_head
+ self.buffer[pos : pos + len(nd)] = nd
+ self.buffer[pos + glb_nsz : pos + sz] = d
+ self.local_head += sz
+
+ def FetchBatch(self, batch_size):
+ fetched = 0
+ while batch_size > fetched:
+ obj = self.Next()
+ if obj is None:
+ self.more = False
+ break
+ self.AddToBuffer(obj)
+ fetched += 1
+ if fetched:
+ self.fetched += fetched
+ with self.fetch_count.get_lock():
+ self.fetch_count.value += fetched
+ self.head.value = self.local_head
+ self.fetched_event.set()
+
+ def Run(self):
+ while self.more:
+ target = self.WaitForTarget()
+ if target < 0:
+ break
+ batch_size = min(glb_chunk_sz, target - self.fetched)
+ self.FetchBatch(batch_size)
+ self.fetching_done.value = True
+ self.fetched_event.set()
+
+def SQLFetcherFn(*x):
+ process = SQLFetcherProcess(*x)
+ process.Run()
+
+# SQL data fetcher
+
+class SQLFetcher(QObject):
+
+ done = Signal(object)
+
+ def __init__(self, glb, sql, prep, process_data, parent=None):
+ super(SQLFetcher, self).__init__(parent)
+ self.process_data = process_data
+ self.more = True
+ self.target = 0
+ self.last_target = 0
+ self.fetched = 0
+ self.buffer_size = 16 * 1024 * 1024
+ self.buffer = Array(c_char, self.buffer_size, lock=False)
+ self.head = Value(c_longlong)
+ self.tail = Value(c_longlong)
+ self.local_tail = 0
+ self.fetch_count = Value(c_longlong)
+ self.fetching_done = Value(c_bool)
+ self.last_count = 0
+ self.process_target = Value(c_longlong)
+ self.wait_event = Event()
+ self.fetched_event = Event()
+ glb.AddInstanceToShutdownOnExit(self)
+ self.process = Process(target=SQLFetcherFn, args=(glb.dbref, sql, self.buffer, self.head, self.tail, self.fetch_count, self.fetching_done, self.process_target, self.wait_event, self.fetched_event, prep))
+ self.process.start()
+ self.thread = Thread(self.Thread)
+ self.thread.done.connect(self.ProcessData, Qt.QueuedConnection)
+ self.thread.start()
+
+ def Shutdown(self):
+ # Tell the thread and process to exit
+ self.process_target.value = -1
+ self.wait_event.set()
+ self.more = False
+ self.fetching_done.value = True
+ self.fetched_event.set()
+
+ def Thread(self):
+ if not self.more:
+ return True, 0
+ while True:
+ self.fetched_event.clear()
+ fetch_count = self.fetch_count.value
+ if fetch_count != self.last_count:
+ break
+ if self.fetching_done.value:
+ self.more = False
+ return True, 0
+ self.fetched_event.wait()
+ count = fetch_count - self.last_count
+ self.last_count = fetch_count
+ self.fetched += count
+ return False, count
+
+ def Fetch(self, nr):
+ if not self.more:
+ # -1 inidcates there are no more
+ return -1
+ result = self.fetched
+ extra = result + nr - self.target
+ if extra > 0:
+ self.target += extra
+ # process_target < 0 indicates shutting down
+ if self.process_target.value >= 0:
+ self.process_target.value = self.target
+ self.wait_event.set()
+ return result
+
+ def RemoveFromBuffer(self):
+ pos = self.local_tail
+ if len(self.buffer) - pos < glb_nsz:
+ pos = 0
+ n = cPickle.loads(self.buffer[pos : pos + glb_nsz])
+ if n == 0:
+ pos = 0
+ n = cPickle.loads(self.buffer[0 : glb_nsz])
+ pos += glb_nsz
+ obj = cPickle.loads(self.buffer[pos : pos + n])
+ self.local_tail = pos + n
+ return obj
+
+ def ProcessData(self, count):
+ for i in xrange(count):
+ obj = self.RemoveFromBuffer()
+ self.process_data(obj)
+ self.tail.value = self.local_tail
+ self.wait_event.set()
+ self.done.emit(count)
+
+# Fetch more records bar
+
+class FetchMoreRecordsBar():
+
+ def __init__(self, model, parent):
+ self.model = model
+
+ self.label = QLabel("Number of records (x " + "{:,}".format(glb_chunk_sz) + ") to fetch:")
+ self.label.setSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
+
+ self.fetch_count = QSpinBox()
+ self.fetch_count.setRange(1, 1000000)
+ self.fetch_count.setValue(10)
+ self.fetch_count.setSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
+
+ self.fetch = QPushButton("Go!")
+ self.fetch.setSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
+ self.fetch.released.connect(self.FetchMoreRecords)
+
+ self.progress = QProgressBar()
+ self.progress.setRange(0, 100)
+ self.progress.hide()
+
+ self.done_label = QLabel("All records fetched")
+ self.done_label.hide()
+
+ self.spacer = QLabel("")
+
+ self.close_button = QToolButton()
+ self.close_button.setIcon(parent.style().standardIcon(QStyle.SP_DockWidgetCloseButton))
+ self.close_button.released.connect(self.Deactivate)
+
+ self.hbox = QHBoxLayout()
+ self.hbox.setContentsMargins(0, 0, 0, 0)
+
+ self.hbox.addWidget(self.label)
+ self.hbox.addWidget(self.fetch_count)
+ self.hbox.addWidget(self.fetch)
+ self.hbox.addWidget(self.spacer)
+ self.hbox.addWidget(self.progress)
+ self.hbox.addWidget(self.done_label)
+ self.hbox.addWidget(self.close_button)
+
+ self.bar = QWidget()
+ self.bar.setLayout(self.hbox);
+ self.bar.show()
+
+ self.in_progress = False
+ self.model.progress.connect(self.Progress)
+
+ self.done = False
+
+ if not model.HasMoreRecords():
+ self.Done()
+
+ def Widget(self):
+ return self.bar
+
+ def Activate(self):
+ self.bar.show()
+ self.fetch.setFocus()
+
+ def Deactivate(self):
+ self.bar.hide()
+
+ def Enable(self, enable):
+ self.fetch.setEnabled(enable)
+ self.fetch_count.setEnabled(enable)
+
+ def Busy(self):
+ self.Enable(False)
+ self.fetch.hide()
+ self.spacer.hide()
+ self.progress.show()
+
+ def Idle(self):
+ self.in_progress = False
+ self.Enable(True)
+ self.progress.hide()
+ self.fetch.show()
+ self.spacer.show()
+
+ def Target(self):
+ return self.fetch_count.value() * glb_chunk_sz
+
+ def Done(self):
+ self.done = True
+ self.Idle()
+ self.label.hide()
+ self.fetch_count.hide()
+ self.fetch.hide()
+ self.spacer.hide()
+ self.done_label.show()
+
+ def Progress(self, count):
+ if self.in_progress:
+ if count:
+ percent = ((count - self.start) * 100) / self.Target()
+ if percent >= 100:
+ self.Idle()
+ else:
+ self.progress.setValue(percent)
+ if not count:
+ # Count value of zero means no more records
+ self.Done()
+
+ def FetchMoreRecords(self):
+ if self.done:
+ return
+ self.progress.setValue(0)
+ self.Busy()
+ self.in_progress = True
+ self.start = self.model.FetchMoreRecords(self.Target())
+
+# Brance data model level two item
+
+class BranchLevelTwoItem():
+
+ def __init__(self, row, text, parent_item):
+ self.row = row
+ self.parent_item = parent_item
+ self.data = [""] * 8
+ self.data[7] = text
+ self.level = 2
+
+ def getParentItem(self):
+ return self.parent_item
+
+ def getRow(self):
+ return self.row
+
+ def childCount(self):
+ return 0
+
+ def hasChildren(self):
+ return False
+
+ def getData(self, column):
+ return self.data[column]
+
+# Brance data model level one item
+
+class BranchLevelOneItem():
+
+ def __init__(self, glb, row, data, parent_item):
+ self.glb = glb
+ self.row = row
+ self.parent_item = parent_item
+ self.child_count = 0
+ self.child_items = []
+ self.data = data[1:]
+ self.dbid = data[0]
+ self.level = 1
+ self.query_done = False
+
+ def getChildItem(self, row):
+ return self.child_items[row]
+
+ def getParentItem(self):
+ return self.parent_item
+
+ def getRow(self):
+ return self.row
+
+ def Select(self):
+ self.query_done = True
+
+ if not self.glb.have_disassembler:
+ return
+
+ query = QSqlQuery(self.glb.db)
+
+ QueryExec(query, "SELECT cpu, to_dso_id, to_symbol_id, to_sym_offset, short_name, long_name, build_id, sym_start, to_ip"
+ " FROM samples"
+ " INNER JOIN dsos ON samples.to_dso_id = dsos.id"
+ " INNER JOIN symbols ON samples.to_symbol_id = symbols.id"
+ " WHERE samples.id = " + str(self.dbid))
+ if not query.next():
+ return
+ cpu = query.value(0)
+ dso = query.value(1)
+ sym = query.value(2)
+ if dso == 0 or sym == 0:
+ return
+ off = query.value(3)
+ short_name = query.value(4)
+ long_name = query.value(5)
+ build_id = query.value(6)
+ sym_start = query.value(7)
+ ip = query.value(8)
+
+ QueryExec(query, "SELECT samples.dso_id, symbol_id, sym_offset, sym_start"
+ " FROM samples"
+ " INNER JOIN symbols ON samples.symbol_id = symbols.id"
+ " WHERE samples.id > " + str(self.dbid) + " AND cpu = " + str(cpu) +
+ " ORDER BY samples.id"
+ " LIMIT 1")
+ if not query.next():
+ return
+ if query.value(0) != dso:
+ # Cannot disassemble from one dso to another
+ return
+ bsym = query.value(1)
+ boff = query.value(2)
+ bsym_start = query.value(3)
+ if bsym == 0:
+ return
+ tot = bsym_start + boff + 1 - sym_start - off
+ if tot <= 0 or tot > 16384:
+ return
+
+ inst = self.glb.disassembler.Instruction()
+ f = self.glb.FileFromNamesAndBuildId(short_name, long_name, build_id)
+ if not f:
+ return
+ mode = 0 if Is64Bit(f) else 1
+ self.glb.disassembler.SetMode(inst, mode)
+
+ buf_sz = tot + 16
+ buf = create_string_buffer(tot + 16)
+ f.seek(sym_start + off)
+ buf.value = f.read(buf_sz)
+ buf_ptr = addressof(buf)
+ i = 0
+ while tot > 0:
+ cnt, text = self.glb.disassembler.DisassembleOne(inst, buf_ptr, buf_sz, ip)
+ if cnt:
+ byte_str = tohex(ip).rjust(16)
+ for k in xrange(cnt):
+ byte_str += " %02x" % ord(buf[i])
+ i += 1
+ while k < 15:
+ byte_str += " "
+ k += 1
+ self.child_items.append(BranchLevelTwoItem(0, byte_str + " " + text, self))
+ self.child_count += 1
+ else:
+ return
+ buf_ptr += cnt
+ tot -= cnt
+ buf_sz -= cnt
+ ip += cnt
+
+ def childCount(self):
+ if not self.query_done:
+ self.Select()
+ if not self.child_count:
+ return -1
+ return self.child_count
+
+ def hasChildren(self):
+ if not self.query_done:
+ return True
+ return self.child_count > 0
+
+ def getData(self, column):
+ return self.data[column]
+
+# Brance data model root item
+
+class BranchRootItem():
+
+ def __init__(self):
+ self.child_count = 0
+ self.child_items = []
+ self.level = 0
+
+ def getChildItem(self, row):
+ return self.child_items[row]
+
+ def getParentItem(self):
+ return None
+
+ def getRow(self):
+ return 0
+
+ def childCount(self):
+ return self.child_count
+
+ def hasChildren(self):
+ return self.child_count > 0
+
+ def getData(self, column):
+ return ""
+
+# Branch data preparation
+
+def BranchDataPrep(query):
+ data = []
+ for i in xrange(0, 8):
+ data.append(query.value(i))
+ data.append(tohex(query.value(8)).rjust(16) + " " + query.value(9) + offstr(query.value(10)) +
+ " (" + dsoname(query.value(11)) + ")" + " -> " +
+ tohex(query.value(12)) + " " + query.value(13) + offstr(query.value(14)) +
+ " (" + dsoname(query.value(15)) + ")")
+ return data
+
+# Branch data model
+
+class BranchModel(TreeModel):
+
+ progress = Signal(object)
+
+ def __init__(self, glb, event_id, where_clause, parent=None):
+ super(BranchModel, self).__init__(BranchRootItem(), parent)
+ self.glb = glb
+ self.event_id = event_id
+ self.more = True
+ self.populated = 0
+ sql = ("SELECT samples.id, time, cpu, comm, pid, tid, branch_types.name,"
+ " CASE WHEN in_tx = '0' THEN 'No' ELSE 'Yes' END,"
+ " ip, symbols.name, sym_offset, dsos.short_name,"
+ " to_ip, to_symbols.name, to_sym_offset, to_dsos.short_name"
+ " FROM samples"
+ " INNER JOIN comms ON comm_id = comms.id"
+ " INNER JOIN threads ON thread_id = threads.id"
+ " INNER JOIN branch_types ON branch_type = branch_types.id"
+ " INNER JOIN symbols ON symbol_id = symbols.id"
+ " INNER JOIN symbols to_symbols ON to_symbol_id = to_symbols.id"
+ " INNER JOIN dsos ON samples.dso_id = dsos.id"
+ " INNER JOIN dsos AS to_dsos ON samples.to_dso_id = to_dsos.id"
+ " WHERE samples.id > $$last_id$$" + where_clause +
+ " AND evsel_id = " + str(self.event_id) +
+ " ORDER BY samples.id"
+ " LIMIT " + str(glb_chunk_sz))
+ self.fetcher = SQLFetcher(glb, sql, BranchDataPrep, self.AddSample)
+ self.fetcher.done.connect(self.Update)
+ self.fetcher.Fetch(glb_chunk_sz)
+
+ def columnCount(self, parent=None):
+ return 8
+
+ def columnHeader(self, column):
+ return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Branch")[column]
+
+ def columnFont(self, column):
+ if column != 7:
+ return None
+ return QFont("Monospace")
+
+ def DisplayData(self, item, index):
+ if item.level == 1:
+ self.FetchIfNeeded(item.row)
+ return item.getData(index.column())
+
+ def AddSample(self, data):
+ child = BranchLevelOneItem(self.glb, self.populated, data, self.root)
+ self.root.child_items.append(child)
+ self.populated += 1
+
+ def Update(self, fetched):
+ if not fetched:
+ self.more = False
+ self.progress.emit(0)
+ child_count = self.root.child_count
+ count = self.populated - child_count
+ if count > 0:
+ parent = QModelIndex()
+ self.beginInsertRows(parent, child_count, child_count + count - 1)
+ self.insertRows(child_count, count, parent)
+ self.root.child_count += count
+ self.endInsertRows()
+ self.progress.emit(self.root.child_count)
+
+ def FetchMoreRecords(self, count):
+ current = self.root.child_count
+ if self.more:
+ self.fetcher.Fetch(count)
+ else:
+ self.progress.emit(0)
+ return current
+
+ def HasMoreRecords(self):
+ return self.more
+
+# Branch window
+
+class BranchWindow(QMdiSubWindow):
+
+ def __init__(self, glb, event_id, name, where_clause, parent=None):
+ super(BranchWindow, self).__init__(parent)
+
+ model_name = "Branch Events " + str(event_id)
+ if len(where_clause):
+ model_name = where_clause + " " + model_name
+
+ self.model = LookupCreateModel(model_name, lambda: BranchModel(glb, event_id, where_clause))
+
+ self.view = QTreeView()
+ self.view.setUniformRowHeights(True)
+ self.view.setModel(self.model)
+
+ self.ResizeColumnsToContents()
+
+ self.find_bar = FindBar(self, self, True)
+
+ self.finder = ChildDataItemFinder(self.model.root)
+
+ self.fetch_bar = FetchMoreRecordsBar(self.model, self)
+
+ self.vbox = VBox(self.view, self.find_bar.Widget(), self.fetch_bar.Widget())
+
+ self.setWidget(self.vbox.Widget())
+
+ AddSubWindow(glb.mainwindow.mdi_area, self, name + " Branch Events")
+
+ def ResizeColumnToContents(self, column, n):
+ # Using the view's resizeColumnToContents() here is extrememly slow
+ # so implement a crude alternative
+ mm = "MM" if column else "MMMM"
+ font = self.view.font()
+ metrics = QFontMetrics(font)
+ max = 0
+ for row in xrange(n):
+ val = self.model.root.child_items[row].data[column]
+ len = metrics.width(str(val) + mm)
+ max = len if len > max else max
+ val = self.model.columnHeader(column)
+ len = metrics.width(str(val) + mm)
+ max = len if len > max else max
+ self.view.setColumnWidth(column, max)
+
+ def ResizeColumnsToContents(self):
+ n = min(self.model.root.child_count, 100)
+ if n < 1:
+ # No data yet, so connect a signal to notify when there is
+ self.model.rowsInserted.connect(self.UpdateColumnWidths)
+ return
+ columns = self.model.columnCount()
+ for i in xrange(columns):
+ self.ResizeColumnToContents(i, n)
+
+ def UpdateColumnWidths(self, *x):
+ # This only needs to be done once, so disconnect the signal now
+ self.model.rowsInserted.disconnect(self.UpdateColumnWidths)
+ self.ResizeColumnsToContents()
+
+ def Find(self, value, direction, pattern, context):
+ self.view.setFocus()
+ self.find_bar.Busy()
+ self.finder.Find(value, direction, pattern, context, self.FindDone)
+
+ def FindDone(self, row):
+ self.find_bar.Idle()
+ if row >= 0:
+ self.view.setCurrentIndex(self.model.index(row, 0, QModelIndex()))
+ else:
+ self.find_bar.NotFound()
+
+# Dialog data item converted and validated using a SQL table
+
+class SQLTableDialogDataItem():
+
+ def __init__(self, glb, label, placeholder_text, table_name, match_column, column_name1, column_name2, parent):
+ self.glb = glb
+ self.label = label
+ self.placeholder_text = placeholder_text
+ self.table_name = table_name
+ self.match_column = match_column
+ self.column_name1 = column_name1
+ self.column_name2 = column_name2
+ self.parent = parent
+
+ self.value = ""
+
+ self.widget = QLineEdit()
+ self.widget.editingFinished.connect(self.Validate)
+ self.widget.textChanged.connect(self.Invalidate)
+ self.red = False
+ self.error = ""
+ self.validated = True
+
+ self.last_id = 0
+ self.first_time = 0
+ self.last_time = 2 ** 64
+ if self.table_name == "<timeranges>":
+ query = QSqlQuery(self.glb.db)
+ QueryExec(query, "SELECT id, time FROM samples ORDER BY id DESC LIMIT 1")
+ if query.next():
+ self.last_id = int(query.value(0))
+ self.last_time = int(query.value(1))
+ QueryExec(query, "SELECT time FROM samples WHERE time != 0 ORDER BY id LIMIT 1")
+ if query.next():
+ self.first_time = int(query.value(0))
+ if placeholder_text:
+ placeholder_text += ", between " + str(self.first_time) + " and " + str(self.last_time)
+
+ if placeholder_text:
+ self.widget.setPlaceholderText(placeholder_text)
+
+ def ValueToIds(self, value):
+ ids = []
+ query = QSqlQuery(self.glb.db)
+ stmt = "SELECT id FROM " + self.table_name + " WHERE " + self.match_column + " = '" + value + "'"
+ ret = query.exec_(stmt)
+ if ret:
+ while query.next():
+ ids.append(str(query.value(0)))
+ return ids
+
+ def IdBetween(self, query, lower_id, higher_id, order):
+ QueryExec(query, "SELECT id FROM samples WHERE id > " + str(lower_id) + " AND id < " + str(higher_id) + " ORDER BY id " + order + " LIMIT 1")
+ if query.next():
+ return True, int(query.value(0))
+ else:
+ return False, 0
+
+ def BinarySearchTime(self, lower_id, higher_id, target_time, get_floor):
+ query = QSqlQuery(self.glb.db)
+ while True:
+ next_id = int((lower_id + higher_id) / 2)
+ QueryExec(query, "SELECT time FROM samples WHERE id = " + str(next_id))
+ if not query.next():
+ ok, dbid = self.IdBetween(query, lower_id, next_id, "DESC")
+ if not ok:
+ ok, dbid = self.IdBetween(query, next_id, higher_id, "")
+ if not ok:
+ return str(higher_id)
+ next_id = dbid
+ QueryExec(query, "SELECT time FROM samples WHERE id = " + str(next_id))
+ next_time = int(query.value(0))
+ if get_floor:
+ if target_time > next_time:
+ lower_id = next_id
+ else:
+ higher_id = next_id
+ if higher_id <= lower_id + 1:
+ return str(higher_id)
+ else:
+ if target_time >= next_time:
+ lower_id = next_id
+ else:
+ higher_id = next_id
+ if higher_id <= lower_id + 1:
+ return str(lower_id)
+
+ def ConvertRelativeTime(self, val):
+ print "val ", val
+ mult = 1
+ suffix = val[-2:]
+ if suffix == "ms":
+ mult = 1000000
+ elif suffix == "us":
+ mult = 1000
+ elif suffix == "ns":
+ mult = 1
+ else:
+ return val
+ val = val[:-2].strip()
+ if not self.IsNumber(val):
+ return val
+ val = int(val) * mult
+ if val >= 0:
+ val += self.first_time
+ else:
+ val += self.last_time
+ return str(val)
+
+ def ConvertTimeRange(self, vrange):
+ print "vrange ", vrange
+ if vrange[0] == "":
+ vrange[0] = str(self.first_time)
+ if vrange[1] == "":
+ vrange[1] = str(self.last_time)
+ vrange[0] = self.ConvertRelativeTime(vrange[0])
+ vrange[1] = self.ConvertRelativeTime(vrange[1])
+ print "vrange2 ", vrange
+ if not self.IsNumber(vrange[0]) or not self.IsNumber(vrange[1]):
+ return False
+ print "ok1"
+ beg_range = max(int(vrange[0]), self.first_time)
+ end_range = min(int(vrange[1]), self.last_time)
+ if beg_range > self.last_time or end_range < self.first_time:
+ return False
+ print "ok2"
+ vrange[0] = self.BinarySearchTime(0, self.last_id, beg_range, True)
+ vrange[1] = self.BinarySearchTime(1, self.last_id + 1, end_range, False)
+ print "vrange3 ", vrange
+ return True
+
+ def AddTimeRange(self, value, ranges):
+ print "value ", value
+ n = value.count("-")
+ if n == 1:
+ pass
+ elif n == 2:
+ if value.split("-")[1].strip() == "":
+ n = 1
+ elif n == 3:
+ n = 2
+ else:
+ return False
+ pos = findnth(value, "-", n)
+ vrange = [value[:pos].strip() ,value[pos+1:].strip()]
+ if self.ConvertTimeRange(vrange):
+ ranges.append(vrange)
+ return True
+ return False
+
+ def InvalidValue(self, value):
+ self.value = ""
+ palette = QPalette()
+ palette.setColor(QPalette.Text,Qt.red)
+ self.widget.setPalette(palette)
+ self.red = True
+ self.error = self.label + " invalid value '" + value + "'"
+ self.parent.ShowMessage(self.error)
+
+ def IsNumber(self, value):
+ try:
+ x = int(value)
+ except:
+ x = 0
+ return str(x) == value
+
+ def Invalidate(self):
+ self.validated = False
+
+ def Validate(self):
+ input_string = self.widget.text()
+ self.validated = True
+ if self.red:
+ palette = QPalette()
+ self.widget.setPalette(palette)
+ self.red = False
+ if not len(input_string.strip()):
+ self.error = ""
+ self.value = ""
+ return
+ if self.table_name == "<timeranges>":
+ ranges = []
+ for value in [x.strip() for x in input_string.split(",")]:
+ if not self.AddTimeRange(value, ranges):
+ return self.InvalidValue(value)
+ ranges = [("(" + self.column_name1 + " >= " + r[0] + " AND " + self.column_name1 + " <= " + r[1] + ")") for r in ranges]
+ self.value = " OR ".join(ranges)
+ elif self.table_name == "<ranges>":
+ singles = []
+ ranges = []
+ for value in [x.strip() for x in input_string.split(",")]:
+ if "-" in value:
+ vrange = value.split("-")
+ if len(vrange) != 2 or not self.IsNumber(vrange[0]) or not self.IsNumber(vrange[1]):
+ return self.InvalidValue(value)
+ ranges.append(vrange)
+ else:
+ if not self.IsNumber(value):
+ return self.InvalidValue(value)
+ singles.append(value)
+ ranges = [("(" + self.column_name1 + " >= " + r[0] + " AND " + self.column_name1 + " <= " + r[1] + ")") for r in ranges]
+ if len(singles):
+ ranges.append(self.column_name1 + " IN (" + ",".join(singles) + ")")
+ self.value = " OR ".join(ranges)
+ elif self.table_name:
+ all_ids = []
+ for value in [x.strip() for x in input_string.split(",")]:
+ ids = self.ValueToIds(value)
+ if len(ids):
+ all_ids.extend(ids)
+ else:
+ return self.InvalidValue(value)
+ self.value = self.column_name1 + " IN (" + ",".join(all_ids) + ")"
+ if self.column_name2:
+ self.value = "( " + self.value + " OR " + self.column_name2 + " IN (" + ",".join(all_ids) + ") )"
+ else:
+ self.value = input_string.strip()
+ self.error = ""
+ self.parent.ClearMessage()
+
+ def IsValid(self):
+ if not self.validated:
+ self.Validate()
+ if len(self.error):
+ self.parent.ShowMessage(self.error)
+ return False
+ return True
+
+# Selected branch report creation dialog
+
+class SelectedBranchDialog(QDialog):
+
+ def __init__(self, glb, parent=None):
+ super(SelectedBranchDialog, self).__init__(parent)
+
+ self.glb = glb
+
+ self.name = ""
+ self.where_clause = ""
+
+ self.setWindowTitle("Selected Branches")
+ self.setMinimumWidth(600)
+
+ items = (
+ ("Report name:", "Enter a name to appear in the window title bar", "", "", "", ""),
+ ("Time ranges:", "Enter time ranges", "<timeranges>", "", "samples.id", ""),
+ ("CPUs:", "Enter CPUs or ranges e.g. 0,5-6", "<ranges>", "", "cpu", ""),
+ ("Commands:", "Only branches with these commands will be included", "comms", "comm", "comm_id", ""),
+ ("PIDs:", "Only branches with these process IDs will be included", "threads", "pid", "thread_id", ""),
+ ("TIDs:", "Only branches with these thread IDs will be included", "threads", "tid", "thread_id", ""),
+ ("DSOs:", "Only branches with these DSOs will be included", "dsos", "short_name", "samples.dso_id", "to_dso_id"),
+ ("Symbols:", "Only branches with these symbols will be included", "symbols", "name", "symbol_id", "to_symbol_id"),
+ ("Raw SQL clause: ", "Enter a raw SQL WHERE clause", "", "", "", ""),
+ )
+ self.data_items = [SQLTableDialogDataItem(glb, *x, parent=self) for x in items]
+
+ self.grid = QGridLayout()
+
+ for row in xrange(len(self.data_items)):
+ self.grid.addWidget(QLabel(self.data_items[row].label), row, 0)
+ self.grid.addWidget(self.data_items[row].widget, row, 1)
+
+ self.status = QLabel()
+
+ self.ok_button = QPushButton("Ok", self)
+ self.ok_button.setDefault(True)
+ self.ok_button.released.connect(self.Ok)
+ self.ok_button.setSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
+
+ self.cancel_button = QPushButton("Cancel", self)
+ self.cancel_button.released.connect(self.reject)
+ self.cancel_button.setSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
+
+ self.hbox = QHBoxLayout()
+ #self.hbox.addStretch()
+ self.hbox.addWidget(self.status)
+ self.hbox.addWidget(self.ok_button)
+ self.hbox.addWidget(self.cancel_button)
+
+ self.vbox = QVBoxLayout()
+ self.vbox.addLayout(self.grid)
+ self.vbox.addLayout(self.hbox)
+
+ self.setLayout(self.vbox);
+
+ def Ok(self):
+ self.name = self.data_items[0].value
+ if not self.name:
+ self.ShowMessage("Report name is required")
+ return
+ for d in self.data_items:
+ if not d.IsValid():
+ return
+ for d in self.data_items[1:]:
+ if len(d.value):
+ if len(self.where_clause):
+ self.where_clause += " AND "
+ self.where_clause += d.value
+ if len(self.where_clause):
+ self.where_clause = " AND ( " + self.where_clause + " ) "
+ else:
+ self.ShowMessage("No selection")
+ return
+ self.accept()
+
+ def ShowMessage(self, msg):
+ self.status.setText("<font color=#FF0000>" + msg)
+
+ def ClearMessage(self):
+ self.status.setText("")
+
+# Event list
+
+def GetEventList(db):
+ events = []
+ query = QSqlQuery(db)
+ QueryExec(query, "SELECT name FROM selected_events WHERE id > 0 ORDER BY id")
+ while query.next():
+ events.append(query.value(0))
+ return events
+
+# SQL data preparation
+
+def SQLTableDataPrep(query, count):
+ data = []
+ for i in xrange(count):
+ data.append(query.value(i))
+ return data
+
+# SQL table data model item
+
+class SQLTableItem():
+
+ def __init__(self, row, data):
+ self.row = row
+ self.data = data
+
+ def getData(self, column):
+ return self.data[column]
+
+# SQL table data model
+
+class SQLTableModel(TableModel):
+
+ progress = Signal(object)
+
+ def __init__(self, glb, sql, column_count, parent=None):
+ super(SQLTableModel, self).__init__(parent)
+ self.glb = glb
+ self.more = True
+ self.populated = 0
+ self.fetcher = SQLFetcher(glb, sql, lambda x, y=column_count: SQLTableDataPrep(x, y), self.AddSample)
+ self.fetcher.done.connect(self.Update)
+ self.fetcher.Fetch(glb_chunk_sz)
+
+ def DisplayData(self, item, index):
+ self.FetchIfNeeded(item.row)
+ return item.getData(index.column())
+
+ def AddSample(self, data):
+ child = SQLTableItem(self.populated, data)
+ self.child_items.append(child)
+ self.populated += 1
+
+ def Update(self, fetched):
+ if not fetched:
+ self.more = False
+ self.progress.emit(0)
+ child_count = self.child_count
+ count = self.populated - child_count
+ if count > 0:
+ parent = QModelIndex()
+ self.beginInsertRows(parent, child_count, child_count + count - 1)
+ self.insertRows(child_count, count, parent)
+ self.child_count += count
+ self.endInsertRows()
+ self.progress.emit(self.child_count)
+
+ def FetchMoreRecords(self, count):
+ current = self.child_count
+ if self.more:
+ self.fetcher.Fetch(count)
+ else:
+ self.progress.emit(0)
+ return current
+
+ def HasMoreRecords(self):
+ return self.more
+
+# SQL automatic table data model
+
+class SQLAutoTableModel(SQLTableModel):
+
+ def __init__(self, glb, table_name, parent=None):
+ sql = "SELECT * FROM " + table_name + " WHERE id > $$last_id$$ ORDER BY id LIMIT " + str(glb_chunk_sz)
+ if table_name == "comm_threads_view":
+ # For now, comm_threads_view has no id column
+ sql = "SELECT * FROM " + table_name + " WHERE comm_id > $$last_id$$ ORDER BY comm_id LIMIT " + str(glb_chunk_sz)
+ self.column_headers = []
+ query = QSqlQuery(glb.db)
+ if glb.dbref.is_sqlite3:
+ QueryExec(query, "PRAGMA table_info(" + table_name + ")")
+ while query.next():
+ self.column_headers.append(query.value(1))
+ if table_name == "sqlite_master":
+ sql = "SELECT * FROM " + table_name
+ else:
+ if table_name[:19] == "information_schema.":
+ sql = "SELECT * FROM " + table_name
+ select_table_name = table_name[19:]
+ schema = "information_schema"
+ else:
+ select_table_name = table_name
+ schema = "public"
+ QueryExec(query, "SELECT column_name FROM information_schema.columns WHERE table_schema = '" + schema + "' and table_name = '" + select_table_name + "'")
+ while query.next():
+ self.column_headers.append(query.value(0))
+ super(SQLAutoTableModel, self).__init__(glb, sql, len(self.column_headers), parent)
+
+ def columnCount(self, parent=None):
+ return len(self.column_headers)
+
+ def columnHeader(self, column):
+ return self.column_headers[column]
+
+# Base class for custom ResizeColumnsToContents
+
+class ResizeColumnsToContentsBase(QObject):
+
+ def __init__(self, parent=None):
+ super(ResizeColumnsToContentsBase, self).__init__(parent)
+
+ def ResizeColumnToContents(self, column, n):
+ # Using the view's resizeColumnToContents() here is extrememly slow
+ # so implement a crude alternative
+ font = self.view.font()
+ metrics = QFontMetrics(font)
+ max = 0
+ for row in xrange(n):
+ val = self.data_model.child_items[row].data[column]
+ len = metrics.width(str(val) + "MM")
+ max = len if len > max else max
+ val = self.data_model.columnHeader(column)
+ len = metrics.width(str(val) + "MM")
+ max = len if len > max else max
+ self.view.setColumnWidth(column, max)
+
+ def ResizeColumnsToContents(self):
+ n = min(self.data_model.child_count, 100)
+ if n < 1:
+ # No data yet, so connect a signal to notify when there is
+ self.data_model.rowsInserted.connect(self.UpdateColumnWidths)
+ return
+ columns = self.data_model.columnCount()
+ for i in xrange(columns):
+ self.ResizeColumnToContents(i, n)
+
+ def UpdateColumnWidths(self, *x):
+ # This only needs to be done once, so disconnect the signal now
+ self.data_model.rowsInserted.disconnect(self.UpdateColumnWidths)
+ self.ResizeColumnsToContents()
+
+# Table window
+
+class TableWindow(QMdiSubWindow, ResizeColumnsToContentsBase):
+
+ def __init__(self, glb, table_name, parent=None):
+ super(TableWindow, self).__init__(parent)
+
+ self.data_model = LookupCreateModel(table_name + " Table", lambda: SQLAutoTableModel(glb, table_name))
+
+ self.model = QSortFilterProxyModel()
+ self.model.setSourceModel(self.data_model)
+
+ self.view = QTableView()
+ self.view.setModel(self.model)
+ self.view.setEditTriggers(QAbstractItemView.NoEditTriggers)
+ self.view.verticalHeader().setVisible(False)
+ self.view.sortByColumn(-1, Qt.AscendingOrder)
+ self.view.setSortingEnabled(True)
+
+ self.ResizeColumnsToContents()
+
+ self.find_bar = FindBar(self, self, True)
+
+ self.finder = ChildDataItemFinder(self.data_model)
+
+ self.fetch_bar = FetchMoreRecordsBar(self.data_model, self)
+
+ self.vbox = VBox(self.view, self.find_bar.Widget(), self.fetch_bar.Widget())
+
+ self.setWidget(self.vbox.Widget())
+
+ AddSubWindow(glb.mainwindow.mdi_area, self, table_name + " Table")
+
+ def Find(self, value, direction, pattern, context):
+ self.view.setFocus()
+ self.find_bar.Busy()
+ self.finder.Find(value, direction, pattern, context, self.FindDone)
+
+ def FindDone(self, row):
+ self.find_bar.Idle()
+ if row >= 0:
+ self.view.setCurrentIndex(self.model.mapFromSource(self.data_model.index(row, 0, QModelIndex())))
+ else:
+ self.find_bar.NotFound()
+
+# Table list
+
+def GetTableList(glb):
+ tables = []
+ query = QSqlQuery(glb.db)
+ if glb.dbref.is_sqlite3:
+ QueryExec(query, "SELECT name FROM sqlite_master WHERE type IN ( 'table' , 'view' ) ORDER BY name")
+ else:
+ QueryExec(query, "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public' AND table_type IN ( 'BASE TABLE' , 'VIEW' ) ORDER BY table_name")
+ while query.next():
+ tables.append(query.value(0))
+ if glb.dbref.is_sqlite3:
+ tables.append("sqlite_master")
+ else:
+ tables.append("information_schema.tables")
+ tables.append("information_schema.views")
+ tables.append("information_schema.columns")
+ return tables
+
+# Action Definition
+
+def CreateAction(label, tip, callback, parent=None, shortcut=None):
+ action = QAction(label, parent)
+ if shortcut != None:
+ action.setShortcuts(shortcut)
+ action.setStatusTip(tip)
+ action.triggered.connect(callback)
+ return action
+
+# Typical application actions
+
+def CreateExitAction(app, parent=None):
+ return CreateAction("&Quit", "Exit the application", app.closeAllWindows, parent, QKeySequence.Quit)
+
+# Typical MDI actions
+
+def CreateCloseActiveWindowAction(mdi_area):
+ return CreateAction("Cl&ose", "Close the active window", mdi_area.closeActiveSubWindow, mdi_area)
+
+def CreateCloseAllWindowsAction(mdi_area):
+ return CreateAction("Close &All", "Close all the windows", mdi_area.closeAllSubWindows, mdi_area)
+
+def CreateTileWindowsAction(mdi_area):
+ return CreateAction("&Tile", "Tile the windows", mdi_area.tileSubWindows, mdi_area)
+
+def CreateCascadeWindowsAction(mdi_area):
+ return CreateAction("&Cascade", "Cascade the windows", mdi_area.cascadeSubWindows, mdi_area)
+
+def CreateNextWindowAction(mdi_area):
+ return CreateAction("Ne&xt", "Move the focus to the next window", mdi_area.activateNextSubWindow, mdi_area, QKeySequence.NextChild)
+
+def CreatePreviousWindowAction(mdi_area):
+ return CreateAction("Pre&vious", "Move the focus to the previous window", mdi_area.activatePreviousSubWindow, mdi_area, QKeySequence.PreviousChild)
+
+# Typical MDI window menu
+
+class WindowMenu():
+
+ def __init__(self, mdi_area, menu):
+ self.mdi_area = mdi_area
+ self.window_menu = menu.addMenu("&Windows")
+ self.close_active_window = CreateCloseActiveWindowAction(mdi_area)
+ self.close_all_windows = CreateCloseAllWindowsAction(mdi_area)
+ self.tile_windows = CreateTileWindowsAction(mdi_area)
+ self.cascade_windows = CreateCascadeWindowsAction(mdi_area)
+ self.next_window = CreateNextWindowAction(mdi_area)
+ self.previous_window = CreatePreviousWindowAction(mdi_area)
+ self.window_menu.aboutToShow.connect(self.Update)
+
+ def Update(self):
+ self.window_menu.clear()
+ sub_window_count = len(self.mdi_area.subWindowList())
+ have_sub_windows = sub_window_count != 0
+ self.close_active_window.setEnabled(have_sub_windows)
+ self.close_all_windows.setEnabled(have_sub_windows)
+ self.tile_windows.setEnabled(have_sub_windows)
+ self.cascade_windows.setEnabled(have_sub_windows)
+ self.next_window.setEnabled(have_sub_windows)
+ self.previous_window.setEnabled(have_sub_windows)
+ self.window_menu.addAction(self.close_active_window)
+ self.window_menu.addAction(self.close_all_windows)
+ self.window_menu.addSeparator()
+ self.window_menu.addAction(self.tile_windows)
+ self.window_menu.addAction(self.cascade_windows)
+ self.window_menu.addSeparator()
+ self.window_menu.addAction(self.next_window)
+ self.window_menu.addAction(self.previous_window)
+ if sub_window_count == 0:
+ return
+ self.window_menu.addSeparator()
+ nr = 1
+ for sub_window in self.mdi_area.subWindowList():
+ label = str(nr) + " " + sub_window.name
+ if nr < 10:
+ label = "&" + label
+ action = self.window_menu.addAction(label)
+ action.setCheckable(True)
+ action.setChecked(sub_window == self.mdi_area.activeSubWindow())
+ action.triggered.connect(lambda x=nr: self.setActiveSubWindow(x))
+ self.window_menu.addAction(action)
+ nr += 1
+
+ def setActiveSubWindow(self, nr):
+ self.mdi_area.setActiveSubWindow(self.mdi_area.subWindowList()[nr - 1])
+
+# Help text
+
+glb_help_text = """
+<h1>Contents</h1>
+<style>
+p.c1 {
+ text-indent: 40px;
+}
+p.c2 {
+ text-indent: 80px;
+}
+}
+</style>
+<p class=c1><a href=#reports>1. Reports</a></p>
+<p class=c2><a href=#callgraph>1.1 Context-Sensitive Call Graph</a></p>
+<p class=c2><a href=#allbranches>1.2 All branches</a></p>
+<p class=c2><a href=#selectedbranches>1.3 Selected branches</a></p>
+<p class=c1><a href=#tables>2. Tables</a></p>
+<h1 id=reports>1. Reports</h1>
+<h2 id=callgraph>1.1 Context-Sensitive Call Graph</h2>
+The result is a GUI window with a tree representing a context-sensitive
+call-graph. Expanding a couple of levels of the tree and adjusting column
+widths to suit will display something like:
+<pre>
+ Call Graph: pt_example
+Call Path Object Count Time(ns) Time(%) Branch Count Branch Count(%)
+v- ls
+ v- 2638:2638
+ v- _start ld-2.19.so 1 10074071 100.0 211135 100.0
+ |- unknown unknown 1 13198 0.1 1 0.0
+ >- _dl_start ld-2.19.so 1 1400980 13.9 19637 9.3
+ >- _d_linit_internal ld-2.19.so 1 448152 4.4 11094 5.3
+ v-__libc_start_main@plt ls 1 8211741 81.5 180397 85.4
+ >- _dl_fixup ld-2.19.so 1 7607 0.1 108 0.1
+ >- __cxa_atexit libc-2.19.so 1 11737 0.1 10 0.0
+ >- __libc_csu_init ls 1 10354 0.1 10 0.0
+ |- _setjmp libc-2.19.so 1 0 0.0 4 0.0
+ v- main ls 1 8182043 99.6 180254 99.9
+</pre>
+<h3>Points to note:</h3>
+<ul>
+<li>The top level is a command name (comm)</li>
+<li>The next level is a thread (pid:tid)</li>
+<li>Subsequent levels are functions</li>
+<li>'Count' is the number of calls</li>
+<li>'Time' is the elapsed time until the function returns</li>
+<li>Percentages are relative to the level above</li>
+<li>'Branch Count' is the total number of branches for that function and all functions that it calls
+</ul>
+<h3>Find</h3>
+Ctrl-F displays a Find bar which finds function names by either an exact match or a pattern match.
+The pattern matching symbols are ? for any character and * for zero or more characters.
+<h2 id=allbranches>1.2 All branches</h2>
+The All branches report displays all branches in chronological order.
+Not all data is fetched immediately. More records can be fetched using the Fetch bar provided.
+<h3>Disassembly</h3>
+Open a branch to display disassembly. This only works if:
+<ol>
+<li>The disassembler is available. Currently, only Intel XED is supported - see <a href=#xed>Intel XED Setup</a></li>
+<li>The object code is available. Currently, only the perf build ID cache is searched for object code.
+The default directory ~/.debug can be overridden by setting environment variable PERF_BUILDID_DIR.
+One exception is kcore where the DSO long name is used (refer dsos_view on the Tables menu),
+or alternatively, set environment variable PERF_KCORE to the kcore file name.</li>
+</ol>
+<h4 id=xed>Intel XED Setup</h4>
+To use Intel XED, libxed.so must be present. To build and install libxed.so:
+<pre>
+git clone https://github.com/intelxed/mbuild.git mbuild
+git clone https://github.com/intelxed/xed
+cd xed
+./mfile.py --share
+sudo ./mfile.py --prefix=/usr/local install
+sudo ldconfig
+</pre>
+<h3>Find</h3>
+Ctrl-F displays a Find bar which finds substrings by either an exact match or a regular expression match.
+Refer to Python documentation for the regular expression syntax.
+All columns are searched, but only currently fetched rows are searched.
+<h2 id=selectedbranches>1.3 Selected branches</h2>
+This is the same as the <a href=#allbranches>All branches</a> report but with the data reduced
+by various selection criteria. A dialog box displays available criteria which are AND'ed together.
+<h3>1.3.1 Time ranges</h3>
+The time ranges hint text shows the total time range. Relative time ranges can also be entered in
+ms, us or ns. Also, negative values are relative to the end of trace. Examples:
+<pre>
+ 81073085947329-81073085958238 From 81073085947329 to 81073085958238
+ 100us-200us From 100us to 200us
+ 10ms- From 10ms to the end
+ -100ns The first 100ns
+ -10ms- The last 10ms
+</pre>
+N.B. Due to the granularity of timestamps, there could be no branches in any given time range.
+<h1 id=tables>2. Tables</h1>
+The Tables menu shows all tables and views in the database. Most tables have an associated view
+which displays the information in a more friendly way. Not all data for large tables is fetched
+immediately. More records can be fetched using the Fetch bar provided. Columns can be sorted,
+but that can be slow for large tables.
+<p>There are also tables of database meta-information.
+For SQLite3 databases, the sqlite_master table is included.
+For PostgreSQL databases, information_schema.tables/views/columns are included.
+<h3>Find</h3>
+Ctrl-F displays a Find bar which finds substrings by either an exact match or a regular expression match.
+Refer to Python documentation for the regular expression syntax.
+All columns are searched, but only currently fetched rows are searched.
+<p>N.B. Results are found in id order, so if the table is re-ordered, find-next and find-previous
+will go to the next/previous result in id order, instead of display order.
+"""
+
+# Help window
+
+class HelpWindow(QMdiSubWindow):
+
+ def __init__(self, glb, parent=None):
+ super(HelpWindow, self).__init__(parent)
+
+ self.text = QTextBrowser()
+ self.text.setHtml(glb_help_text)
+ self.text.setReadOnly(True)
+ self.text.setOpenExternalLinks(True)
+
+ self.setWidget(self.text)
+
+ AddSubWindow(glb.mainwindow.mdi_area, self, "Exported SQL Viewer Help")
+
+# Main window that only displays the help text
+
+class HelpOnlyWindow(QMainWindow):
+
+ def __init__(self, parent=None):
+ super(HelpOnlyWindow, self).__init__(parent)
+
+ self.setMinimumSize(200, 100)
+ self.resize(800, 600)
+ self.setWindowTitle("Exported SQL Viewer Help")
+ self.setWindowIcon(self.style().standardIcon(QStyle.SP_MessageBoxInformation))
+
+ self.text = QTextBrowser()
+ self.text.setHtml(glb_help_text)
+ self.text.setReadOnly(True)
+ self.text.setOpenExternalLinks(True)
+
+ self.setCentralWidget(self.text)
+
+# Font resize
+
+def ResizeFont(widget, diff):
+ font = widget.font()
+ sz = font.pointSize()
+ font.setPointSize(sz + diff)
+ widget.setFont(font)
+
+def ShrinkFont(widget):
+ ResizeFont(widget, -1)
+
+def EnlargeFont(widget):
+ ResizeFont(widget, 1)
+
+# Unique name for sub-windows
+
+def NumberedWindowName(name, nr):
+ if nr > 1:
+ name += " <" + str(nr) + ">"
+ return name
+
+def UniqueSubWindowName(mdi_area, name):
+ nr = 1
+ while True:
+ unique_name = NumberedWindowName(name, nr)
+ ok = True
+ for sub_window in mdi_area.subWindowList():
+ if sub_window.name == unique_name:
+ ok = False
+ break
+ if ok:
+ return unique_name
+ nr += 1
+
+# Add a sub-window
+
+def AddSubWindow(mdi_area, sub_window, name):
+ unique_name = UniqueSubWindowName(mdi_area, name)
+ sub_window.setMinimumSize(200, 100)
+ sub_window.resize(800, 600)
+ sub_window.setWindowTitle(unique_name)
+ sub_window.setAttribute(Qt.WA_DeleteOnClose)
+ sub_window.setWindowIcon(sub_window.style().standardIcon(QStyle.SP_FileIcon))
+ sub_window.name = unique_name
+ mdi_area.addSubWindow(sub_window)
+ sub_window.show()
+
+# Main window
+
+class MainWindow(QMainWindow):
+
+ def __init__(self, glb, parent=None):
+ super(MainWindow, self).__init__(parent)
+
+ self.glb = glb
+
+ self.setWindowTitle("Exported SQL Viewer: " + glb.dbname)
+ self.setWindowIcon(self.style().standardIcon(QStyle.SP_ComputerIcon))
+ self.setMinimumSize(200, 100)
+
+ self.mdi_area = QMdiArea()
+ self.mdi_area.setHorizontalScrollBarPolicy(Qt.ScrollBarAsNeeded)
+ self.mdi_area.setVerticalScrollBarPolicy(Qt.ScrollBarAsNeeded)
+
+ self.setCentralWidget(self.mdi_area)
+
+ menu = self.menuBar()
+
+ file_menu = menu.addMenu("&File")
+ file_menu.addAction(CreateExitAction(glb.app, self))
+
+ edit_menu = menu.addMenu("&Edit")
+ edit_menu.addAction(CreateAction("&Find...", "Find items", self.Find, self, QKeySequence.Find))
+ edit_menu.addAction(CreateAction("Fetch &more records...", "Fetch more records", self.FetchMoreRecords, self, [QKeySequence(Qt.Key_F8)]))
+ edit_menu.addAction(CreateAction("&Shrink Font", "Make text smaller", self.ShrinkFont, self, [QKeySequence("Ctrl+-")]))
+ edit_menu.addAction(CreateAction("&Enlarge Font", "Make text bigger", self.EnlargeFont, self, [QKeySequence("Ctrl++")]))
+
+ reports_menu = menu.addMenu("&Reports")
+ reports_menu.addAction(CreateAction("Context-Sensitive Call &Graph", "Create a new window containing a context-sensitive call graph", self.NewCallGraph, self))
+
+ self.EventMenu(GetEventList(glb.db), reports_menu)
+
+ self.TableMenu(GetTableList(glb), menu)
+
+ self.window_menu = WindowMenu(self.mdi_area, menu)
+
+ help_menu = menu.addMenu("&Help")
+ help_menu.addAction(CreateAction("&Exported SQL Viewer Help", "Helpful information", self.Help, self, QKeySequence.HelpContents))
+
+ def Find(self):
+ win = self.mdi_area.activeSubWindow()
+ if win:
+ try:
+ win.find_bar.Activate()
+ except:
+ pass
+
+ def FetchMoreRecords(self):
+ win = self.mdi_area.activeSubWindow()
+ if win:
+ try:
+ win.fetch_bar.Activate()
+ except:
+ pass
+
+ def ShrinkFont(self):
+ win = self.mdi_area.activeSubWindow()
+ ShrinkFont(win.view)
+
+ def EnlargeFont(self):
+ win = self.mdi_area.activeSubWindow()
+ EnlargeFont(win.view)
+
+ def EventMenu(self, events, reports_menu):
+ branches_events = 0
+ for event in events:
+ event = event.split(":")[0]
+ if event == "branches":
+ branches_events += 1
+ dbid = 0
+ for event in events:
+ dbid += 1
+ event = event.split(":")[0]
+ if event == "branches":
+ label = "All branches" if branches_events == 1 else "All branches " + "(id=" + dbid + ")"
+ reports_menu.addAction(CreateAction(label, "Create a new window displaying branch events", lambda x=dbid: self.NewBranchView(x), self))
+ label = "Selected branches" if branches_events == 1 else "Selected branches " + "(id=" + dbid + ")"
+ reports_menu.addAction(CreateAction(label, "Create a new window displaying branch events", lambda x=dbid: self.NewSelectedBranchView(x), self))
+
+ def TableMenu(self, tables, menu):
+ table_menu = menu.addMenu("&Tables")
+ for table in tables:
+ table_menu.addAction(CreateAction(table, "Create a new window containing a table view", lambda t=table: self.NewTableView(t), self))
+
+ def NewCallGraph(self):
+ CallGraphWindow(self.glb, self)
+
+ def NewBranchView(self, event_id):
+ BranchWindow(self.glb, event_id, "", "", self)
+
+ def NewSelectedBranchView(self, event_id):
+ dialog = SelectedBranchDialog(self.glb, self)
+ ret = dialog.exec_()
+ if ret:
+ BranchWindow(self.glb, event_id, dialog.name, dialog.where_clause, self)
+
+ def NewTableView(self, table_name):
+ TableWindow(self.glb, table_name, self)
+
+ def Help(self):
+ HelpWindow(self.glb, self)
+
+# XED Disassembler
+
+class xed_state_t(Structure):
+
+ _fields_ = [
+ ("mode", c_int),
+ ("width", c_int)
+ ]
+
+class XEDInstruction():
+
+ def __init__(self, libxed):
+ # Current xed_decoded_inst_t structure is 192 bytes. Use 512 to allow for future expansion
+ xedd_t = c_byte * 512
+ self.xedd = xedd_t()
+ self.xedp = addressof(self.xedd)
+ libxed.xed_decoded_inst_zero(self.xedp)
+ self.state = xed_state_t()
+ self.statep = addressof(self.state)
+ # Buffer for disassembled instruction text
+ self.buffer = create_string_buffer(256)
+ self.bufferp = addressof(self.buffer)
+
+class LibXED():
+
+ def __init__(self):
+ try:
+ self.libxed = CDLL("libxed.so")
+ except:
+ self.libxed = None
+ if not self.libxed:
+ self.libxed = CDLL("/usr/local/lib/libxed.so")
+
+ self.xed_tables_init = self.libxed.xed_tables_init
+ self.xed_tables_init.restype = None
+ self.xed_tables_init.argtypes = []
+
+ self.xed_decoded_inst_zero = self.libxed.xed_decoded_inst_zero
+ self.xed_decoded_inst_zero.restype = None
+ self.xed_decoded_inst_zero.argtypes = [ c_void_p ]
+
+ self.xed_operand_values_set_mode = self.libxed.xed_operand_values_set_mode
+ self.xed_operand_values_set_mode.restype = None
+ self.xed_operand_values_set_mode.argtypes = [ c_void_p, c_void_p ]
+
+ self.xed_decoded_inst_zero_keep_mode = self.libxed.xed_decoded_inst_zero_keep_mode
+ self.xed_decoded_inst_zero_keep_mode.restype = None
+ self.xed_decoded_inst_zero_keep_mode.argtypes = [ c_void_p ]
+
+ self.xed_decode = self.libxed.xed_decode
+ self.xed_decode.restype = c_int
+ self.xed_decode.argtypes = [ c_void_p, c_void_p, c_uint ]
+
+ self.xed_format_context = self.libxed.xed_format_context
+ self.xed_format_context.restype = c_uint
+ self.xed_format_context.argtypes = [ c_int, c_void_p, c_void_p, c_int, c_ulonglong, c_void_p, c_void_p ]
+
+ self.xed_tables_init()
+
+ def Instruction(self):
+ return XEDInstruction(self)
+
+ def SetMode(self, inst, mode):
+ if mode:
+ inst.state.mode = 4 # 32-bit
+ inst.state.width = 4 # 4 bytes
+ else:
+ inst.state.mode = 1 # 64-bit
+ inst.state.width = 8 # 8 bytes
+ self.xed_operand_values_set_mode(inst.xedp, inst.statep)
+
+ def DisassembleOne(self, inst, bytes_ptr, bytes_cnt, ip):
+ self.xed_decoded_inst_zero_keep_mode(inst.xedp)
+ err = self.xed_decode(inst.xedp, bytes_ptr, bytes_cnt)
+ if err:
+ return 0, ""
+ # Use AT&T mode (2), alternative is Intel (3)
+ ok = self.xed_format_context(2, inst.xedp, inst.bufferp, sizeof(inst.buffer), ip, 0, 0)
+ if not ok:
+ return 0, ""
+ # Return instruction length and the disassembled instruction text
+ # For now, assume the length is in byte 166
+ return inst.xedd[166], inst.buffer.value
+
+def TryOpen(file_name):
+ try:
+ return open(file_name, "rb")
+ except:
+ return None
+
+def Is64Bit(f):
+ result = sizeof(c_void_p)
+ # ELF support only
+ pos = f.tell()
+ f.seek(0)
+ header = f.read(7)
+ f.seek(pos)
+ magic = header[0:4]
+ eclass = ord(header[4])
+ encoding = ord(header[5])
+ version = ord(header[6])
+ if magic == chr(127) + "ELF" and eclass > 0 and eclass < 3 and encoding > 0 and encoding < 3 and version == 1:
+ result = True if eclass == 2 else False
+ return result
+
+# Global data
+
+class Glb():
+
+ def __init__(self, dbref, db, dbname):
+ self.dbref = dbref
+ self.db = db
+ self.dbname = dbname
+ self.home_dir = os.path.expanduser("~")
+ self.buildid_dir = os.getenv("PERF_BUILDID_DIR")
+ if self.buildid_dir:
+ self.buildid_dir += "/.build-id/"
+ else:
+ self.buildid_dir = self.home_dir + "/.debug/.build-id/"
+ self.app = None
+ self.mainwindow = None
+ self.instances_to_shutdown_on_exit = weakref.WeakSet()
+ try:
+ self.disassembler = LibXED()
+ self.have_disassembler = True
+ except:
+ self.have_disassembler = False
+
+ def FileFromBuildId(self, build_id):
+ file_name = self.buildid_dir + build_id[0:2] + "/" + build_id[2:] + "/elf"
+ return TryOpen(file_name)
+
+ def FileFromNamesAndBuildId(self, short_name, long_name, build_id):
+ # Assume current machine i.e. no support for virtualization
+ if short_name[0:7] == "[kernel" and os.path.basename(long_name) == "kcore":
+ file_name = os.getenv("PERF_KCORE")
+ f = TryOpen(file_name) if file_name else None
+ if f:
+ return f
+ # For now, no special handling if long_name is /proc/kcore
+ f = TryOpen(long_name)
+ if f:
+ return f
+ f = self.FileFromBuildId(build_id)
+ if f:
+ return f
+ return None
+
+ def AddInstanceToShutdownOnExit(self, instance):
+ self.instances_to_shutdown_on_exit.add(instance)
+
+ # Shutdown any background processes or threads
+ def ShutdownInstances(self):
+ for x in self.instances_to_shutdown_on_exit:
+ try:
+ x.Shutdown()
+ except:
+ pass
+
+# Database reference
+
+class DBRef():
+
+ def __init__(self, is_sqlite3, dbname):
+ self.is_sqlite3 = is_sqlite3
+ self.dbname = dbname
+
+ def Open(self, connection_name):
+ dbname = self.dbname
+ if self.is_sqlite3:
+ db = QSqlDatabase.addDatabase("QSQLITE", connection_name)
+ else:
+ db = QSqlDatabase.addDatabase("QPSQL", connection_name)
+ opts = dbname.split()
+ for opt in opts:
+ if "=" in opt:
+ opt = opt.split("=")
+ if opt[0] == "hostname":
+ db.setHostName(opt[1])
+ elif opt[0] == "port":
+ db.setPort(int(opt[1]))
+ elif opt[0] == "username":
+ db.setUserName(opt[1])
+ elif opt[0] == "password":
+ db.setPassword(opt[1])
+ elif opt[0] == "dbname":
+ dbname = opt[1]
+ else:
+ dbname = opt
+
+ db.setDatabaseName(dbname)
+ if not db.open():
+ raise Exception("Failed to open database " + dbname + " error: " + db.lastError().text())
+ return db, dbname
+
+# Main
+
+def Main():
+ if (len(sys.argv) < 2):
+ print >> sys.stderr, "Usage is: exported-sql-viewer.py {<database name> | --help-only}"
+ raise Exception("Too few arguments")
+
+ dbname = sys.argv[1]
+ if dbname == "--help-only":
+ app = QApplication(sys.argv)
+ mainwindow = HelpOnlyWindow()
+ mainwindow.show()
+ err = app.exec_()
+ sys.exit(err)
+
+ is_sqlite3 = False
+ try:
+ f = open(dbname)
+ if f.read(15) == "SQLite format 3":
+ is_sqlite3 = True
+ f.close()
+ except:
+ pass
+
+ dbref = DBRef(is_sqlite3, dbname)
+ db, dbname = dbref.Open("main")
+ glb = Glb(dbref, db, dbname)
+ app = QApplication(sys.argv)
+ glb.app = app
+ mainwindow = MainWindow(glb)
+ glb.mainwindow = mainwindow
+ mainwindow.show()
+ err = app.exec_()
+ glb.ShutdownInstances()
+ db.close()
+ sys.exit(err)
+
+if __name__ == "__main__":
+ Main()
config=0
sample_period=*
sample_type=263
-read_format=0
+read_format=0|4
disabled=1
inherit=1
pinned=0
sample_period=0
freq=0
write_backward=0
-sample_id_all=0
libperf-y += ioctl.o
endif
libperf-y += kcmp.o
+libperf-y += mount_flags.o
libperf-y += pkey_alloc.o
libperf-y += prctl.o
libperf-y += sockaddr.o
}
size_t strarray__scnprintf(struct strarray *sa, char *bf, size_t size, const char *intfmt, int val);
+size_t strarray__scnprintf_flags(struct strarray *sa, char *bf, size_t size, unsigned long flags);
struct trace;
struct thread;
size_t syscall_arg__scnprintf_kcmp_idx(char *bf, size_t size, struct syscall_arg *arg);
#define SCA_KCMP_IDX syscall_arg__scnprintf_kcmp_idx
+unsigned long syscall_arg__mask_val_mount_flags(struct syscall_arg *arg, unsigned long flags);
+#define SCAMV_MOUNT_FLAGS syscall_arg__mask_val_mount_flags
+
+size_t syscall_arg__scnprintf_mount_flags(char *bf, size_t size, struct syscall_arg *arg);
+#define SCA_MOUNT_FLAGS syscall_arg__scnprintf_mount_flags
+
size_t syscall_arg__scnprintf_pkey_alloc_access_rights(char *bf, size_t size, struct syscall_arg *arg);
#define SCA_PKEY_ALLOC_ACCESS_RIGHTS syscall_arg__scnprintf_pkey_alloc_access_rights
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/cone.c
*
* Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
- *
- * Released under the GPL v2. (and only v2, not any later version)
*/
#include "trace/beauty/beauty.h"
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/drm/
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#ifndef EFD_SEMAPHORE
#define EFD_SEMAPHORE 1
#endif
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/fcntl.c
*
* Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
- *
- * Released under the GPL v2. (and only v2, not any later version)
*/
#include "trace/beauty/beauty.h"
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include "trace/beauty/beauty.h"
#include <linux/kernel.h>
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <linux/futex.h>
#ifndef FUTEX_WAIT_BITSET
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <linux/futex.h>
#ifndef FUTEX_BITSET_MATCH_ANY
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/ioctl.c
*
* Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
- *
- * Released under the GPL v2. (and only v2, not any later version)
*/
#include "trace/beauty/beauty.h"
"TCSETSW2", "TCSETSF2", "TIOCGRS48", "TIOCSRS485", "TIOCGPTN", "TIOCSPTLCK",
"TIOCGDEV", "TCSETX", "TCSETXF", "TCSETXW", "TIOCSIG", "TIOCVHANGUP", "TIOCGPKT",
"TIOCGPTLCK", [_IOC_NR(TIOCGEXCL)] = "TIOCGEXCL", "TIOCGPTPEER",
+ "TIOCGISO7816", "TIOCSISO7816",
[_IOC_NR(FIONCLEX)] = "FIONCLEX", "FIOCLEX", "FIOASYNC", "TIOCSERCONFIG",
"TIOCSERGWILD", "TIOCSERSWILD", "TIOCGLCKTRMIOS", "TIOCSLCKTRMIOS",
"TIOCSERGSTRUCT", "TIOCSERGETLSR", "TIOCSERGETMULTI", "TIOCSERSETMULTI",
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/kcmp.c
*
* Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
- *
- * Released under the GPL v2. (and only v2, not any later version)
*/
#include "trace/beauty/beauty.h"
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/linux/
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/linux/
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/asm-generic/
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <uapi/linux/mman.h>
+#include <linux/log2.h>
static size_t syscall_arg__scnprintf_mmap_prot(char *bf, size_t size,
struct syscall_arg *arg)
#define SCA_MMAP_PROT syscall_arg__scnprintf_mmap_prot
+static size_t mmap__scnprintf_flags(unsigned long flags, char *bf, size_t size)
+{
+#include "trace/beauty/generated/mmap_flags_array.c"
+ static DEFINE_STRARRAY(mmap_flags);
+
+ return strarray__scnprintf_flags(&strarray__mmap_flags, bf, size, flags);
+}
+
static size_t syscall_arg__scnprintf_mmap_flags(char *bf, size_t size,
struct syscall_arg *arg)
{
- int printed = 0, flags = arg->val;
+ unsigned long flags = arg->val;
if (flags & MAP_ANONYMOUS)
arg->mask |= (1 << 4) | (1 << 5); /* Mask 4th ('fd') and 5th ('offset') args, ignored */
-#define P_MMAP_FLAG(n) \
- if (flags & MAP_##n) { \
- printed += scnprintf(bf + printed, size - printed, "%s%s", printed ? "|" : "", #n); \
- flags &= ~MAP_##n; \
- }
-
- P_MMAP_FLAG(SHARED);
- P_MMAP_FLAG(PRIVATE);
-#ifdef MAP_32BIT
- P_MMAP_FLAG(32BIT);
-#endif
- P_MMAP_FLAG(ANONYMOUS);
- P_MMAP_FLAG(DENYWRITE);
- P_MMAP_FLAG(EXECUTABLE);
- P_MMAP_FLAG(FILE);
- P_MMAP_FLAG(FIXED);
-#ifdef MAP_FIXED_NOREPLACE
- P_MMAP_FLAG(FIXED_NOREPLACE);
-#endif
- P_MMAP_FLAG(GROWSDOWN);
- P_MMAP_FLAG(HUGETLB);
- P_MMAP_FLAG(LOCKED);
- P_MMAP_FLAG(NONBLOCK);
- P_MMAP_FLAG(NORESERVE);
- P_MMAP_FLAG(POPULATE);
- P_MMAP_FLAG(STACK);
- P_MMAP_FLAG(UNINITIALIZED);
-#ifdef MAP_SYNC
- P_MMAP_FLAG(SYNC);
-#endif
-#undef P_MMAP_FLAG
-
- if (flags)
- printed += scnprintf(bf + printed, size - printed, "%s%#x", printed ? "|" : "", flags);
-
- return printed;
+ return mmap__scnprintf_flags(flags, bf, size);
}
#define SCA_MMAP_FLAGS syscall_arg__scnprintf_mmap_flags
--- /dev/null
+#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
+
+if [ $# -ne 2 ] ; then
+ [ $# -eq 1 ] && hostarch=$1 || hostarch=`uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/`
+ header_dir=tools/include/uapi/asm-generic
+ arch_header_dir=tools/arch/${hostarch}/include/uapi/asm
+else
+ header_dir=$1
+ arch_header_dir=$2
+fi
+
+arch_mman=${arch_header_dir}/mman.h
+
+# those in egrep -vw are flags, we want just the bits
+
+printf "static const char *mmap_flags[] = {\n"
+regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MAP_([[:alnum:]_]+)[[:space:]]+(0x[[:xdigit:]]+)[[:space:]]*.*'
+egrep -q $regex ${arch_mman} && \
+(egrep $regex ${arch_mman} | \
+ sed -r "s/$regex/\2 \1/g" | \
+ xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n")
+egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.*' ${arch_mman} &&
+(egrep $regex ${header_dir}/mman-common.h | \
+ egrep -vw 'MAP_(UNINITIALIZED|TYPE|SHARED_VALIDATE)' | \
+ sed -r "s/$regex/\2 \1/g" | \
+ xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n")
+egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.h>.*' ${arch_mman} &&
+(egrep $regex ${header_dir}/mman.h | \
+ sed -r "s/$regex/\2 \1/g" | \
+ xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n")
+printf "};\n"
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
--- /dev/null
+// SPDX-License-Identifier: LGPL-2.1
+/*
+ * trace/beauty/mount_flags.c
+ *
+ * Copyright (C) 2018, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
+ */
+
+#include "trace/beauty/beauty.h"
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <linux/log2.h>
+#include <sys/mount.h>
+
+static size_t mount__scnprintf_flags(unsigned long flags, char *bf, size_t size)
+{
+#include "trace/beauty/generated/mount_flags_array.c"
+ static DEFINE_STRARRAY(mount_flags);
+
+ return strarray__scnprintf_flags(&strarray__mount_flags, bf, size, flags);
+}
+
+unsigned long syscall_arg__mask_val_mount_flags(struct syscall_arg *arg __maybe_unused, unsigned long flags)
+{
+ // do_mount in fs/namespace.c:
+ /*
+ * Pre-0.97 versions of mount() didn't have a flags word. When the
+ * flags word was introduced its top half was required to have the
+ * magic value 0xC0ED, and this remained so until 2.4.0-test9.
+ * Therefore, if this magic number is present, it carries no
+ * information and must be discarded.
+ */
+ if ((flags & MS_MGC_MSK) == MS_MGC_VAL)
+ flags &= ~MS_MGC_MSK;
+
+ return flags;
+}
+
+size_t syscall_arg__scnprintf_mount_flags(char *bf, size_t size, struct syscall_arg *arg)
+{
+ unsigned long flags = arg->val;
+
+ return mount__scnprintf_flags(flags, bf, size);
+}
--- /dev/null
+#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
+
+[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/linux/
+
+printf "static const char *mount_flags[] = {\n"
+regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MS_([[:alnum:]_]+)[[:space:]]+([[:digit:]]+)[[:space:]]*.*'
+egrep $regex ${header_dir}/fs.h | egrep -v '(MSK|VERBOSE|MGC_VAL)\>' | \
+ sed -r "s/$regex/\2 \2 \1/g" | sort -n | \
+ xargs printf "\t[%s ? (ilog2(%s) + 1) : 0] = \"%s\",\n"
+regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MS_([[:alnum:]_]+)[[:space:]]+\(1<<([[:digit:]]+)\)[[:space:]]*.*'
+egrep $regex ${header_dir}/fs.h | \
+ sed -r "s/$regex/\2 \1/g" | \
+ xargs printf "\t[%s + 1] = \"%s\",\n"
+printf "};\n"
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <sys/types.h>
#include <sys/socket.h>
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#ifndef PERF_FLAG_FD_NO_GROUP
# define PERF_FLAG_FD_NO_GROUP (1UL << 0)
#endif
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/linux/
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
+
size_t syscall_arg__scnprintf_pid(char *bf, size_t size, struct syscall_arg *arg)
{
int pid = arg->val;
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/pkey_alloc.c
*
* Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
- *
- * Released under the GPL v2. (and only v2, not any later version)
*/
#include "trace/beauty/beauty.h"
#include <linux/kernel.h>
#include <linux/log2.h>
-static size_t pkey_alloc__scnprintf_access_rights(int access_rights, char *bf, size_t size)
+size_t strarray__scnprintf_flags(struct strarray *sa, char *bf, size_t size, unsigned long flags)
{
int i, printed = 0;
-#include "trace/beauty/generated/pkey_alloc_access_rights_array.c"
- static DEFINE_STRARRAY(pkey_alloc_access_rights);
-
- if (access_rights == 0) {
- const char *s = strarray__pkey_alloc_access_rights.entries[0];
+ if (flags == 0) {
+ const char *s = sa->entries[0];
if (s)
return scnprintf(bf, size, "%s", s);
return scnprintf(bf, size, "%d", 0);
}
- for (i = 1; i < strarray__pkey_alloc_access_rights.nr_entries; ++i) {
- int bit = 1 << (i - 1);
+ for (i = 1; i < sa->nr_entries; ++i) {
+ unsigned long bit = 1UL << (i - 1);
- if (!(access_rights & bit))
+ if (!(flags & bit))
continue;
if (printed != 0)
printed += scnprintf(bf + printed, size - printed, "|");
- if (strarray__pkey_alloc_access_rights.entries[i] != NULL)
- printed += scnprintf(bf + printed, size - printed, "%s", strarray__pkey_alloc_access_rights.entries[i]);
+ if (sa->entries[i] != NULL)
+ printed += scnprintf(bf + printed, size - printed, "%s", sa->entries[i]);
else
printed += scnprintf(bf + printed, size - printed, "0x%#", bit);
}
return printed;
}
+static size_t pkey_alloc__scnprintf_access_rights(int access_rights, char *bf, size_t size)
+{
+#include "trace/beauty/generated/pkey_alloc_access_rights_array.c"
+ static DEFINE_STRARRAY(pkey_alloc_access_rights);
+
+ return strarray__scnprintf_flags(&strarray__pkey_alloc_access_rights, bf, size, access_rights);
+}
+
size_t syscall_arg__scnprintf_pkey_alloc_access_rights(char *bf, size_t size, struct syscall_arg *arg)
{
unsigned long cmd = arg->val;
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/asm-generic/
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/prctl.c
*
* Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
- *
- * Released under the GPL v2. (and only v2, not any later version)
*/
#include "trace/beauty/beauty.h"
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/linux/
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <sched.h>
/*
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#ifndef SECCOMP_SET_MODE_STRICT
#define SECCOMP_SET_MODE_STRICT 0
#endif
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <signal.h>
static size_t syscall_arg__scnprintf_signum(char *bf, size_t size, struct syscall_arg *arg)
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/sound/
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/sound/
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
// Copyright (C) 2018, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
#include "trace/beauty/beauty.h"
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/socket.c
*
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/linux/
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <sys/types.h>
#include <sys/socket.h>
+// SPDX-License-Identifier: LGPL-2.1
/*
* trace/beauty/statx.c
*
* Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
- *
- * Released under the GPL v2. (and only v2, not any later version)
*/
#include "trace/beauty/beauty.h"
#!/bin/sh
+# SPDX-License-Identifier: LGPL-2.1
[ $# -eq 1 ] && header_dir=$1 || header_dir=tools/include/uapi/linux/
-// SPDX-License-Identifier: GPL-2.0
+// SPDX-License-Identifier: LGPL-2.1
#include <sys/types.h>
#include <sys/wait.h>
libperf-y += evsel.o
libperf-y += evsel_fprintf.o
libperf-y += find_bit.o
+libperf-y += get_current_dir_name.o
libperf-y += kallsyms.o
libperf-y += levenshtein.o
libperf-y += llvm-utils.o
#include "arch/x86/annotate/instructions.c"
#include "arch/powerpc/annotate/instructions.c"
#include "arch/s390/annotate/instructions.c"
+#include "arch/sparc/annotate/instructions.c"
static struct arch architectures[] = {
{
.comment_char = '#',
},
},
+ {
+ .name = "sparc",
+ .init = sparc__annotate_init,
+ .objdump = {
+ .comment_char = '#',
+ },
+ },
};
static void ins__delete(struct ins_operands *ops)
#define PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ 64
#define PERF_ITRACE_MAX_LAST_BRANCH_SZ 1024
-void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
+void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
+ bool no_sample)
{
- synth_opts->instructions = true;
synth_opts->branches = true;
synth_opts->transactions = true;
synth_opts->ptwrites = true;
synth_opts->pwr_events = true;
synth_opts->errors = true;
- synth_opts->period_type = PERF_ITRACE_DEFAULT_PERIOD_TYPE;
- synth_opts->period = PERF_ITRACE_DEFAULT_PERIOD;
+ if (no_sample) {
+ synth_opts->period_type = PERF_ITRACE_PERIOD_INSTRUCTIONS;
+ synth_opts->period = 1;
+ synth_opts->calls = true;
+ } else {
+ synth_opts->instructions = true;
+ synth_opts->period_type = PERF_ITRACE_DEFAULT_PERIOD_TYPE;
+ synth_opts->period = PERF_ITRACE_DEFAULT_PERIOD;
+ }
synth_opts->callchain_sz = PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
synth_opts->last_branch_sz = PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ;
synth_opts->initial_skip = 0;
}
if (!str) {
- itrace_synth_opts__set_default(synth_opts);
+ itrace_synth_opts__set_default(synth_opts, false);
return 0;
}
/**
* struct itrace_synth_opts - AUX area tracing synthesis options.
* @set: indicates whether or not options have been set
+ * @default_no_sample: Default to no sampling.
* @inject: indicates the event (not just the sample) must be fully synthesized
* because 'perf inject' will write it out
* @instructions: whether to synthesize 'instructions' events
*/
struct itrace_synth_opts {
bool set;
+ bool default_no_sample;
bool inject;
bool instructions;
bool branches;
union perf_event *event);
int itrace_parse_synth_opts(const struct option *opt, const char *str,
int unset);
-void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts);
+void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
+ bool no_sample);
size_t perf_event__fprintf_auxtrace_error(union perf_event *event, FILE *fp);
void perf_session__auxtrace_error_inc(struct perf_session *session,
zfree(&aux);
}
+static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address)
+{
+ struct machine *machine;
+
+ machine = etmq->etm->machine;
+
+ if (address >= etmq->etm->kernel_start) {
+ if (machine__is_host(machine))
+ return PERF_RECORD_MISC_KERNEL;
+ else
+ return PERF_RECORD_MISC_GUEST_KERNEL;
+ } else {
+ if (machine__is_host(machine))
+ return PERF_RECORD_MISC_USER;
+ else if (perf_guest)
+ return PERF_RECORD_MISC_GUEST_USER;
+ else
+ return PERF_RECORD_MISC_HYPERVISOR;
+ }
+}
+
static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u64 address,
size_t size, u8 *buffer)
{
return -1;
machine = etmq->etm->machine;
- if (address >= etmq->etm->kernel_start)
- cpumode = PERF_RECORD_MISC_KERNEL;
- else
- cpumode = PERF_RECORD_MISC_USER;
+ cpumode = cs_etm__cpu_mode(etmq, address);
thread = etmq->thread;
if (!thread) {
struct perf_sample sample = {.ip = 0,};
event->sample.header.type = PERF_RECORD_SAMPLE;
- event->sample.header.misc = PERF_RECORD_MISC_USER;
+ event->sample.header.misc = cs_etm__cpu_mode(etmq, addr);
event->sample.header.size = sizeof(struct perf_event_header);
sample.ip = addr;
sample.cpu = etmq->packet->cpu;
sample.flags = 0;
sample.insn_len = 1;
- sample.cpumode = event->header.misc;
+ sample.cpumode = event->sample.header.misc;
if (etm->synth_opts.last_branch) {
cs_etm__copy_last_branch_rb(etmq);
u64 nr;
struct branch_entry entries;
} dummy_bs;
+ u64 ip;
+
+ ip = cs_etm__last_executed_instr(etmq->prev_packet);
event->sample.header.type = PERF_RECORD_SAMPLE;
- event->sample.header.misc = PERF_RECORD_MISC_USER;
+ event->sample.header.misc = cs_etm__cpu_mode(etmq, ip);
event->sample.header.size = sizeof(struct perf_event_header);
- sample.ip = cs_etm__last_executed_instr(etmq->prev_packet);
+ sample.ip = ip;
sample.pid = etmq->pid;
sample.tid = etmq->tid;
sample.addr = cs_etm__first_executed_instr(etmq->packet);
sample.period = 1;
sample.cpu = etmq->packet->cpu;
sample.flags = 0;
- sample.cpumode = PERF_RECORD_MISC_USER;
+ sample.cpumode = event->sample.header.misc;
/*
* perf report cannot handle events without a branch stack
if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
etm->synth_opts = *session->itrace_synth_opts;
} else {
- itrace_synth_opts__set_default(&etm->synth_opts);
+ itrace_synth_opts__set_default(&etm->synth_opts,
+ session->itrace_synth_opts->default_no_sample);
etm->synth_opts.callchain = false;
}
struct numa_node *numa_nodes;
struct memory_node *memory_nodes;
unsigned long long memory_bsize;
+ u64 clockid_res_ns;
};
extern struct perf_env perf_env;
event->fork.pid = tgid;
event->fork.tid = pid;
event->fork.header.type = PERF_RECORD_FORK;
+ event->fork.header.misc = PERF_RECORD_MISC_FORK_EXEC;
event->fork.header.size = (sizeof(event->fork) + machine->id_hdr_size);
struct perf_evsel *pos;
evlist__for_each_entry(evlist, pos) {
- if (!perf_evsel__is_group_leader(pos) || !pos->fd)
+ if (pos->disabled || !perf_evsel__is_group_leader(pos) || !pos->fd)
continue;
perf_evsel__disable(pos);
}
leader->forced_leader = true;
}
}
+
+struct perf_evsel *perf_evlist__reset_weak_group(struct perf_evlist *evsel_list,
+ struct perf_evsel *evsel)
+{
+ struct perf_evsel *c2, *leader;
+ bool is_open = true;
+
+ leader = evsel->leader;
+ pr_debug("Weak group for %s/%d failed\n",
+ leader->name, leader->nr_members);
+
+ /*
+ * for_each_group_member doesn't work here because it doesn't
+ * include the first entry.
+ */
+ evlist__for_each_entry(evsel_list, c2) {
+ if (c2 == evsel)
+ is_open = false;
+ if (c2->leader == leader) {
+ if (is_open)
+ perf_evsel__close(c2);
+ c2->leader = c2;
+ c2->nr_members = 0;
+ }
+ }
+ return leader;
+}
void perf_evlist__force_leader(struct perf_evlist *evlist);
+struct perf_evsel *perf_evlist__reset_weak_group(struct perf_evlist *evlist,
+ struct perf_evsel *evsel);
+
#endif /* __PERF_EVLIST_H */
evsel->leader = evsel;
evsel->unit = "";
evsel->scale = 1.0;
+ evsel->max_events = ULONG_MAX;
evsel->evlist = NULL;
evsel->bpf_fd = -1;
INIT_LIST_HEAD(&evsel->node);
case PERF_EVSEL__CONFIG_TERM_MAX_STACK:
max_stack = term->val.max_stack;
break;
+ case PERF_EVSEL__CONFIG_TERM_MAX_EVENTS:
+ evsel->max_events = term->val.max_events;
+ break;
case PERF_EVSEL__CONFIG_TERM_INHERIT:
/*
* attr->inherit should has already been set by
attr->sample_freq = 0;
attr->sample_period = 0;
attr->write_backward = 0;
- attr->sample_id_all = 0;
}
if (opts->no_samples)
attr->exclude_user = 1;
}
- if (evsel->own_cpus)
+ if (evsel->own_cpus || evsel->unit)
evsel->attr.read_format |= PERF_FORMAT_ID;
/*
int perf_evsel__enable(struct perf_evsel *evsel)
{
- return perf_evsel__run_ioctl(evsel,
- PERF_EVENT_IOC_ENABLE,
- 0);
+ int err = perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, 0);
+
+ if (!err)
+ evsel->disabled = false;
+
+ return err;
}
int perf_evsel__disable(struct perf_evsel *evsel)
{
- return perf_evsel__run_ioctl(evsel,
- PERF_EVENT_IOC_DISABLE,
- 0);
+ int err = perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_DISABLE, 0);
+ /*
+ * We mark it disabled here so that tools that disable a event can
+ * ignore events after they disable it. I.e. the ring buffer may have
+ * already a few more events queued up before the kernel got the stop
+ * request.
+ */
+ if (!err)
+ evsel->disabled = true;
+
+ return err;
}
int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
PERF_EVSEL__CONFIG_TERM_STACK_USER,
PERF_EVSEL__CONFIG_TERM_INHERIT,
PERF_EVSEL__CONFIG_TERM_MAX_STACK,
+ PERF_EVSEL__CONFIG_TERM_MAX_EVENTS,
PERF_EVSEL__CONFIG_TERM_OVERWRITE,
PERF_EVSEL__CONFIG_TERM_DRV_CFG,
PERF_EVSEL__CONFIG_TERM_BRANCH,
bool inherit;
bool overwrite;
char *branch;
+ unsigned long max_events;
} val;
bool weak;
};
struct perf_counts *prev_raw_counts;
int idx;
u32 ids;
+ unsigned long max_events;
+ unsigned long nr_events_printed;
char *name;
double scale;
const char *unit;
bool snapshot;
bool supported;
bool needs_swap;
+ bool disabled;
bool no_aux_samples;
bool immediate;
bool system_wide;
#elif defined(__powerpc__)
#define GEN_ELF_ARCH EM_PPC
#define GEN_ELF_CLASS ELFCLASS32
+#elif defined(__sparc__) && defined(__arch64__)
+#define GEN_ELF_ARCH EM_SPARCV9
+#define GEN_ELF_CLASS ELFCLASS64
+#elif defined(__sparc__)
+#define GEN_ELF_ARCH EM_SPARC
+#define GEN_ELF_CLASS ELFCLASS32
#else
#error "unsupported architecture"
#endif
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
+//
+#ifndef HAVE_GET_CURRENT_DIR_NAME
+#include "util.h"
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdlib.h>
+
+/* Android's 'bionic' library, for one, doesn't have this */
+
+char *get_current_dir_name(void)
+{
+ char pwd[PATH_MAX];
+
+ return getcwd(pwd, sizeof(pwd)) == NULL ? NULL : strdup(pwd);
+}
+#endif // HAVE_GET_CURRENT_DIR_NAME
return err;
}
+static int write_clockid(struct feat_fd *ff,
+ struct perf_evlist *evlist __maybe_unused)
+{
+ return do_write(ff, &ff->ph->env.clockid_res_ns,
+ sizeof(ff->ph->env.clockid_res_ns));
+}
+
static int cpu_cache_level__sort(const void *a, const void *b)
{
struct cpu_cache_level *cache_a = (struct cpu_cache_level *)a;
fprintf(fp, "# Core ID and Socket ID information is not available\n");
}
+static void print_clockid(struct feat_fd *ff, FILE *fp)
+{
+ fprintf(fp, "# clockid frequency: %"PRIu64" MHz\n",
+ ff->ph->env.clockid_res_ns * 1000);
+}
+
static void free_event_desc(struct perf_evsel *events)
{
struct perf_evsel *evsel;
return ret;
}
+static int process_clockid(struct feat_fd *ff,
+ void *data __maybe_unused)
+{
+ if (do_read_u64(ff, &ff->ph->env.clockid_res_ns))
+ return -1;
+
+ return 0;
+}
+
struct feature_ops {
int (*write)(struct feat_fd *ff, struct perf_evlist *evlist);
void (*print)(struct feat_fd *ff, FILE *fp);
FEAT_OPN(CACHE, cache, true),
FEAT_OPR(SAMPLE_TIME, sample_time, false),
FEAT_OPR(MEM_TOPOLOGY, mem_topology, true),
+ FEAT_OPR(CLOCKID, clockid, false)
};
struct header_print_data {
HEADER_CACHE,
HEADER_SAMPLE_TIME,
HEADER_MEM_TOPOLOGY,
+ HEADER_CLOCKID,
HEADER_LAST_FEATURE,
HEADER_FEAT_BITS = 256,
};
return 0;
}
+static inline u8 intel_bts_cpumode(struct intel_bts *bts, uint64_t ip)
+{
+ return machine__kernel_ip(bts->machine, ip) ?
+ PERF_RECORD_MISC_KERNEL :
+ PERF_RECORD_MISC_USER;
+}
+
static int intel_bts_synth_branch_sample(struct intel_bts_queue *btsq,
struct branch *branch)
{
bts->num_events++ <= bts->synth_opts.initial_skip)
return 0;
- event.sample.header.type = PERF_RECORD_SAMPLE;
- event.sample.header.misc = PERF_RECORD_MISC_USER;
- event.sample.header.size = sizeof(struct perf_event_header);
-
- sample.cpumode = PERF_RECORD_MISC_USER;
sample.ip = le64_to_cpu(branch->from);
+ sample.cpumode = intel_bts_cpumode(bts, sample.ip);
sample.pid = btsq->pid;
sample.tid = btsq->tid;
sample.addr = le64_to_cpu(branch->to);
sample.insn_len = btsq->intel_pt_insn.length;
memcpy(sample.insn, btsq->intel_pt_insn.buf, INTEL_PT_INSN_BUF_SZ);
+ event.sample.header.type = PERF_RECORD_SAMPLE;
+ event.sample.header.misc = sample.cpumode;
+ event.sample.header.size = sizeof(struct perf_event_header);
+
if (bts->synth_opts.inject) {
event.sample.header.size = bts->branches_event_size;
ret = perf_event__synthesize_sample(&event,
if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
bts->synth_opts = *session->itrace_synth_opts;
} else {
- itrace_synth_opts__set_default(&bts->synth_opts);
+ itrace_synth_opts__set_default(&bts->synth_opts,
+ session->itrace_synth_opts->default_no_sample);
if (session->itrace_synth_opts)
bts->synth_opts.thread_stack =
session->itrace_synth_opts->thread_stack;
decoder->have_calc_cyc_to_tsc = false;
intel_pt_calc_cyc_to_tsc(decoder, true);
}
+
+ intel_pt_log_to("Setting timestamp", decoder->timestamp);
}
static void intel_pt_calc_cbr(struct intel_pt_decoder *decoder)
decoder->timestamp = timestamp;
decoder->timestamp_insn_cnt = 0;
+
+ intel_pt_log_to("Setting timestamp", decoder->timestamp);
}
/* Walk PSB+ packets when already in sync. */
static char log_name[MAX_LOG_NAME];
bool intel_pt_enable_logging;
+void *intel_pt_log_fp(void)
+{
+ return f;
+}
+
void intel_pt_log_enable(void)
{
intel_pt_enable_logging = true;
struct intel_pt_pkt;
+void *intel_pt_log_fp(void);
void intel_pt_log_enable(void);
void intel_pt_log_disable(void);
void intel_pt_log_set_name(const char *name);
intel_pt_dump(pt, buf, len);
}
+static void intel_pt_log_event(union perf_event *event)
+{
+ FILE *f = intel_pt_log_fp();
+
+ if (!intel_pt_enable_logging || !f)
+ return;
+
+ perf_event__fprintf(event, f);
+}
+
static int intel_pt_do_fix_overlap(struct intel_pt *pt, struct auxtrace_buffer *a,
struct auxtrace_buffer *b)
{
return auxtrace_cache__lookup(dso->auxtrace_cache, offset);
}
+static inline u8 intel_pt_cpumode(struct intel_pt *pt, uint64_t ip)
+{
+ return ip >= pt->kernel_start ?
+ PERF_RECORD_MISC_KERNEL :
+ PERF_RECORD_MISC_USER;
+}
+
static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
uint64_t *insn_cnt_ptr, uint64_t *ip,
uint64_t to_ip, uint64_t max_insn_cnt,
if (to_ip && *ip == to_ip)
goto out_no_cache;
- if (*ip >= ptq->pt->kernel_start)
- cpumode = PERF_RECORD_MISC_KERNEL;
- else
- cpumode = PERF_RECORD_MISC_USER;
+ cpumode = intel_pt_cpumode(ptq->pt, *ip);
thread = ptq->thread;
if (!thread) {
if (pt->synth_opts.callchain) {
size_t sz = sizeof(struct ip_callchain);
- sz += pt->synth_opts.callchain_sz * sizeof(u64);
+ /* Add 1 to callchain_sz for callchain context */
+ sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
ptq->chain = zalloc(sz);
if (!ptq->chain)
goto out_free;
union perf_event *event,
struct perf_sample *sample)
{
- event->sample.header.type = PERF_RECORD_SAMPLE;
- event->sample.header.misc = PERF_RECORD_MISC_USER;
- event->sample.header.size = sizeof(struct perf_event_header);
-
if (!pt->timeless_decoding)
sample->time = tsc_to_perf_time(ptq->timestamp, &pt->tc);
- sample->cpumode = PERF_RECORD_MISC_USER;
sample->ip = ptq->state->from_ip;
+ sample->cpumode = intel_pt_cpumode(pt, sample->ip);
sample->pid = ptq->pid;
sample->tid = ptq->tid;
sample->addr = ptq->state->to_ip;
sample->flags = ptq->flags;
sample->insn_len = ptq->insn_len;
memcpy(sample->insn, ptq->insn, INTEL_PT_INSN_BUF_SZ);
+
+ event->sample.header.type = PERF_RECORD_SAMPLE;
+ event->sample.header.misc = sample->cpumode;
+ event->sample.header.size = sizeof(struct perf_event_header);
}
static int intel_pt_inject_event(union perf_event *event,
if (pt->synth_opts.callchain) {
thread_stack__sample(ptq->thread, ptq->chain,
- pt->synth_opts.callchain_sz, sample->ip);
+ pt->synth_opts.callchain_sz + 1,
+ sample->ip, pt->kernel_start);
sample->callchain = ptq->chain;
}
event->header.type == PERF_RECORD_SWITCH_CPU_WIDE)
err = intel_pt_context_switch(pt, event, sample);
- intel_pt_log("event %s (%u): cpu %d time %"PRIu64" tsc %#"PRIx64"\n",
- perf_event__name(event->header.type), event->header.type,
- sample->cpu, sample->time, timestamp);
+ intel_pt_log("event %u: cpu %d time %"PRIu64" tsc %#"PRIx64" ",
+ event->header.type, sample->cpu, sample->time, timestamp);
+ intel_pt_log_event(event);
return err;
}
if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
pt->synth_opts = *session->itrace_synth_opts;
} else {
- itrace_synth_opts__set_default(&pt->synth_opts);
+ itrace_synth_opts__set_default(&pt->synth_opts,
+ session->itrace_synth_opts->default_no_sample);
if (use_browser != -1) {
pt->synth_opts.branches = false;
pt->synth_opts.callchain = true;
struct thread *parent = machine__findnew_thread(machine,
event->fork.ppid,
event->fork.ptid);
+ bool do_maps_clone = true;
int err = 0;
if (dump_trace)
thread = machine__findnew_thread(machine, event->fork.pid,
event->fork.tid);
+ /*
+ * When synthesizing FORK events, we are trying to create thread
+ * objects for the already running tasks on the machine.
+ *
+ * Normally, for a kernel FORK event, we want to clone the parent's
+ * maps because that is what the kernel just did.
+ *
+ * But when synthesizing, this should not be done. If we do, we end up
+ * with overlapping maps as we process the sythesized MMAP2 events that
+ * get delivered shortly thereafter.
+ *
+ * Use the FORK event misc flags in an internal way to signal this
+ * situation, so we can elide the map clone when appropriate.
+ */
+ if (event->fork.header.misc & PERF_RECORD_MISC_FORK_EXEC)
+ do_maps_clone = false;
if (thread == NULL || parent == NULL ||
- thread__fork(thread, parent, sample->time) < 0) {
+ thread__fork(thread, parent, sample->time, do_maps_clone) < 0) {
dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n");
err = -1;
}
return 0;
}
+static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
+ struct callchain_cursor *cursor,
+ struct symbol **parent,
+ struct addr_location *root_al,
+ u8 *cpumode, int ent)
+{
+ int err = 0;
+
+ while (--ent >= 0) {
+ u64 ip = chain->ips[ent];
+
+ if (ip >= PERF_CONTEXT_MAX) {
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, cpumode, ip,
+ false, NULL, NULL, 0);
+ break;
+ }
+ }
+ return err;
+}
+
static int thread__resolve_callchain_sample(struct thread *thread,
struct callchain_cursor *cursor,
struct perf_evsel *evsel,
}
check_calls:
+ if (callchain_param.order != ORDER_CALLEE) {
+ err = find_prev_cpumode(chain, thread, cursor, parent, root_al,
+ &cpumode, chain->nr - first_call);
+ if (err)
+ return (err < 0) ? err : 0;
+ }
for (i = first_call, nr_entries = 0;
i < chain_nr && nr_entries < max_stack; i++) {
u64 ip;
continue;
#endif
ip = chain->ips[j];
-
if (ip < PERF_CONTEXT_MAX)
++nr_entries;
+ else if (callchain_param.order != ORDER_CALLEE) {
+ err = find_prev_cpumode(chain, thread, cursor, parent,
+ root_al, &cpumode, j);
+ if (err)
+ return (err < 0) ? err : 0;
+ continue;
+ }
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, ip,
#include <stdio.h>
#include <string.h>
#include <unistd.h>
+#include <asm/bug.h>
struct namespaces *namespaces__new(struct namespaces_event *event)
{
char curpath[PATH_MAX];
int oldns = -1;
int newns = -1;
+ char *oldcwd = NULL;
if (nc == NULL)
return;
if (snprintf(curpath, PATH_MAX, "/proc/self/ns/mnt") >= PATH_MAX)
return;
+ oldcwd = get_current_dir_name();
+ if (!oldcwd)
+ return;
+
oldns = open(curpath, O_RDONLY);
if (oldns < 0)
- return;
+ goto errout;
newns = open(nsi->mntns_path, O_RDONLY);
if (newns < 0)
if (setns(newns, CLONE_NEWNS) < 0)
goto errout;
+ nc->oldcwd = oldcwd;
nc->oldns = oldns;
nc->newns = newns;
return;
errout:
+ free(oldcwd);
if (oldns > -1)
close(oldns);
if (newns > -1)
void nsinfo__mountns_exit(struct nscookie *nc)
{
- if (nc == NULL || nc->oldns == -1 || nc->newns == -1)
+ if (nc == NULL || nc->oldns == -1 || nc->newns == -1 || !nc->oldcwd)
return;
setns(nc->oldns, CLONE_NEWNS);
+ if (nc->oldcwd) {
+ WARN_ON_ONCE(chdir(nc->oldcwd));
+ zfree(&nc->oldcwd);
+ }
+
if (nc->oldns > -1) {
close(nc->oldns);
nc->oldns = -1;
struct nscookie {
int oldns;
int newns;
+ char *oldcwd;
};
int nsinfo__init(struct nsinfo *nsi);
[PARSE_EVENTS__TERM_TYPE_NOINHERIT] = "no-inherit",
[PARSE_EVENTS__TERM_TYPE_INHERIT] = "inherit",
[PARSE_EVENTS__TERM_TYPE_MAX_STACK] = "max-stack",
+ [PARSE_EVENTS__TERM_TYPE_MAX_EVENTS] = "nr",
[PARSE_EVENTS__TERM_TYPE_OVERWRITE] = "overwrite",
[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE] = "no-overwrite",
[PARSE_EVENTS__TERM_TYPE_DRV_CFG] = "driver-config",
case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
CHECK_TYPE_VAL(NUM);
break;
+ case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
+ CHECK_TYPE_VAL(NUM);
+ break;
default:
err->str = strdup("unknown term");
err->idx = term->err_term;
case PARSE_EVENTS__TERM_TYPE_INHERIT:
case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
+ case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
return config_term_common(attr, term, err);
case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
ADD_CONFIG_TERM(MAX_STACK, max_stack, term->val.num);
break;
+ case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
+ ADD_CONFIG_TERM(MAX_EVENTS, max_events, term->val.num);
+ break;
case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
break;
PARSE_EVENTS__TERM_TYPE_NOINHERIT,
PARSE_EVENTS__TERM_TYPE_INHERIT,
PARSE_EVENTS__TERM_TYPE_MAX_STACK,
+ PARSE_EVENTS__TERM_TYPE_MAX_EVENTS,
PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
PARSE_EVENTS__TERM_TYPE_OVERWRITE,
PARSE_EVENTS__TERM_TYPE_DRV_CFG,
call-graph { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
stack-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
max-stack { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_MAX_STACK); }
+nr { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_MAX_EVENTS); }
inherit { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
no-inherit { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
overwrite { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
if (!is_arm_pmu_core(name)) {
pname = pe->pmu ? pe->pmu : "cpu";
- if (strncmp(pname, name, strlen(pname)))
+ if (strcmp(pname, name))
continue;
}
plt_entry_size = 16;
break;
- default: /* FIXME: s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa need to be checked */
+ case EM_SPARC:
+ plt_header_size = 48;
+ plt_entry_size = 12;
+ break;
+
+ case EM_SPARCV9:
+ plt_header_size = 128;
+ plt_entry_size = 32;
+ break;
+
+ default: /* FIXME: s390/alpha/mips/parisc/poperpc/sh/xtensa need to be checked */
plt_header_size = shdr_plt.sh_entsize;
plt_entry_size = shdr_plt.sh_entsize;
break;
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
- *field_sep;
+ *field_sep,
+ *graph_function;
const char *default_guest_vmlinux_name,
*default_guest_kallsyms,
*default_guest_modules;
}
}
+static inline u64 callchain_context(u64 ip, u64 kernel_start)
+{
+ return ip < kernel_start ? PERF_CONTEXT_USER : PERF_CONTEXT_KERNEL;
+}
+
void thread_stack__sample(struct thread *thread, struct ip_callchain *chain,
- size_t sz, u64 ip)
+ size_t sz, u64 ip, u64 kernel_start)
{
- size_t i;
+ u64 context = callchain_context(ip, kernel_start);
+ u64 last_context;
+ size_t i, j;
- if (!thread || !thread->ts)
- chain->nr = 1;
- else
- chain->nr = min(sz, thread->ts->cnt + 1);
+ if (sz < 2) {
+ chain->nr = 0;
+ return;
+ }
- chain->ips[0] = ip;
+ chain->ips[0] = context;
+ chain->ips[1] = ip;
+
+ if (!thread || !thread->ts) {
+ chain->nr = 2;
+ return;
+ }
+
+ last_context = context;
+
+ for (i = 2, j = 1; i < sz && j <= thread->ts->cnt; i++, j++) {
+ ip = thread->ts->stack[thread->ts->cnt - j].ret_addr;
+ context = callchain_context(ip, kernel_start);
+ if (context != last_context) {
+ if (i >= sz - 1)
+ break;
+ chain->ips[i++] = context;
+ last_context = context;
+ }
+ chain->ips[i] = ip;
+ }
- for (i = 1; i < chain->nr; i++)
- chain->ips[i] = thread->ts->stack[thread->ts->cnt - i].ret_addr;
+ chain->nr = i;
}
struct call_return_processor *
u64 to_ip, u16 insn_len, u64 trace_nr);
void thread_stack__set_trace_nr(struct thread *thread, u64 trace_nr);
void thread_stack__sample(struct thread *thread, struct ip_callchain *chain,
- size_t sz, u64 ip);
+ size_t sz, u64 ip, u64 kernel_start);
int thread_stack__flush(struct thread *thread);
void thread_stack__free(struct thread *thread);
size_t thread_stack__depth(struct thread *thread);
}
static int thread__clone_map_groups(struct thread *thread,
- struct thread *parent)
+ struct thread *parent,
+ bool do_maps_clone)
{
/* This is new thread, we share map groups for process. */
if (thread->pid_ == parent->pid_)
thread->pid_, thread->tid, parent->pid_, parent->tid);
return 0;
}
-
/* But this one is new process, copy maps. */
- if (map_groups__clone(thread, parent->mg) < 0)
- return -ENOMEM;
-
- return 0;
+ return do_maps_clone ? map_groups__clone(thread, parent->mg) : 0;
}
-int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp)
+int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp, bool do_maps_clone)
{
if (parent->comm_set) {
const char *comm = thread__comm_str(parent);
}
thread->ppid = parent->tid;
- return thread__clone_map_groups(thread, parent);
+ return thread__clone_map_groups(thread, parent, do_maps_clone);
}
void thread__find_cpumode_addr_location(struct thread *thread, u64 addr,
void *addr_space;
struct unwind_libunwind_ops *unwind_libunwind_ops;
#endif
+ bool filter;
+ int filter_entry_depth;
};
struct machine;
struct comm *thread__exec_comm(const struct thread *thread);
const char *thread__comm_str(const struct thread *thread);
int thread__insert_map(struct thread *thread, struct map *map);
-int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
+int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp, bool do_maps_clone);
size_t thread__fprintf(struct thread *thread, FILE *fp);
struct thread *thread__main_thread(struct machine *machine, struct thread *thread);
Dwarf_Addr s;
dwfl_module_info(mod, NULL, &s, NULL, NULL, NULL, NULL, NULL);
- if (s != al->map->start)
+ if (s != al->map->start - al->map->pgoff)
mod = 0;
}
if (!mod)
mod = dwfl_report_elf(ui->dwfl, dso->short_name,
- (dso->symsrc_filename ? dso->symsrc_filename : dso->long_name), -1, al->map->start,
+ (dso->symsrc_filename ? dso->symsrc_filename : dso->long_name), -1, al->map->start - al->map->pgoff,
false);
return mod && dwfl_addrmodule(ui->dwfl, ip) == mod ? 0 : -1;
const char *perf_tip(const char *dirpath);
+#ifndef HAVE_GET_CURRENT_DIR_NAME
+char *get_current_dir_name(void);
+#endif
+
#ifndef HAVE_SCHED_GETCPU_SUPPORT
int sched_getcpu(void);
#endif
WARNINGS += $(call cc-supports,-Wdeclaration-after-statement)
WARNINGS += -Wshadow
-CFLAGS += -DVERSION=\"$(VERSION)\" -DPACKAGE=\"$(PACKAGE)\" \
+override CFLAGS += -DVERSION=\"$(VERSION)\" -DPACKAGE=\"$(PACKAGE)\" \
-DPACKAGE_BUGREPORT=\"$(PACKAGE_BUGREPORT)\" -D_GNU_SOURCE
UTIL_OBJS = utils/helpers/amd.o utils/helpers/msr.o \
LIB_OBJS = lib/cpufreq.o lib/cpupower.o lib/cpuidle.o
LIB_OBJS := $(addprefix $(OUTPUT),$(LIB_OBJS))
-CFLAGS += -pipe
+override CFLAGS += -pipe
ifeq ($(strip $(NLS)),true)
INSTALL_NLS += install-gmo
COMPILE_NLS += create-gmo
- CFLAGS += -DNLS
+ override CFLAGS += -DNLS
endif
ifeq ($(strip $(CPUFREQ_BENCH)),true)
UTIL_SRC += $(LIB_SRC)
endif
-CFLAGS += $(WARNINGS)
+override CFLAGS += $(WARNINGS)
ifeq ($(strip $(V)),false)
QUIET=@
# if DEBUG is enabled, then we do not strip or optimize
ifeq ($(strip $(DEBUG)),true)
- CFLAGS += -O1 -g -DDEBUG
+ override CFLAGS += -O1 -g -DDEBUG
STRIPCMD = /bin/true -Since_we_are_debugging
else
- CFLAGS += $(OPTIMIZATION) -fomit-frame-pointer
+ override CFLAGS += $(OPTIMIZATION) -fomit-frame-pointer
STRIPCMD = $(STRIP) -s --remove-section=.note --remove-section=.comment
endif
ifeq ($(strip $(STATIC)),true)
LIBS = -L../ -L$(OUTPUT) -lm
OBJS = $(OUTPUT)main.o $(OUTPUT)parse.o $(OUTPUT)system.o $(OUTPUT)benchmark.o \
- $(OUTPUT)../lib/cpufreq.o $(OUTPUT)../lib/sysfs.o
+ $(OUTPUT)../lib/cpufreq.o $(OUTPUT)../lib/cpupower.o
else
LIBS = -L../ -L$(OUTPUT) -lm -lcpupower
OBJS = $(OUTPUT)main.o $(OUTPUT)parse.o $(OUTPUT)system.o $(OUTPUT)benchmark.o
default: all
$(OUTPUT)centrino-decode: ../i386/centrino-decode.c
- $(CC) $(CFLAGS) -o $@ $<
+ $(CC) $(CFLAGS) -o $@ $(LDFLAGS) $<
$(OUTPUT)powernow-k8-decode: ../i386/powernow-k8-decode.c
- $(CC) $(CFLAGS) -o $@ $<
+ $(CC) $(CFLAGS) -o $@ $(LDFLAGS) $<
all: $(OUTPUT)centrino-decode $(OUTPUT)powernow-k8-decode
snprintf(path, sizeof(path), PATH_TO_CPU "cpu%u/cpufreq/%s",
cpu, fname);
- return sysfs_read_file(path, buf, buflen);
+ return cpupower_read_sysfs(path, buf, buflen);
}
/* helper function to write a new value to a /sys file */
snprintf(path, sizeof(path), PATH_TO_CPU "cpuidle/%s", fname);
- return sysfs_read_file(path, buf, buflen);
+ return cpupower_read_sysfs(path, buf, buflen);
}
#include "cpupower.h"
#include "cpupower_intern.h"
-unsigned int sysfs_read_file(const char *path, char *buf, size_t buflen)
+unsigned int cpupower_read_sysfs(const char *path, char *buf, size_t buflen)
{
int fd;
ssize_t numread;
snprintf(path, sizeof(path), PATH_TO_CPU "cpu%u/topology/%s",
cpu, fname);
- if (sysfs_read_file(path, linebuf, MAX_LINE_LEN) == 0)
+ if (cpupower_read_sysfs(path, linebuf, MAX_LINE_LEN) == 0)
return -1;
*result = strtol(linebuf, &endp, 0);
if (endp == linebuf || errno == ERANGE)
#define MAX_LINE_LEN 4096
#define SYSFS_PATH_MAX 255
-unsigned int sysfs_read_file(const char *path, char *buf, size_t buflen);
+unsigned int cpupower_read_sysfs(const char *path, char *buf, size_t buflen);
#include <linux/dma-mapping.h>
#include <linux/workqueue.h>
#include <linux/libnvdimm.h>
+#include <linux/genalloc.h>
#include <linux/vmalloc.h>
#include <linux/device.h>
#include <linux/module.h>
[6] = NFIT_DIMM_HANDLE(1, 0, 0, 0, 1),
};
-static unsigned long dimm_fail_cmd_flags[NUM_DCR];
-static int dimm_fail_cmd_code[NUM_DCR];
+static unsigned long dimm_fail_cmd_flags[ARRAY_SIZE(handle)];
+static int dimm_fail_cmd_code[ARRAY_SIZE(handle)];
static const struct nd_intel_smart smart_def = {
.flags = ND_INTEL_SMART_HEALTH_VALID
unsigned long deadline;
spinlock_t lock;
} ars_state;
- struct device *dimm_dev[NUM_DCR];
+ struct device *dimm_dev[ARRAY_SIZE(handle)];
struct nd_intel_smart *smart;
struct nd_intel_smart_threshold *smart_threshold;
struct badrange badrange;
static struct workqueue_struct *nfit_wq;
+static struct gen_pool *nfit_pool;
+
static struct nfit_test *to_nfit_test(struct device *dev)
{
struct platform_device *pdev = to_platform_device(dev);
list_del(&nfit_res->list);
spin_unlock(&nfit_test_lock);
+ if (resource_size(&nfit_res->res) >= DIMM_SIZE)
+ gen_pool_free(nfit_pool, nfit_res->res.start,
+ resource_size(&nfit_res->res));
vfree(nfit_res->buf);
kfree(nfit_res);
}
GFP_KERNEL);
int rc;
- if (!buf || !nfit_res)
+ if (!buf || !nfit_res || !*dma)
goto err;
rc = devm_add_action(dev, release_nfit_res, nfit_res);
if (rc)
return nfit_res->buf;
err:
+ if (*dma && size >= DIMM_SIZE)
+ gen_pool_free(nfit_pool, *dma, size);
if (buf)
vfree(buf);
kfree(nfit_res);
static void *test_alloc(struct nfit_test *t, size_t size, dma_addr_t *dma)
{
+ struct genpool_data_align data = {
+ .align = SZ_128M,
+ };
void *buf = vmalloc(size);
- *dma = (unsigned long) buf;
+ if (size >= DIMM_SIZE)
+ *dma = gen_pool_alloc_algo(nfit_pool, size,
+ gen_pool_first_fit_align, &data);
+ else
+ *dma = (unsigned long) buf;
return __test_alloc(t, size, dma, buf);
}
u32 nfit_handle = __to_nfit_memdev(nfit_mem)->device_handle;
int i;
- for (i = 0; i < NUM_DCR; i++)
+ for (i = 0; i < ARRAY_SIZE(handle); i++)
if (nfit_handle == handle[i])
dev_set_drvdata(nfit_test->dimm_dev[i],
nfit_mem);
goto err_register;
}
+ nfit_pool = gen_pool_create(ilog2(SZ_4M), NUMA_NO_NODE);
+ if (!nfit_pool) {
+ rc = -ENOMEM;
+ goto err_register;
+ }
+
+ if (gen_pool_add(nfit_pool, SZ_4G, SZ_4G, NUMA_NO_NODE)) {
+ rc = -ENOMEM;
+ goto err_register;
+ }
+
for (i = 0; i < NUM_NFITS; i++) {
struct nfit_test *nfit_test;
struct platform_device *pdev;
return 0;
err_register:
+ if (nfit_pool)
+ gen_pool_destroy(nfit_pool);
+
destroy_workqueue(nfit_wq);
for (i = 0; i < NUM_NFITS; i++)
if (instances[i])
platform_driver_unregister(&nfit_test_driver);
nfit_test_teardown();
+ gen_pool_destroy(nfit_pool);
+
for (i = 0; i < NUM_NFITS; i++)
put_device(&instances[i]->pdev.dev);
class_destroy(nfit_test_dimm);
TARGETS += mount
TARGETS += mqueue
TARGETS += net
+TARGETS += netfilter
TARGETS += nsfs
TARGETS += powerpc
TARGETS += proc
(void *) BPF_FUNC_skb_ancestor_cgroup_id;
static struct bpf_sock *(*bpf_sk_lookup_tcp)(void *ctx,
struct bpf_sock_tuple *tuple,
- int size, unsigned int netns_id,
+ int size, unsigned long long netns_id,
unsigned long long flags) =
(void *) BPF_FUNC_sk_lookup_tcp;
static struct bpf_sock *(*bpf_sk_lookup_udp)(void *ctx,
struct bpf_sock_tuple *tuple,
- int size, unsigned int netns_id,
+ int size, unsigned long long netns_id,
unsigned long long flags) =
(void *) BPF_FUNC_sk_lookup_udp;
static int (*bpf_sk_release)(struct bpf_sock *sk) =
#include <bpf/bpf.h>
#include <bpf/libbpf.h>
+#include "bpf_rlimit.h"
+
const char *cfg_pin_path = "/sys/fs/bpf/flow_dissector";
const char *cfg_map_name = "jmp_table";
bool cfg_attach = true;
/* const void* */ /* [3] */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 2),
/* typedef const void * const_void_ptr */
- BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 3),
- /* struct A { */ /* [4] */
+ BTF_TYPEDEF_ENC(NAME_TBD, 3), /* [4] */
+ /* struct A { */ /* [5] */
BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 1), sizeof(void *)),
/* const_void_ptr m; */
- BTF_MEMBER_ENC(NAME_TBD, 3, 0),
+ BTF_MEMBER_ENC(NAME_TBD, 4, 0),
/* } */
BTF_END_RAW,
},
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_CONST, 0, 0), 0),
/* const void* */ /* [3] */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 2),
- /* typedef const void * const_void_ptr */ /* [4] */
- BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 3),
- /* const_void_ptr[4] */ /* [5] */
- BTF_TYPE_ARRAY_ENC(3, 1, 4),
+ /* typedef const void * const_void_ptr */
+ BTF_TYPEDEF_ENC(NAME_TBD, 3), /* [4] */
+ /* const_void_ptr[4] */
+ BTF_TYPE_ARRAY_ENC(4, 1, 4), /* [5] */
BTF_END_RAW,
},
.str_sec = "\0const_void_ptr",
.err_str = "type != 0",
},
+{
+ .descr = "typedef (invalid name, name_off = 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPEDEF_ENC(0, 1), /* [2] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__int",
+ .str_sec_size = sizeof("\0__int"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "typedef_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "typedef (invalid name, invalid identifier)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPEDEF_ENC(NAME_TBD, 1), /* [2] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__!int",
+ .str_sec_size = sizeof("\0__!int"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "typedef_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "ptr type (invalid name, name_off <> 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 1), /* [2] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__int",
+ .str_sec_size = sizeof("\0__int"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "ptr_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "volatile type (invalid name, name_off <> 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_VOLATILE, 0, 0), 1), /* [2] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__int",
+ .str_sec_size = sizeof("\0__int"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "volatile_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "const type (invalid name, name_off <> 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_CONST, 0, 0), 1), /* [2] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__int",
+ .str_sec_size = sizeof("\0__int"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "const_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "restrict type (invalid name, name_off <> 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 1), /* [2] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_RESTRICT, 0, 0), 2), /* [3] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__int",
+ .str_sec_size = sizeof("\0__int"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "restrict_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "fwd type (invalid name, name_off = 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FWD, 0, 0), 0), /* [2] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__skb",
+ .str_sec_size = sizeof("\0__skb"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "fwd_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "fwd type (invalid name, invalid identifier)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_FWD, 0, 0), 0), /* [2] */
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__!skb",
+ .str_sec_size = sizeof("\0__!skb"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "fwd_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "array type (invalid name, name_off <> 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_ARRAY, 0, 0), 0), /* [2] */
+ BTF_ARRAY_ENC(1, 1, 4),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0__skb",
+ .str_sec_size = sizeof("\0__skb"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "array_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "struct type (name_off = 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(0,
+ BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 1), 4), /* [2] */
+ BTF_MEMBER_ENC(NAME_TBD, 1, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0A",
+ .str_sec_size = sizeof("\0A"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "struct_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+},
+
+{
+ .descr = "struct type (invalid name, invalid identifier)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 1), 4), /* [2] */
+ BTF_MEMBER_ENC(NAME_TBD, 1, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0A!\0B",
+ .str_sec_size = sizeof("\0A!\0B"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "struct_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "struct member (name_off = 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(0,
+ BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 1), 4), /* [2] */
+ BTF_MEMBER_ENC(NAME_TBD, 1, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0A",
+ .str_sec_size = sizeof("\0A"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "struct_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+},
+
+{
+ .descr = "struct member (invalid name, invalid identifier)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 1), 4), /* [2] */
+ BTF_MEMBER_ENC(NAME_TBD, 1, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0A\0B*",
+ .str_sec_size = sizeof("\0A\0B*"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "struct_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "enum type (name_off = 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(0,
+ BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1),
+ sizeof(int)), /* [2] */
+ BTF_ENUM_ENC(NAME_TBD, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0A\0B",
+ .str_sec_size = sizeof("\0A\0B"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "enum_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+},
+
+{
+ .descr = "enum type (invalid name, invalid identifier)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD,
+ BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1),
+ sizeof(int)), /* [2] */
+ BTF_ENUM_ENC(NAME_TBD, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0A!\0B",
+ .str_sec_size = sizeof("\0A!\0B"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "enum_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "enum member (invalid name, name_off = 0)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(0,
+ BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1),
+ sizeof(int)), /* [2] */
+ BTF_ENUM_ENC(0, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "",
+ .str_sec_size = sizeof(""),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "enum_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
+
+{
+ .descr = "enum member (invalid name, invalid identifier)",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(0,
+ BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1),
+ sizeof(int)), /* [2] */
+ BTF_ENUM_ENC(NAME_TBD, 0),
+ BTF_END_RAW,
+ },
+ .str_sec = "\0A!",
+ .str_sec_size = sizeof("\0A!"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "enum_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = sizeof(int),
+ .key_type_id = 1,
+ .value_type_id = 1,
+ .max_entries = 4,
+ .btf_load_err = true,
+ .err_str = "Invalid name",
+},
{
.descr = "arraymap invalid btf key (a bit field)",
.raw_types = {
goto err;
}
- assert(system("ping localhost -6 -c 10000 -f -q > /dev/null") == 0);
+ if (system("which ping6 &>/dev/null") == 0)
+ assert(!system("ping6 localhost -c 10000 -f -q > /dev/null"));
+ else
+ assert(!system("ping -6 localhost -c 10000 -f -q > /dev/null"));
if (bpf_prog_query(cgroup_fd, BPF_CGROUP_INET_EGRESS, 0, NULL, NULL,
&prog_cnt)) {
return TC_ACT_SHOT;
tuple_len = ipv4 ? sizeof(tuple->ipv4) : sizeof(tuple->ipv6);
- sk = bpf_sk_lookup_tcp(skb, tuple, tuple_len, 0, 0);
+ sk = bpf_sk_lookup_tcp(skb, tuple, tuple_len, BPF_F_CURRENT_NETNS, 0);
if (sk)
bpf_sk_release(sk);
return sk ? TC_ACT_OK : TC_ACT_UNSPEC;
struct bpf_sock_tuple tuple = {};
struct bpf_sock *sk;
- sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
if (sk)
bpf_sk_release(sk);
return 0;
struct bpf_sock *sk;
__u32 family = 0;
- sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
if (sk) {
bpf_sk_release(sk);
family = sk->family;
struct bpf_sock *sk;
__u32 family;
- sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
if (sk) {
sk += 1;
bpf_sk_release(sk);
struct bpf_sock *sk;
__u32 family;
- sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
sk += 1;
if (sk)
bpf_sk_release(sk);
{
struct bpf_sock_tuple tuple = {};
- bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
return 0;
}
struct bpf_sock_tuple tuple = {};
struct bpf_sock *sk;
- sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
bpf_sk_release(sk);
bpf_sk_release(sk);
return 0;
struct bpf_sock_tuple tuple = {};
struct bpf_sock *sk;
- sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ sk = bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
bpf_sk_release(sk);
return 0;
}
void lookup_no_release(struct __sk_buff *skb)
{
struct bpf_sock_tuple tuple = {};
- bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), 0, 0);
+ bpf_sk_lookup_tcp(skb, &tuple, sizeof(tuple), BPF_F_CURRENT_NETNS, 0);
}
SEC("fail_no_release_subcall")
echo -n "Wait for testing link-local IP to become available "
for _i in $(seq ${MAX_PING_TRIES}); do
echo -n "."
- if ping -6 -q -c 1 -W 1 ff02::1%${TEST_IF} >/dev/null 2>&1; then
+ if $PING6 -c 1 -W 1 ff02::1%${TEST_IF} >/dev/null 2>&1; then
echo " OK"
return
fi
BPF_PROG_SECTION="cgroup_id_logger"
BPF_PROG_ID=0
PROG="${DIR}/test_skb_cgroup_id_user"
+type ping6 >/dev/null 2>&1 && PING6="ping6" || PING6="ping -6"
main
ping_once()
{
- ping -${1} -q -c 1 -W 1 ${2%%/*} >/dev/null 2>&1
+ type ping${1} >/dev/null 2>&1 && PING="ping${1}" || PING="ping -${1}"
+ $PING -q -c 1 -W 1 ${2%%/*} >/dev/null 2>&1
}
wait_for_ip()
int fixup_percpu_cgroup_storage[MAX_FIXUPS];
const char *errstr;
const char *errstr_unpriv;
- uint32_t retval;
+ uint32_t retval, retval_unpriv;
enum {
UNDEF,
ACCEPT,
.fixup_prog1 = { 2 },
.result = ACCEPT,
.retval = 42,
+ /* Verifier rewrite for unpriv skips tail call here. */
+ .retval_unpriv = 2,
},
{
"stack pointer arithmetic",
.errstr = "R1 min value is negative",
.prog_type = BPF_PROG_TYPE_TRACEPOINT,
},
+ {
+ "map access: known scalar += value_ptr",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3),
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_0),
+ BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = ACCEPT,
+ .retval = 1,
+ },
+ {
+ "map access: value_ptr += known scalar",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3),
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = ACCEPT,
+ .retval = 1,
+ },
+ {
+ "map access: unknown scalar += value_ptr",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xf),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_0),
+ BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = ACCEPT,
+ .retval = 1,
+ },
+ {
+ "map access: value_ptr += unknown scalar",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xf),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = ACCEPT,
+ .retval = 1,
+ },
+ {
+ "map access: value_ptr += value_ptr",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_0),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = REJECT,
+ .errstr = "R0 pointer += pointer prohibited",
+ },
+ {
+ "map access: known scalar -= value_ptr",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3),
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_0),
+ BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = REJECT,
+ .errstr = "R1 tried to subtract pointer from scalar",
+ },
+ {
+ "map access: value_ptr -= known scalar",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3),
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = REJECT,
+ .errstr = "R0 min value is outside of the array range",
+ },
+ {
+ "map access: value_ptr -= known scalar, 2",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5),
+ BPF_MOV64_IMM(BPF_REG_1, 6),
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_2),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = ACCEPT,
+ .retval = 1,
+ },
+ {
+ "map access: unknown scalar -= value_ptr",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xf),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_0),
+ BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = REJECT,
+ .errstr = "R1 tried to subtract pointer from scalar",
+ },
+ {
+ "map access: value_ptr -= unknown scalar",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xf),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = REJECT,
+ .errstr = "R0 min value is negative",
+ },
+ {
+ "map access: value_ptr -= unknown scalar, 2",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xf),
+ BPF_ALU64_IMM(BPF_OR, BPF_REG_1, 0x7),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0x7),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = ACCEPT,
+ .retval = 1,
+ },
+ {
+ "map access: value_ptr -= value_ptr",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_0),
+ BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = REJECT,
+ .errstr = "R0 invalid mem access 'inv'",
+ .errstr_unpriv = "R0 pointer -= pointer prohibited",
+ },
{
"map lookup helper access to map",
.insns = {
BPF_JMP_IMM(BPF_JA, 0, 0, -7),
},
.fixup_map_hash_8b = { 4 },
- .errstr = "R0 invalid mem access 'inv'",
+ .errstr = "unbounded min value",
.result = REJECT,
},
{
"check deducing bounds from const, 5",
.insns = {
BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 0, 1),
+ BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 1, 1),
BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = ACCEPT,
},
+ {
+ "calls: ctx read at start of subprog",
+ .insns = {
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 5),
+ BPF_JMP_REG(BPF_JSGT, BPF_REG_0, BPF_REG_0, 0),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+ BPF_EXIT_INSN(),
+ BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_1, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
+ .errstr_unpriv = "function calls to other bpf functions are allowed for root only",
+ .result_unpriv = REJECT,
+ .result = ACCEPT,
+ },
};
static int probe_filter_length(const struct bpf_insn *fp)
}
}
+static int set_admin(bool admin)
+{
+ cap_t caps;
+ const cap_value_t cap_val = CAP_SYS_ADMIN;
+ int ret = -1;
+
+ caps = cap_get_proc();
+ if (!caps) {
+ perror("cap_get_proc");
+ return -1;
+ }
+ if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_val,
+ admin ? CAP_SET : CAP_CLEAR)) {
+ perror("cap_set_flag");
+ goto out;
+ }
+ if (cap_set_proc(caps)) {
+ perror("cap_set_proc");
+ goto out;
+ }
+ ret = 0;
+out:
+ if (cap_free(caps))
+ perror("cap_free");
+ return ret;
+}
+
static void do_test_single(struct bpf_test *test, bool unpriv,
int *passes, int *errors)
{
struct bpf_insn *prog = test->insns;
int map_fds[MAX_NR_MAPS];
const char *expected_err;
+ uint32_t expected_val;
uint32_t retval;
int i, err;
test->result_unpriv : test->result;
expected_err = unpriv && test->errstr_unpriv ?
test->errstr_unpriv : test->errstr;
+ expected_val = unpriv && test->retval_unpriv ?
+ test->retval_unpriv : test->retval;
reject_from_alignment = fd_prog < 0 &&
(test->flags & F_NEEDS_EFFICIENT_UNALIGNED_ACCESS) &&
- strstr(bpf_vlog, "Unknown alignment.");
+ strstr(bpf_vlog, "misaligned");
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
if (reject_from_alignment) {
printf("FAIL\nFailed due to alignment despite having efficient unaligned access: '%s'!\n",
__u8 tmp[TEST_DATA_LEN << 2];
__u32 size_tmp = sizeof(tmp);
+ if (unpriv)
+ set_admin(true);
err = bpf_prog_test_run(fd_prog, 1, test->data,
sizeof(test->data), tmp, &size_tmp,
&retval, NULL);
+ if (unpriv)
+ set_admin(false);
if (err && errno != 524/*ENOTSUPP*/ && errno != EPERM) {
printf("Unexpected bpf_prog_test_run error\n");
goto fail_log;
}
- if (!err && retval != test->retval &&
- test->retval != POINTER_VALUE) {
- printf("FAIL retval %d != %d\n", retval, test->retval);
+ if (!err && retval != expected_val &&
+ expected_val != POINTER_VALUE) {
+ printf("FAIL retval %d != %d\n", retval, expected_val);
goto fail_log;
}
}
return (sysadmin == CAP_SET);
}
-static int set_admin(bool admin)
-{
- cap_t caps;
- const cap_value_t cap_val = CAP_SYS_ADMIN;
- int ret = -1;
-
- caps = cap_get_proc();
- if (!caps) {
- perror("cap_get_proc");
- return -1;
- }
- if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_val,
- admin ? CAP_SET : CAP_CLEAR)) {
- perror("cap_set_flag");
- goto out;
- }
- if (cap_set_proc(caps)) {
- perror("cap_set_proc");
- goto out;
- }
- ret = 0;
-out:
- if (cap_free(caps))
- perror("cap_free");
- return ret;
-}
-
static void get_unpriv_disabled()
{
char buf[2];
# Thus we set MTU to 10K on all involved interfaces. Then both unicast and
# multicast traffic uses 8K frames.
#
-# +-----------------------+ +----------------------------------+
-# | H1 | | H2 |
-# | | | unicast --> + $h2.111 |
-# | | | traffic | 192.0.2.129/28 |
-# | multicast | | | e-qos-map 0:1 |
-# | traffic | | | |
-# | $h1 + <----- | | + $h2 |
-# +-----|-----------------+ +--------------|-------------------+
-# | |
-# +-----|-------------------------------------------------|-------------------+
-# | + $swp1 + $swp2 |
-# | | >1Gbps | >1Gbps |
-# | +---|----------------+ +----------|----------------+ |
-# | | + $swp1.1 | | + $swp2.111 | |
+# +---------------------------+ +----------------------------------+
+# | H1 | | H2 |
+# | | | unicast --> + $h2.111 |
+# | multicast | | traffic | 192.0.2.129/28 |
+# | traffic | | | e-qos-map 0:1 |
+# | $h1 + <----- | | | |
+# | 192.0.2.65/28 | | | + $h2 |
+# +---------------|-----------+ +--------------|-------------------+
+# | |
+# +---------------|---------------------------------------|-------------------+
+# | $swp1 + + $swp2 |
+# | >1Gbps | | >1Gbps |
+# | +-------------|------+ +----------|----------------+ |
+# | | $swp1.1 + | | + $swp2.111 | |
# | | BR1 | SW | BR111 | |
-# | | + $swp3.1 | | + $swp3.111 | |
-# | +---|----------------+ +----------|----------------+ |
-# | \_________________________________________________/ |
+# | | $swp3.1 + | | + $swp3.111 | |
+# | +-------------|------+ +----------|----------------+ |
+# | \_______________________________________/ |
# | | |
# | + $swp3 |
# | | 1Gbps bottleneck |
# |
# +--|-----------------+
# | + $h3 H3 |
+# | | 192.0.2.66/28 |
# | | |
# | + $h3.111 |
# | 192.0.2.130/28 |
ALL_TESTS="
ping_ipv4
test_mc_aware
+ test_uc_aware
"
lib_dir=$(dirname $0)/../../../net/forwarding
h1_create()
{
- simple_if_init $h1
+ simple_if_init $h1 192.0.2.65/28
mtu_set $h1 10000
}
h1_destroy()
{
mtu_restore $h1
- simple_if_fini $h1
+ simple_if_fini $h1 192.0.2.65/28
}
h2_create()
h3_create()
{
- simple_if_init $h3
+ simple_if_init $h3 192.0.2.66/28
mtu_set $h3 10000
vlan_create $h3 111 v$h3 192.0.2.130/28
vlan_destroy $h3 111
mtu_restore $h3
- simple_if_fini $h3
+ simple_if_fini $h3 192.0.2.66/28
}
switch_create()
# average ingress rate to somewhat mitigate this.
local min_ingress=2147483648
- mausezahn $h2.111 -p 8000 -A 192.0.2.129 -B 192.0.2.130 -c 0 \
+ $MZ $h2.111 -p 8000 -A 192.0.2.129 -B 192.0.2.130 -c 0 \
-a own -b $h3mac -t udp -q &
sleep 1
check_err $? "Could not get high enough UC-only ingress rate"
local ucth1=${uc_rate[1]}
- mausezahn $h1 -p 8000 -c 0 -a own -b bc -t udp -q &
+ $MZ $h1 -p 8000 -c 0 -a own -b bc -t udp -q &
local d0=$(date +%s)
local t0=$(ethtool_stats_get $h3 rx_octets_prio_0)
ret = 100 * ($ucth1 - $ucth2) / $ucth1
if (ret > 0) { ret } else { 0 }
")
- check_err $(bc <<< "$deg > 10")
+ check_err $(bc <<< "$deg > 25")
local interval=$((d1 - d0))
local mc_ir=$(rate $u0 $u1 $interval)
echo " egress UC throughput $(humanize ${uc_rate_2[1]})"
echo " ingress MC throughput $(humanize $mc_ir)"
echo " egress MC throughput $(humanize $mc_er)"
+ echo
+}
+
+test_uc_aware()
+{
+ RET=0
+
+ $MZ $h2.111 -p 8000 -A 192.0.2.129 -B 192.0.2.130 -c 0 \
+ -a own -b $h3mac -t udp -q &
+
+ local d0=$(date +%s)
+ local t0=$(ethtool_stats_get $h3 rx_octets_prio_1)
+ local u0=$(ethtool_stats_get $swp2 rx_octets_prio_1)
+ sleep 1
+
+ local attempts=50
+ local passes=0
+ local i
+
+ for ((i = 0; i < attempts; ++i)); do
+ if $ARPING -c 1 -I $h1 -b 192.0.2.66 -q -w 0.1; then
+ ((passes++))
+ fi
+
+ sleep 0.1
+ done
+
+ local d1=$(date +%s)
+ local t1=$(ethtool_stats_get $h3 rx_octets_prio_1)
+ local u1=$(ethtool_stats_get $swp2 rx_octets_prio_1)
+
+ local interval=$((d1 - d0))
+ local uc_ir=$(rate $u0 $u1 $interval)
+ local uc_er=$(rate $t0 $t1 $interval)
+
+ ((attempts == passes))
+ check_err $?
+
+ # Suppress noise from killing mausezahn.
+ { kill %% && wait; } 2>/dev/null
+
+ log_test "MC performace under UC overload"
+ echo " ingress UC throughput $(humanize ${uc_ir})"
+ echo " egress UC throughput $(humanize ${uc_er})"
+ echo " sent $attempts BC ARPs, got $passes responses"
}
trap cleanup EXIT
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for netfilter selftests
+
+TEST_PROGS := nft_trans_stress.sh
+
+include ../lib.mk
--- /dev/null
+CONFIG_NET_NS=y
+NF_TABLES_INET=y
--- /dev/null
+#!/bin/bash
+#
+# This test is for stress-testing the nf_tables config plane path vs.
+# packet path processing: Make sure we never release rules that are
+# still visible to other cpus.
+#
+# set -e
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+testns=testns1
+tables="foo bar baz quux"
+
+nft --version > /dev/null 2>&1
+if [ $? -ne 0 ];then
+ echo "SKIP: Could not run test without nft tool"
+ exit $ksft_skip
+fi
+
+ip -Version > /dev/null 2>&1
+if [ $? -ne 0 ];then
+ echo "SKIP: Could not run test without ip tool"
+ exit $ksft_skip
+fi
+
+tmp=$(mktemp)
+
+for table in $tables; do
+ echo add table inet "$table" >> "$tmp"
+ echo flush table inet "$table" >> "$tmp"
+
+ echo "add chain inet $table INPUT { type filter hook input priority 0; }" >> "$tmp"
+ echo "add chain inet $table OUTPUT { type filter hook output priority 0; }" >> "$tmp"
+ for c in $(seq 1 400); do
+ chain=$(printf "chain%03u" "$c")
+ echo "add chain inet $table $chain" >> "$tmp"
+ done
+
+ for c in $(seq 1 400); do
+ chain=$(printf "chain%03u" "$c")
+ for BASE in INPUT OUTPUT; do
+ echo "add rule inet $table $BASE counter jump $chain" >> "$tmp"
+ done
+ echo "add rule inet $table $chain counter return" >> "$tmp"
+ done
+done
+
+ip netns add "$testns"
+ip -netns "$testns" link set lo up
+
+lscpu | grep ^CPU\(s\): | ( read cpu cpunum ;
+cpunum=$((cpunum-1))
+for i in $(seq 0 $cpunum);do
+ mask=$(printf 0x%x $((1<<$i)))
+ ip netns exec "$testns" taskset $mask ping -4 127.0.0.1 -fq > /dev/null &
+ ip netns exec "$testns" taskset $mask ping -6 ::1 -fq > /dev/null &
+done)
+
+sleep 1
+
+for i in $(seq 1 10) ; do ip netns exec "$testns" nft -f "$tmp" & done
+
+for table in $tables;do
+ randsleep=$((RANDOM%10))
+ sleep $randsleep
+ ip netns exec "$testns" nft delete table inet $table 2>/dev/null
+done
+
+randsleep=$((RANDOM%10))
+sleep $randsleep
+
+pkill -9 ping
+
+wait
+
+rm -f "$tmp"
+ip netns del "$testns"
# SPDX-License-Identifier: GPL-2.0
-TEST_PROGS := cache_shape
-
-all: $(TEST_PROGS)
-
-$(TEST_PROGS): ../harness.c ../utils.c
+TEST_GEN_PROGS := cache_shape
top_srcdir = ../../../../..
include ../../lib.mk
-clean:
- rm -f $(TEST_PROGS) *.o
+$(TEST_GEN_PROGS): ../harness.c ../utils.c
return 0;
}
-#define REG_POISON 0x5a5aUL
-#define POISONED_REG(n) ((REG_POISON << 48) | ((n) << 32) | (REG_POISON << 16) | (n))
+#define REG_POISON 0x5a5a
+#define POISONED_REG(n) ((((unsigned long)REG_POISON) << 48) | ((n) << 32) | \
+ (((unsigned long)REG_POISON) << 16) | (n))
static inline void poison_regs(void)
{
}
}
+#ifdef _CALL_AIXDESC
+struct opd {
+ unsigned long ip;
+ unsigned long toc;
+ unsigned long env;
+};
+static struct opd bad_opd = {
+ .ip = BAD_NIP,
+};
+#define BAD_FUNC (&bad_opd)
+#else
+#define BAD_FUNC BAD_NIP
+#endif
+
int test_wild_bctr(void)
{
int (*func_ptr)(void);
poison_regs();
- func_ptr = (int (*)(void))BAD_NIP;
+ func_ptr = (int (*)(void))BAD_FUNC;
func_ptr();
FAIL_IF(1); /* we didn't segv? */
# The EBB handler is 64-bit code and everything links against it
CFLAGS += -m64
+# Toolchains may build PIE by default which breaks the assembly
+LDFLAGS += -no-pie
+
TEST_GEN_PROGS := reg_access_test event_attributes_test cycles_test \
cycles_with_freeze_test pmc56_overflow_test \
ebb_vs_cpu_event_test cpu_event_vs_ebb_test \
# SPDX-License-Identifier: GPL-2.0
-TEST_PROGS := ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
+TEST_GEN_PROGS := ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx ptrace-tm-vsx \
ptrace-tm-spd-vsx ptrace-tm-spr ptrace-hwbreak ptrace-pkey core-pkey \
perf-hwbreak ptrace-syscall
top_srcdir = ../../../../..
include ../../lib.mk
-all: $(TEST_PROGS)
-
CFLAGS += -m64 -I../../../../../usr/include -I../tm -mhtm -fno-pie
-ptrace-pkey core-pkey: child.h
-ptrace-pkey core-pkey: LDLIBS += -pthread
-
-$(TEST_PROGS): ../harness.c ../utils.c ../lib/reg.S ptrace.h
+$(OUTPUT)/ptrace-pkey $(OUTPUT)/core-pkey: child.h
+$(OUTPUT)/ptrace-pkey $(OUTPUT)/core-pkey: LDLIBS += -pthread
-clean:
- rm -f $(TEST_PROGS) *.o
+$(TEST_GEN_PROGS): ../harness.c ../utils.c ../lib/reg.S ptrace.h
"3: ;"
: [res] "=r" (result), [texasr] "=r" (texasr)
: [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2), [gpr_4]"i"(GPR_4),
- [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "r" (&a),
- [flt_2] "r" (&b), [flt_4] "r" (&d)
+ [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "b" (&a),
+ [flt_4] "b" (&d)
: "memory", "r5", "r6", "r7",
"r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
"r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23",
# SPDX-License-Identifier: GPL-2.0+
TEST_GEN_PROGS := rfi_flush
+top_srcdir = ../../../../..
CFLAGS += -I../../../../../usr/include
struct perf_event_read v;
__u64 l1d_misses_total = 0;
unsigned long iterations = 100000, zero_size = 24 * 1024;
+ unsigned long l1d_misses_expected;
int rfi_flush_org, rfi_flush;
SKIP_IF(geteuid() != 0);
iter = repetitions;
+ /*
+ * We expect to see l1d miss for each cacheline access when rfi_flush
+ * is set. Allow a small variation on this.
+ */
+ l1d_misses_expected = iterations * (zero_size / CACHELINE_SIZE - 2);
+
again:
FAIL_IF(perf_event_reset(fd));
FAIL_IF(read(fd, &v, sizeof(v)) != sizeof(v));
- /* Expect at least zero_size/CACHELINE_SIZE misses per iteration */
- if (v.l1d_misses >= (iterations * zero_size / CACHELINE_SIZE) && rfi_flush)
+ if (rfi_flush && v.l1d_misses >= l1d_misses_expected)
passes++;
- else if (v.l1d_misses < iterations && !rfi_flush)
+ else if (!rfi_flush && v.l1d_misses < (l1d_misses_expected / 2))
passes++;
l1d_misses_total += v.l1d_misses;
if (passes < repetitions) {
printf("FAIL (L1D misses with rfi_flush=%d: %llu %c %lu) [%d/%d failures]\n",
rfi_flush, l1d_misses_total, rfi_flush ? '<' : '>',
- rfi_flush ? (repetitions * iterations * zero_size / CACHELINE_SIZE) : iterations,
+ rfi_flush ? repetitions * l1d_misses_expected :
+ repetitions * l1d_misses_expected / 2,
repetitions - passes, repetitions);
rc = 1;
} else
printf("PASS (L1D misses with rfi_flush=%d: %llu %c %lu) [%d/%d pass]\n",
rfi_flush, l1d_misses_total, rfi_flush ? '>' : '<',
- rfi_flush ? (repetitions * iterations * zero_size / CACHELINE_SIZE) : iterations,
+ rfi_flush ? repetitions * l1d_misses_expected :
+ repetitions * l1d_misses_expected / 2,
passes, repetitions);
if (rfi_flush == rfi_flush_org) {
# SPDX-License-Identifier: GPL-2.0
-TEST_PROGS := signal signal_tm
-
-all: $(TEST_PROGS)
-
-$(TEST_PROGS): ../harness.c ../utils.c signal.S
+TEST_GEN_PROGS := signal signal_tm
CFLAGS += -maltivec
-signal_tm: CFLAGS += -mhtm
+$(OUTPUT)/signal_tm: CFLAGS += -mhtm
top_srcdir = ../../../../..
include ../../lib.mk
-clean:
- rm -f $(TEST_PROGS) *.o
+$(TEST_GEN_PROGS): ../harness.c ../utils.c signal.S
top_srcdir = ../../../../..
include ../../lib.mk
+$(OUTPUT)/switch_endian_test: ASFLAGS += -I $(OUTPUT)
$(OUTPUT)/switch_endian_test: $(OUTPUT)/check-reversed.S
$(OUTPUT)/check-reversed.o: $(OUTPUT)/check.o
#include "utils.h"
static char auxv[4096];
-extern unsigned int dscr_insn[];
int read_auxv(char *buf, ssize_t buf_size)
{
ucontext_t *ctx = (ucontext_t *)unused;
unsigned long *pc = &UCONTEXT_NIA(ctx);
- if (*pc == (unsigned long)&dscr_insn) {
+ /* mtspr 3,RS to check for move to DSCR below */
+ if ((*((unsigned int *)*pc) & 0xfc1fffff) == 0x7c0303a6) {
if (!warned++)
printf("WARNING: Skipping over dscr setup. Consider running 'ppc64_cpu --dscr=1' manually.\n");
*pc += 4;
init = 1;
}
- asm volatile("dscr_insn: mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR));
+ asm volatile("mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR));
}
* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/
-/* Test readlink /proc/self/map_files/... with address 0. */
+/* Test readlink /proc/self/map_files/... with minimum address. */
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
int main(void)
{
const unsigned int PAGE_SIZE = sysconf(_SC_PAGESIZE);
+#ifdef __arm__
+ unsigned long va = 2 * PAGE_SIZE;
+#else
+ unsigned long va = 0;
+#endif
void *p;
int fd;
unsigned long a, b;
if (fd == -1)
return 1;
- p = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_PRIVATE|MAP_FILE|MAP_FIXED, fd, 0);
+ p = mmap((void *)va, PAGE_SIZE, PROT_NONE, MAP_PRIVATE|MAP_FILE|MAP_FIXED, fd, 0);
if (p == MAP_FAILED) {
if (errno == EPERM)
return 2;
(rawout, serr) = proc.communicate()
if proc.returncode != 0 and len(serr) > 0:
- foutput = serr.decode("utf-8")
+ foutput = serr.decode("utf-8", errors="ignore")
else:
- foutput = rawout.decode("utf-8")
+ foutput = rawout.decode("utf-8", errors="ignore")
proc.stdout.close()
proc.stderr.close()
file=sys.stderr)
print("\n{} *** Error message: \"{}\"".format(prefix, foutput),
file=sys.stderr)
+ print("returncode {}; expected {}".format(proc.returncode,
+ exit_codes))
print("\n{} *** Aborting test run.".format(prefix), file=sys.stderr)
print("\n\n{} *** stdout ***".format(proc.stdout), file=sys.stderr)
print("\n\n{} *** stderr ***".format(proc.stderr), file=sys.stderr)
print('-----> execute stage')
pm.call_pre_execute()
(p, procout) = exec_cmd(args, pm, 'execute', tidx["cmdUnderTest"])
- exit_code = p.returncode
+ if p:
+ exit_code = p.returncode
+ else:
+ exit_code = None
+
pm.call_post_execute()
- if (exit_code != int(tidx["expExitCode"])):
+ if (exit_code is None or exit_code != int(tidx["expExitCode"])):
result = False
- print("exit:", exit_code, int(tidx["expExitCode"]))
+ print("exit: {!r}".format(exit_code))
+ print("exit: {}".format(int(tidx["expExitCode"])))
+ #print("exit: {!r} {}".format(exit_code, int(tidx["expExitCode"])))
print(procout)
else:
if args.verbose > 0: