diff options
Diffstat (limited to 'Documentation')
136 files changed, 3149 insertions, 629 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-i2c-devices-turris-omnia-mcu b/Documentation/ABI/testing/sysfs-bus-i2c-devices-turris-omnia-mcu index 307a55f599cb..35a8f6dae5bf 100644 --- a/Documentation/ABI/testing/sysfs-bus-i2c-devices-turris-omnia-mcu +++ b/Documentation/ABI/testing/sysfs-bus-i2c-devices-turris-omnia-mcu @@ -32,9 +32,9 @@ Description: (RW) The front button on the Turris Omnia router can be interrupt. This file switches between these two modes: - - "mcu" makes the button press event be handled by the MCU to - change the LEDs panel intensity. - - "cpu" makes the button press event be handled by the CPU. + - ``mcu`` makes the button press event be handled by the MCU to + change the LEDs panel intensity. + - ``cpu`` makes the button press event be handled by the CPU. Format: %s. diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 325873385b71..de725ca3be82 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -562,7 +562,8 @@ Description: Control Symmetric Multi Threading (SMT) ================ ========================================= If control status is "forceoff" or "notsupported" writes - are rejected. + are rejected. Note that enabling SMT on PowerPC skips + offline cores. What: /sys/devices/system/cpu/cpuX/power/energy_perf_bias Date: March 2019 diff --git a/Documentation/ABI/testing/sysfs-timecard b/Documentation/ABI/testing/sysfs-timecard index 220478156297..3ae41b7634ac 100644 --- a/Documentation/ABI/testing/sysfs-timecard +++ b/Documentation/ABI/testing/sysfs-timecard @@ -258,24 +258,29 @@ Description: (RW) When retrieving the PHC with the PTP SYS_OFFSET_EXTENDED the estimated point where the FPGA latches the PHC time. This value may be changed by writing an unsigned integer. -What: /sys/class/timecard/ocpN/ttyGNSS -What: /sys/class/timecard/ocpN/ttyGNSS2 -Date: September 2021 +What: /sys/class/timecard/ocpN/tty +Date: August 2024 +Contact: Vadim Fedorenko <vadim.fedorenko@linux.dev> +Description: (RO) Directory containing the sysfs nodes for TTY attributes + +What: /sys/class/timecard/ocpN/tty/ttyGNSS +What: /sys/class/timecard/ocpN/tty/ttyGNSS2 +Date: August 2024 Contact: Jonathan Lemon <jonathan.lemon@gmail.com> -Description: These optional attributes link to the TTY serial ports - associated with the GNSS devices. +Description: (RO) These optional attributes contain names of the TTY serial + ports associated with the GNSS devices. -What: /sys/class/timecard/ocpN/ttyMAC -Date: September 2021 +What: /sys/class/timecard/ocpN/tty/ttyMAC +Date: August 2024 Contact: Jonathan Lemon <jonathan.lemon@gmail.com> -Description: This optional attribute links to the TTY serial port - associated with the Miniature Atomic Clock. +Description: (RO) This optional attribute contains name of the TTY serial + port associated with the Miniature Atomic Clock. -What: /sys/class/timecard/ocpN/ttyNMEA -Date: September 2021 +What: /sys/class/timecard/ocpN/tty/ttyNMEA +Date: August 2024 Contact: Jonathan Lemon <jonathan.lemon@gmail.com> -Description: This optional attribute links to the TTY serial port - which outputs the PHC time in NMEA ZDA format. +Description: (RO) This optional attribute contains name of the TTY serial + port which outputs the PHC time in NMEA ZDA format. What: /sys/class/timecard/ocpN/utc_tai_offset Date: September 2021 diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 86311c2907cd..95c18bc17083 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1717,9 +1717,10 @@ The following nested keys are defined. entries fault back in or are written out to disk. memory.zswap.writeback - A read-write single value file. The default value is "1". The - initial value of the root cgroup is 1, and when a new cgroup is - created, it inherits the current value of its parent. + A read-write single value file. The default value is "1". + Note that this setting is hierarchical, i.e. the writeback would be + implicitly disabled for child cgroups if the upper hierarchy + does so. When this is set to 0, all swapping attempts to swapping devices are disabled. This included both zswap writebacks, and swapping due diff --git a/Documentation/admin-guide/cifs/usage.rst b/Documentation/admin-guide/cifs/usage.rst index fd4b56c0996f..c09674a75a9e 100644 --- a/Documentation/admin-guide/cifs/usage.rst +++ b/Documentation/admin-guide/cifs/usage.rst @@ -742,7 +742,7 @@ SecurityFlags Flags which control security negotiation and may use NTLMSSP 0x00080 must use NTLMSSP 0x80080 seal (packet encryption) 0x00040 - must seal (not implemented yet) 0x40040 + must seal 0x40040 cifsFYI If set to non-zero value, additional debug information will be logged to the system error log. This field diff --git a/Documentation/admin-guide/device-mapper/dm-crypt.rst b/Documentation/admin-guide/device-mapper/dm-crypt.rst index e625830d335e..552c9155165d 100644 --- a/Documentation/admin-guide/device-mapper/dm-crypt.rst +++ b/Documentation/admin-guide/device-mapper/dm-crypt.rst @@ -162,13 +162,14 @@ iv_large_sectors Module parameters:: -max_read_size -max_write_size - Maximum size of read or write requests. When a request larger than this size - is received, dm-crypt will split the request. The splitting improves - concurrency (the split requests could be encrypted in parallel by multiple - cores), but it also causes overhead. The user should tune these parameters to - fit the actual workload. + + max_read_size + max_write_size + Maximum size of read or write requests. When a request larger than this size + is received, dm-crypt will split the request. The splitting improves + concurrency (the split requests could be encrypted in parallel by multiple + cores), but it also causes overhead. The user should tune these parameters to + fit the actual workload. Example scripts diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst index 4bd3ce3ba171..2ad1c05b8c88 100644 --- a/Documentation/admin-guide/hw-vuln/srso.rst +++ b/Documentation/admin-guide/hw-vuln/srso.rst @@ -158,3 +158,72 @@ poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). + +Checking the safe RET mitigation actually works +----------------------------------------------- + +In case one wants to validate whether the SRSO safe RET mitigation works +on a kernel, one could use two performance counters + +* PMC_0xc8 - Count of RET/RET lw retired +* PMC_0xc9 - Count of RET/RET lw retired mispredicted + +and compare the number of RETs retired properly vs those retired +mispredicted, in kernel mode. Another way of specifying those events +is:: + + # perf list ex_ret_near_ret + + List of pre-defined events (to be used in -e or -M): + + core: + ex_ret_near_ret + [Retired Near Returns] + ex_ret_near_ret_mispred + [Retired Near Returns Mispredicted] + +Either the command using the event mnemonics:: + + # perf stat -e ex_ret_near_ret:k -e ex_ret_near_ret_mispred:k sleep 10s + +or using the raw PMC numbers:: + + # perf stat -e cpu/event=0xc8,umask=0/k -e cpu/event=0xc9,umask=0/k sleep 10s + +should give the same amount. I.e., every RET retired should be +mispredicted:: + + [root@brent: ~/kernel/linux/tools/perf> ./perf stat -e cpu/event=0xc8,umask=0/k -e cpu/event=0xc9,umask=0/k sleep 10s + + Performance counter stats for 'sleep 10s': + + 137,167 cpu/event=0xc8,umask=0/k + 137,173 cpu/event=0xc9,umask=0/k + + 10.004110303 seconds time elapsed + + 0.000000000 seconds user + 0.004462000 seconds sys + +vs the case when the mitigation is disabled (spec_rstack_overflow=off) +or not functioning properly, showing usually a lot smaller number of +mispredicted retired RETs vs the overall count of retired RETs during +a workload:: + + [root@brent: ~/kernel/linux/tools/perf> ./perf stat -e cpu/event=0xc8,umask=0/k -e cpu/event=0xc9,umask=0/k sleep 10s + + Performance counter stats for 'sleep 10s': + + 201,627 cpu/event=0xc8,umask=0/k + 4,074 cpu/event=0xc9,umask=0/k + + 10.003267252 seconds time elapsed + + 0.002729000 seconds user + 0.000000000 seconds sys + +Also, there is a selftest which performs the above, go to +tools/testing/selftests/x86/ and do:: + + make srso + ./srso diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 8396e015aab3..be010fec7654 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4808,11 +4808,9 @@ profile= [KNL] Enable kernel profiling via /proc/profile Format: [<profiletype>,]<number> - Param: <profiletype>: "schedule", "sleep", or "kvm" + Param: <profiletype>: "schedule" or "kvm" [defaults to kernel profiling] Param: "schedule" - profile schedule points. - Param: "sleep" - profile D-state sleeping (millisecs). - Requires CONFIG_SCHEDSTATS Param: "kvm" - profile VM exits. Param: <number> - step/bucket size as a power of 2 for statistical time based profiling. diff --git a/Documentation/admin-guide/perf/arm-ni.rst b/Documentation/admin-guide/perf/arm-ni.rst new file mode 100644 index 000000000000..d26a8f697c36 --- /dev/null +++ b/Documentation/admin-guide/perf/arm-ni.rst @@ -0,0 +1,17 @@ +==================================== +Arm Network-on Chip Interconnect PMU +==================================== + +NI-700 and friends implement a distinct PMU for each clock domain within the +interconnect. Correspondingly, the driver exposes multiple PMU devices named +arm_ni_<x>_cd_<y>, where <x> is an (arbitrary) instance identifier and <y> is +the clock domain ID within that particular instance. If multiple NI instances +exist within a system, the PMU devices can be correlated with the underlying +hardware instance via sysfs parentage. + +Each PMU exposes base event aliases for the interface types present in its clock +domain. These require qualifying with the "eventid" and "nodeid" parameters +to specify the event code to count and the interface at which to count it +(per the configured hardware ID as reflected in the xxNI_NODE_INFO register). +The exception is the "cycles" alias for the PMU cycle counter, which is encoded +with the PMU node type and needs no further qualification. diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst index d47cd229d710..39b8e1fdd0cd 100644 --- a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst @@ -46,16 +46,16 @@ Some of the events only exist for specific configurations. DesignWare Cores (DWC) PCIe PMU Driver ======================================= -This driver adds PMU devices for each PCIe Root Port named based on the BDF of +This driver adds PMU devices for each PCIe Root Port named based on the SBDF of the Root Port. For example, - 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) + 0001:30:03.0 PCI bridge: Device 1ded:8000 (rev 01) -the PMU device name for this Root Port is dwc_rootport_3018. +the PMU device name for this Root Port is dwc_rootport_13018. The DWC PCIe PMU driver registers a perf PMU driver, which provides description of available events and configuration options in sysfs, see -/sys/bus/event_source/devices/dwc_rootport_{bdf}. +/sys/bus/event_source/devices/dwc_rootport_{sbdf}. The "format" directory describes format of the config fields of the perf_event_attr structure. The "events" directory provides configuration @@ -66,16 +66,16 @@ The "perf list" command shall list the available events from sysfs, e.g.:: $# perf list | grep dwc_rootport <...> - dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] + dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] <...> - dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] + dwc_rootport_13018/rx_memory_read,lane=?/ [Kernel PMU event] Time Based Analysis Event Usage ------------------------------- Example usage of counting PCIe RX TLP data payload (Units of bytes):: - $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ + $# perf stat -a -e dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/ The average RX/TX bandwidth can be calculated using the following formula: @@ -88,7 +88,7 @@ Lane Event Usage Each lane has the same event set and to avoid generating a list of hundreds of events, the user need to specify the lane ID explicitly, e.g.:: - $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ + $# perf stat -a -e dwc_rootport_13018/rx_memory_read,lane=4/ The driver does not support sampling, therefore "perf record" will not work. Per-task (without "-a") perf sessions are not supported. diff --git a/Documentation/admin-guide/perf/hisi-pcie-pmu.rst b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst index 5541ff40e06a..083ca50de896 100644 --- a/Documentation/admin-guide/perf/hisi-pcie-pmu.rst +++ b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst @@ -28,7 +28,9 @@ The "identifier" sysfs file allows users to identify the version of the PMU hardware device. The "bus" sysfs file allows users to get the bus number of Root Ports -monitored by PMU. +monitored by PMU. Furthermore users can get the Root Ports range in +[bdf_min, bdf_max] from "bdf_min" and "bdf_max" sysfs attributes +respectively. Example usage of perf:: diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index 7eb3dcd6f4da..8502bc174640 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -16,6 +16,7 @@ Performance monitor support starfive_starlink_pmu arm-ccn arm-cmn + arm-ni xgene-pmu arm_dsu_pmu thunderx2-pmu diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst index d0324d44f548..210a808b74ec 100644 --- a/Documentation/admin-guide/pm/amd-pstate.rst +++ b/Documentation/admin-guide/pm/amd-pstate.rst @@ -251,7 +251,9 @@ performance supported in `AMD CPPC Performance Capability <perf_cap_>`_). In some ASICs, the highest CPPC performance is not the one in the ``_CPC`` table, so we need to expose it to sysfs. If boost is not active, but still supported, this maximum frequency will be larger than the one in -``cpuinfo``. +``cpuinfo``. On systems that support preferred core, the driver will have +different values for some cores than others and this will reflect the values +advertised by the platform at bootup. This attribute is read-only. ``amd_pstate_lowest_nonlinear_freq`` @@ -262,6 +264,17 @@ lowest non-linear performance in `AMD CPPC Performance Capability <perf_cap_>`_.) This attribute is read-only. +``amd_pstate_hw_prefcore`` + +Whether the platform supports the preferred core feature and it has been +enabled. This attribute is read-only. + +``amd_pstate_prefcore_ranking`` + +The performance ranking of the core. This number doesn't have any unit, but +larger numbers are preferred at the time of reading. This can change at +runtime based on platform conditions. This attribute is read-only. + ``energy_performance_available_preferences`` A list of all the supported EPP preferences that could be used for diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst index 448c1664879b..694f67fa07d1 100644 --- a/Documentation/arch/arm64/elf_hwcaps.rst +++ b/Documentation/arch/arm64/elf_hwcaps.rst @@ -365,6 +365,8 @@ HWCAP2_SME_SF8DP2 HWCAP2_SME_SF8DP4 Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1. +HWCAP2_POE + Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001. 4. Unused AT_HWCAP bits ----------------------- diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst index bb83c5d8c675..9eb5e70b4888 100644 --- a/Documentation/arch/arm64/silicon-errata.rst +++ b/Documentation/arch/arm64/silicon-errata.rst @@ -55,6 +55,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Ampere | AmpereOne | AC03_CPU_38 | AMPERE_ERRATUM_AC03_CPU_38 | +----------------+-----------------+-----------------+-----------------------------+ +| Ampere | AmpereOne AC04 | AC04_CPU_10 | AMPERE_ERRATUM_AC03_CPU_38 | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A510 | #2457168 | ARM64_ERRATUM_2457168 | +----------------+-----------------+-----------------+-----------------------------+ @@ -122,10 +124,18 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A76 | #1490853 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A76 | #3324349 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A77 | #1491015 | N/A | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A77 | #1508412 | ARM64_ERRATUM_1508412 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A77 | #3324348 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A78 | #3324344 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A78C | #3324346,3324347| ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A710 | #2119858 | ARM64_ERRATUM_2119858 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A710 | #2054223 | ARM64_ERRATUM_2054223 | @@ -138,8 +148,14 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A720 | #3456091 | ARM64_ERRATUM_3194386 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A725 | #3456106 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-X1 | #1502854 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-X1 | #3324344 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-X1C | #3324346 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-X2 | #2119858 | ARM64_ERRATUM_2119858 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-X2 | #2224489 | ARM64_ERRATUM_2224489 | @@ -160,6 +176,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N1 | #1542419 | ARM64_ERRATUM_1542419 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Neoverse-N1 | #3324349 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N2 | #2139208 | ARM64_ERRATUM_2139208 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-N2 | #2067961 | ARM64_ERRATUM_2067961 | @@ -170,6 +188,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-V1 | #1619801 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Neoverse-V1 | #3324341 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-V2 | #3324336 | ARM64_ERRATUM_3194386 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-V3 | #3312417 | ARM64_ERRATUM_3194386 | @@ -231,8 +251,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ -| Hisilicon | Hip08 SMMU PMCG | #162001900 | N/A | -| | Hip09 SMMU PMCG | | | +| Hisilicon | Hip{08,09,10,10C| #162001900 | N/A | +| | ,11} SMMU PMCG | | | +----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst index 3db60a0911df..85b709257918 100644 --- a/Documentation/arch/riscv/hwprobe.rst +++ b/Documentation/arch/riscv/hwprobe.rst @@ -239,25 +239,33 @@ The following keys are defined: ratified in commit 98918c844281 ("Merge pull request #1217 from riscv/zawrs") of riscv-isa-manual. -* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: A bitmask that contains performance - information about the selected set of processors. +* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to + :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was + mistakenly classified as a bitmask rather than a value. - * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNKNOWN`: The performance of misaligned - accesses is unknown. +* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`: An enum value describing + the performance of misaligned scalar native word accesses on the selected set + of processors. - * :c:macro:`RISCV_HWPROBE_MISALIGNED_EMULATED`: Misaligned accesses are - emulated via software, either in or below the kernel. These accesses are - always extremely slow. + * :c:macro:`RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN`: The performance of + misaligned scalar accesses is unknown. - * :c:macro:`RISCV_HWPROBE_MISALIGNED_SLOW`: Misaligned accesses are slower - than equivalent byte accesses. Misaligned accesses may be supported - directly in hardware, or trapped and emulated by software. + * :c:macro:`RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED`: Misaligned scalar + accesses are emulated via software, either in or below the kernel. These + accesses are always extremely slow. - * :c:macro:`RISCV_HWPROBE_MISALIGNED_FAST`: Misaligned accesses are faster - than equivalent byte accesses. + * :c:macro:`RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW`: Misaligned scalar native + word sized accesses are slower than the equivalent quantity of byte + accesses. Misaligned accesses may be supported directly in hardware, or + trapped and emulated by software. - * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNSUPPORTED`: Misaligned accesses are - not supported at all and will generate a misaligned address fault. + * :c:macro:`RISCV_HWPROBE_MISALIGNED_SCALAR_FAST`: Misaligned scalar native + word sized accesses are faster than the equivalent quantity of byte + accesses. + + * :c:macro:`RISCV_HWPROBE_MISALIGNED_SCALAR_UNSUPPORTED`: Misaligned scalar + accesses are not supported at all and will generate a misaligned address + fault. * :c:macro:`RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE`: An unsigned int which represents the size of the Zicboz block in bytes. diff --git a/Documentation/arch/riscv/vm-layout.rst b/Documentation/arch/riscv/vm-layout.rst index 077b968dcc81..eabec99b5852 100644 --- a/Documentation/arch/riscv/vm-layout.rst +++ b/Documentation/arch/riscv/vm-layout.rst @@ -134,19 +134,3 @@ RISC-V Linux Kernel SV57 ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel __________________|____________|__________________|_________|____________________________________________________________ - - -Userspace VAs --------------------- -To maintain compatibility with software that relies on the VA space with a -maximum of 48 bits the kernel will, by default, return virtual addresses to -userspace from a 48-bit range (sv48). This default behavior is achieved by -passing 0 into the hint address parameter of mmap. On CPUs with an address space -smaller than sv48, the CPU maximum supported address space will be the default. - -Software can "opt-in" to receiving VAs from another VA space by providing -a hint address to mmap. When a hint address is passed to mmap, the returned -address will never use more bits than the hint address. For example, if a hint -address of `1 << 40` is passed to mmap, a valid returned address will never use -bits 41 through 63. If no mappable addresses are available in that range, mmap -will return `MAP_FAILED`. diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst index bcc370c876be..16f861c9791e 100644 --- a/Documentation/core-api/workqueue.rst +++ b/Documentation/core-api/workqueue.rst @@ -260,7 +260,7 @@ Some users depend on strict execution ordering where only one work item is in flight at any given time and the work items are processed in queueing order. While the combination of ``@max_active`` of 1 and ``WQ_UNBOUND`` used to achieve this behavior, this is no longer the -case. Use ``alloc_ordered_queue()`` instead. +case. Use alloc_ordered_workqueue() instead. Example Execution Scenarios diff --git a/Documentation/dev-tools/gcov.rst b/Documentation/dev-tools/gcov.rst index 5fce2b06f229..dbd26b02ff3c 100644 --- a/Documentation/dev-tools/gcov.rst +++ b/Documentation/dev-tools/gcov.rst @@ -75,6 +75,17 @@ Only files which are linked to the main kernel image or are compiled as kernel modules are supported by this mechanism. +Module specific configs +----------------------- + +Gcov kernel configs for specific modules are described below: + +CONFIG_GCOV_PROFILE_RDS: + Enables GCOV profiling on RDS for checking which functions or + lines are executed. This config is used by the rds selftest to + generate coverage reports. If left unset the report is omitted. + + Files ----- diff --git a/Documentation/devicetree/bindings/ata/rockchip,dwc-ahci.yaml b/Documentation/devicetree/bindings/ata/rockchip,dwc-ahci.yaml index b5e5767d8698..13eaa8d9a16e 100644 --- a/Documentation/devicetree/bindings/ata/rockchip,dwc-ahci.yaml +++ b/Documentation/devicetree/bindings/ata/rockchip,dwc-ahci.yaml @@ -35,6 +35,9 @@ properties: ports-implemented: const: 1 + power-domains: + maxItems: 1 + sata-port@0: $ref: /schemas/ata/snps,dwc-ahci-common.yaml#/$defs/dwc-ahci-port diff --git a/Documentation/devicetree/bindings/clock/qcom,dispcc-sm6350.yaml b/Documentation/devicetree/bindings/clock/qcom,dispcc-sm6350.yaml index a584b4953e68..46403b98411f 100644 --- a/Documentation/devicetree/bindings/clock/qcom,dispcc-sm6350.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,dispcc-sm6350.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Display Clock & Reset Controller on SM6350 maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm display clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,gcc-msm8994.yaml b/Documentation/devicetree/bindings/clock/qcom,gcc-msm8994.yaml index 6b9c1d198b14..10afe984e2fb 100644 --- a/Documentation/devicetree/bindings/clock/qcom,gcc-msm8994.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,gcc-msm8994.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Global Clock & Reset Controller on MSM8994 maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm global clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,gcc-sm6125.yaml b/Documentation/devicetree/bindings/clock/qcom,gcc-sm6125.yaml index a5a29dc75ae1..1fe68e07a2b2 100644 --- a/Documentation/devicetree/bindings/clock/qcom,gcc-sm6125.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,gcc-sm6125.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Global Clock & Reset Controller on SM6125 maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm global clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,gcc-sm6350.yaml b/Documentation/devicetree/bindings/clock/qcom,gcc-sm6350.yaml index 2280b859b2ad..78e232fa95dc 100644 --- a/Documentation/devicetree/bindings/clock/qcom,gcc-sm6350.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,gcc-sm6350.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Global Clock & Reset Controller on SM6350 maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm global clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,sm6115-gpucc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm6115-gpucc.yaml index cf19f44af774..4ff17a91344b 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm6115-gpucc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm6115-gpucc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Graphics Clock & Reset Controller on SM6115 maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm graphics clock control module provides clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,sm6125-gpucc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm6125-gpucc.yaml index 374a1844a159..10a9c96a97b6 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm6125-gpucc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm6125-gpucc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Graphics Clock & Reset Controller on SM6125 maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm graphics clock control module provides clocks and power domains on diff --git a/Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml index fd6658cb793d..c03b30f64f35 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Camera Clock & Reset Controller on SM6350 maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm camera clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,sm6375-dispcc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm6375-dispcc.yaml index 183b1c75dbdf..3cd422a645fd 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm6375-dispcc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm6375-dispcc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Display Clock & Reset Controller on SM6375 maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm display clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,sm6375-gcc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm6375-gcc.yaml index 147b75a21508..de4e9066eeb8 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm6375-gcc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm6375-gcc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Global Clock & Reset Controller on SM6375 maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm global clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,sm6375-gpucc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm6375-gpucc.yaml index cf4cad76f6c9..d9dd479c17bd 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm6375-gpucc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm6375-gpucc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Graphics Clock & Reset Controller on SM6375 maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm graphics clock control module provides clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,sm8350-videocc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm8350-videocc.yaml index 46d1d91e3a01..5c2ecec0624e 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm8350-videocc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm8350-videocc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm SM8350 Video Clock & Reset Controller maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm video clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/clock/qcom,sm8450-gpucc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm8450-gpucc.yaml index 3c2cac14e6c3..d10bb002906e 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sm8450-gpucc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,sm8450-gpucc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Graphics Clock & Reset Controller on SM8450 maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm graphics clock control module provides the clocks, resets and power diff --git a/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml b/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml index 0a9ed2848b7c..9c8c9991f29a 100644 --- a/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml +++ b/Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml @@ -137,7 +137,10 @@ patternProperties: - const: fsl,sec-v4.0-rtic reg: - maxItems: 1 + items: + - description: RTIC control and status register space. + - description: RTIC recoverable error indication register space. + minItems: 1 ranges: maxItems: 1 diff --git a/Documentation/devicetree/bindings/crypto/qcom,prng.yaml b/Documentation/devicetree/bindings/crypto/qcom,prng.yaml index 89c88004b41b..048b769a73c0 100644 --- a/Documentation/devicetree/bindings/crypto/qcom,prng.yaml +++ b/Documentation/devicetree/bindings/crypto/qcom,prng.yaml @@ -17,6 +17,7 @@ properties: - qcom,prng-ee # 8996 and later using EE - items: - enum: + - qcom,sa8255p-trng - qcom,sa8775p-trng - qcom,sc7280-trng - qcom,sm8450-trng diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm6375-mdss.yaml b/Documentation/devicetree/bindings/display/msm/qcom,sm6375-mdss.yaml index 8e8a288d318c..e22b4c433fd0 100644 --- a/Documentation/devicetree/bindings/display/msm/qcom,sm6375-mdss.yaml +++ b/Documentation/devicetree/bindings/display/msm/qcom,sm6375-mdss.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm SM6375 Display MDSS maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: SM6375 MSM Mobile Display Subsystem (MDSS), which encapsulates sub-blocks diff --git a/Documentation/devicetree/bindings/display/panel/wl-355608-a8.yaml b/Documentation/devicetree/bindings/display/panel/anbernic,rg35xx-plus-panel.yaml index e552d01b52b9..1d67492ebd3b 100644 --- a/Documentation/devicetree/bindings/display/panel/wl-355608-a8.yaml +++ b/Documentation/devicetree/bindings/display/panel/anbernic,rg35xx-plus-panel.yaml @@ -1,10 +1,10 @@ # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) %YAML 1.2 --- -$id: http://devicetree.org/schemas/display/panel/wl-355608-a8.yaml# +$id: http://devicetree.org/schemas/display/panel/anbernic,rg35xx-plus-panel.yaml# $schema: http://devicetree.org/meta-schemas/core.yaml# -title: WL-355608-A8 3.5" (640x480 pixels) 24-bit IPS LCD panel +title: Anbernic RG35XX series (WL-355608-A8) 3.5" 640x480 24-bit IPS LCD panel maintainers: - Ryan Walklin <ryan@testtoast.com> @@ -15,7 +15,14 @@ allOf: properties: compatible: - const: wl-355608-a8 + oneOf: + - const: anbernic,rg35xx-plus-panel + - items: + - enum: + - anbernic,rg35xx-2024-panel + - anbernic,rg35xx-h-panel + - anbernic,rg35xx-sp-panel + - const: anbernic,rg35xx-plus-panel reg: maxItems: 1 @@ -40,7 +47,7 @@ examples: #size-cells = <0>; panel@0 { - compatible = "wl-355608-a8"; + compatible = "anbernic,rg35xx-plus-panel"; reg = <0>; spi-3wire; diff --git a/Documentation/devicetree/bindings/display/panel/asus,z00t-tm5p5-nt35596.yaml b/Documentation/devicetree/bindings/display/panel/asus,z00t-tm5p5-nt35596.yaml index 2399cabf044c..dd614e077bbf 100644 --- a/Documentation/devicetree/bindings/display/panel/asus,z00t-tm5p5-nt35596.yaml +++ b/Documentation/devicetree/bindings/display/panel/asus,z00t-tm5p5-nt35596.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ASUS Z00T TM5P5 NT35596 5.5" 1080×1920 LCD Panel maintainers: - - Konrad Dybcio <konradybcio@gmail.com> + - Konrad Dybcio <konradybcio@kernel.org> description: |+ This panel seems to only be found in the Asus Z00T diff --git a/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml b/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml index 5192c93fbd67..032f783eefc4 100644 --- a/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml +++ b/Documentation/devicetree/bindings/display/panel/samsung,atna33xc20.yaml @@ -17,9 +17,12 @@ properties: oneOf: # Samsung 13.3" FHD (1920x1080 pixels) eDP AMOLED panel - const: samsung,atna33xc20 - # Samsung 14.5" WQXGA+ (2880x1800 pixels) eDP AMOLED panel - items: - - const: samsung,atna45af01 + - enum: + # Samsung 14.5" WQXGA+ (2880x1800 pixels) eDP AMOLED panel + - samsung,atna45af01 + # Samsung 14.5" 3K (2944x1840 pixels) eDP AMOLED panel + - samsung,atna45dc02 - const: samsung,atna33xc20 enable-gpios: true diff --git a/Documentation/devicetree/bindings/display/panel/sony,td4353-jdi.yaml b/Documentation/devicetree/bindings/display/panel/sony,td4353-jdi.yaml index 191b692125e1..032a989184ff 100644 --- a/Documentation/devicetree/bindings/display/panel/sony,td4353-jdi.yaml +++ b/Documentation/devicetree/bindings/display/panel/sony,td4353-jdi.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Sony TD4353 JDI 5 / 5.7" 2160x1080 MIPI-DSI Panel maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | The Sony TD4353 JDI is a 5 (XZ2c) / 5.7 (XZ2) inch 2160x1080 diff --git a/Documentation/devicetree/bindings/eeprom/at25.yaml b/Documentation/devicetree/bindings/eeprom/at25.yaml index 1715b0c9feea..c31e5e719525 100644 --- a/Documentation/devicetree/bindings/eeprom/at25.yaml +++ b/Documentation/devicetree/bindings/eeprom/at25.yaml @@ -28,6 +28,7 @@ properties: - anvo,anv32e61w - atmel,at25256B - fujitsu,mb85rs1mt + - fujitsu,mb85rs256 - fujitsu,mb85rs64 - microchip,at25160bn - microchip,25lc040 diff --git a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.yaml b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.yaml index 379721027bf8..51d48d4130d3 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.yaml @@ -42,6 +42,7 @@ properties: - focaltech,ft5426 - focaltech,ft5452 - focaltech,ft6236 + - focaltech,ft8201 - focaltech,ft8719 reg: diff --git a/Documentation/devicetree/bindings/interconnect/qcom,sc7280-rpmh.yaml b/Documentation/devicetree/bindings/interconnect/qcom,sc7280-rpmh.yaml index 9fce7203bd42..78210791496f 100644 --- a/Documentation/devicetree/bindings/interconnect/qcom,sc7280-rpmh.yaml +++ b/Documentation/devicetree/bindings/interconnect/qcom,sc7280-rpmh.yaml @@ -8,7 +8,7 @@ title: Qualcomm RPMh Network-On-Chip Interconnect on SC7280 maintainers: - Bjorn Andersson <andersson@kernel.org> - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | RPMh interconnect providers support system bandwidth requirements through diff --git a/Documentation/devicetree/bindings/interconnect/qcom,sc8280xp-rpmh.yaml b/Documentation/devicetree/bindings/interconnect/qcom,sc8280xp-rpmh.yaml index 6c2da03f0cd2..100c68636909 100644 --- a/Documentation/devicetree/bindings/interconnect/qcom,sc8280xp-rpmh.yaml +++ b/Documentation/devicetree/bindings/interconnect/qcom,sc8280xp-rpmh.yaml @@ -8,7 +8,7 @@ title: Qualcomm RPMh Network-On-Chip Interconnect on SC8280XP maintainers: - Bjorn Andersson <andersson@kernel.org> - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | RPMh interconnect providers support system bandwidth requirements through diff --git a/Documentation/devicetree/bindings/interconnect/qcom,sm8450-rpmh.yaml b/Documentation/devicetree/bindings/interconnect/qcom,sm8450-rpmh.yaml index 3cff7e662255..300640a533dd 100644 --- a/Documentation/devicetree/bindings/interconnect/qcom,sm8450-rpmh.yaml +++ b/Documentation/devicetree/bindings/interconnect/qcom,sm8450-rpmh.yaml @@ -8,7 +8,7 @@ title: Qualcomm RPMh Network-On-Chip Interconnect on SM8450 maintainers: - Bjorn Andersson <andersson@kernel.org> - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | RPMh interconnect providers support system bandwidth requirements through diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.yaml b/Documentation/devicetree/bindings/iommu/qcom,iommu.yaml index 571e5746d177..f8cebc9e8cd9 100644 --- a/Documentation/devicetree/bindings/iommu/qcom,iommu.yaml +++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Technologies legacy IOMMU implementations maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | Qualcomm "B" family devices which are not compatible with arm-smmu have diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml index ee7a65b528cd..d1e2bca3c503 100644 --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml @@ -58,18 +58,18 @@ allOf: - const: timing-adjustment amlogic,tx-delay-ns: - $ref: /schemas/types.yaml#/definitions/uint32 + enum: [0, 2, 4, 6] + default: 2 description: - The internal RGMII TX clock delay (provided by this driver) in - nanoseconds. Allowed values are 0ns, 2ns, 4ns, 6ns. - When phy-mode is set to "rgmii" then the TX delay should be - explicitly configured. When not configured a fallback of 2ns is - used. When the phy-mode is set to either "rgmii-id" or "rgmii-txid" - the TX clock delay is already provided by the PHY. In that case - this property should be set to 0ns (which disables the TX clock - delay in the MAC to prevent the clock from going off because both - PHY and MAC are adding a delay). - Any configuration is ignored when the phy-mode is set to "rmii". + The internal RGMII TX clock delay (provided by this driver) + in nanoseconds. When phy-mode is set to "rgmii" then the TX + delay should be explicitly configured. When the phy-mode is + set to either "rgmii-id" or "rgmii-txid" the TX clock delay + is already provided by the PHY. In that case this property + should be set to 0ns (which disables the TX clock delay in + the MAC to prevent the clock from going off because both + PHY and MAC are adding a delay). Any configuration is + ignored when the phy-mode is set to "rmii". amlogic,rx-delay-ns: deprecated: true diff --git a/Documentation/devicetree/bindings/net/bluetooth/amlogic,w155s2-bt.yaml b/Documentation/devicetree/bindings/net/bluetooth/amlogic,w155s2-bt.yaml new file mode 100644 index 000000000000..6fd7557039d2 --- /dev/null +++ b/Documentation/devicetree/bindings/net/bluetooth/amlogic,w155s2-bt.yaml @@ -0,0 +1,63 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +# Copyright (C) 2024 Amlogic, Inc. All rights reserved +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/bluetooth/amlogic,w155s2-bt.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Amlogic Bluetooth chips + +description: + The W155S2 is an Amlogic Bluetooth and Wi-Fi combo chip. It works on + the standard H4 protocol via a 4-wire UART interface, with baud rates + up to 4 Mbps. + +maintainers: + - Yang Li <yang.li@amlogic.com> + +properties: + compatible: + oneOf: + - items: + - enum: + - amlogic,w265s1-bt + - amlogic,w265p1-bt + - const: amlogic,w155s2-bt + - enum: + - amlogic,w155s2-bt + - amlogic,w265s2-bt + + clocks: + maxItems: 1 + description: clock provided to the controller (32.768KHz) + + enable-gpios: + maxItems: 1 + + vddio-supply: + description: VDD_IO supply regulator handle + + firmware-name: + maxItems: 1 + description: specify the path of firmware bin to load + +required: + - compatible + - clocks + - enable-gpios + - vddio-supply + - firmware-name + +additionalProperties: false + +examples: + - | + #include <dt-bindings/gpio/gpio.h> + bluetooth { + compatible = "amlogic,w155s2-bt"; + clocks = <&extclk>; + enable-gpios = <&gpio 17 GPIO_ACTIVE_HIGH>; + vddio-supply = <&wcn_3v3>; + firmware-name = "amlogic/aml_w155s2_bt_uart.bin"; + }; + diff --git a/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml b/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml index 68c5ed111417..64a5c5004862 100644 --- a/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml +++ b/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml @@ -172,14 +172,14 @@ allOf: - qcom,wcn6855-bt then: required: - - enable-gpios - - swctrl-gpios - - vddio-supply - - vddbtcxmx-supply - vddrfacmn-supply + - vddaon-supply + - vddwlcx-supply + - vddwlmx-supply + - vddbtcmx-supply - vddrfa0p8-supply - vddrfa1p2-supply - - vddrfa1p7-supply + - vddrfa1p8-supply - if: properties: compatible: diff --git a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml index f197d9b516bb..97dd1a7c5ed2 100644 --- a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml +++ b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml @@ -17,6 +17,7 @@ properties: compatible: oneOf: - enum: + - fsl,imx95-flexcan - fsl,imx93-flexcan - fsl,imx8qm-flexcan - fsl,imx8mp-flexcan @@ -39,9 +40,6 @@ properties: - fsl,imx6sx-flexcan - const: fsl,imx6q-flexcan - items: - - const: fsl,imx95-flexcan - - const: fsl,imx93-flexcan - - items: - enum: - fsl,ls1028ar1-flexcan - const: fsl,lx2160ar1-flexcan @@ -80,6 +78,10 @@ properties: node then controller is assumed to be little endian. If this property is present then controller is assumed to be big endian. + can-transceiver: + $ref: can-transceiver.yaml# + unevaluatedProperties: false + fsl,stop-mode: description: | Register bits of stop mode control. diff --git a/Documentation/devicetree/bindings/net/can/microchip,mcp2510.yaml b/Documentation/devicetree/bindings/net/can/microchip,mcp2510.yaml new file mode 100644 index 000000000000..db446dde6842 --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/microchip,mcp2510.yaml @@ -0,0 +1,70 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/can/microchip,mcp2510.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Microchip MCP251X stand-alone CAN controller + +maintainers: + - Marc Kleine-Budde <mkl@pengutronix.de> + +properties: + compatible: + enum: + - microchip,mcp2510 + - microchip,mcp2515 + - microchip,mcp25625 + + reg: + maxItems: 1 + + clocks: + maxItems: 1 + + interrupts: + maxItems: 1 + + vdd-supply: + description: Regulator that powers the CAN controller. + + xceiver-supply: + description: Regulator that powers the CAN transceiver. + + gpio-controller: true + + "#gpio-cells": + const: 2 + +required: + - compatible + - reg + - clocks + - interrupts + +allOf: + - $ref: /schemas/spi/spi-peripheral-props.yaml# + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + spi { + #address-cells = <1>; + #size-cells = <0>; + + can@1 { + compatible = "microchip,mcp2515"; + reg = <1>; + clocks = <&clk24m>; + interrupt-parent = <&gpio4>; + interrupts = <13 IRQ_TYPE_LEVEL_LOW>; + vdd-supply = <®5v0>; + xceiver-supply = <®5v0>; + gpio-controller; + #gpio-cells = <2>; + }; + }; + diff --git a/Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt b/Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt deleted file mode 100644 index 381f8fb3e865..000000000000 --- a/Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt +++ /dev/null @@ -1,30 +0,0 @@ -* Microchip MCP251X stand-alone CAN controller device tree bindings - -Required properties: - - compatible: Should be one of the following: - - "microchip,mcp2510" for MCP2510. - - "microchip,mcp2515" for MCP2515. - - "microchip,mcp25625" for MCP25625. - - reg: SPI chip select. - - clocks: The clock feeding the CAN controller. - - interrupts: Should contain IRQ line for the CAN controller. - -Optional properties: - - vdd-supply: Regulator that powers the CAN controller. - - xceiver-supply: Regulator that powers the CAN transceiver. - - gpio-controller: Indicates this device is a GPIO controller. - - #gpio-cells: Should be two. The first cell is the pin number and - the second cell is used to specify the gpio polarity. - -Example: - can0: can@1 { - compatible = "microchip,mcp2515"; - reg = <1>; - clocks = <&clk24m>; - interrupt-parent = <&gpio4>; - interrupts = <13 IRQ_TYPE_LEVEL_LOW>; - vdd-supply = <®5v0>; - xceiver-supply = <®5v0>; - gpio-controller; - #gpio-cells = <2>; - }; diff --git a/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml b/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml index d3f45d29fa0a..7c5ac5d2e880 100644 --- a/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml +++ b/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml @@ -32,6 +32,7 @@ properties: - enum: - renesas,r8a779a0-canfd # R-Car V3U - renesas,r8a779g0-canfd # R-Car V4H + - renesas,r8a779h0-canfd # R-Car V4M - const: renesas,rcar-gen4-canfd # R-Car Gen4 - items: @@ -163,14 +164,23 @@ allOf: maxItems: 1 - if: - not: - properties: - compatible: - contains: - const: renesas,rcar-gen4-canfd + properties: + compatible: + contains: + const: renesas,r8a779h0-canfd then: patternProperties: - "^channel[2-7]$": false + "^channel[5-7]$": false + else: + if: + not: + properties: + compatible: + contains: + const: renesas,rcar-gen4-canfd + then: + patternProperties: + "^channel[2-7]$": false unevaluatedProperties: false diff --git a/Documentation/devicetree/bindings/net/can/rockchip,rk3568v2-canfd.yaml b/Documentation/devicetree/bindings/net/can/rockchip,rk3568v2-canfd.yaml new file mode 100644 index 000000000000..a077c0330013 --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/rockchip,rk3568v2-canfd.yaml @@ -0,0 +1,74 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/can/rockchip,rk3568v2-canfd.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: + Rockchip CAN-FD controller + +maintainers: + - Marc Kleine-Budde <mkl@pengutronix.de> + +allOf: + - $ref: can-controller.yaml# + +properties: + compatible: + oneOf: + - const: rockchip,rk3568v2-canfd + - items: + - const: rockchip,rk3568v3-canfd + - const: rockchip,rk3568v2-canfd + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + clocks: + maxItems: 2 + + clock-names: + items: + - const: baud + - const: pclk + + resets: + maxItems: 2 + + reset-names: + items: + - const: core + - const: apb + +required: + - compatible + - reg + - interrupts + - clocks + - resets + +additionalProperties: false + +examples: + - | + #include <dt-bindings/clock/rk3568-cru.h> + #include <dt-bindings/interrupt-controller/arm-gic.h> + #include <dt-bindings/interrupt-controller/irq.h> + + soc { + #address-cells = <2>; + #size-cells = <2>; + + can@fe570000 { + compatible = "rockchip,rk3568v2-canfd"; + reg = <0x0 0xfe570000 0x0 0x1000>; + interrupts = <GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&cru CLK_CAN0>, <&cru PCLK_CAN0>; + clock-names = "baud", "pclk"; + resets = <&cru SRST_CAN0>, <&cru SRST_P_CAN0>; + reset-names = "core", "apb"; + }; + }; diff --git a/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml b/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml index 7e405ad96eb2..ea979bcae1d6 100644 --- a/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml +++ b/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml @@ -92,6 +92,10 @@ properties: Built-in switch of the MT7988 SoC const: mediatek,mt7988-switch + - description: + Built-in switch of the Airoha EN7581 SoC + const: airoha,en7581-switch + reg: maxItems: 1 @@ -284,7 +288,9 @@ allOf: - if: properties: compatible: - const: mediatek,mt7988-switch + enum: + - mediatek,mt7988-switch + - airoha,en7581-switch then: $ref: "#/$defs/mt7530-dsa-port" properties: diff --git a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml index 52acc15ebcbf..30c0c3e6f37a 100644 --- a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml +++ b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml @@ -22,7 +22,9 @@ properties: - microchip,ksz8794 - microchip,ksz8795 - microchip,ksz8863 + - microchip,ksz8864 # 4-port version of KSZ8895 family switch - microchip,ksz8873 + - microchip,ksz8895 # 5-port version of KSZ8895 family switch - microchip,ksz9477 - microchip,ksz9897 - microchip,ksz9896 @@ -51,6 +53,11 @@ properties: Set if the output SYNCLKO clock should be disabled. Do not mix with microchip,synclko-125. + microchip,pme-active-high: + $ref: /schemas/types.yaml#/definitions/flag + description: + Indicates if the PME pin polarity is active-high. + microchip,io-drive-strength-microamp: description: IO Pad Drive Strength diff --git a/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml b/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml index b99d7a694b70..51cf574249be 100644 --- a/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml +++ b/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml @@ -52,6 +52,25 @@ properties: allOf: - $ref: dsa.yaml#/$defs/ethernet-ports +patternProperties: + "^(ethernet-)?ports$": + additionalProperties: true + patternProperties: + "^(ethernet-)?port@6$": + allOf: + - if: + properties: + phy-mode: + contains: + enum: + - rgmii + then: + properties: + rx-internal-delay-ps: + $ref: "#/$defs/internal-delay-ps" + tx-internal-delay-ps: + $ref: "#/$defs/internal-delay-ps" + # This checks if reg is a chipselect so the device is on an SPI # bus, the if-clause will fail if reg is a tuple such as for a # platform device. @@ -67,6 +86,15 @@ required: - compatible - reg +$defs: + internal-delay-ps: + description: + Disable tunable delay lines using 0 ps, or enable them and select + the phase between 1400 ps and 2000 ps in increments of 300 ps. + default: 2000 + enum: + [0, 1400, 1700, 2000] + unevaluatedProperties: false examples: @@ -108,6 +136,8 @@ examples: reg = <6>; ethernet = <&gmac1>; phy-mode = "rgmii"; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; fixed-link { speed = <1000>; full-duplex; @@ -150,6 +180,8 @@ examples: ethernet-port@6 { reg = <6>; ethernet = <&enet0>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; phy-mode = "rgmii"; fixed-link { speed = <1000>; diff --git a/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml b/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml index a1b71b35319e..be8a2163b73e 100644 --- a/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml +++ b/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml @@ -24,24 +24,20 @@ properties: maxItems: 1 description: The DPMAC number - phy-handle: true - - phy-connection-type: true - - phy-mode: true - pcs-handle: maxItems: 1 description: A reference to a node representing a PCS PHY device found on the internal MDIO bus. - managed: true + phys: + description: A reference to the SerDes lane(s) + maxItems: 1 required: - reg -additionalProperties: false +unevaluatedProperties: false examples: - | diff --git a/Documentation/devicetree/bindings/net/mdio.yaml b/Documentation/devicetree/bindings/net/mdio.yaml index a266ade918ca..bed3987a8fbf 100644 --- a/Documentation/devicetree/bindings/net/mdio.yaml +++ b/Documentation/devicetree/bindings/net/mdio.yaml @@ -19,7 +19,7 @@ description: properties: $nodename: - pattern: "^mdio(@.*)?" + pattern: '^mdio(-(bus|external))?(@.+|-([0-9]+))?$' "#address-cells": const: 1 diff --git a/Documentation/devicetree/bindings/net/mediatek,net.yaml b/Documentation/devicetree/bindings/net/mediatek,net.yaml index 686b5c2fae40..9e02fd80af83 100644 --- a/Documentation/devicetree/bindings/net/mediatek,net.yaml +++ b/Documentation/devicetree/bindings/net/mediatek,net.yaml @@ -30,8 +30,13 @@ properties: reg: maxItems: 1 - clocks: true - clock-names: true + clocks: + minItems: 2 + maxItems: 24 + + clock-names: + minItems: 2 + maxItems: 24 interrupts: minItems: 1 @@ -127,6 +132,7 @@ allOf: then: properties: interrupts: + minItems: 3 maxItems: 3 clocks: @@ -183,6 +189,7 @@ allOf: then: properties: interrupts: + minItems: 3 maxItems: 3 clocks: @@ -222,6 +229,7 @@ allOf: then: properties: interrupts: + minItems: 3 maxItems: 3 clocks: diff --git a/Documentation/devicetree/bindings/net/microchip,lan8650.yaml b/Documentation/devicetree/bindings/net/microchip,lan8650.yaml new file mode 100644 index 000000000000..61e11d4a07c4 --- /dev/null +++ b/Documentation/devicetree/bindings/net/microchip,lan8650.yaml @@ -0,0 +1,74 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/microchip,lan8650.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Microchip LAN8650/1 10BASE-T1S MACPHY Ethernet Controllers + +maintainers: + - Parthiban Veerasooran <parthiban.veerasooran@microchip.com> + +description: + The LAN8650/1 combines a Media Access Controller (MAC) and an Ethernet + PHY to enable 10BASE‑T1S networks. The Ethernet Media Access Controller + (MAC) module implements a 10 Mbps half duplex Ethernet MAC, compatible + with the IEEE 802.3 standard and a 10BASE-T1S physical layer transceiver + integrated into the LAN8650/1. The communication between the Host and + the MAC-PHY is specified in the OPEN Alliance 10BASE-T1x MACPHY Serial + Interface (TC6). + +allOf: + - $ref: /schemas/net/ethernet-controller.yaml# + - $ref: /schemas/spi/spi-peripheral-props.yaml# + +properties: + compatible: + oneOf: + - const: microchip,lan8650 + - items: + - const: microchip,lan8651 + - const: microchip,lan8650 + + reg: + maxItems: 1 + + interrupts: + description: + Interrupt from MAC-PHY asserted in the event of Receive Chunks + Available, Transmit Chunk Credits Available and Extended Status + Event. + maxItems: 1 + + spi-max-frequency: + minimum: 15000000 + maximum: 25000000 + +required: + - compatible + - reg + - interrupts + - spi-max-frequency + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + #include <dt-bindings/gpio/gpio.h> + + spi { + #address-cells = <1>; + #size-cells = <0>; + + ethernet@0 { + compatible = "microchip,lan8651", "microchip,lan8650"; + reg = <0>; + pinctrl-names = "default"; + pinctrl-0 = <ð0_pins>; + interrupt-parent = <&gpio>; + interrupts = <6 IRQ_TYPE_EDGE_FALLING>; + local-mac-address = [04 05 06 01 02 03]; + spi-max-frequency = <15000000>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/nxp,tja11xx.yaml b/Documentation/devicetree/bindings/net/nxp,tja11xx.yaml index 85bfa45f5122..a754a61adc2d 100644 --- a/Documentation/devicetree/bindings/net/nxp,tja11xx.yaml +++ b/Documentation/devicetree/bindings/net/nxp,tja11xx.yaml @@ -14,8 +14,53 @@ maintainers: description: Bindings for NXP TJA11xx automotive PHYs +properties: + compatible: + enum: + - ethernet-phy-id0180.dc40 + - ethernet-phy-id0180.dc41 + - ethernet-phy-id0180.dc48 + - ethernet-phy-id0180.dd00 + - ethernet-phy-id0180.dd01 + - ethernet-phy-id0180.dd02 + - ethernet-phy-id0180.dc80 + - ethernet-phy-id0180.dc82 + - ethernet-phy-id001b.b010 + - ethernet-phy-id001b.b013 + - ethernet-phy-id001b.b030 + - ethernet-phy-id001b.b031 + allOf: - $ref: ethernet-phy.yaml# + - if: + properties: + compatible: + contains: + enum: + - ethernet-phy-id0180.dc40 + - ethernet-phy-id0180.dc41 + - ethernet-phy-id0180.dc48 + - ethernet-phy-id0180.dd00 + - ethernet-phy-id0180.dd01 + - ethernet-phy-id0180.dd02 + + then: + properties: + nxp,rmii-refclk-in: + type: boolean + description: | + The REF_CLK is provided for both transmitted and received data + in RMII mode. This clock signal is provided by the PHY and is + typically derived from an external 25MHz crystal. Alternatively, + a 50MHz clock signal generated by an external oscillator can be + connected to pin REF_CLK. A third option is to connect a 25MHz + clock to pin CLK_IN_OUT. So, the REF_CLK should be configured + as input or output according to the actual circuit connection. + If present, indicates that the REF_CLK will be configured as + interface reference clock input when RMII mode enabled. + If not present, the REF_CLK will be configured as interface + reference clock output when RMII mode enabled. + Only supported on TJA1100 and TJA1101. patternProperties: "^ethernet-phy@[0-9a-f]+$": @@ -32,22 +77,6 @@ patternProperties: description: The ID number for the child PHY. Should be +1 of parent PHY. - nxp,rmii-refclk-in: - type: boolean - description: | - The REF_CLK is provided for both transmitted and received data - in RMII mode. This clock signal is provided by the PHY and is - typically derived from an external 25MHz crystal. Alternatively, - a 50MHz clock signal generated by an external oscillator can be - connected to pin REF_CLK. A third option is to connect a 25MHz - clock to pin CLK_IN_OUT. So, the REF_CLK should be configured - as input or output according to the actual circuit connection. - If present, indicates that the REF_CLK will be configured as - interface reference clock input when RMII mode enabled. - If not present, the REF_CLK will be configured as interface - reference clock output when RMII mode enabled. - Only supported on TJA1100 and TJA1101. - required: - reg @@ -60,6 +89,7 @@ examples: #size-cells = <0>; tja1101_phy0: ethernet-phy@4 { + compatible = "ethernet-phy-id0180.dc40"; reg = <0x4>; nxp,rmii-refclk-in; }; diff --git a/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml b/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml index 6992d56832bf..d08abcb01211 100644 --- a/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml +++ b/Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml @@ -23,6 +23,9 @@ properties: '#pse-cells': const: 1 + reset-gpios: + maxItems: 1 + channels: description: each set of 8 ports can be assigned to one physical channels or two for PoE4. This parameter describes the configuration diff --git a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml index 21a92f179093..1e00ef5b3acd 100644 --- a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml +++ b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml @@ -62,15 +62,27 @@ properties: - renesas,r9a08g045-gbeth # RZ/G3S - const: renesas,rzg2l-gbeth # RZ/{G2L,G2UL,V2L} family - reg: true + reg: + minItems: 1 + items: + - description: MAC register block + - description: Stream buffer - interrupts: true + interrupts: + minItems: 1 + maxItems: 29 - interrupt-names: true + interrupt-names: + minItems: 1 + maxItems: 29 - clocks: true + clocks: + minItems: 1 + maxItems: 3 - clock-names: true + clock-names: + minItems: 1 + maxItems: 3 iommus: maxItems: 1 @@ -150,14 +162,11 @@ allOf: then: properties: reg: - items: - - description: MAC register block - - description: Stream buffer + minItems: 2 else: properties: reg: - items: - - description: MAC register block + maxItems: 1 - if: properties: diff --git a/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml b/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml index 6bbe96e35250..f8a576611d6c 100644 --- a/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml @@ -25,6 +25,7 @@ select: - rockchip,rk3368-gmac - rockchip,rk3399-gmac - rockchip,rk3568-gmac + - rockchip,rk3576-gmac - rockchip,rk3588-gmac - rockchip,rv1108-gmac - rockchip,rv1126-gmac @@ -52,6 +53,7 @@ properties: - items: - enum: - rockchip,rk3568-gmac + - rockchip,rk3576-gmac - rockchip,rk3588-gmac - rockchip,rv1126-gmac - const: snps,dwmac-4.20a diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 3eb65e63fdae..4e2ba1bf788c 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -80,6 +80,7 @@ properties: - rockchip,rk3328-gmac - rockchip,rk3366-gmac - rockchip,rk3368-gmac + - rockchip,rk3576-gmac - rockchip,rk3588-gmac - rockchip,rk3399-gmac - rockchip,rv1108-gmac diff --git a/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml b/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml index b0ebcef6801c..4eb63b303cff 100644 --- a/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml +++ b/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml @@ -41,13 +41,17 @@ properties: minItems: 1 maxItems: 4 - clock-names: true + clock-names: + minItems: 1 + maxItems: 4 resets: minItems: 1 maxItems: 2 - reset-names: true + reset-names: + minItems: 1 + maxItems: 2 socionext,syscon-phy-mode: $ref: /schemas/types.yaml#/definitions/phandle-array diff --git a/Documentation/devicetree/bindings/net/wireless/marvell,sd8787.yaml b/Documentation/devicetree/bindings/net/wireless/marvell,sd8787.yaml new file mode 100644 index 000000000000..1715b22e0dcf --- /dev/null +++ b/Documentation/devicetree/bindings/net/wireless/marvell,sd8787.yaml @@ -0,0 +1,93 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/wireless/marvell,sd8787.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Marvell 8787/8897/8978/8997 (sd8787/sd8897/sd8978/sd8997/pcie8997) SDIO/PCIE devices + +maintainers: + - Brian Norris <briannorris@chromium.org> + - Frank Li <Frank.Li@nxp.com> + +description: + This node provides properties for describing the Marvell SDIO/PCIE wireless device. + The node is expected to be specified as a child node to the SDIO/PCIE controller that + connects the device to the system. + +properties: + compatible: + enum: + - marvell,sd8787 + - marvell,sd8897 + - marvell,sd8978 + - marvell,sd8997 + - nxp,iw416 + - pci11ab,2b42 + - pci1b4b,2b42 + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + wakeup-source: true + + marvell,caldata-txpwrlimit-2g: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: Calibration data for the 2GHz band. + maxItems: 566 + + marvell,caldata-txpwrlimit-5g-sub0: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: Calibration data for sub-band 0 in the 5GHz band. + maxItems: 502 + + marvell,caldata-txpwrlimit-5g-sub1: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: Calibration data for sub-band 1 in the 5GHz band. + maxItems: 688 + + marvell,caldata-txpwrlimit-5g-sub2: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: Calibration data for sub-band 2 in the 5GHz band. + maxItems: 750 + + marvell,caldata-txpwrlimit-5g-sub3: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: Calibration data for sub-band 3 in the 5GHz band. + maxItems: 502 + + marvell,wakeup-pin: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + Provides the pin number for the wakeup pin from the device's point of + view. The wakeup pin is used for the device to wake the host system + from sleep. This property is only necessary if the wakeup pin is + wired in a non-standard way, such that the default pin assignments + are invalid. + +required: + - compatible + - reg + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + mmc { + #address-cells = <1>; + #size-cells = <0>; + + wifi@1 { + compatible = "marvell,sd8897"; + reg = <1>; + interrupt-parent = <&pio>; + interrupts = <38 IRQ_TYPE_LEVEL_LOW>; + marvell,wakeup-pin = <3>; + }; + }; + diff --git a/Documentation/devicetree/bindings/net/wireless/marvell-8xxx.txt b/Documentation/devicetree/bindings/net/wireless/marvell-8xxx.txt deleted file mode 100644 index cdc303caf5f4..000000000000 --- a/Documentation/devicetree/bindings/net/wireless/marvell-8xxx.txt +++ /dev/null @@ -1,70 +0,0 @@ -Marvell 8787/8897/8978/8997 (sd8787/sd8897/sd8978/sd8997/pcie8997) SDIO/PCIE devices ------- - -This node provides properties for controlling the Marvell SDIO/PCIE wireless device. -The node is expected to be specified as a child node to the SDIO/PCIE controller that -connects the device to the system. - -Required properties: - - - compatible : should be one of the following: - * "marvell,sd8787" - * "marvell,sd8897" - * "marvell,sd8978" - * "marvell,sd8997" - * "nxp,iw416" - * "pci11ab,2b42" - * "pci1b4b,2b42" - -Optional properties: - - - marvell,caldata* : A series of properties with marvell,caldata prefix, - represent calibration data downloaded to the device during - initialization. This is an array of unsigned 8-bit values. - the properties should follow below property name and - corresponding array length: - "marvell,caldata-txpwrlimit-2g" (length = 566). - "marvell,caldata-txpwrlimit-5g-sub0" (length = 502). - "marvell,caldata-txpwrlimit-5g-sub1" (length = 688). - "marvell,caldata-txpwrlimit-5g-sub2" (length = 750). - "marvell,caldata-txpwrlimit-5g-sub3" (length = 502). - - marvell,wakeup-pin : a wakeup pin number of wifi chip which will be configured - to firmware. Firmware will wakeup the host using this pin - during suspend/resume. - - interrupts : interrupt pin number to the cpu. driver will request an irq based on - this interrupt number. during system suspend, the irq will be enabled - so that the wifi chip can wakeup host platform under certain condition. - during system resume, the irq will be disabled to make sure - unnecessary interrupt is not received. - - vmmc-supply: a phandle of a regulator, supplying VCC to the card - - mmc-pwrseq: phandle to the MMC power sequence node. See "mmc-pwrseq-*" - for documentation of MMC power sequence bindings. - -Example: - -Tx power limit calibration data is configured in below example. -The calibration data is an array of unsigned values, the length -can vary between hw versions. -IRQ pin 38 is used as system wakeup source interrupt. wakeup pin 3 is configured -so that firmware can wakeup host using this device side pin. - -&mmc3 { - vmmc-supply = <&wlan_en_reg>; - mmc-pwrseq = <&wifi_pwrseq>; - bus-width = <4>; - cap-power-off-card; - keep-power-in-suspend; - - #address-cells = <1>; - #size-cells = <0>; - mwifiex: wifi@1 { - compatible = "marvell,sd8897"; - reg = <1>; - interrupt-parent = <&pio>; - interrupts = <38 IRQ_TYPE_LEVEL_LOW>; - - marvell,caldata_00_txpwrlimit_2g_cfg_set = /bits/ 8 < - 0x01 0x00 0x06 0x00 0x08 0x02 0x89 0x01>; - marvell,wakeup-pin = <3>; - }; -}; diff --git a/Documentation/devicetree/bindings/nvmem/xlnx,zynqmp-nvmem.yaml b/Documentation/devicetree/bindings/nvmem/xlnx,zynqmp-nvmem.yaml index 917c40d5c382..1cbe44ab23b1 100644 --- a/Documentation/devicetree/bindings/nvmem/xlnx,zynqmp-nvmem.yaml +++ b/Documentation/devicetree/bindings/nvmem/xlnx,zynqmp-nvmem.yaml @@ -28,7 +28,7 @@ unevaluatedProperties: false examples: - | - nvmem { + soc-nvmem { compatible = "xlnx,zynqmp-nvmem-fw"; nvmem-layout { compatible = "fixed-layout"; diff --git a/Documentation/devicetree/bindings/opp/operating-points-v2-ti-cpu.yaml b/Documentation/devicetree/bindings/opp/operating-points-v2-ti-cpu.yaml index 02d1d2c17129..fd0c8d5c5f3e 100644 --- a/Documentation/devicetree/bindings/opp/operating-points-v2-ti-cpu.yaml +++ b/Documentation/devicetree/bindings/opp/operating-points-v2-ti-cpu.yaml @@ -19,7 +19,7 @@ description: the hardware description for the scheme mentioned above. maintainers: - - Nishanth Menon <nm@ti.com> + - Dhruva Gole <d-gole@ti.com> allOf: - $ref: opp-v2-base.yaml# diff --git a/Documentation/devicetree/bindings/perf/arm,cmn.yaml b/Documentation/devicetree/bindings/perf/arm,cmn.yaml index 2e51072e794a..0e9d665584e6 100644 --- a/Documentation/devicetree/bindings/perf/arm,cmn.yaml +++ b/Documentation/devicetree/bindings/perf/arm,cmn.yaml @@ -16,6 +16,7 @@ properties: - arm,cmn-600 - arm,cmn-650 - arm,cmn-700 + - arm,cmn-s3 - arm,ci-700 reg: diff --git a/Documentation/devicetree/bindings/perf/arm,ni.yaml b/Documentation/devicetree/bindings/perf/arm,ni.yaml new file mode 100644 index 000000000000..d66fffa256d5 --- /dev/null +++ b/Documentation/devicetree/bindings/perf/arm,ni.yaml @@ -0,0 +1,30 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/perf/arm,ni.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Arm NI (Network-on-Chip Interconnect) Performance Monitors + +maintainers: + - Robin Murphy <robin.murphy@arm.com> + +properties: + compatible: + const: arm,ni-700 + + reg: + items: + - description: Complete configuration register space + + interrupts: + minItems: 1 + maxItems: 32 + description: Overflow interrupts, one per clock domain, in order of domain ID + +required: + - compatible + - reg + - interrupts + +additionalProperties: false diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,mdm9607-tlmm.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,mdm9607-tlmm.yaml index bd3cbb44c99a..e75393b3d196 100644 --- a/Documentation/devicetree/bindings/pinctrl/qcom,mdm9607-tlmm.yaml +++ b/Documentation/devicetree/bindings/pinctrl/qcom,mdm9607-tlmm.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Technologies, Inc. MDM9607 TLMM block maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: Top Level Mode Multiplexer pin controller in Qualcomm MDM9607 SoC. diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,sm6350-tlmm.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,sm6350-tlmm.yaml index a4771f87d936..b262af6be97d 100644 --- a/Documentation/devicetree/bindings/pinctrl/qcom,sm6350-tlmm.yaml +++ b/Documentation/devicetree/bindings/pinctrl/qcom,sm6350-tlmm.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Technologies, Inc. SM6350 TLMM block maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: Top Level Mode Multiplexer pin controller in Qualcomm SM6350 SoC. diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,sm6375-tlmm.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,sm6375-tlmm.yaml index 047f82863f9b..c11af09c3f5b 100644 --- a/Documentation/devicetree/bindings/pinctrl/qcom,sm6375-tlmm.yaml +++ b/Documentation/devicetree/bindings/pinctrl/qcom,sm6375-tlmm.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Technologies, Inc. SM6375 TLMM block maintainers: - - Konrad Dybcio <konrad.dybcio@somainline.org> + - Konrad Dybcio <konradybcio@kernel.org> description: Top Level Mode Multiplexer pin controller in Qualcomm SM6375 SoC. diff --git a/Documentation/devicetree/bindings/ptp/fsl,ptp.yaml b/Documentation/devicetree/bindings/ptp/fsl,ptp.yaml index 3bb8615e3e91..42ca895f3c4e 100644 --- a/Documentation/devicetree/bindings/ptp/fsl,ptp.yaml +++ b/Documentation/devicetree/bindings/ptp/fsl,ptp.yaml @@ -11,11 +11,14 @@ maintainers: properties: compatible: - enum: - - fsl,etsec-ptp - - fsl,fman-ptp-timer - - fsl,dpaa2-ptp - - fsl,enetc-ptp + oneOf: + - enum: + - fsl,etsec-ptp + - fsl,fman-ptp-timer + - fsl,dpaa2-ptp + - items: + - const: pci1957,ee02 + - const: fsl,enetc-ptp reg: maxItems: 1 @@ -123,6 +126,15 @@ required: - compatible - reg +allOf: + - if: + properties: + compatible: + contains: + const: fsl,enetc-ptp + then: + $ref: /schemas/pci/pci-device.yaml + additionalProperties: false examples: diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,rpm-proc.yaml b/Documentation/devicetree/bindings/remoteproc/qcom,rpm-proc.yaml index 7afafde17a38..61cf4fe19ca5 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,rpm-proc.yaml +++ b/Documentation/devicetree/bindings/remoteproc/qcom,rpm-proc.yaml @@ -8,7 +8,7 @@ title: Qualcomm Resource Power Manager (RPM) Processor/Subsystem maintainers: - Bjorn Andersson <andersson@kernel.org> - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> - Stephan Gerhold <stephan@gerhold.net> description: | diff --git a/Documentation/devicetree/bindings/rng/rockchip,rk3568-rng.yaml b/Documentation/devicetree/bindings/rng/rockchip,rk3568-rng.yaml new file mode 100644 index 000000000000..e0595814a6d9 --- /dev/null +++ b/Documentation/devicetree/bindings/rng/rockchip,rk3568-rng.yaml @@ -0,0 +1,61 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/rng/rockchip,rk3568-rng.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Rockchip RK3568 TRNG + +description: True Random Number Generator on Rockchip RK3568 SoC + +maintainers: + - Aurelien Jarno <aurelien@aurel32.net> + - Daniel Golle <daniel@makrotopia.org> + +properties: + compatible: + enum: + - rockchip,rk3568-rng + + reg: + maxItems: 1 + + clocks: + items: + - description: TRNG clock + - description: TRNG AHB clock + + clock-names: + items: + - const: core + - const: ahb + + resets: + maxItems: 1 + +required: + - compatible + - reg + - clocks + - clock-names + - resets + +additionalProperties: false + +examples: + - | + #include <dt-bindings/clock/rk3568-cru.h> + bus { + #address-cells = <2>; + #size-cells = <2>; + + rng@fe388000 { + compatible = "rockchip,rk3568-rng"; + reg = <0x0 0xfe388000 0x0 0x4000>; + clocks = <&cru CLK_TRNG_NS>, <&cru HCLK_TRNG_NS>; + clock-names = "core", "ahb"; + resets = <&cru SRST_TRNG_NS>; + }; + }; + +... diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,rpm-master-stats.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,rpm-master-stats.yaml index 9410404f87f1..ad2dcc39a5f5 100644 --- a/Documentation/devicetree/bindings/soc/qcom/qcom,rpm-master-stats.yaml +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,rpm-master-stats.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm Technologies, Inc. (QTI) RPM Master Stats maintainers: - - Konrad Dybcio <konrad.dybcio@linaro.org> + - Konrad Dybcio <konradybcio@kernel.org> description: | The Qualcomm RPM (Resource Power Manager) architecture includes a concept diff --git a/Documentation/devicetree/bindings/soc/rockchip/grf.yaml b/Documentation/devicetree/bindings/soc/rockchip/grf.yaml index 78c6d5b64138..35b20e53b513 100644 --- a/Documentation/devicetree/bindings/soc/rockchip/grf.yaml +++ b/Documentation/devicetree/bindings/soc/rockchip/grf.yaml @@ -31,11 +31,17 @@ properties: - rockchip,rk3588-pcie3-pipe-grf - rockchip,rk3588-usb-grf - rockchip,rk3588-usbdpphy-grf - - rockchip,rk3588-vo-grf + - rockchip,rk3588-vo0-grf + - rockchip,rk3588-vo1-grf - rockchip,rk3588-vop-grf - rockchip,rv1108-usbgrf - const: syscon - items: + - const: rockchip,rk3588-vo-grf + - const: syscon + deprecated: true + description: Use rockchip,rk3588-vo{0,1}-grf instead. + - items: - enum: - rockchip,px30-grf - rockchip,px30-pmugrf @@ -262,6 +268,8 @@ allOf: contains: enum: - rockchip,rk3588-vo-grf + - rockchip,rk3588-vo0-grf + - rockchip,rk3588-vo1-grf then: required: diff --git a/Documentation/devicetree/bindings/soc/ti/ti,pruss.yaml b/Documentation/devicetree/bindings/soc/ti/ti,pruss.yaml index c402cb2928e8..3cb1471cc6b6 100644 --- a/Documentation/devicetree/bindings/soc/ti/ti,pruss.yaml +++ b/Documentation/devicetree/bindings/soc/ti/ti,pruss.yaml @@ -278,6 +278,26 @@ patternProperties: additionalProperties: false + ^pa-stats@[a-f0-9]+$: + description: | + PA-STATS sub-module represented as a SysCon. PA_STATS is a set of + registers where different statistics related to ICSSG, are dumped by + ICSSG firmware. This syscon sub-module will help the device to + access/read/write those statistics. + + type: object + + additionalProperties: false + + properties: + compatible: + items: + - const: ti,pruss-pa-st + - const: syscon + + reg: + maxItems: 1 + interrupt-controller@[a-f0-9]+$: description: | PRUSS INTC Node. Each PRUSS has a single interrupt controller instance diff --git a/Documentation/devicetree/bindings/sound/qcom,wcd934x.yaml b/Documentation/devicetree/bindings/sound/qcom,wcd934x.yaml index beb0ff0245b0..a65b1d1d5fdd 100644 --- a/Documentation/devicetree/bindings/sound/qcom,wcd934x.yaml +++ b/Documentation/devicetree/bindings/sound/qcom,wcd934x.yaml @@ -199,10 +199,11 @@ additionalProperties: false examples: - | + #include <dt-bindings/gpio/gpio.h> codec@1,0{ compatible = "slim217,250"; reg = <1 0>; - reset-gpios = <&tlmm 64 0>; + reset-gpios = <&tlmm 64 GPIO_ACTIVE_LOW>; slim-ifc-dev = <&wcd9340_ifd>; #sound-dai-cells = <1>; interrupt-parent = <&tlmm>; diff --git a/Documentation/devicetree/bindings/sound/qcom,wcd937x.yaml b/Documentation/devicetree/bindings/sound/qcom,wcd937x.yaml index de397d879acc..f94203798f24 100644 --- a/Documentation/devicetree/bindings/sound/qcom,wcd937x.yaml +++ b/Documentation/devicetree/bindings/sound/qcom,wcd937x.yaml @@ -42,7 +42,7 @@ examples: pinctrl-names = "default", "sleep"; pinctrl-0 = <&wcd_reset_n>; pinctrl-1 = <&wcd_reset_n_sleep>; - reset-gpios = <&tlmm 83 GPIO_ACTIVE_HIGH>; + reset-gpios = <&tlmm 83 GPIO_ACTIVE_LOW>; vdd-buck-supply = <&vreg_l17b_1p8>; vdd-rxtx-supply = <&vreg_l18b_1p8>; vdd-px-supply = <&vreg_l18b_1p8>; diff --git a/Documentation/devicetree/bindings/sound/qcom,wcd938x.yaml b/Documentation/devicetree/bindings/sound/qcom,wcd938x.yaml index cf6c3787adfe..10531350c336 100644 --- a/Documentation/devicetree/bindings/sound/qcom,wcd938x.yaml +++ b/Documentation/devicetree/bindings/sound/qcom,wcd938x.yaml @@ -34,9 +34,10 @@ unevaluatedProperties: false examples: - | + #include <dt-bindings/gpio/gpio.h> codec { compatible = "qcom,wcd9380-codec"; - reset-gpios = <&tlmm 32 0>; + reset-gpios = <&tlmm 32 GPIO_ACTIVE_LOW>; #sound-dai-cells = <1>; qcom,tx-device = <&wcd938x_tx>; qcom,rx-device = <&wcd938x_rx>; diff --git a/Documentation/devicetree/bindings/sound/qcom,wcd939x.yaml b/Documentation/devicetree/bindings/sound/qcom,wcd939x.yaml index 6e76f6a8634f..c69291f4d575 100644 --- a/Documentation/devicetree/bindings/sound/qcom,wcd939x.yaml +++ b/Documentation/devicetree/bindings/sound/qcom,wcd939x.yaml @@ -52,10 +52,10 @@ unevaluatedProperties: false examples: - | - #include <dt-bindings/interrupt-controller/irq.h> + #include <dt-bindings/gpio/gpio.h> codec { compatible = "qcom,wcd9390-codec"; - reset-gpios = <&tlmm 32 IRQ_TYPE_NONE>; + reset-gpios = <&tlmm 32 GPIO_ACTIVE_LOW>; #sound-dai-cells = <1>; qcom,tx-device = <&wcd939x_tx>; qcom,rx-device = <&wcd939x_rx>; diff --git a/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml b/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml index 725303e1a364..70b273271754 100644 --- a/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml +++ b/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml @@ -32,6 +32,9 @@ properties: clocks: maxItems: 1 + power-domains: + maxItems: 1 + amlogic,ao-secure: description: phandle to the ao-secure syscon $ref: /schemas/types.yaml#/definitions/phandle diff --git a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml index 72048c5a0412..d45690d6a465 100644 --- a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml +++ b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml @@ -51,6 +51,7 @@ properties: - qcom,msm8996-tsens - qcom,msm8998-tsens - qcom,qcm2290-tsens + - qcom,sa8255p-tsens - qcom,sa8775p-tsens - qcom,sc7180-tsens - qcom,sc7280-tsens diff --git a/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml b/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml index 783c27591e56..b14e6f37b298 100644 --- a/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml +++ b/Documentation/devicetree/bindings/usb/microchip,usb2514.yaml @@ -10,7 +10,7 @@ maintainers: - Fabio Estevam <festevam@gmail.com> allOf: - - $ref: usb-hcd.yaml# + - $ref: usb-device.yaml# properties: compatible: @@ -18,6 +18,7 @@ properties: - usb424,2412 - usb424,2417 - usb424,2514 + - usb424,2517 reg: true @@ -35,6 +36,13 @@ required: - compatible - reg +patternProperties: + "^.*@[0-9a-f]{1,2}$": + description: The hard wired USB devices + type: object + $ref: /schemas/usb/usb-device.yaml + additionalProperties: true + unevaluatedProperties: false examples: diff --git a/Documentation/driver-api/dpll.rst b/Documentation/driver-api/dpll.rst index ea8d16600e16..e6855cd37e85 100644 --- a/Documentation/driver-api/dpll.rst +++ b/Documentation/driver-api/dpll.rst @@ -214,6 +214,27 @@ offset values are fractional with 3-digit decimal places and shell be divided with ``DPLL_PIN_PHASE_OFFSET_DIVIDER`` to get integer part and modulo divided to get fractional part. +Embedded SYNC +============= + +Device may provide ability to use Embedded SYNC feature. It allows +to embed additional SYNC signal into the base frequency of a pin - a one +special pulse of base frequency signal every time SYNC signal pulse +happens. The user can configure the frequency of Embedded SYNC. +The Embedded SYNC capability is always related to a given base frequency +and HW capabilities. The user is provided a range of Embedded SYNC +frequencies supported, depending on current base frequency configured for +the pin. + + ========================================= ================================= + ``DPLL_A_PIN_ESYNC_FREQUENCY`` current Embedded SYNC frequency + ``DPLL_A_PIN_ESYNC_FREQUENCY_SUPPORTED`` nest available Embedded SYNC + frequency ranges + ``DPLL_A_PIN_FREQUENCY_MIN`` attr minimum value of frequency + ``DPLL_A_PIN_FREQUENCY_MAX`` attr maximum value of frequency + ``DPLL_A_PIN_ESYNC_PULSE`` pulse type of Embedded SYNC + ========================================= ================================= + Configuration commands group ============================ diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst index 6c1175c6afba..c803b89b7248 100644 --- a/Documentation/driver-api/thermal/sysfs-api.rst +++ b/Documentation/driver-api/thermal/sysfs-api.rst @@ -4,8 +4,6 @@ Generic Thermal Sysfs driver How To Written by Sujith Thomas <sujith.thomas@intel.com>, Zhang Rui <rui.zhang@intel.com> -Updated: 2 January 2008 - Copyright (c) 2008 Intel Corporation @@ -38,61 +36,57 @@ temperature) and throttle appropriate devices. :: - struct thermal_zone_device - *thermal_zone_device_register(char *type, - int trips, int mask, void *devdata, - struct thermal_zone_device_ops *ops, - const struct thermal_zone_params *tzp, - int passive_delay, int polling_delay)) + struct thermal_zone_device * + thermal_zone_device_register_with_trips(const char *type, + const struct thermal_trip *trips, + int num_trips, void *devdata, + const struct thermal_zone_device_ops *ops, + const struct thermal_zone_params *tzp, + unsigned int passive_delay, + unsigned int polling_delay) - This interface function adds a new thermal zone device (sensor) to + This interface function adds a new thermal zone device (sensor) to the /sys/class/thermal folder as `thermal_zone[0-*]`. It tries to bind all the - thermal cooling devices registered at the same time. + thermal cooling devices registered to it at the same time. type: the thermal zone type. trips: - the total number of trip points this thermal zone supports. - mask: - Bit string: If 'n'th bit is set, then trip point 'n' is writable. + the table of trip points for this thermal zone. devdata: device private data ops: thermal zone device call-backs. - .bind: - bind the thermal zone device with a thermal cooling device. - .unbind: - unbind the thermal zone device with a thermal cooling device. + .should_bind: + check whether or not a given cooling device should be bound to + a given trip point in this thermal zone. .get_temp: get the current temperature of the thermal zone. .set_trips: - set the trip points window. Whenever the current temperature - is updated, the trip points immediately below and above the - current temperature are found. - .get_mode: - get the current mode (enabled/disabled) of the thermal zone. - - - "enabled" means the kernel thermal management is - enabled. - - "disabled" will prevent kernel thermal driver action - upon trip points so that user applications can take - charge of thermal management. - .set_mode: - set the mode (enabled/disabled) of the thermal zone. - .get_trip_type: - get the type of certain trip point. - .get_trip_temp: - get the temperature above which the certain trip point - will be fired. + set the trip points window. Whenever the current temperature + is updated, the trip points immediately below and above the + current temperature are found. + .change_mode: + change the mode (enabled/disabled) of the thermal zone. + .set_trip_temp: + set the temperature of a given trip point. + .get_crit_temp: + get the critical temperature for this thermal zone. .set_emul_temp: - set the emulation temperature which helps in debugging - different threshold temperature points. + set the emulation temperature which helps in debugging + different threshold temperature points. + .get_trend: + get the trend of most recent zone temperature changes. + .hot: + hot trip point crossing handler. + .critical: + critical trip point crossing handler. tzp: thermal zone platform parameters. passive_delay: - number of milliseconds to wait between polls when - performing passive cooling. + number of milliseconds to wait between polls when performing passive + cooling. polling_delay: number of milliseconds to wait between polls when checking whether trip points have been crossed (0 for interrupt driven systems). @@ -251,56 +245,6 @@ temperature) and throttle appropriate devices. It deletes the corresponding entry from /sys/class/thermal folder and unbinds itself from all the thermal zone devices using it. -1.3 interface for binding a thermal zone device with a thermal cooling device ------------------------------------------------------------------------------ - - :: - - int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz, - int trip, struct thermal_cooling_device *cdev, - unsigned long upper, unsigned long lower, unsigned int weight); - - This interface function binds a thermal cooling device to a particular trip - point of a thermal zone device. - - This function is usually called in the thermal zone device .bind callback. - - tz: - the thermal zone device - cdev: - thermal cooling device - trip: - indicates which trip point in this thermal zone the cooling device - is associated with. - upper: - the Maximum cooling state for this trip point. - THERMAL_NO_LIMIT means no upper limit, - and the cooling device can be in max_state. - lower: - the Minimum cooling state can be used for this trip point. - THERMAL_NO_LIMIT means no lower limit, - and the cooling device can be in cooling state 0. - weight: - the influence of this cooling device in this thermal - zone. See 1.4.1 below for more information. - - :: - - int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz, - int trip, struct thermal_cooling_device *cdev); - - This interface function unbinds a thermal cooling device from a particular - trip point of a thermal zone device. This function is usually called in - the thermal zone device .unbind callback. - - tz: - the thermal zone device - cdev: - thermal cooling device - trip: - indicates which trip point in this thermal zone the cooling device - is associated with. - 1.4 Thermal Zone Parameters --------------------------- @@ -371,8 +315,6 @@ Thermal cooling device sys I/F, created once it's registered:: Then next two dynamic attributes are created/removed in pairs. They represent the relationship between a thermal zone and its associated cooling device. -They are created/removed for each successful execution of -thermal_zone_bind_cooling_device/thermal_zone_unbind_cooling_device. :: @@ -464,14 +406,7 @@ are supposed to implement the callback. If they don't, the thermal framework calculated the trend by comparing the previous and the current temperature values. -4.2. get_thermal_instance -------------------------- - -This function returns the thermal_instance corresponding to a given -{thermal_zone, cooling_device, trip_point} combination. Returns NULL -if such an instance does not exist. - -4.3. thermal_cdev_update +4.2. thermal_cdev_update ------------------------ This function serves as an arbitrator to set the state of a cooling diff --git a/Documentation/filesystems/caching/fscache.rst b/Documentation/filesystems/caching/fscache.rst index a74d7b052dc1..de1f32526cc1 100644 --- a/Documentation/filesystems/caching/fscache.rst +++ b/Documentation/filesystems/caching/fscache.rst @@ -318,10 +318,10 @@ where the columns are: Debugging ========= -If CONFIG_FSCACHE_DEBUG is enabled, the FS-Cache facility can have runtime -debugging enabled by adjusting the value in:: +If CONFIG_NETFS_DEBUG is enabled, the FS-Cache facility and NETFS support can +have runtime debugging enabled by adjusting the value in:: - /sys/module/fscache/parameters/debug + /sys/module/netfs/parameters/debug This is a bitmask of debugging streams to enable: @@ -343,6 +343,6 @@ This is a bitmask of debugging streams to enable: The appropriate set of values should be OR'd together and the result written to the control file. For example:: - echo $((1|8|512)) >/sys/module/fscache/parameters/debug + echo $((1|8|512)) >/sys/module/netfs/parameters/debug will turn on all function entry debugging. diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst index cc4626d6ee4f..c293f8e37468 100644 --- a/Documentation/filesystems/erofs.rst +++ b/Documentation/filesystems/erofs.rst @@ -75,7 +75,7 @@ Here are the main features of EROFS: - Support merging tail-end data into a special inode as fragments. - - Support large folios for uncompressed files. + - Support large folios to make use of THPs (Transparent Hugepages); - Support direct I/O on uncompressed files to avoid double caching for loop devices; diff --git a/Documentation/filesystems/idmappings.rst b/Documentation/filesystems/idmappings.rst index ac0af679e61e..77930c77fcfe 100644 --- a/Documentation/filesystems/idmappings.rst +++ b/Documentation/filesystems/idmappings.rst @@ -821,7 +821,7 @@ the same idmapping to the mount. We now perform three steps: /* Map the userspace id down into a kernel id in the filesystem's idmapping. */ make_kuid(u0:k20000:r10000, u1000) = k21000 -2. Verify that the caller's kernel ids can be mapped to userspace ids in the +3. Verify that the caller's kernel ids can be mapped to userspace ids in the filesystem's idmapping:: from_kuid(u0:k20000:r10000, k21000) = u1000 @@ -854,10 +854,10 @@ The same translation algorithm works with the third example. /* Map the userspace id down into a kernel id in the filesystem's idmapping. */ make_kuid(u0:k0:r4294967295, u1000) = k1000 -2. Verify that the caller's kernel ids can be mapped to userspace ids in the +3. Verify that the caller's kernel ids can be mapped to userspace ids in the filesystem's idmapping:: - from_kuid(u0:k0:r4294967295, k21000) = u1000 + from_kuid(u0:k0:r4294967295, k1000) = u1000 So the ownership that lands on disk will be ``u1000``. @@ -994,7 +994,7 @@ from above::: /* Map the userspace id down into a kernel id in the filesystem's idmapping. */ make_kuid(u0:k0:r4294967295, u1000) = k1000 -2. Verify that the caller's filesystem ids can be mapped to userspace ids in the +3. Verify that the caller's filesystem ids can be mapped to userspace ids in the filesystem's idmapping:: from_kuid(u0:k0:r4294967295, k1000) = u1000 diff --git a/Documentation/filesystems/iomap/design.rst b/Documentation/filesystems/iomap/design.rst index f8ee3427bc1a..37594e1c5914 100644 --- a/Documentation/filesystems/iomap/design.rst +++ b/Documentation/filesystems/iomap/design.rst @@ -142,9 +142,9 @@ Definitions * **pure overwrite**: A write operation that does not require any metadata or zeroing operations to perform during either submission or completion. - This implies that the fileystem must have already allocated space + This implies that the filesystem must have already allocated space on disk as ``IOMAP_MAPPED`` and the filesystem must not place any - constaints on IO alignment or size. + constraints on IO alignment or size. The only constraints on I/O alignment are device level (minimum I/O size and alignment, typically sector size). @@ -394,7 +394,7 @@ iomap is concerned: * The **upper** level primitive is provided by the filesystem to coordinate access to different iomap operations. - The exact primitive is specifc to the filesystem and operation, + The exact primitive is specific to the filesystem and operation, but is often a VFS inode, pagecache invalidation, or folio lock. For example, a filesystem might take ``i_rwsem`` before calling ``iomap_file_buffered_write`` and ``iomap_file_unshare`` to prevent diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index e664061ed55d..f5e3676db954 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -251,10 +251,10 @@ prototypes:: void (*readahead)(struct readahead_control *); int (*write_begin)(struct file *, struct address_space *mapping, loff_t pos, unsigned len, - struct page **pagep, void **fsdata); + struct folio **foliop, void **fsdata); int (*write_end)(struct file *, struct address_space *mapping, loff_t pos, unsigned len, unsigned copied, - struct page *page, void *fsdata); + struct folio *folio, void *fsdata); sector_t (*bmap)(struct address_space *, sector_t); void (*invalidate_folio) (struct folio *, size_t start, size_t len); bool (*release_folio)(struct folio *, gfp_t); @@ -280,7 +280,7 @@ read_folio: yes, unlocks shared writepages: dirty_folio: maybe readahead: yes, unlocks shared -write_begin: locks the page exclusive +write_begin: locks the folio exclusive write_end: yes, unlocks exclusive bmap: invalidate_folio: yes exclusive diff --git a/Documentation/filesystems/smb/ksmbd.rst b/Documentation/filesystems/smb/ksmbd.rst index 6b30e43a0d11..67cb68ea6e68 100644 --- a/Documentation/filesystems/smb/ksmbd.rst +++ b/Documentation/filesystems/smb/ksmbd.rst @@ -13,7 +13,7 @@ KSMBD architecture The subset of performance related operations belong in kernelspace and the other subset which belong to operations which are not really related with performance in userspace. So, DCE/RPC management that has historically resulted -into number of buffer overflow issues and dangerous security bugs and user +into a number of buffer overflow issues and dangerous security bugs and user account management are implemented in user space as ksmbd.mountd. File operations that are related with performance (open/read/write/close etc.) in kernel space (ksmbd). This also allows for easier integration with VFS @@ -24,8 +24,8 @@ ksmbd (kernel daemon) When the server daemon is started, It starts up a forker thread (ksmbd/interface name) at initialization time and open a dedicated port 445 -for listening to SMB requests. Whenever new clients make request, Forker -thread will accept the client connection and fork a new thread for dedicated +for listening to SMB requests. Whenever new clients make a request, the Forker +thread will accept the client connection and fork a new thread for a dedicated communication channel between the client and the server. It allows for parallel processing of SMB requests(commands) from clients as well as allowing for new clients to make new connections. Each instance is named ksmbd/1~n(port number) @@ -34,12 +34,12 @@ thread can decide to pass through the commands to the user space (ksmbd.mountd), currently DCE/RPC commands are identified to be handled through the user space. To further utilize the linux kernel, it has been chosen to process the commands as workitems and to be executed in the handlers of the ksmbd-io kworker threads. -It allows for multiplexing of the handlers as the kernel take care of initiating +It allows for multiplexing of the handlers as the kernel takes care of initiating extra worker threads if the load is increased and vice versa, if the load is -decreased it destroys the extra worker threads. So, after connection is -established with client. Dedicated ksmbd/1..n(port number) takes complete +decreased it destroys the extra worker threads. So, after the connection is +established with the client. Dedicated ksmbd/1..n(port number) takes complete ownership of receiving/parsing of SMB commands. Each received command is worked -in parallel i.e., There can be multiple clients commands which are worked in +in parallel i.e., there can be multiple client commands which are worked in parallel. After receiving each command a separated kernel workitem is prepared for each command which is further queued to be handled by ksmbd-io kworkers. So, each SMB workitem is queued to the kworkers. This allows the benefit of load @@ -49,9 +49,9 @@ performance by handling client commands in parallel. ksmbd.mountd (user space daemon) -------------------------------- -ksmbd.mountd is userspace process to, transfer user account and password that +ksmbd.mountd is a userspace process to, transfer the user account and password that are registered using ksmbd.adduser (part of utils for user space). Further it -allows sharing information parameters that parsed from smb.conf to ksmbd in +allows sharing information parameters that are parsed from smb.conf to ksmbd in kernel. For the execution part it has a daemon which is continuously running and connected to the kernel interface using netlink socket, it waits for the requests (dcerpc and share/user info). It handles RPC calls (at a minimum few @@ -124,7 +124,7 @@ How to run 1. Download ksmbd-tools(https://github.com/cifsd-team/ksmbd-tools/releases) and compile them. - - Refer README(https://github.com/cifsd-team/ksmbd-tools/blob/master/README.md) + - Refer to README(https://github.com/cifsd-team/ksmbd-tools/blob/master/README.md) to know how to use ksmbd.mountd/adduser/addshare/control utils $ ./autogen.sh @@ -133,7 +133,7 @@ How to run 2. Create /usr/local/etc/ksmbd/ksmbd.conf file, add SMB share in ksmbd.conf file. - - Refer ksmbd.conf.example in ksmbd-utils, See ksmbd.conf manpage + - Refer to ksmbd.conf.example in ksmbd-utils, See ksmbd.conf manpage for details to configure shares. $ man ksmbd.conf @@ -145,7 +145,7 @@ How to run $ man ksmbd.adduser $ sudo ksmbd.adduser -a <Enter USERNAME for SMB share access> -4. Insert ksmbd.ko module after build your kernel. No need to load module +4. Insert the ksmbd.ko module after you build your kernel. No need to load the module if ksmbd is built into the kernel. - Set ksmbd in menuconfig(e.g. $ make menuconfig) @@ -175,7 +175,7 @@ Each layer 1. Enable all component prints # sudo ksmbd.control -d "all" -2. Enable one of components (smb, auth, vfs, oplock, ipc, conn, rdma) +2. Enable one of the components (smb, auth, vfs, oplock, ipc, conn, rdma) # sudo ksmbd.control -d "smb" 3. Show what prints are enabled. diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 6e903a903f8f..4f67b5ea0568 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -810,7 +810,7 @@ cache in your filesystem. The following members are defined: struct page **pagep, void **fsdata); int (*write_end)(struct file *, struct address_space *mapping, loff_t pos, unsigned len, unsigned copied, - struct page *page, void *fsdata); + struct folio *folio, void *fsdata); sector_t (*bmap)(struct address_space *, sector_t); void (*invalidate_folio) (struct folio *, size_t start, size_t len); bool (*release_folio)(struct folio *, gfp_t); @@ -926,12 +926,12 @@ cache in your filesystem. The following members are defined: (if they haven't been read already) so that the updated blocks can be written out properly. - The filesystem must return the locked pagecache page for the - specified offset, in ``*pagep``, for the caller to write into. + The filesystem must return the locked pagecache folio for the + specified offset, in ``*foliop``, for the caller to write into. It must be able to cope with short writes (where the length passed to write_begin is greater than the number of bytes copied - into the page). + into the folio). A void * may be returned in fsdata, which then gets passed into write_end. @@ -944,8 +944,8 @@ cache in your filesystem. The following members are defined: called. len is the original len passed to write_begin, and copied is the amount that was able to be copied. - The filesystem must take care of unlocking the page and - releasing it refcount, and updating i_size. + The filesystem must take care of unlocking the folio, + decrementing its refcount, and updating i_size. Returns < 0 on failure, otherwise the number of bytes (<= 'copied') that were able to be copied into pagecache. diff --git a/Documentation/kbuild/llvm.rst b/Documentation/kbuild/llvm.rst index bb5c44f8bd1c..6dc66b4f31a7 100644 --- a/Documentation/kbuild/llvm.rst +++ b/Documentation/kbuild/llvm.rst @@ -126,7 +126,7 @@ Ccache ``ccache`` can be used with ``clang`` to improve subsequent builds, (though KBUILD_BUILD_TIMESTAMP_ should be set to a deterministic value between builds -in order to avoid 100% cache misses, see Reproducible_builds_ for more info): +in order to avoid 100% cache misses, see Reproducible_builds_ for more info):: KBUILD_BUILD_TIMESTAMP='' make LLVM=1 CC="ccache clang" diff --git a/Documentation/netlink/specs/dpll.yaml b/Documentation/netlink/specs/dpll.yaml index 94132d30e0e0..f2894ca35de8 100644 --- a/Documentation/netlink/specs/dpll.yaml +++ b/Documentation/netlink/specs/dpll.yaml @@ -345,6 +345,26 @@ attribute-sets: Value is in PPM (parts per million). This may be implemented for example for pin of type PIN_TYPE_SYNCE_ETH_PORT. + - + name: esync-frequency + type: u64 + doc: | + Frequency of Embedded SYNC signal. If provided, the pin is configured + with a SYNC signal embedded into its base clock frequency. + - + name: esync-frequency-supported + type: nest + multi-attr: true + nested-attributes: frequency-range + doc: | + If provided a pin is capable of embedding a SYNC signal (within given + range) into its base frequency signal. + - + name: esync-pulse + type: u32 + doc: | + A ratio of high to low state of a SYNC signal pulse embedded + into base clock frequency. Value is in percents. - name: pin-parent-device subset-of: pin @@ -510,6 +530,9 @@ operations: - phase-adjust-max - phase-adjust - fractional-frequency-offset + - esync-frequency + - esync-frequency-supported + - esync-pulse dump: request: @@ -536,6 +559,7 @@ operations: - parent-device - parent-pin - phase-adjust + - esync-frequency - name: pin-create-ntf doc: Notification about pin appearing diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml index 495e35fcfb21..6a050d755b9c 100644 --- a/Documentation/netlink/specs/ethtool.yaml +++ b/Documentation/netlink/specs/ethtool.yaml @@ -39,6 +39,11 @@ definitions: - ovld-detected - power-not-available - short-detected + - + name: phy-upstream-type + enum-name: + type: enum + entries: [ mac, phy ] attribute-sets: - @@ -54,6 +59,9 @@ attribute-sets: name: flags type: u32 enum: header-flags + - + name: phy-index + type: u32 - name: bitset-bit @@ -659,6 +667,9 @@ attribute-sets: - name: code type: u8 + - + name: src + type: u32 - name: cable-fault-length attributes: @@ -668,6 +679,9 @@ attribute-sets: - name: cm type: u32 + - + name: src + type: u32 - name: cable-nest attributes: @@ -1022,12 +1036,16 @@ attribute-sets: - name: indir type: binary + sub-type: u32 - name: hkey type: binary - name: input_xfrm type: u32 + - + name: start-context + type: u32 - name: plca attributes: @@ -1085,6 +1103,35 @@ attribute-sets: - name: total type: uint + - + name: phy + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: index + type: u32 + - + name: drvname + type: string + - + name: name + type: string + - + name: upstream-type + type: u32 + enum: phy-upstream-type + - + name: upstream-index + type: u32 + - + name: upstream-sfp-name + type: string + - + name: downstream-sfp-name + type: string operations: enum-model: directional @@ -1749,11 +1796,12 @@ operations: attribute-set: rss - do: &rss-get-op + do: request: attributes: - header - reply: + - context + reply: &rss-reply attributes: - header - context @@ -1761,7 +1809,12 @@ operations: - indir - hkey - input_xfrm - dump: *rss-get-op + dump: + request: + attributes: + - header + - start-context + reply: *rss-reply - name: plca-get-cfg doc: Get PLCA params. @@ -1877,3 +1930,24 @@ operations: - status-msg - done - total + - + name: phy-get + doc: Get PHY devices attached to an interface + + attribute-set: phy + + do: &phy-get-op + request: + attributes: + - header + reply: + attributes: + - header + - index + - drvname + - name + - upstream-type + - upstream-index + - upstream-sfp-name + - downstream-sfp-name + dump: *phy-get-op diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml index af525ed29792..30d8342cacc8 100644 --- a/Documentation/netlink/specs/mptcp_pm.yaml +++ b/Documentation/netlink/specs/mptcp_pm.yaml @@ -109,7 +109,6 @@ attribute-sets: - name: port type: u16 - byte-order: big-endian - name: flags type: u32 diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 959755be4d7f..08412c279297 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -167,6 +167,10 @@ attribute-sets: "re-attached", they are just waiting to disappear. Attribute is absent if Page Pool has not been detached, and can still be used to allocate new memory. + - + name: dmabuf + doc: ID of the dmabuf this page-pool is attached to. + type: u32 - name: page-pool-info subset-of: page-pool @@ -268,6 +272,10 @@ attribute-sets: name: napi-id doc: ID of the NAPI instance which services this queue. type: u32 + - + name: dmabuf + doc: ID of the dmabuf attached to this queue, if any. + type: u32 - name: qstats @@ -457,6 +465,39 @@ attribute-sets: Number of times driver re-started accepting send requests to this queue from the stack. type: uint + - + name: queue-id + subset-of: queue + attributes: + - + name: id + - + name: type + - + name: dmabuf + attributes: + - + name: ifindex + doc: netdev ifindex to bind the dmabuf to. + type: u32 + checks: + min: 1 + - + name: queues + doc: receive queues to bind the dmabuf to. + type: nest + nested-attributes: queue-id + multi-attr: true + - + name: fd + doc: dmabuf file descriptor to bind. + type: u32 + - + name: id + doc: id of the dmabuf binding + type: u32 + checks: + min: 1 operations: list: @@ -510,6 +551,7 @@ operations: - inflight - inflight-mem - detach-time + - dmabuf dump: reply: *pp-reply config-cond: page-pool @@ -574,6 +616,7 @@ operations: - type - napi-id - ifindex + - dmabuf dump: request: attributes: @@ -619,6 +662,24 @@ operations: - rx-bytes - tx-packets - tx-bytes + - + name: bind-rx + doc: Bind dmabuf to netdev + attribute-set: dmabuf + flags: [ admin-perm ] + do: + request: + attributes: + - ifindex + - fd + - queues + reply: + attributes: + - id + +kernel-family: + headers: [ "linux/list.h"] + sock-priv: struct list_head mcast-groups: list: diff --git a/Documentation/netlink/specs/nftables.yaml b/Documentation/netlink/specs/nftables.yaml index dff2a18f3d90..bd938bd01b6b 100644 --- a/Documentation/netlink/specs/nftables.yaml +++ b/Documentation/netlink/specs/nftables.yaml @@ -63,6 +63,13 @@ definitions: - sdifname - bri-broute - + name: bitwise-ops + type: enum + entries: + - bool + - lshift + - rshift + - name: cmp-ops type: enum entries: @@ -125,6 +132,99 @@ definitions: - object - concat - expr + - + name: lookup-flags + type: flags + entries: + - invert + - + name: ct-keys + type: enum + entries: + - state + - direction + - status + - mark + - secmark + - expiration + - helper + - l3protocol + - src + - dst + - protocol + - proto-src + - proto-dst + - labels + - pkts + - bytes + - avgpkt + - zone + - eventmask + - src-ip + - dst-ip + - src-ip6 + - dst-ip6 + - ct-id + - + name: ct-direction + type: enum + entries: + - original + - reply + - + name: quota-flags + type: flags + entries: + - invert + - depleted + - + name: verdict-code + type: enum + entries: + - name: continue + value: 0xffffffff + - name: break + value: 0xfffffffe + - name: jump + value: 0xfffffffd + - name: goto + value: 0xfffffffc + - name: return + value: 0xfffffffb + - name: drop + value: 0 + - name: accept + value: 1 + - name: stolen + value: 2 + - name: queue + value: 3 + - name: repeat + value: 4 + - + name: fib-result + type: enum + entries: + - oif + - oifname + - addrtype + - + name: fib-flags + type: flags + entries: + - saddr + - daddr + - mark + - iif + - oif + - present + - + name: reject-types + type: enum + entries: + - icmp-unreach + - tcp-rst + - icmpx-unreach attribute-sets: - @@ -611,9 +711,10 @@ attribute-sets: type: u64 byte-order: big-endian - - name: flags # TODO + name: flags type: u32 byte-order: big-endian + enum: quota-flags - name: pad type: pad @@ -665,6 +766,38 @@ attribute-sets: type: nest nested-attributes: hook-dev-attrs - + name: expr-bitwise-attrs + attributes: + - + name: sreg + type: u32 + byte-order: big-endian + - + name: dreg + type: u32 + byte-order: big-endian + - + name: len + type: u32 + byte-order: big-endian + - + name: mask + type: nest + nested-attributes: data-attrs + - + name: xor + type: nest + nested-attributes: data-attrs + - + name: op + type: u32 + byte-order: big-endian + enum: bitwise-ops + - + name: data + type: nest + nested-attributes: data-attrs + - name: expr-cmp-attrs attributes: - @@ -698,6 +831,7 @@ attribute-sets: name: code type: u32 byte-order: big-endian + enum: verdict-code - name: chain type: string @@ -719,6 +853,43 @@ attribute-sets: name: pad type: pad - + name: expr-fib-attrs + attributes: + - + name: dreg + type: u32 + byte-order: big-endian + - + name: result + type: u32 + byte-order: big-endian + enum: fib-result + - + name: flags + type: u32 + byte-order: big-endian + enum: fib-flags + - + name: expr-ct-attrs + attributes: + - + name: dreg + type: u32 + byte-order: big-endian + - + name: key + type: u32 + byte-order: big-endian + enum: ct-keys + - + name: direction + type: u8 + enum: ct-direction + - + name: sreg + type: u32 + byte-order: big-endian + - name: expr-flow-offload-attrs attributes: - @@ -737,6 +908,31 @@ attribute-sets: type: nest nested-attributes: data-attrs - + name: expr-lookup-attrs + attributes: + - + name: set + type: string + doc: Name of set to use + - + name: set id + type: u32 + byte-order: big-endian + doc: ID of set to use + - + name: sreg + type: u32 + byte-order: big-endian + - + name: dreg + type: u32 + byte-order: big-endian + - + name: flags + type: u32 + byte-order: big-endian + enum: lookup-flags + - name: expr-meta-attrs attributes: - @@ -821,6 +1017,30 @@ attribute-sets: type: u32 byte-order: big-endian - + name: expr-reject-attrs + attributes: + - + name: type + type: u32 + byte-order: big-endian + enum: reject-types + - + name: icmp-code + type: u8 + - + name: expr-target-attrs + attributes: + - + name: name + type: string + - + name: rev + type: u32 + byte-order: big-endian + - + name: info + type: binary + - name: expr-tproxy-attrs attributes: - @@ -835,13 +1055,38 @@ attribute-sets: name: reg-port type: u32 byte-order: big-endian + - + name: expr-objref-attrs + attributes: + - + name: imm-type + type: u32 + byte-order: big-endian + - + name: imm-name + type: string + doc: object name + - + name: set-sreg + type: u32 + byte-order: big-endian + - + name: set-name + type: string + doc: name of object map + - + name: set-id + type: u32 + byte-order: big-endian + doc: id of object map sub-messages: - name: expr-ops formats: - - value: bitwise # TODO + value: bitwise + attribute-set: expr-bitwise-attrs - value: cmp attribute-set: expr-cmp-attrs @@ -849,7 +1094,11 @@ sub-messages: value: counter attribute-set: expr-counter-attrs - - value: ct # TODO + value: ct + attribute-set: expr-ct-attrs + - + value: fib + attribute-set: expr-fib-attrs - value: flow_offload attribute-set: expr-flow-offload-attrs @@ -857,7 +1106,8 @@ sub-messages: value: immediate attribute-set: expr-immediate-attrs - - value: lookup # TODO + value: lookup + attribute-set: expr-lookup-attrs - value: meta attribute-set: expr-meta-attrs @@ -865,9 +1115,21 @@ sub-messages: value: nat attribute-set: expr-nat-attrs - + value: objref + attribute-set: expr-objref-attrs + - value: payload attribute-set: expr-payload-attrs - + value: quota + attribute-set: quota-attrs + - + value: reject + attribute-set: expr-reject-attrs + - + value: target + attribute-set: expr-target-attrs + - value: tproxy attribute-set: expr-tproxy-attrs - diff --git a/Documentation/netlink/specs/rt_link.yaml b/Documentation/netlink/specs/rt_link.yaml index de08c12fd56f..0c4d5d40cae9 100644 --- a/Documentation/netlink/specs/rt_link.yaml +++ b/Documentation/netlink/specs/rt_link.yaml @@ -903,6 +903,22 @@ definitions: - cfm-config - cfm-status - mst + - + name: netkit-policy + type: enum + entries: + - + name: forward + value: 0 + - + name: blackhole + value: 2 + - + name: netkit-mode + type: enum + entries: + - name: l2 + - name: l3 attribute-sets: - @@ -2109,6 +2125,28 @@ attribute-sets: - name: id type: u32 + - + name: linkinfo-netkit-attrs + name-prefix: ifla-netkit- + attributes: + - + name: peer-info + type: binary + - + name: primary + type: u8 + - + name: policy + type: u32 + enum: netkit-policy + - + name: peer-policy + type: u32 + enum: netkit-policy + - + name: mode + type: u32 + enum: netkit-mode sub-messages: - @@ -2147,6 +2185,9 @@ sub-messages: - value: vrf attribute-set: linkinfo-vrf-attrs + - + value: netkit + attribute-set: linkinfo-netkit-attrs - name: linkinfo-member-data-msg formats: diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst index a4c7d0c65fd7..4561e8ab9e08 100644 --- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst +++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst @@ -230,6 +230,11 @@ per-queue stats) from the device. In addition the driver logs the stats to syslog upon device reset. +On supported instance types, the statistics will also include the +ENA Express data (fields prefixed with `ena_srd`). For a complete +documentation of ENA Express data refer to +https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ena-express.html#ena-express-monitor + MTU === diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst index 6932d8c043c2..6fc1961492b7 100644 --- a/Documentation/networking/device_drivers/ethernet/index.rst +++ b/Documentation/networking/device_drivers/ethernet/index.rst @@ -44,6 +44,7 @@ Contents: marvell/octeon_ep marvell/octeon_ep_vf mellanox/mlx5/index + meta/fbnic microsoft/netvsc neterion/s2io netronome/nfp diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst index 3bd72577af9a..99d95be4d159 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst @@ -218,6 +218,22 @@ the software port. [#accel]_. - Informative + * - `rx[i]_hds_nosplit_packets` + - Number of packets that were not split in header/data split mode. A + packet will not get split when the hardware does not support its + protocol splitting. An example such a protocol is ICMPv4/v6. Currently + TCP and UDP with IPv4/IPv6 are supported for header/data split + [#accel]_. + - Informative + + * - `rx[i]_hds_nosplit_bytes` + - Number of bytes for packets that were not split in header/data split + mode. A packet will not get split when the hardware does not support its + protocol splitting. An example such a protocol is ICMPv4/v6. Currently + TCP and UDP with IPv4/IPv6 are supported for header/data split + [#accel]_. + - Informative + * - `rx[i]_lro_packets` - The number of LRO packets received on ring i [#accel]_. - Acceleration diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst index 20d3b7e87049..34e911480108 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst @@ -130,6 +130,9 @@ Enabling the driver and kconfig options | Build support for software-managed steering in the NIC. +**CONFIG_MLX5_HW_STEERING=(y/n)** + +| Build support for hardware-managed steering in the NIC. **CONFIG_MLX5_TC_CT=(y/n)** diff --git a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst new file mode 100644 index 000000000000..32ff114f5c26 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst @@ -0,0 +1,29 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +===================================== +Meta Platforms Host Network Interface +===================================== + +Firmware Versions +----------------- + +fbnic has three components stored on the flash which are provided in one PLDM +image: + +1. fw - The control firmware used to view and modify firmware settings, request + firmware actions, and retrieve firmware counters outside of the data path. + This is the firmware which fbnic_fw.c interacts with. +2. bootloader - The firmware which validate firmware security and control basic + operations including loading and updating the firmware. This is also known + as the cmrt firmware. +3. undi - This is the UEFI driver which is based on the Linux driver. + +fbnic stores two copies of these three components on flash. This allows fbnic +to fall back to an older version of firmware automatically in case firmware +fails to boot. Version information for both is provided as running and stored. +The undi is only provided in stored as it is not actively running once the Linux +driver takes over. + +devlink dev info provides version information for all three components. In +addition to the version the hg commit hash of the build is included as a +separate entry. diff --git a/Documentation/networking/devmem.rst b/Documentation/networking/devmem.rst new file mode 100644 index 000000000000..a55bf21f671c --- /dev/null +++ b/Documentation/networking/devmem.rst @@ -0,0 +1,269 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================= +Device Memory TCP +================= + + +Intro +===== + +Device memory TCP (devmem TCP) enables receiving data directly into device +memory (dmabuf). The feature is currently implemented for TCP sockets. + + +Opportunity +----------- + +A large number of data transfers have device memory as the source and/or +destination. Accelerators drastically increased the prevalence of such +transfers. Some examples include: + +- Distributed training, where ML accelerators, such as GPUs on different hosts, + exchange data. + +- Distributed raw block storage applications transfer large amounts of data with + remote SSDs. Much of this data does not require host processing. + +Typically the Device-to-Device data transfers in the network are implemented as +the following low-level operations: Device-to-Host copy, Host-to-Host network +transfer, and Host-to-Device copy. + +The flow involving host copies is suboptimal, especially for bulk data transfers, +and can put significant strains on system resources such as host memory +bandwidth and PCIe bandwidth. + +Devmem TCP optimizes this use case by implementing socket APIs that enable +the user to receive incoming network packets directly into device memory. + +Packet payloads go directly from the NIC to device memory. + +Packet headers go to host memory and are processed by the TCP/IP stack +normally. The NIC must support header split to achieve this. + +Advantages: + +- Alleviate host memory bandwidth pressure, compared to existing + network-transfer + device-copy semantics. + +- Alleviate PCIe bandwidth pressure, by limiting data transfer to the lowest + level of the PCIe tree, compared to the traditional path which sends data + through the root complex. + + +More Info +--------- + + slides, video + https://netdevconf.org/0x17/sessions/talk/device-memory-tcp.html + + patchset + [PATCH net-next v24 00/13] Device Memory TCP + https://lore.kernel.org/netdev/20240831004313.3713467-1-almasrymina@google.com/ + + +Interface +========= + + +Example +------- + +tools/testing/selftests/net/ncdevmem.c:do_server shows an example of setting up +the RX path of this API. + + +NIC Setup +--------- + +Header split, flow steering, & RSS are required features for devmem TCP. + +Header split is used to split incoming packets into a header buffer in host +memory, and a payload buffer in device memory. + +Flow steering & RSS are used to ensure that only flows targeting devmem land on +an RX queue bound to devmem. + +Enable header split & flow steering:: + + # enable header split + ethtool -G eth1 tcp-data-split on + + + # enable flow steering + ethtool -K eth1 ntuple on + +Configure RSS to steer all traffic away from the target RX queue (queue 15 in +this example):: + + ethtool --set-rxfh-indir eth1 equal 15 + + +The user must bind a dmabuf to any number of RX queues on a given NIC using +the netlink API:: + + /* Bind dmabuf to NIC RX queue 15 */ + struct netdev_queue *queues; + queues = malloc(sizeof(*queues) * 1); + + queues[0]._present.type = 1; + queues[0]._present.idx = 1; + queues[0].type = NETDEV_RX_QUEUE_TYPE_RX; + queues[0].idx = 15; + + *ys = ynl_sock_create(&ynl_netdev_family, &yerr); + + req = netdev_bind_rx_req_alloc(); + netdev_bind_rx_req_set_ifindex(req, 1 /* ifindex */); + netdev_bind_rx_req_set_dmabuf_fd(req, dmabuf_fd); + __netdev_bind_rx_req_set_queues(req, queues, n_queue_index); + + rsp = netdev_bind_rx(*ys, req); + + dmabuf_id = rsp->dmabuf_id; + + +The netlink API returns a dmabuf_id: a unique ID that refers to this dmabuf +that has been bound. + +The user can unbind the dmabuf from the netdevice by closing the netlink socket +that established the binding. We do this so that the binding is automatically +unbound even if the userspace process crashes. + +Note that any reasonably well-behaved dmabuf from any exporter should work with +devmem TCP, even if the dmabuf is not actually backed by devmem. An example of +this is udmabuf, which wraps user memory (non-devmem) in a dmabuf. + + +Socket Setup +------------ + +The socket must be flow steered to the dmabuf bound RX queue:: + + ethtool -N eth1 flow-type tcp4 ... queue 15 + + +Receiving data +-------------- + +The user application must signal to the kernel that it is capable of receiving +devmem data by passing the MSG_SOCK_DEVMEM flag to recvmsg:: + + ret = recvmsg(fd, &msg, MSG_SOCK_DEVMEM); + +Applications that do not specify the MSG_SOCK_DEVMEM flag will receive an EFAULT +on devmem data. + +Devmem data is received directly into the dmabuf bound to the NIC in 'NIC +Setup', and the kernel signals such to the user via the SCM_DEVMEM_* cmsgs:: + + for (cm = CMSG_FIRSTHDR(&msg); cm; cm = CMSG_NXTHDR(&msg, cm)) { + if (cm->cmsg_level != SOL_SOCKET || + (cm->cmsg_type != SCM_DEVMEM_DMABUF && + cm->cmsg_type != SCM_DEVMEM_LINEAR)) + continue; + + dmabuf_cmsg = (struct dmabuf_cmsg *)CMSG_DATA(cm); + + if (cm->cmsg_type == SCM_DEVMEM_DMABUF) { + /* Frag landed in dmabuf. + * + * dmabuf_cmsg->dmabuf_id is the dmabuf the + * frag landed on. + * + * dmabuf_cmsg->frag_offset is the offset into + * the dmabuf where the frag starts. + * + * dmabuf_cmsg->frag_size is the size of the + * frag. + * + * dmabuf_cmsg->frag_token is a token used to + * refer to this frag for later freeing. + */ + + struct dmabuf_token token; + token.token_start = dmabuf_cmsg->frag_token; + token.token_count = 1; + continue; + } + + if (cm->cmsg_type == SCM_DEVMEM_LINEAR) + /* Frag landed in linear buffer. + * + * dmabuf_cmsg->frag_size is the size of the + * frag. + */ + continue; + + } + +Applications may receive 2 cmsgs: + +- SCM_DEVMEM_DMABUF: this indicates the fragment landed in the dmabuf indicated + by dmabuf_id. + +- SCM_DEVMEM_LINEAR: this indicates the fragment landed in the linear buffer. + This typically happens when the NIC is unable to split the packet at the + header boundary, such that part (or all) of the payload landed in host + memory. + +Applications may receive no SO_DEVMEM_* cmsgs. That indicates non-devmem, +regular TCP data that landed on an RX queue not bound to a dmabuf. + + +Freeing frags +------------- + +Frags received via SCM_DEVMEM_DMABUF are pinned by the kernel while the user +processes the frag. The user must return the frag to the kernel via +SO_DEVMEM_DONTNEED:: + + ret = setsockopt(client_fd, SOL_SOCKET, SO_DEVMEM_DONTNEED, &token, + sizeof(token)); + +The user must ensure the tokens are returned to the kernel in a timely manner. +Failure to do so will exhaust the limited dmabuf that is bound to the RX queue +and will lead to packet drops. + + +Implementation & Caveats +======================== + +Unreadable skbs +--------------- + +Devmem payloads are inaccessible to the kernel processing the packets. This +results in a few quirks for payloads of devmem skbs: + +- Loopback is not functional. Loopback relies on copying the payload, which is + not possible with devmem skbs. + +- Software checksum calculation fails. + +- TCP Dump and bpf can't access devmem packet payloads. + + +Testing +======= + +More realistic example code can be found in the kernel source under +``tools/testing/selftests/net/ncdevmem.c`` + +ncdevmem is a devmem TCP netcat. It works very similarly to netcat, but +receives data directly into a udmabuf. + +To run ncdevmem, you need to run it on a server on the machine under test, and +you need to run netcat on a peer to provide the TX data. + +ncdevmem has a validation mode as well that expects a repeating pattern of +incoming data and validates it as such. For example, you can launch +ncdevmem on the server by:: + + ncdevmem -s <server IP> -c <client IP> -f eth1 -d 3 -n 0000:06:00.0 -l \ + -p 5201 -v 7 + +On client side, use regular netcat to send TX data to ncdevmem process +on the server:: + + yes $(echo -e \\x01\\x02\\x03\\x04\\x05\\x06) | \ + tr \\n \\0 | head -c 5G | nc <server IP> 5201 -p 5201 diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index 3ab423b80e91..295563e91082 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -57,6 +57,7 @@ Structure of this header is ``ETHTOOL_A_HEADER_DEV_INDEX`` u32 device ifindex ``ETHTOOL_A_HEADER_DEV_NAME`` string device name ``ETHTOOL_A_HEADER_FLAGS`` u32 flags common for all requests + ``ETHTOOL_A_HEADER_PHY_INDEX`` u32 phy device index ============================== ====== ============================= ``ETHTOOL_A_HEADER_DEV_INDEX`` and ``ETHTOOL_A_HEADER_DEV_NAME`` identify the @@ -81,6 +82,12 @@ the behaviour is backward compatible, i.e. requests from old clients not aware of the flag should be interpreted the way the client expects. A client must not set flags it does not understand. +``ETHTOOL_A_HEADER_PHY_INDEX`` identifies the Ethernet PHY the message relates to. +As there are numerous commands that are related to PHY configuration, and because +there may be more than one PHY on the link, the PHY index can be passed in the +request for the commands that needs it. It is, however, not mandatory, and if it +is not passed for commands that target a PHY, the net_device.phydev pointer +is used. Bit sets ======== @@ -934,18 +941,18 @@ Request contents: ==================================== ====== =========================== Kernel checks that requested ring sizes do not exceed limits reported by -driver. Driver may impose additional constraints and may not suspport all +driver. Driver may impose additional constraints and may not support all attributes. ``ETHTOOL_A_RINGS_CQE_SIZE`` specifies the completion queue event size. -Completion queue events(CQE) are the events posted by NIC to indicate the -completion status of a packet when the packet is sent(like send success or -error) or received(like pointers to packet fragments). The CQE size parameter +Completion queue events (CQE) are the events posted by NIC to indicate the +completion status of a packet when the packet is sent (like send success or +error) or received (like pointers to packet fragments). The CQE size parameter enables to modify the CQE size other than default size if NIC supports it. -A bigger CQE can have more receive buffer pointers inturn NIC can transfer -a bigger frame from wire. Based on the NIC hardware, the overall completion -queue size can be adjusted in the driver if CQE size is modified. +A bigger CQE can have more receive buffer pointers, and in turn the NIC can +transfer a bigger frame from wire. Based on the NIC hardware, the overall +completion queue size can be adjusted in the driver if CQE size is modified. CHANNELS_GET ============ @@ -989,7 +996,7 @@ Request contents: ===================================== ====== ========================== Kernel checks that requested channel counts do not exceed limits reported by -driver. Driver may impose additional constraints and may not suspport all +driver. Driver may impose additional constraints and may not support all attributes. @@ -1307,12 +1314,17 @@ information. +-+-+-----------------------------------------+--------+---------------------+ | | | ``ETHTOOL_A_CABLE_RESULTS_CODE`` | u8 | result code | +-+-+-----------------------------------------+--------+---------------------+ + | | | ``ETHTOOL_A_CABLE_RESULT_SRC`` | u32 | information source | + +-+-+-----------------------------------------+--------+---------------------+ | | ``ETHTOOL_A_CABLE_NEST_FAULT_LENGTH`` | nested | cable length | +-+-+-----------------------------------------+--------+---------------------+ | | | ``ETHTOOL_A_CABLE_FAULT_LENGTH_PAIR`` | u8 | pair number | +-+-+-----------------------------------------+--------+---------------------+ | | | ``ETHTOOL_A_CABLE_FAULT_LENGTH_CM`` | u32 | length in cm | +-+-+-----------------------------------------+--------+---------------------+ + | | | ``ETHTOOL_A_CABLE_FAULT_LENGTH_SRC`` | u32 | information source | + +-+-+-----------------------------------------+--------+---------------------+ + CABLE_TEST TDR ============== @@ -1756,7 +1768,7 @@ Kernel response contents: When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` attribute identifies the operational state of the PoDL PSE functions. The operational state of the PSE function can be changed using the ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` -action. This option is corresponding to ``IEEE 802.3-2018`` 30.15.1.1.2 +action. This attribute corresponds to ``IEEE 802.3-2018`` 30.15.1.1.2 aPoDLPSEAdminState. Possible values are: .. kernel-doc:: include/uapi/linux/ethtool.h @@ -1770,8 +1782,8 @@ The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_STATE`` implementing When set, the optional ``ETHTOOL_A_PODL_PSE_PW_D_STATUS`` attribute identifies the power detection status of the PoDL PSE. The status depend on internal PSE -state machine and automatic PD classification support. This option is -corresponding to ``IEEE 802.3-2018`` 30.15.1.1.3 aPoDLPSEPowerDetectionStatus. +state machine and automatic PD classification support. This attribute +corresponds to ``IEEE 802.3-2018`` 30.15.1.1.3 aPoDLPSEPowerDetectionStatus. Possible values are: .. kernel-doc:: include/uapi/linux/ethtool.h @@ -1785,12 +1797,13 @@ The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_PW_D_STATUS`` implementing When set, the optional ``ETHTOOL_A_C33_PSE_PW_CLASS`` attribute identifies the power class of the C33 PSE. It depends on the class negotiated between -the PSE and the PD. This option is corresponding to ``IEEE 802.3-2022`` +the PSE and the PD. This attribute corresponds to ``IEEE 802.3-2022`` 30.9.1.1.8 aPSEPowerClassification. When set, the optional ``ETHTOOL_A_C33_PSE_ACTUAL_PW`` attribute identifies -This option is corresponding to ``IEEE 802.3-2022`` 30.9.1.1.23 aPSEActualPower. -Actual power is reported in mW. +the actual power drawn by the C33 PSE. This attribute corresponds to +``IEEE 802.3-2022`` 30.9.1.1.23 aPSEActualPower. Actual power is reported +in mW. When set, the optional ``ETHTOOL_A_C33_PSE_EXT_STATE`` attribute identifies the extended error state of the C33 PSE. Possible values are: @@ -1839,7 +1852,7 @@ Request contents: ====================================== ====== ============================= When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` attribute is used -to control PoDL PSE Admin functions. This option is implementing +to control PoDL PSE Admin functions. This option implements ``IEEE 802.3-2018`` 30.15.1.2.1 acPoDLPSEAdminControl. See ``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` for supported values. @@ -1866,15 +1879,24 @@ RSS context of an interface similar to ``ETHTOOL_GRSSH`` ioctl request. Request contents: -===================================== ====== ========================== +===================================== ====== ============================ ``ETHTOOL_A_RSS_HEADER`` nested request header ``ETHTOOL_A_RSS_CONTEXT`` u32 context number -===================================== ====== ========================== + ``ETHTOOL_A_RSS_START_CONTEXT`` u32 start context number (dumps) +===================================== ====== ============================ + +``ETHTOOL_A_RSS_CONTEXT`` specifies which RSS context number to query, +if not set context 0 (the main context) is queried. Dumps can be filtered +by device (only listing contexts of a given netdev). Filtering single +context number is not supported but ``ETHTOOL_A_RSS_START_CONTEXT`` +can be used to start dumping context from the given number (primarily +used to ignore context 0s and only dump additional contexts). Kernel response contents: ===================================== ====== ========================== ``ETHTOOL_A_RSS_HEADER`` nested reply header + ``ETHTOOL_A_RSS_CONTEXT`` u32 context number ``ETHTOOL_A_RSS_HFUNC`` u32 RSS hash func ``ETHTOOL_A_RSS_INDIR`` binary Indir table bytes ``ETHTOOL_A_RSS_HKEY`` binary Hash key bytes @@ -1926,7 +1948,7 @@ When set, the optional ``ETHTOOL_A_PLCA_VERSION`` attribute indicates which standard and version the PLCA management interface complies to. When not set, the interface is vendor-specific and (possibly) supplied by the driver. The OPEN Alliance SIG specifies a standard register map for 10BASE-T1S PHYs -embedding the PLCA Reconcialiation Sublayer. See "10BASE-T1S PLCA Management +embedding the PLCA Reconciliation Sublayer. See "10BASE-T1S PLCA Management Registers" at https://www.opensig.org/about/specifications/. When set, the optional ``ETHTOOL_A_PLCA_ENABLED`` attribute indicates the @@ -1988,7 +2010,7 @@ Request contents: ``ETHTOOL_A_PLCA_ENABLED`` u8 PLCA Admin State ``ETHTOOL_A_PLCA_NODE_ID`` u8 PLCA unique local node ID ``ETHTOOL_A_PLCA_NODE_CNT`` u8 Number of PLCA nodes on the - netkork, including the + network, including the coordinator ``ETHTOOL_A_PLCA_TO_TMR`` u8 Transmit Opportunity Timer value in bit-times (BT) @@ -2175,6 +2197,49 @@ string. The ``ETHTOOL_A_MODULE_FW_FLASH_DONE`` and ``ETHTOOL_A_MODULE_FW_FLASH_TOTAL`` attributes encode the completed and total amount of work, respectively. +PHY_GET +======= + +Retrieve information about a given Ethernet PHY sitting on the link. The DO +operation returns all available information about dev->phydev. User can also +specify a PHY_INDEX, in which case the DO request returns information about that +specific PHY. + +As there can be more than one PHY, the DUMP operation can be used to list the PHYs +present on a given interface, by passing an interface index or name in +the dump request. + +For more information, refer to :ref:`phy_link_topology` + +Request contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_PHY_HEADER`` nested request header + ==================================== ====== ========================== + +Kernel response contents: + + ===================================== ====== =============================== + ``ETHTOOL_A_PHY_HEADER`` nested request header + ``ETHTOOL_A_PHY_INDEX`` u32 the phy's unique index, that can + be used for phy-specific + requests + ``ETHTOOL_A_PHY_DRVNAME`` string the phy driver name + ``ETHTOOL_A_PHY_NAME`` string the phy device name + ``ETHTOOL_A_PHY_UPSTREAM_TYPE`` u32 the type of device this phy is + connected to + ``ETHTOOL_A_PHY_UPSTREAM_INDEX`` u32 the PHY index of the upstream + PHY + ``ETHTOOL_A_PHY_UPSTREAM_SFP_NAME`` string if this PHY is connected to + its parent PHY through an SFP + bus, the name of this sfp bus + ``ETHTOOL_A_PHY_DOWNSTREAM_SFP_NAME`` string if the phy controls an sfp bus, + the name of the sfp bus + ===================================== ====== =============================== + +When ``ETHTOOL_A_PHY_UPSTREAM_TYPE`` is PHY_UPSTREAM_PHY, the PHY's parent is +another PHY. + Request translation =================== @@ -2282,4 +2347,5 @@ are netlink only. n/a ``ETHTOOL_MSG_MM_GET`` n/a ``ETHTOOL_MSG_MM_SET`` n/a ``ETHTOOL_MSG_MODULE_FW_FLASH_ACT`` + n/a ``ETHTOOL_MSG_PHY_GET`` =================================== ===================================== diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index d1af04b952f8..803dfc1efb75 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -49,6 +49,7 @@ Contents: cdc_mbim dccp dctcp + devmem dns_resolver driver eql @@ -87,10 +88,12 @@ Contents: nexthop-group-resilient nf_conntrack-sysctl nf_flowtable + oa-tc6-framework openvswitch operstates packet_mmap phonet + phy-link-topology pktgen plip ppp_generic diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 3616389c8c2d..eacf8983e230 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2362,6 +2362,20 @@ ra_honor_pio_life - BOOLEAN Default: 0 (disabled) +ra_honor_pio_pflag - BOOLEAN + The Prefix Information Option P-flag indicates the network can + allocate a unique IPv6 prefix per client using DHCPv6-PD. + This sysctl can be enabled when a userspace DHCPv6-PD client + is running to cause the P-flag to take effect: i.e. the + P-flag suppresses any effects of the A-flag within the same + PIO. For a given PIO, P=1 and A=1 is treated as A=0. + + - If disabled, the P-flag is ignored. + - If enabled, the P-flag will disable SLAAC autoconfiguration + for the given Prefix Information Option. + + Default: 0 (disabled) + accept_ra_rt_info_min_plen - INTEGER Minimum prefix length of Route Information in RA. diff --git a/Documentation/networking/l2tp.rst b/Documentation/networking/l2tp.rst index 8496b467dea4..e8cf8b3e60ac 100644 --- a/Documentation/networking/l2tp.rst +++ b/Documentation/networking/l2tp.rst @@ -638,9 +638,8 @@ Tunnels are identified by a unique tunnel id. The id is 16-bit for L2TPv2 and 32-bit for L2TPv3. Internally, the id is stored as a 32-bit value. -Tunnels are kept in a per-net list, indexed by tunnel id. The tunnel -id namespace is shared by L2TPv2 and L2TPv3. The tunnel context can be -derived from the socket's sk_user_data. +Tunnels are kept in a per-net list, indexed by tunnel id. The +tunnel id namespace is shared by L2TPv2 and L2TPv3. Handling tunnel socket close is perhaps the most tricky part of the L2TP implementation. If userspace closes a tunnel socket, the L2TP @@ -652,9 +651,7 @@ socket's encap_destroy handler is invoked, which L2TP uses to initiate its tunnel close actions. For L2TPIP sockets, the socket's close handler initiates the same tunnel close actions. All sessions are first closed. Each session drops its tunnel ref. When the tunnel ref -reaches zero, the tunnel puts its socket ref. When the socket is -eventually destroyed, its sk_destruct finally frees the L2TP tunnel -context. +reaches zero, the tunnel drops its socket ref. Sessions -------- @@ -667,10 +664,7 @@ pseudowire) or other data types such as PPP, ATM, HDLC or Frame Relay. Linux currently implements only Ethernet and PPP session types. Some L2TP session types also have a socket (PPP pseudowires) while -others do not (Ethernet pseudowires). We can't therefore use the -socket reference count as the reference count for session -contexts. The L2TP implementation therefore has its own internal -reference counts on the session contexts. +others do not (Ethernet pseudowires). Like tunnels, L2TP sessions are identified by a unique session id. Just as with tunnel ids, the session id is 16-bit for @@ -680,21 +674,19 @@ value. Sessions hold a ref on their parent tunnel to ensure that the tunnel stays extant while one or more sessions references it. -Sessions are kept in a per-tunnel list, indexed by session id. L2TPv3 -sessions are also kept in a per-net list indexed by session id, -because L2TPv3 session ids are unique across all tunnels and L2TPv3 -data packets do not contain a tunnel id in the header. This list is -therefore needed to find the session context associated with a -received data packet when the tunnel context cannot be derived from -the tunnel socket. +Sessions are kept in a per-net list. L2TPv2 sessions and L2TPv3 +sessions are stored in separate lists. L2TPv2 sessions are keyed +by a 32-bit key made up of the 16-bit tunnel ID and 16-bit +session ID. L2TPv3 sessions are keyed by the 32-bit session ID, since +L2TPv3 session ids are unique across all tunnels. Although the L2TPv3 RFC specifies that L2TPv3 session ids are not -scoped by the tunnel, the kernel does not police this for L2TPv3 UDP -tunnels and does not add sessions of L2TPv3 UDP tunnels into the -per-net session list. In the UDP receive code, we must trust that the -tunnel can be identified using the tunnel socket's sk_user_data and -lookup the session in the tunnel's session list instead of the per-net -session list. +scoped by the tunnel, the Linux implementation has historically +allowed this. Such session id collisions are supported using a per-net +hash table keyed by sk and session ID. When looking up L2TPv3 +sessions, the list entry may link to multiple sessions with that +session ID, in which case the session matching the given sk (tunnel) +is used. PPP --- @@ -714,10 +706,9 @@ The L2TP PPP implementation handles the closing of a PPPoL2TP socket by closing its corresponding L2TP session. This is complicated because it must consider racing with netlink session create/destroy requests and pppol2tp_connect trying to reconnect with a session that is in the -process of being closed. Unlike tunnels, PPP sessions do not hold a -ref on their associated socket, so code must be careful to sock_hold -the socket where necessary. For all the details, see commit -3d609342cc04129ff7568e19316ce3d7451a27e8. +process of being closed. PPP sessions hold a ref on their associated +socket in order that the socket remains extants while the session +references it. Ethernet -------- @@ -761,15 +752,10 @@ Limitations The current implementation has a number of limitations: - 1) Multiple UDP sockets with the same 5-tuple address cannot be - used. The kernel's tunnel context is identified using private - data associated with the socket so it is important that each - socket is uniquely identified by its address. - - 2) Interfacing with openvswitch is not yet implemented. It may be + 1) Interfacing with openvswitch is not yet implemented. It may be useful to map OVS Ethernet and VLAN ports into L2TPv3 tunnels. - 3) VLAN pseudowires are implemented using an ``l2tpethN`` interface + 2) VLAN pseudowires are implemented using an ``l2tpethN`` interface configured with a VLAN sub-interface. Since L2TPv3 VLAN pseudowires carry one and only one VLAN, it may be better to use a single netdevice rather than an ``l2tpethN`` and ``l2tpethN``:M diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst index fd514bba8c43..95598c21fc8e 100644 --- a/Documentation/networking/mptcp-sysctl.rst +++ b/Documentation/networking/mptcp-sysctl.rst @@ -34,6 +34,17 @@ available_schedulers - STRING Shows the available schedulers choices that are registered. More packet schedulers may be available, but not loaded. +blackhole_timeout - INTEGER (seconds) + Initial time period in second to disable MPTCP on active MPTCP sockets + when a MPTCP firewall blackhole issue happens. This time period will + grow exponentially when more blackhole issues get detected right after + MPTCP is re-enabled and will reset to the initial value when the + blackhole issue goes away. + + 0 to disable the blackhole detection. + + Default: 3600 + checksum_enabled - BOOLEAN Control whether DSS checksum can be enabled. diff --git a/Documentation/networking/multi-pf-netdev.rst b/Documentation/networking/multi-pf-netdev.rst index 268819225866..2cd25d81aaa7 100644 --- a/Documentation/networking/multi-pf-netdev.rst +++ b/Documentation/networking/multi-pf-netdev.rst @@ -111,11 +111,11 @@ The relation between PF, irq, napi, and queue can be observed via netlink spec:: Here you can clearly observe our channels distribution policy:: $ ls /proc/irq/{36,39,40,41,42}/mlx5* -d -1 - /proc/irq/36/mlx5_comp1@pci:0000:08:00.0 - /proc/irq/39/mlx5_comp1@pci:0000:09:00.0 - /proc/irq/40/mlx5_comp2@pci:0000:08:00.0 - /proc/irq/41/mlx5_comp2@pci:0000:09:00.0 - /proc/irq/42/mlx5_comp3@pci:0000:08:00.0 + /proc/irq/36/mlx5_comp0@pci:0000:08:00.0 + /proc/irq/39/mlx5_comp0@pci:0000:09:00.0 + /proc/irq/40/mlx5_comp1@pci:0000:08:00.0 + /proc/irq/41/mlx5_comp1@pci:0000:09:00.0 + /proc/irq/42/mlx5_comp2@pci:0000:08:00.0 Steering ======== diff --git a/Documentation/networking/net_cachelines/net_device.rst b/Documentation/networking/net_cachelines/net_device.rst index 70c4fb9d4e5c..22b07c814f4a 100644 --- a/Documentation/networking/net_cachelines/net_device.rst +++ b/Documentation/networking/net_cachelines/net_device.rst @@ -7,6 +7,8 @@ net_device struct fast path usage breakdown Type Name fastpath_tx_access fastpath_rx_access Comments ..struct ..net_device +unsigned_long:32 priv_flags read_mostly - __dev_queue_xmit(tx) +unsigned_long:1 lltx read_mostly - HARD_TX_LOCK,HARD_TX_TRYLOCK,HARD_TX_UNLOCK(tx) char name[16] - - struct_netdev_name_node* name_node struct_dev_ifalias* ifalias @@ -23,7 +25,6 @@ struct_list_head ptype_specific struct adj_list unsigned_int flags read_mostly read_mostly __dev_queue_xmit,__dev_xmit_skb,ip6_output,__ip6_finish_output(tx);ip6_rcv_core(rx) xdp_features_t xdp_features -unsigned_long_long priv_flags read_mostly - __dev_queue_xmit(tx) struct_net_device_ops* netdev_ops read_mostly - netdev_core_pick_tx,netdev_start_xmit(tx) struct_xdp_metadata_ops* xdp_metadata_ops int ifindex - read_mostly ip6_rcv_core @@ -98,7 +99,7 @@ unsigned_int num_rx_queues unsigned_int real_num_rx_queues - read_mostly get_rps_cpu struct_bpf_prog* xdp_prog - read_mostly netif_elide_gro() unsigned_long gro_flush_timeout - read_mostly napi_complete_done -int napi_defer_hard_irqs - read_mostly napi_complete_done +u32 napi_defer_hard_irqs - read_mostly napi_complete_done unsigned_int gro_max_size - read_mostly skb_gro_receive unsigned_int gro_ipv4_max_size - read_mostly skb_gro_receive rx_handler_func_t* rx_handler read_mostly - __netif_receive_skb_core @@ -163,6 +164,10 @@ struct_lock_class_key* qdisc_tx_busylock bool proto_down unsigned:1 wol_enabled unsigned:1 threaded - - napi_poll(napi_enable,dev_set_threaded) +unsigned_long:1 see_all_hwtstamp_requests +unsigned_long:1 change_proto_down +unsigned_long:1 netns_local +unsigned_long:1 fcoe_mtu struct_list_head net_notifier_list struct_macsec_ops* macsec_ops struct_udp_tunnel_nic_info* udp_tunnel_nic_info @@ -176,3 +181,5 @@ netdevice_tracker dev_registered_tracker struct_rtnl_hw_stats64* offload_xstats_l3 struct_devlink_port* devlink_port struct_dpll_pin* dpll_pin +struct hlist_head page_pools +struct dim_irq_moder* irq_moder diff --git a/Documentation/networking/netdev-features.rst b/Documentation/networking/netdev-features.rst index d7b15bb64deb..5014f7cc1398 100644 --- a/Documentation/networking/netdev-features.rst +++ b/Documentation/networking/netdev-features.rst @@ -139,21 +139,6 @@ chained skbs (skb->next/prev list). Features contained in NETIF_F_SOFT_FEATURES are features of networking stack. Driver should not change behaviour based on them. - * LLTX driver (deprecated for hardware drivers) - -NETIF_F_LLTX is meant to be used by drivers that don't need locking at all, -e.g. software tunnels. - -This is also used in a few legacy drivers that implement their -own locking, don't use it for new (hardware) drivers. - - * netns-local device - -NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between -network namespaces (e.g. loopback). - -Don't use it in drivers. - * VLAN challenged NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index c2476917a6c3..857c9784f87e 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -258,11 +258,11 @@ ndo_get_stats: ndo_start_xmit: Synchronization: __netif_tx_lock spinlock. - When the driver sets NETIF_F_LLTX in dev->features this will be + When the driver sets dev->lltx this will be called without holding netif_tx_lock. In this case the driver has to lock by itself when needed. The locking there should also properly protect against - set_rx_mode. WARNING: use of NETIF_F_LLTX is deprecated. + set_rx_mode. WARNING: use of dev->lltx is deprecated. Don't use it for new drivers. Context: Process with BHs disabled or BH (timer), diff --git a/Documentation/networking/oa-tc6-framework.rst b/Documentation/networking/oa-tc6-framework.rst new file mode 100644 index 000000000000..fe2aabde923a --- /dev/null +++ b/Documentation/networking/oa-tc6-framework.rst @@ -0,0 +1,497 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +========================================================================= +OPEN Alliance 10BASE-T1x MAC-PHY Serial Interface (TC6) Framework Support +========================================================================= + +Introduction +------------ + +The IEEE 802.3cg project defines two 10 Mbit/s PHYs operating over a +single pair of conductors. The 10BASE-T1L (Clause 146) is a long reach +PHY supporting full duplex point-to-point operation over 1 km of single +balanced pair of conductors. The 10BASE-T1S (Clause 147) is a short reach +PHY supporting full / half duplex point-to-point operation over 15 m of +single balanced pair of conductors, or half duplex multidrop bus +operation over 25 m of single balanced pair of conductors. + +Furthermore, the IEEE 802.3cg project defines the new Physical Layer +Collision Avoidance (PLCA) Reconciliation Sublayer (Clause 148) meant to +provide improved determinism to the CSMA/CD media access method. PLCA +works in conjunction with the 10BASE-T1S PHY operating in multidrop mode. + +The aforementioned PHYs are intended to cover the low-speed / low-cost +applications in industrial and automotive environment. The large number +of pins (16) required by the MII interface, which is specified by the +IEEE 802.3 in Clause 22, is one of the major cost factors that need to be +addressed to fulfil this objective. + +The MAC-PHY solution integrates an IEEE Clause 4 MAC and a 10BASE-T1x PHY +exposing a low pin count Serial Peripheral Interface (SPI) to the host +microcontroller. This also enables the addition of Ethernet functionality +to existing low-end microcontrollers which do not integrate a MAC +controller. + +Overview +-------- + +The MAC-PHY is specified to carry both data (Ethernet frames) and control +(register access) transactions over a single full-duplex serial peripheral +interface. + +Protocol Overview +----------------- + +Two types of transactions are defined in the protocol: data transactions +for Ethernet frame transfers and control transactions for register +read/write transfers. A chunk is the basic element of data transactions +and is composed of 4 bytes of overhead plus 64 bytes of payload size for +each chunk. Ethernet frames are transferred over one or more data chunks. +Control transactions consist of one or more register read/write control +commands. + +SPI transactions are initiated by the SPI host with the assertion of CSn +low to the MAC-PHY and ends with the deassertion of CSn high. In between +each SPI transaction, the SPI host may need time for additional +processing and to setup the next SPI data or control transaction. + +SPI data transactions consist of an equal number of transmit (TX) and +receive (RX) chunks. Chunks in both transmit and receive directions may +or may not contain valid frame data independent from each other, allowing +for the simultaneous transmission and reception of different length +frames. + +Each transmit data chunk begins with a 32-bit data header followed by a +data chunk payload on MOSI. The data header indicates whether transmit +frame data is present and provides the information to determine which +bytes of the payload contain valid frame data. + +In parallel, receive data chunks are received on MISO. Each receive data +chunk consists of a data chunk payload ending with a 32-bit data footer. +The data footer indicates if there is receive frame data present within +the payload or not and provides the information to determine which bytes +of the payload contain valid frame data. + +Reference +--------- + +10BASE-T1x MAC-PHY Serial Interface Specification, + +Link: https://opensig.org/download/document/OPEN_Alliance_10BASET1x_MAC-PHY_Serial_Interface_V1.1.pdf + +Hardware Architecture +--------------------- + +.. code-block:: none + + +----------+ +-------------------------------------+ + | | | MAC-PHY | + | |<---->| +-----------+ +-------+ +-------+ | + | SPI Host | | | SPI Slave | | MAC | | PHY | | + | | | +-----------+ +-------+ +-------+ | + +----------+ +-------------------------------------+ + +Software Architecture +--------------------- + +.. code-block:: none + + +----------------------------------------------------------+ + | Networking Subsystem | + +----------------------------------------------------------+ + / \ / \ + | | + | | + \ / | + +----------------------+ +-----------------------------+ + | MAC Driver |<--->| OPEN Alliance TC6 Framework | + +----------------------+ +-----------------------------+ + / \ / \ + | | + | | + | \ / + +----------------------------------------------------------+ + | SPI Subsystem | + +----------------------------------------------------------+ + / \ + | + | + \ / + +----------------------------------------------------------+ + | 10BASE-T1x MAC-PHY Device | + +----------------------------------------------------------+ + +Implementation +-------------- + +MAC Driver +~~~~~~~~~~ + +- Probed by SPI subsystem. + +- Initializes OA TC6 framework for the MAC-PHY. + +- Registers and configures the network device. + +- Sends the tx ethernet frames from n/w subsystem to OA TC6 framework. + +OPEN Alliance TC6 Framework +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Initializes PHYLIB interface. + +- Registers mac-phy interrupt. + +- Performs mac-phy register read/write operation using the control + transaction protocol specified in the OPEN Alliance 10BASE-T1x MAC-PHY + Serial Interface specification. + +- Performs Ethernet frames transaction using the data transaction protocol + for Ethernet frames specified in the OPEN Alliance 10BASE-T1x MAC-PHY + Serial Interface specification. + +- Forwards the received Ethernet frame from 10Base-T1x MAC-PHY to n/w + subsystem. + +Data Transaction +~~~~~~~~~~~~~~~~ + +The Ethernet frames that are typically transferred from the SPI host to +the MAC-PHY will be converted into multiple transmit data chunks. Each +transmit data chunk will have a 4 bytes header which contains the +information needed to determine the validity and the location of the +transmit frame data within the 64 bytes data chunk payload. + +.. code-block:: none + + +---------------------------------------------------+ + | Tx Chunk | + | +---------------------------+ +----------------+ | MOSI + | | 64 bytes chunk payload | | 4 bytes header | |------------> + | +---------------------------+ +----------------+ | + +---------------------------------------------------+ + +4 bytes header contains the below fields, + +DNC (Bit 31) - Data-Not-Control flag. This flag specifies the type of SPI + transaction. For TX data chunks, this bit shall be ’1’. + 0 - Control command + 1 - Data chunk + +SEQ (Bit 30) - Data Chunk Sequence. This bit is used to indicate an + even/odd transmit data chunk sequence to the MAC-PHY. + +NORX (Bit 29) - No Receive flag. The SPI host may set this bit to prevent + the MAC-PHY from conveying RX data on the MISO for the + current chunk (DV = 0 in the footer), indicating that the + host would not process it. Typically, the SPI host should + set NORX = 0 indicating that it will accept and process + any receive frame data within the current chunk. + +RSVD (Bit 28..24) - Reserved: All reserved bits shall be ‘0’. + +VS (Bit 23..22) - Vendor Specific. These bits are implementation specific. + If the MAC-PHY does not implement these bits, the host + shall set them to ‘0’. + +DV (Bit 21) - Data Valid flag. The SPI host uses this bit to indicate + whether the current chunk contains valid transmit frame data + (DV = 1) or not (DV = 0). When ‘0’, the MAC-PHY ignores the + chunk payload. Note that the receive path is unaffected by + the setting of the DV bit in the data header. + +SV (Bit 20) - Start Valid flag. The SPI host shall set this bit when the + beginning of an Ethernet frame is present in the current + transmit data chunk payload. Otherwise, this bit shall be + zero. This bit is not to be confused with the Start-of-Frame + Delimiter (SFD) byte described in IEEE 802.3 [2]. + +SWO (Bit 19..16) - Start Word Offset. When SV = 1, this field shall + contain the 32-bit word offset into the transmit data + chunk payload that points to the start of a new + Ethernet frame to be transmitted. The host shall write + this field as zero when SV = 0. + +RSVD (Bit 15) - Reserved: All reserved bits shall be ‘0’. + +EV (Bit 14) - End Valid flag. The SPI host shall set this bit when the end + of an Ethernet frame is present in the current transmit data + chunk payload. Otherwise, this bit shall be zero. + +EBO (Bit 13..8) - End Byte Offset. When EV = 1, this field shall contain + the byte offset into the transmit data chunk payload + that points to the last byte of the Ethernet frame to + transmit. This field shall be zero when EV = 0. + +TSC (Bit 7..6) - Timestamp Capture. Request a timestamp capture when the + frame is transmitted onto the network. + 00 - Do not capture a timestamp + 01 - Capture timestamp into timestamp capture register A + 10 - Capture timestamp into timestamp capture register B + 11 - Capture timestamp into timestamp capture register C + +RSVD (Bit 5..1) - Reserved: All reserved bits shall be ‘0’. + +P (Bit 0) - Parity. Parity bit calculated over the transmit data header. + Method used is odd parity. + +The number of buffers available in the MAC-PHY to store the incoming +transmit data chunk payloads is represented as transmit credits. The +available transmit credits in the MAC-PHY can be read either from the +Buffer Status Register or footer (Refer below for the footer info) +received from the MAC-PHY. The SPI host should not write more data chunks +than the available transmit credits as this will lead to transmit buffer +overflow error. + +In case the previous data footer had no transmit credits available and +once the transmit credits become available for transmitting transmit data +chunks, the MAC-PHY interrupt is asserted to SPI host. On reception of the +first data header this interrupt will be deasserted and the received +footer for the first data chunk will have the transmit credits available +information. + +The Ethernet frames that are typically transferred from MAC-PHY to SPI +host will be sent as multiple receive data chunks. Each receive data +chunk will have 64 bytes of data chunk payload followed by 4 bytes footer +which contains the information needed to determine the validity and the +location of the receive frame data within the 64 bytes data chunk payload. + +.. code-block:: none + + +---------------------------------------------------+ + | Rx Chunk | + | +----------------+ +---------------------------+ | MISO + | | 4 bytes footer | | 64 bytes chunk payload | |------------> + | +----------------+ +---------------------------+ | + +---------------------------------------------------+ + +4 bytes footer contains the below fields, + +EXST (Bit 31) - Extended Status. This bit is set when any bit in the + STATUS0 or STATUS1 registers are set and not masked. + +HDRB (Bit 30) - Received Header Bad. When set, indicates that the MAC-PHY + received a control or data header with a parity error. + +SYNC (Bit 29) - Configuration Synchronized flag. This bit reflects the + state of the SYNC bit in the CONFIG0 configuration + register (see Table 12). A zero indicates that the MAC-PHY + configuration may not be as expected by the SPI host. + Following configuration, the SPI host sets the + corresponding bitin the configuration register which is + reflected in this field. + +RCA (Bit 28..24) - Receive Chunks Available. The RCA field indicates to + the SPI host the minimum number of additional receive + data chunks of frame data that are available for + reading beyond the current receive data chunk. This + field is zero when there is no receive frame data + pending in the MAC-PHY’s buffer for reading. + +VS (Bit 23..22) - Vendor Specific. These bits are implementation specific. + If not implemented, the MAC-PHY shall set these bits to + ‘0’. + +DV (Bit 21) - Data Valid flag. The MAC-PHY uses this bit to indicate + whether the current receive data chunk contains valid + receive frame data (DV = 1) or not (DV = 0). When ‘0’, the + SPI host shall ignore the chunk payload. + +SV (Bit 20) - Start Valid flag. The MAC-PHY sets this bit when the current + chunk payload contains the start of an Ethernet frame. + Otherwise, this bit is zero. The SV bit is not to be + confused with the Start-of-Frame Delimiter (SFD) byte + described in IEEE 802.3 [2]. + +SWO (Bit 19..16) - Start Word Offset. When SV = 1, this field contains the + 32-bit word offset into the receive data chunk payload + containing the first byte of a new received Ethernet + frame. When a receive timestamp has been added to the + beginning of the received Ethernet frame (RTSA = 1) + then SWO points to the most significant byte of the + timestamp. This field will be zero when SV = 0. + +FD (Bit 15) - Frame Drop. When set, this bit indicates that the MAC has + detected a condition for which the SPI host should drop the + received Ethernet frame. This bit is only valid at the end + of a received Ethernet frame (EV = 1) and shall be zero at + all other times. + +EV (Bit 14) - End Valid flag. The MAC-PHY sets this bit when the end of a + received Ethernet frame is present in this receive data + chunk payload. + +EBO (Bit 13..8) - End Byte Offset: When EV = 1, this field contains the + byte offset into the receive data chunk payload that + locates the last byte of the received Ethernet frame. + This field is zero when EV = 0. + +RTSA (Bit 7) - Receive Timestamp Added. This bit is set when a 32-bit or + 64-bit timestamp has been added to the beginning of the + received Ethernet frame. The MAC-PHY shall set this bit to + zero when SV = 0. + +RTSP (Bit 6) - Receive Timestamp Parity. Parity bit calculated over the + 32-bit/64-bit timestamp added to the beginning of the + received Ethernet frame. Method used is odd parity. The + MAC-PHY shall set this bit to zero when RTSA = 0. + +TXC (Bit 5..1) - Transmit Credits. This field contains the minimum number + of transmit data chunks of frame data that the SPI host + can write in a single transaction without incurring a + transmit buffer overflow error. + +P (Bit 0) - Parity. Parity bit calculated over the receive data footer. + Method used is odd parity. + +SPI host will initiate the data receive transaction based on the receive +chunks available in the MAC-PHY which is provided in the receive chunk +footer (RCA - Receive Chunks Available). SPI host will create data invalid +transmit data chunks (empty chunks) or data valid transmit data chunks in +case there are valid Ethernet frames to transmit to the MAC-PHY. The +receive chunks available in MAC-PHY can be read either from the Buffer +Status Register or footer. + +In case the previous data footer had no receive data chunks available and +once the receive data chunks become available again for reading, the +MAC-PHY interrupt is asserted to SPI host. On reception of the first data +header this interrupt will be deasserted and the received footer for the +first data chunk will have the receive chunks available information. + +MAC-PHY Interrupt +~~~~~~~~~~~~~~~~~ + +The MAC-PHY interrupt is asserted when the following conditions are met. + +Receive chunks available - This interrupt is asserted when the previous +data footer had no receive data chunks available and once the receive +data chunks become available for reading. On reception of the first data +header this interrupt will be deasserted. + +Transmit chunk credits available - This interrupt is asserted when the +previous data footer indicated no transmit credits available and once the +transmit credits become available for transmitting transmit data chunks. +On reception of the first data header this interrupt will be deasserted. + +Extended status event - This interrupt is asserted when the previous data +footer indicated no extended status and once the extended event become +available. In this case the host should read status #0 register to know +the corresponding error/event. On reception of the first data header this +interrupt will be deasserted. + +Control Transaction +~~~~~~~~~~~~~~~~~~~ + +4 bytes control header contains the below fields, + +DNC (Bit 31) - Data-Not-Control flag. This flag specifies the type of SPI + transaction. For control commands, this bit shall be ‘0’. + 0 - Control command + 1 - Data chunk + +HDRB (Bit 30) - Received Header Bad. When set by the MAC-PHY, indicates + that a header was received with a parity error. The SPI + host should always clear this bit. The MAC-PHY ignores the + HDRB value sent by the SPI host on MOSI. + +WNR (Bit 29) - Write-Not-Read. This bit indicates if data is to be written + to registers (when set) or read from registers + (when clear). + +AID (Bit 28) - Address Increment Disable. When clear, the address will be + automatically post-incremented by one following each + register read or write. When set, address auto increment is + disabled allowing successive reads and writes to occur at + the same register address. + +MMS (Bit 27..24) - Memory Map Selector. This field selects the specific + register memory map to access. + +ADDR (Bit 23..8) - Address. Address of the first register within the + selected memory map to access. + +LEN (Bit 7..1) - Length. Specifies the number of registers to read/write. + This field is interpreted as the number of registers + minus 1 allowing for up to 128 consecutive registers read + or written starting at the address specified in ADDR. A + length of zero shall read or write a single register. + +P (Bit 0) - Parity. Parity bit calculated over the control command header. + Method used is odd parity. + +Control transactions consist of one or more control commands. Control +commands are used by the SPI host to read and write registers within the +MAC-PHY. Each control commands are composed of a 4 bytes control command +header followed by register write data in case of control write command. + +The MAC-PHY ignores the final 4 bytes of data from the SPI host at the end +of the control write command. The control write command is also echoed +from the MAC-PHY back to the SPI host to identify which register write +failed in case of any bus errors. The echoed Control write command will +have the first 4 bytes unused value to be ignored by the SPI host +followed by 4 bytes echoed control header followed by echoed register +write data. Control write commands can write either a single register or +multiple consecutive registers. When multiple consecutive registers are +written, the address is automatically post-incremented by the MAC-PHY. +Writing to any unimplemented or undefined registers shall be ignored and +yield no effect. + +The MAC-PHY ignores all data from the SPI host following the control +header for the remainder of the control read command. The control read +command is also echoed from the MAC-PHY back to the SPI host to identify +which register read is failed in case of any bus errors. The echoed +Control read command will have the first 4 bytes of unused value to be +ignored by the SPI host followed by 4 bytes echoed control header followed +by register read data. Control read commands can read either a single +register or multiple consecutive registers. When multiple consecutive +registers are read, the address is automatically post-incremented by the +MAC-PHY. Reading any unimplemented or undefined registers shall return +zero. + +Device drivers API +================== + +The include/linux/oa_tc6.h defines the following functions: + +.. c:function:: struct oa_tc6 *oa_tc6_init(struct spi_device *spi, \ + struct net_device *netdev) + +Initialize OA TC6 lib. + +.. c:function:: void oa_tc6_exit(struct oa_tc6 *tc6) + +Free allocated OA TC6 lib. + +.. c:function:: int oa_tc6_write_register(struct oa_tc6 *tc6, u32 address, \ + u32 value) + +Write a single register in the MAC-PHY. + +.. c:function:: int oa_tc6_write_registers(struct oa_tc6 *tc6, u32 address, \ + u32 value[], u8 length) + +Writing multiple consecutive registers starting from @address in the MAC-PHY. +Maximum of 128 consecutive registers can be written starting at @address. + +.. c:function:: int oa_tc6_read_register(struct oa_tc6 *tc6, u32 address, \ + u32 *value) + +Read a single register in the MAC-PHY. + +.. c:function:: int oa_tc6_read_registers(struct oa_tc6 *tc6, u32 address, \ + u32 value[], u8 length) + +Reading multiple consecutive registers starting from @address in the MAC-PHY. +Maximum of 128 consecutive registers can be read starting at @address. + +.. c:function:: netdev_tx_t oa_tc6_start_xmit(struct oa_tc6 *tc6, \ + struct sk_buff *skb); + +The transmit Ethernet frame in the skb is or going to be transmitted through +the MAC-PHY. + +.. c:function:: int oa_tc6_zero_align_receive_frame_enable(struct oa_tc6 *tc6); + +Zero align receive frame feature can be enabled to align all receive ethernet +frames data to start at the beginning of any receive data chunk payload with a +start word offset (SWO) of zero. diff --git a/Documentation/networking/phy-link-topology.rst b/Documentation/networking/phy-link-topology.rst new file mode 100644 index 000000000000..4dec5d7d6513 --- /dev/null +++ b/Documentation/networking/phy-link-topology.rst @@ -0,0 +1,121 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. _phy_link_topology: + +================= +PHY link topology +================= + +Overview +======== + +The PHY link topology representation in the networking stack aims at representing +the hardware layout for any given Ethernet link. + +An Ethernet interface from userspace's point of view is nothing but a +:c:type:`struct net_device <net_device>`, which exposes configuration options +through the legacy ioctls and the ethtool netlink commands. The base assumption +when designing these configuration APIs were that the link looks something like :: + + +-----------------------+ +----------+ +--------------+ + | Ethernet Controller / | | Ethernet | | Connector / | + | MAC | ------ | PHY | ---- | Port | ---... to LP + +-----------------------+ +----------+ +--------------+ + struct net_device struct phy_device + +Commands that needs to configure the PHY will go through the net_device.phydev +field to reach the PHY and perform the relevant configuration. + +This assumption falls apart in more complex topologies that can arise when, +for example, using SFP transceivers (although that's not the only specific case). + +Here, we have 2 basic scenarios. Either the MAC is able to output a serialized +interface, that can directly be fed to an SFP cage, such as SGMII, 1000BaseX, +10GBaseR, etc. + +The link topology then looks like this (when an SFP module is inserted) :: + + +-----+ SGMII +------------+ + | MAC | ------- | SFP Module | + +-----+ +------------+ + +Knowing that some modules embed a PHY, the actual link is more like :: + + +-----+ SGMII +--------------+ + | MAC | -------- | PHY (on SFP) | + +-----+ +--------------+ + +In this case, the SFP PHY is handled by phylib, and registered by phylink through +its SFP upstream ops. + +Now some Ethernet controllers aren't able to output a serialized interface, so +we can't directly connect them to an SFP cage. However, some PHYs can be used +as media-converters, to translate the non-serialized MAC MII interface to a +serialized MII interface fed to the SFP :: + + +-----+ RGMII +-----------------------+ SGMII +--------------+ + | MAC | ------- | PHY (media converter) | ------- | PHY (on SFP) | + +-----+ +-----------------------+ +--------------+ + +This is where the model of having a single net_device.phydev pointer shows its +limitations, as we now have 2 PHYs on the link. + +The phy_link topology framework aims at providing a way to keep track of every +PHY on the link, for use by both kernel drivers and subsystems, but also to +report the topology to userspace, allowing to target individual PHYs in configuration +commands. + +API +=== + +The :c:type:`struct phy_link_topology <phy_link_topology>` is a per-netdevice +resource, that gets initialized at netdevice creation. Once it's initialized, +it is then possible to register PHYs to the topology through : + +:c:func:`phy_link_topo_add_phy` + +Besides registering the PHY to the topology, this call will also assign a unique +index to the PHY, which can then be reported to userspace to refer to this PHY +(akin to the ifindex). This index is a u32, ranging from 1 to U32_MAX. The value +0 is reserved to indicate the PHY doesn't belong to any topology yet. + +The PHY can then be removed from the topology through + +:c:func:`phy_link_topo_del_phy` + +These function are already hooked into the phylib subsystem, so all PHYs that +are linked to a net_device through :c:func:`phy_attach_direct` will automatically +join the netdev's topology. + +PHYs that are on a SFP module will also be automatically registered IF the SFP +upstream is phylink (so, no media-converter). + +PHY drivers that can be used as SFP upstream need to call :c:func:`phy_sfp_attach_phy` +and :c:func:`phy_sfp_detach_phy`, which can be used as a +.attach_phy / .detach_phy implementation for the +:c:type:`struct sfp_upstream_ops <sfp_upstream_ops>`. + +UAPI +==== + +There exist a set of netlink commands to query the link topology from userspace, +see ``Documentation/networking/ethtool-netlink.rst``. + +The whole point of having a topology representation is to assign the phyindex +field in :c:type:`struct phy_device <phy_device>`. This index is reported to +userspace using the ``ETHTOOL_MSG_PHY_GET`` ethtnl command. Performing a DUMP operation +will result in all PHYs from all net_device being listed. The DUMP command +accepts either a ``ETHTOOL_A_HEADER_DEV_INDEX`` or ``ETHTOOL_A_HEADER_DEV_NAME`` +to be passed in the request to filter the DUMP to a single net_device. + +The retrieved index can then be passed as a request parameter using the +``ETHTOOL_A_HEADER_PHY_INDEX`` field in the following ethnl commands : + +* ``ETHTOOL_MSG_STRSET_GET`` to get the stats string set from a given PHY +* ``ETHTOOL_MSG_CABLE_TEST_ACT`` and ``ETHTOOL_MSG_CABLE_TEST_ACT``, to perform + cable testing on a given PHY on the link (most likely the outermost PHY) +* ``ETHTOOL_MSG_PSE_SET`` and ``ETHTOOL_MSG_PSE_GET`` for PHY-controlled PoE and PSE settings +* ``ETHTOOL_MSG_PLCA_GET_CFG``, ``ETHTOOL_MSG_PLCA_SET_CFG`` and ``ETHTOOL_MSG_PLCA_GET_STATUS`` + to set the PLCA (Physical Layer Collision Avoidance) parameters + +Note that the PHY index can be passed to other requests, which will silently +ignore it if present and irrelevant. diff --git a/Documentation/networking/switchdev.rst b/Documentation/networking/switchdev.rst index 758f1dae3fce..f355f0166f1b 100644 --- a/Documentation/networking/switchdev.rst +++ b/Documentation/networking/switchdev.rst @@ -137,10 +137,10 @@ would be sub-port 0 on port 1 on switch 1. Port Features ^^^^^^^^^^^^^ -NETIF_F_NETNS_LOCAL +dev->netns_local If the switchdev driver (and device) only supports offloading of the default -network namespace (netns), the driver should set this feature flag to prevent +network namespace (netns), the driver should set this private flag to prevent the port netdev from being moved out of the default netns. A netns-aware driver/device would not set this flag and be responsible for partitioning hardware to preserve netns containment. This means hardware cannot forward diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst index 5e93cd71f99f..8199e6917671 100644 --- a/Documentation/networking/timestamping.rst +++ b/Documentation/networking/timestamping.rst @@ -158,7 +158,8 @@ SOF_TIMESTAMPING_SYS_HARDWARE: SOF_TIMESTAMPING_RAW_HARDWARE: Report hardware timestamps as generated by - SOF_TIMESTAMPING_TX_HARDWARE when available. + SOF_TIMESTAMPING_TX_HARDWARE or SOF_TIMESTAMPING_RX_HARDWARE + when available. 1.3.3 Timestamp Options @@ -266,6 +267,23 @@ SOF_TIMESTAMPING_OPT_TX_SWHW: two separate messages will be looped to the socket's error queue, each containing just one timestamp. +SOF_TIMESTAMPING_OPT_RX_FILTER: + Filter out spurious receive timestamps: report a receive timestamp + only if the matching timestamp generation flag is enabled. + + Receive timestamps are generated early in the ingress path, before a + packet's destination socket is known. If any socket enables receive + timestamps, packets for all socket will receive timestamped packets. + Including those that request timestamp reporting with + SOF_TIMESTAMPING_SOFTWARE and/or SOF_TIMESTAMPING_RAW_HARDWARE, but + do not request receive timestamp generation. This can happen when + requesting transmit timestamps only. + + Receiving spurious timestamps is generally benign. A process can + ignore the unexpected non-zero value. But it makes behavior subtly + dependent on other sockets. This flag isolates the socket for more + deterministic behavior. + New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate regardless of the setting of sysctl net.core.tstamp_allow_data. diff --git a/Documentation/power/pci.rst b/Documentation/power/pci.rst index e2c1fb8a569a..9ebecb7b00b2 100644 --- a/Documentation/power/pci.rst +++ b/Documentation/power/pci.rst @@ -979,18 +979,17 @@ subsections can be defined as a separate function, it often is convenient to point two or more members of struct dev_pm_ops to the same routine. There are a few convenience macros that can be used for this purpose. -The SIMPLE_DEV_PM_OPS macro declares a struct dev_pm_ops object with one +The DEFINE_SIMPLE_DEV_PM_OPS() declares a struct dev_pm_ops object with one suspend routine pointed to by the .suspend(), .freeze(), and .poweroff() members and one resume routine pointed to by the .resume(), .thaw(), and .restore() members. The other function pointers in this struct dev_pm_ops are unset. -The UNIVERSAL_DEV_PM_OPS macro is similar to SIMPLE_DEV_PM_OPS, but it -additionally sets the .runtime_resume() pointer to the same value as -.resume() (and .thaw(), and .restore()) and the .runtime_suspend() pointer to -the same value as .suspend() (and .freeze() and .poweroff()). +The DEFINE_RUNTIME_DEV_PM_OPS() is similar to DEFINE_SIMPLE_DEV_PM_OPS(), but it +additionally sets the .runtime_resume() pointer to pm_runtime_force_resume() +and the .runtime_suspend() pointer to pm_runtime_force_suspend(). -The SET_SYSTEM_SLEEP_PM_OPS can be used inside of a declaration of struct +The SYSTEM_SLEEP_PM_OPS() can be used inside of a declaration of struct dev_pm_ops to indicate that one suspend routine is to be pointed to by the .suspend(), .freeze(), and .poweroff() members and one resume routine is to be pointed to by the .resume(), .thaw(), and .restore() members. diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst index 5c4e730f38d0..53d1996460ab 100644 --- a/Documentation/power/runtime_pm.rst +++ b/Documentation/power/runtime_pm.rst @@ -811,8 +811,8 @@ subsystem-level dev_pm_ops structure. Device drivers that wish to use the same function as a system suspend, freeze, poweroff and runtime suspend callback, and similarly for system resume, thaw, -restore, and runtime resume, can achieve this with the help of the -UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its +restore, and runtime resume, can achieve similar behaviour with the help of the +DEFINE_RUNTIME_DEV_PM_OPS() defined in include/linux/pm_runtime.h (possibly setting its last argument to NULL). 8. "No-Callback" Devices diff --git a/Documentation/process/coding-style.rst b/Documentation/process/coding-style.rst index 04f6aa377a5d..8e30c8f7697d 100644 --- a/Documentation/process/coding-style.rst +++ b/Documentation/process/coding-style.rst @@ -629,18 +629,6 @@ The preferred style for long (multi-line) comments is: * with beginning and ending almost-blank lines. */ -For files in net/ and drivers/net/ the preferred style for long (multi-line) -comments is a little different. - -.. code-block:: c - - /* The preferred comment style for files in net/ and drivers/net - * looks like this. - * - * It is nearly the same as the generally preferred comment style, - * but there is no initial almost-blank line. - */ - It's also important to comment data, whether they are basic types or derived types. To this end, use just one data declaration per line (no commas for multiple data declarations). This leaves you room for a small comment on each diff --git a/Documentation/process/embargoed-hardware-issues.rst b/Documentation/process/embargoed-hardware-issues.rst index 6e9a4597bf2c..daebce49cfdf 100644 --- a/Documentation/process/embargoed-hardware-issues.rst +++ b/Documentation/process/embargoed-hardware-issues.rst @@ -13,9 +13,9 @@ kernel. Hardware issues like Meltdown, Spectre, L1TF etc. must be treated differently because they usually affect all Operating Systems ("OS") and therefore need coordination across different OS vendors, distributions, -hardware vendors and other parties. For some of the issues, software -mitigations can depend on microcode or firmware updates, which need further -coordination. +silicon vendors, hardware integrators, and other parties. For some of the +issues, software mitigations can depend on microcode or firmware updates, +which need further coordination. .. _Contact: @@ -32,8 +32,8 @@ Linux kernel security team (:ref:`Documentation/admin-guide/ <securitybugs>`) instead. The team can be contacted by email at <hardware-security@kernel.org>. This -is a private list of security officers who will help you to coordinate a -fix according to our documented process. +is a private list of security officers who will help you coordinate a fix +according to our documented process. The list is encrypted and email to the list can be sent by either PGP or S/MIME encrypted and must be signed with the reporter's PGP key or S/MIME @@ -43,7 +43,7 @@ the following URLs: - PGP: https://www.kernel.org/static/files/hardware-security.asc - S/MIME: https://www.kernel.org/static/files/hardware-security.crt -While hardware security issues are often handled by the affected hardware +While hardware security issues are often handled by the affected silicon vendor, we welcome contact from researchers or individuals who have identified a potential hardware flaw. @@ -65,7 +65,7 @@ of Linux Foundation's IT operations personnel technically have the ability to access the embargoed information, but are obliged to confidentiality by their employment contract. Linux Foundation IT personnel are also responsible for operating and managing the rest of -kernel.org infrastructure. +kernel.org's infrastructure. The Linux Foundation's current director of IT Project infrastructure is Konstantin Ryabitsev. @@ -85,7 +85,7 @@ Memorandum of Understanding The Linux kernel community has a deep understanding of the requirement to keep hardware security issues under embargo for coordination between -different OS vendors, distributors, hardware vendors and other parties. +different OS vendors, distributors, silicon vendors, and other parties. The Linux kernel community has successfully handled hardware security issues in the past and has the necessary mechanisms in place to allow @@ -103,11 +103,11 @@ the issue in the best technical way. All involved developers pledge to adhere to the embargo rules and to keep the received information confidential. Violation of the pledge will lead to immediate exclusion from the current issue and removal from all related -mailing-lists. In addition, the hardware security team will also exclude +mailing lists. In addition, the hardware security team will also exclude the offender from future issues. The impact of this consequence is a highly effective deterrent in our community. In case a violation happens the hardware security team will inform the involved parties immediately. If you -or anyone becomes aware of a potential violation, please report it +or anyone else becomes aware of a potential violation, please report it immediately to the Hardware security officers. @@ -124,14 +124,16 @@ method for these types of issues. Start of Disclosure """"""""""""""""""" -Disclosure starts by contacting the Linux kernel hardware security team by -email. This initial contact should contain a description of the problem and -a list of any known affected hardware. If your organization builds or -distributes the affected hardware, we encourage you to also consider what -other hardware could be affected. +Disclosure starts by emailing the Linux kernel hardware security team per +the Contact section above. This initial contact should contain a +description of the problem and a list of any known affected silicon. If +your organization builds or distributes the affected hardware, we encourage +you to also consider what other hardware could be affected. The disclosing +party is responsible for contacting the affected silicon vendors in a +timely manner. The hardware security team will provide an incident-specific encrypted -mailing-list which will be used for initial discussion with the reporter, +mailing list which will be used for initial discussion with the reporter, further disclosure, and coordination of fixes. The hardware security team will provide the disclosing party a list of @@ -158,8 +160,8 @@ This serves several purposes: - The disclosed entities can be contacted to name experts who should participate in the mitigation development. - - If an expert which is required to handle an issue is employed by an - listed entity or member of an listed entity, then the response teams can + - If an expert who is required to handle an issue is employed by a listed + entity or member of an listed entity, then the response teams can request the disclosure of that expert from that entity. This ensures that the expert is also part of the entity's response team. @@ -169,8 +171,8 @@ Disclosure The disclosing party provides detailed information to the initial response team via the specific encrypted mailing-list. -From our experience the technical documentation of these issues is usually -a sufficient starting point and further technical clarification is best +From our experience, the technical documentation of these issues is usually +a sufficient starting point, and further technical clarification is best done via email. Mitigation development @@ -179,57 +181,93 @@ Mitigation development The initial response team sets up an encrypted mailing-list or repurposes an existing one if appropriate. -Using a mailing-list is close to the normal Linux development process and -has been successfully used in developing mitigations for various hardware +Using a mailing list is close to the normal Linux development process and +has been successfully used to develop mitigations for various hardware security issues in the past. -The mailing-list operates in the same way as normal Linux development. -Patches are posted, discussed and reviewed and if agreed on applied to a -non-public git repository which is only accessible to the participating +The mailing list operates in the same way as normal Linux development. +Patches are posted, discussed, and reviewed and if agreed upon, applied to +a non-public git repository which is only accessible to the participating developers via a secure connection. The repository contains the main development branch against the mainline kernel and backport branches for stable kernel versions as necessary. The initial response team will identify further experts from the Linux -kernel developer community as needed. Bringing in experts can happen at any -time of the development process and needs to be handled in a timely manner. +kernel developer community as needed. Any involved party can suggest +further experts to be included, each of which will be subject to the same +requirements outlined above. -If an expert is employed by or member of an entity on the disclosure list +Bringing in experts can happen at any time in the development process and +needs to be handled in a timely manner. + +If an expert is employed by or a member of an entity on the disclosure list provided by the disclosing party, then participation will be requested from the relevant entity. -If not, then the disclosing party will be informed about the experts +If not, then the disclosing party will be informed about the experts' participation. The experts are covered by the Memorandum of Understanding -and the disclosing party is requested to acknowledge the participation. In -case that the disclosing party has a compelling reason to object, then this -objection has to be raised within five work days and resolved with the -incident team immediately. If the disclosing party does not react within -five work days this is taken as silent acknowledgement. +and the disclosing party is requested to acknowledge their participation. +In the case where the disclosing party has a compelling reason to object, +any objection must to be raised within five working days and resolved with +the incident team immediately. If the disclosing party does not react +within five working days this is taken as silent acknowledgment. -After acknowledgement or resolution of an objection the expert is disclosed -by the incident team and brought into the development process. +After the incident team acknowledges or resolves an objection, the expert +is disclosed and brought into the development process. List participants may not communicate about the issue outside of the private mailing list. List participants may not use any shared resources (e.g. employer build farms, CI systems, etc) when working on patches. +Early access +"""""""""""" + +The patches discussed and developed on the list can neither be distributed +to any individual who is not a member of the response team nor to any other +organization. + +To allow the affected silicon vendors to work with their internal teams and +industry partners on testing, validation, and logistics, the following +exception is provided: + + Designated representatives of the affected silicon vendors are + allowed to hand over the patches at any time to the silicon + vendor’s response team. The representative must notify the kernel + response team about the handover. The affected silicon vendor must + have and maintain their own documented security process for any + patches shared with their response team that is consistent with + this policy. + + The silicon vendor’s response team can distribute these patches to + their industry partners and to their internal teams under the + silicon vendor’s documented security process. Feedback from the + industry partners goes back to the silicon vendor and is + communicated by the silicon vendor to the kernel response team. + + The handover to the silicon vendor’s response team removes any + responsibility or liability from the kernel response team regarding + premature disclosure, which happens due to the involvement of the + silicon vendor’s internal teams or industry partners. The silicon + vendor guarantees this release of liability by agreeing to this + process. Coordinated release """"""""""""""""""" -The involved parties will negotiate the date and time where the embargo -ends. At that point the prepared mitigations are integrated into the -relevant kernel trees and published. There is no pre-notification process: -fixes are published in public and available to everyone at the same time. +The involved parties will negotiate the date and time when the embargo +ends. At that point, the prepared mitigations are published into the +relevant kernel trees. There is no pre-notification process: the +mitigations are published in public and available to everyone at the same +time. While we understand that hardware security issues need coordinated embargo -time, the embargo time should be constrained to the minimum time which is -required for all involved parties to develop, test and prepare the +time, the embargo time should be constrained to the minimum time that is +required for all involved parties to develop, test, and prepare their mitigations. Extending embargo time artificially to meet conference talk -dates or other non-technical reasons is creating more work and burden for -the involved developers and response teams as the patches need to be kept -up to date in order to follow the ongoing upstream kernel development, -which might create conflicting changes. +dates or other non-technical reasons creates more work and burden for the +involved developers and response teams as the patches need to be kept up to +date in order to follow the ongoing upstream kernel development, which +might create conflicting changes. CVE assignment """""""""""""" @@ -275,34 +313,35 @@ an involved disclosed party. The current ambassadors list: If you want your organization to be added to the ambassadors list, please contact the hardware security team. The nominated ambassador has to -understand and support our process fully and is ideally well connected in +understand and support our process fully and is ideally well-connected in the Linux kernel community. Encrypted mailing-lists ----------------------- -We use encrypted mailing-lists for communication. The operating principle +We use encrypted mailing lists for communication. The operating principle of these lists is that email sent to the list is encrypted either with the -list's PGP key or with the list's S/MIME certificate. The mailing-list +list's PGP key or with the list's S/MIME certificate. The mailing list software decrypts the email and re-encrypts it individually for each subscriber with the subscriber's PGP key or S/MIME certificate. Details -about the mailing-list software and the setup which is used to ensure the +about the mailing list software and the setup that is used to ensure the security of the lists and protection of the data can be found here: https://korg.wiki.kernel.org/userdoc/remail. List keys ^^^^^^^^^ -For initial contact see :ref:`Contact`. For incident specific mailing-lists -the key and S/MIME certificate are conveyed to the subscribers by email -sent from the specific list. +For initial contact see the :ref:`Contact` section above. For incident +specific mailing lists, the key and S/MIME certificate are conveyed to the +subscribers by email sent from the specific list. -Subscription to incident specific lists +Subscription to incident-specific lists ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Subscription is handled by the response teams. Disclosed parties who want -to participate in the communication send a list of potential subscribers to -the response team so the response team can validate subscription requests. +Subscription to incident-specific lists is handled by the response teams. +Disclosed parties who want to participate in the communication send a list +of potential experts to the response team so the response team can validate +subscription requests. Each subscriber needs to send a subscription request to the response team by email. The email must be signed with the subscriber's PGP key or S/MIME diff --git a/Documentation/process/maintainer-netdev.rst b/Documentation/process/maintainer-netdev.rst index fe8616397d63..c9edf9e7362d 100644 --- a/Documentation/process/maintainer-netdev.rst +++ b/Documentation/process/maintainer-netdev.rst @@ -355,23 +355,6 @@ just do it. As a result, a sequence of smaller series gets merged quicker and with better review coverage. Re-posting large series also increases the mailing list traffic. -Multi-line comments -~~~~~~~~~~~~~~~~~~~ - -Comment style convention is slightly different for networking and most of -the tree. Instead of this:: - - /* - * foobar blah blah blah - * another line of text - */ - -it is requested that you make it look like this:: - - /* foobar blah blah blah - * another line of text - */ - Local variable ordering ("reverse xmas tree", "RCS") ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -392,6 +375,22 @@ When working in existing code which uses nonstandard formatting make your code follow the most recent guidelines, so that eventually all code in the domain of netdev is in the preferred format. +Using device-managed and cleanup.h constructs +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Netdev remains skeptical about promises of all "auto-cleanup" APIs, +including even ``devm_`` helpers, historically. They are not the preferred +style of implementation, merely an acceptable one. + +Use of ``guard()`` is discouraged within any function longer than 20 lines, +``scoped_guard()`` is considered more readable. Using normal lock/unlock is +still (weakly) preferred. + +Low level cleanup constructs (such as ``__free()``) can be used when building +APIs and helpers, especially scoped iterators. However, direct use of +``__free()`` within networking core and drivers is discouraged. +Similar guidance applies to declaring variables mid-function. + Resending after review ~~~~~~~~~~~~~~~~~~~~~~ diff --git a/Documentation/rust/coding-guidelines.rst b/Documentation/rust/coding-guidelines.rst index 05542840b16c..329b070a1d47 100644 --- a/Documentation/rust/coding-guidelines.rst +++ b/Documentation/rust/coding-guidelines.rst @@ -145,32 +145,32 @@ This is how a well-documented Rust function may look like: This example showcases a few ``rustdoc`` features and some conventions followed in the kernel: - - The first paragraph must be a single sentence briefly describing what - the documented item does. Further explanations must go in extra paragraphs. +- The first paragraph must be a single sentence briefly describing what + the documented item does. Further explanations must go in extra paragraphs. - - Unsafe functions must document their safety preconditions under - a ``# Safety`` section. +- Unsafe functions must document their safety preconditions under + a ``# Safety`` section. - - While not shown here, if a function may panic, the conditions under which - that happens must be described under a ``# Panics`` section. +- While not shown here, if a function may panic, the conditions under which + that happens must be described under a ``# Panics`` section. - Please note that panicking should be very rare and used only with a good - reason. In almost all cases, a fallible approach should be used, typically - returning a ``Result``. + Please note that panicking should be very rare and used only with a good + reason. In almost all cases, a fallible approach should be used, typically + returning a ``Result``. - - If providing examples of usage would help readers, they must be written in - a section called ``# Examples``. +- If providing examples of usage would help readers, they must be written in + a section called ``# Examples``. - - Rust items (functions, types, constants...) must be linked appropriately - (``rustdoc`` will create a link automatically). +- Rust items (functions, types, constants...) must be linked appropriately + (``rustdoc`` will create a link automatically). - - Any ``unsafe`` block must be preceded by a ``// SAFETY:`` comment - describing why the code inside is sound. +- Any ``unsafe`` block must be preceded by a ``// SAFETY:`` comment + describing why the code inside is sound. - While sometimes the reason might look trivial and therefore unneeded, - writing these comments is not just a good way of documenting what has been - taken into account, but most importantly, it provides a way to know that - there are no *extra* implicit constraints. + While sometimes the reason might look trivial and therefore unneeded, + writing these comments is not just a good way of documenting what has been + taken into account, but most importantly, it provides a way to know that + there are no *extra* implicit constraints. To learn more about how to write documentation for Rust and extra features, please take a look at the ``rustdoc`` book at: diff --git a/Documentation/rust/quick-start.rst b/Documentation/rust/quick-start.rst index d06a36106cd4..8e3ad9678719 100644 --- a/Documentation/rust/quick-start.rst +++ b/Documentation/rust/quick-start.rst @@ -305,7 +305,7 @@ If GDB/Binutils is used and Rust symbols are not getting demangled, the reason is the toolchain does not support Rust's new v0 mangling scheme yet. There are a few ways out: - - Install a newer release (GDB >= 10.2, Binutils >= 2.36). +- Install a newer release (GDB >= 10.2, Binutils >= 2.36). - - Some versions of GDB (e.g. vanilla GDB 10.1) are able to use - the pre-demangled names embedded in the debug info (``CONFIG_DEBUG_INFO``). +- Some versions of GDB (e.g. vanilla GDB 10.1) are able to use + the pre-demangled names embedded in the debug info (``CONFIG_DEBUG_INFO``). diff --git a/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst b/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst index f02e6cf3516a..74df19be91f6 100644 --- a/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst +++ b/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst @@ -21,9 +21,9 @@ are often referred to as greyscale formats. .. raw:: latex - \scriptsize + \tiny -.. tabularcolumns:: |p{3.6cm}|p{3.0cm}|p{1.3cm}|p{2.6cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}| +.. tabularcolumns:: |p{3.6cm}|p{2.4cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}| .. flat-table:: Luma-Only Image Formats :header-rows: 1 diff --git a/Documentation/virt/hyperv/coco.rst b/Documentation/virt/hyperv/coco.rst new file mode 100644 index 000000000000..c15d6fe34b4e --- /dev/null +++ b/Documentation/virt/hyperv/coco.rst @@ -0,0 +1,260 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Confidential Computing VMs +========================== +Hyper-V can create and run Linux guests that are Confidential Computing +(CoCo) VMs. Such VMs cooperate with the physical processor to better protect +the confidentiality and integrity of data in the VM's memory, even in the +face of a hypervisor/VMM that has been compromised and may behave maliciously. +CoCo VMs on Hyper-V share the generic CoCo VM threat model and security +objectives described in Documentation/security/snp-tdx-threat-model.rst. Note +that Hyper-V specific code in Linux refers to CoCo VMs as "isolated VMs" or +"isolation VMs". + +A Linux CoCo VM on Hyper-V requires the cooperation and interaction of the +following: + +* Physical hardware with a processor that supports CoCo VMs + +* The hardware runs a version of Windows/Hyper-V with support for CoCo VMs + +* The VM runs a version of Linux that supports being a CoCo VM + +The physical hardware requirements are as follows: + +* AMD processor with SEV-SNP. Hyper-V does not run guest VMs with AMD SME, + SEV, or SEV-ES encryption, and such encryption is not sufficient for a CoCo + VM on Hyper-V. + +* Intel processor with TDX + +To create a CoCo VM, the "Isolated VM" attribute must be specified to Hyper-V +when the VM is created. A VM cannot be changed from a CoCo VM to a normal VM, +or vice versa, after it is created. + +Operational Modes +----------------- +Hyper-V CoCo VMs can run in two modes. The mode is selected when the VM is +created and cannot be changed during the life of the VM. + +* Fully-enlightened mode. In this mode, the guest operating system is + enlightened to understand and manage all aspects of running as a CoCo VM. + +* Paravisor mode. In this mode, a paravisor layer between the guest and the + host provides some operations needed to run as a CoCo VM. The guest operating + system can have fewer CoCo enlightenments than is required in the + fully-enlightened case. + +Conceptually, fully-enlightened mode and paravisor mode may be treated as +points on a spectrum spanning the degree of guest enlightenment needed to run +as a CoCo VM. Fully-enlightened mode is one end of the spectrum. A full +implementation of paravisor mode is the other end of the spectrum, where all +aspects of running as a CoCo VM are handled by the paravisor, and a normal +guest OS with no knowledge of memory encryption or other aspects of CoCo VMs +can run successfully. However, the Hyper-V implementation of paravisor mode +does not go this far, and is somewhere in the middle of the spectrum. Some +aspects of CoCo VMs are handled by the Hyper-V paravisor while the guest OS +must be enlightened for other aspects. Unfortunately, there is no +standardized enumeration of feature/functions that might be provided in the +paravisor, and there is no standardized mechanism for a guest OS to query the +paravisor for the feature/functions it provides. The understanding of what +the paravisor provides is hard-coded in the guest OS. + +Paravisor mode has similarities to the `Coconut project`_, which aims to provide +a limited paravisor to provide services to the guest such as a virtual TPM. +However, the Hyper-V paravisor generally handles more aspects of CoCo VMs +than is currently envisioned for Coconut, and so is further toward the "no +guest enlightenments required" end of the spectrum. + +.. _Coconut project: https://github.com/coconut-svsm/svsm + +In the CoCo VM threat model, the paravisor is in the guest security domain +and must be trusted by the guest OS. By implication, the hypervisor/VMM must +protect itself against a potentially malicious paravisor just like it +protects against a potentially malicious guest. + +The hardware architectural approach to fully-enlightened vs. paravisor mode +varies depending on the underlying processor. + +* With AMD SEV-SNP processors, in fully-enlightened mode the guest OS runs in + VMPL 0 and has full control of the guest context. In paravisor mode, the + guest OS runs in VMPL 2 and the paravisor runs in VMPL 0. The paravisor + running in VMPL 0 has privileges that the guest OS in VMPL 2 does not have. + Certain operations require the guest to invoke the paravisor. Furthermore, in + paravisor mode the guest OS operates in "virtual Top Of Memory" (vTOM) mode + as defined by the SEV-SNP architecture. This mode simplifies guest management + of memory encryption when a paravisor is used. + +* With Intel TDX processor, in fully-enlightened mode the guest OS runs in an + L1 VM. In paravisor mode, TD partitioning is used. The paravisor runs in the + L1 VM, and the guest OS runs in a nested L2 VM. + +Hyper-V exposes a synthetic MSR to guests that describes the CoCo mode. This +MSR indicates if the underlying processor uses AMD SEV-SNP or Intel TDX, and +whether a paravisor is being used. It is straightforward to build a single +kernel image that can boot and run properly on either architecture, and in +either mode. + +Paravisor Effects +----------------- +Running in paravisor mode affects the following areas of generic Linux kernel +CoCo VM functionality: + +* Initial guest memory setup. When a new VM is created in paravisor mode, the + paravisor runs first and sets up the guest physical memory as encrypted. The + guest Linux does normal memory initialization, except for explicitly marking + appropriate ranges as decrypted (shared). In paravisor mode, Linux does not + perform the early boot memory setup steps that are particularly tricky with + AMD SEV-SNP in fully-enlightened mode. + +* #VC/#VE exception handling. In paravisor mode, Hyper-V configures the guest + CoCo VM to route #VC and #VE exceptions to VMPL 0 and the L1 VM, + respectively, and not the guest Linux. Consequently, these exception handlers + do not run in the guest Linux and are not a required enlightenment for a + Linux guest in paravisor mode. + +* CPUID flags. Both AMD SEV-SNP and Intel TDX provide a CPUID flag in the + guest indicating that the VM is operating with the respective hardware + support. While these CPUID flags are visible in fully-enlightened CoCo VMs, + the paravisor filters out these flags and the guest Linux does not see them. + Throughout the Linux kernel, explicitly testing these flags has mostly been + eliminated in favor of the cc_platform_has() function, with the goal of + abstracting the differences between SEV-SNP and TDX. But the + cc_platform_has() abstraction also allows the Hyper-V paravisor configuration + to selectively enable aspects of CoCo VM functionality even when the CPUID + flags are not set. The exception is early boot memory setup on SEV-SNP, which + tests the CPUID SEV-SNP flag. But not having the flag in Hyper-V paravisor + mode VM achieves the desired effect or not running SEV-SNP specific early + boot memory setup. + +* Device emulation. In paravisor mode, the Hyper-V paravisor provides + emulation of devices such as the IO-APIC and TPM. Because the emulation + happens in the paravisor in the guest context (instead of the hypervisor/VMM + context), MMIO accesses to these devices must be encrypted references instead + of the decrypted references that would be used in a fully-enlightened CoCo + VM. The __ioremap_caller() function has been enhanced to make a callback to + check whether a particular address range should be treated as encrypted + (private). See the "is_private_mmio" callback. + +* Encrypt/decrypt memory transitions. In a CoCo VM, transitioning guest + memory between encrypted and decrypted requires coordinating with the + hypervisor/VMM. This is done via callbacks invoked from + __set_memory_enc_pgtable(). In fully-enlightened mode, the normal SEV-SNP and + TDX implementations of these callbacks are used. In paravisor mode, a Hyper-V + specific set of callbacks is used. These callbacks invoke the paravisor so + that the paravisor can coordinate the transitions and inform the hypervisor + as necessary. See hv_vtom_init() where these callback are set up. + +* Interrupt injection. In fully enlightened mode, a malicious hypervisor + could inject interrupts into the guest OS at times that violate x86/x64 + architectural rules. For full protection, the guest OS should include + enlightenments that use the interrupt injection management features provided + by CoCo-capable processors. In paravisor mode, the paravisor mediates + interrupt injection into the guest OS, and ensures that the guest OS only + sees interrupts that are "legal". The paravisor uses the interrupt injection + management features provided by the CoCo-capable physical processor, thereby + masking these complexities from the guest OS. + +Hyper-V Hypercalls +------------------ +When in fully-enlightened mode, hypercalls made by the Linux guest are routed +directly to the hypervisor, just as in a non-CoCo VM. But in paravisor mode, +normal hypercalls trap to the paravisor first, which may in turn invoke the +hypervisor. But the paravisor is idiosyncratic in this regard, and a few +hypercalls made by the Linux guest must always be routed directly to the +hypervisor. These hypercall sites test for a paravisor being present, and use +a special invocation sequence. See hv_post_message(), for example. + +Guest communication with Hyper-V +-------------------------------- +Separate from the generic Linux kernel handling of memory encryption in Linux +CoCo VMs, Hyper-V has VMBus and VMBus devices that communicate using memory +shared between the Linux guest and the host. This shared memory must be +marked decrypted to enable communication. Furthermore, since the threat model +includes a compromised and potentially malicious host, the guest must guard +against leaking any unintended data to the host through this shared memory. + +These Hyper-V and VMBus memory pages are marked as decrypted: + +* VMBus monitor pages + +* Synthetic interrupt controller (synic) related pages (unless supplied by + the paravisor) + +* Per-cpu hypercall input and output pages (unless running with a paravisor) + +* VMBus ring buffers. The direct mapping is marked decrypted in + __vmbus_establish_gpadl(). The secondary mapping created in + hv_ringbuffer_init() must also include the "decrypted" attribute. + +When the guest writes data to memory that is shared with the host, it must +ensure that only the intended data is written. Padding or unused fields must +be initialized to zeros before copying into the shared memory so that random +kernel data is not inadvertently given to the host. + +Similarly, when the guest reads memory that is shared with the host, it must +validate the data before acting on it so that a malicious host cannot induce +the guest to expose unintended data. Doing such validation can be tricky +because the host can modify the shared memory areas even while or after +validation is performed. For messages passed from the host to the guest in a +VMBus ring buffer, the length of the message is validated, and the message is +copied into a temporary (encrypted) buffer for further validation and +processing. The copying adds a small amount of overhead, but is the only way +to protect against a malicious host. See hv_pkt_iter_first(). + +Many drivers for VMBus devices have been "hardened" by adding code to fully +validate messages received over VMBus, instead of assuming that Hyper-V is +acting cooperatively. Such drivers are marked as "allowed_in_isolated" in the +vmbus_devs[] table. Other drivers for VMBus devices that are not needed in a +CoCo VM have not been hardened, and they are not allowed to load in a CoCo +VM. See vmbus_is_valid_offer() where such devices are excluded. + +Two VMBus devices depend on the Hyper-V host to do DMA data transfers: +storvsc for disk I/O and netvsc for network I/O. storvsc uses the normal +Linux kernel DMA APIs, and so bounce buffering through decrypted swiotlb +memory is done implicitly. netvsc has two modes for data transfers. The first +mode goes through send and receive buffer space that is explicitly allocated +by the netvsc driver, and is used for most smaller packets. These send and +receive buffers are marked decrypted by __vmbus_establish_gpadl(). Because +the netvsc driver explicitly copies packets to/from these buffers, the +equivalent of bounce buffering between encrypted and decrypted memory is +already part of the data path. The second mode uses the normal Linux kernel +DMA APIs, and is bounce buffered through swiotlb memory implicitly like in +storvsc. + +Finally, the VMBus virtual PCI driver needs special handling in a CoCo VM. +Linux PCI device drivers access PCI config space using standard APIs provided +by the Linux PCI subsystem. On Hyper-V, these functions directly access MMIO +space, and the access traps to Hyper-V for emulation. But in CoCo VMs, memory +encryption prevents Hyper-V from reading the guest instruction stream to +emulate the access. So in a CoCo VM, these functions must make a hypercall +with arguments explicitly describing the access. See +_hv_pcifront_read_config() and _hv_pcifront_write_config() and the +"use_calls" flag indicating to use hypercalls. + +load_unaligned_zeropad() +------------------------ +When transitioning memory between encrypted and decrypted, the caller of +set_memory_encrypted() or set_memory_decrypted() is responsible for ensuring +the memory isn't in use and isn't referenced while the transition is in +progress. The transition has multiple steps, and includes interaction with +the Hyper-V host. The memory is in an inconsistent state until all steps are +complete. A reference while the state is inconsistent could result in an +exception that can't be cleanly fixed up. + +However, the kernel load_unaligned_zeropad() mechanism may make stray +references that can't be prevented by the caller of set_memory_encrypted() or +set_memory_decrypted(), so there's specific code in the #VC or #VE exception +handler to fixup this case. But a CoCo VM running on Hyper-V may be +configured to run with a paravisor, with the #VC or #VE exception routed to +the paravisor. There's no architectural way to forward the exceptions back to +the guest kernel, and in such a case, the load_unaligned_zeropad() fixup code +in the #VC/#VE handlers doesn't run. + +To avoid this problem, the Hyper-V specific functions for notifying the +hypervisor of the transition mark pages as "not present" while a transition +is in progress. If load_unaligned_zeropad() causes a stray reference, a +normal page fault is generated instead of #VC or #VE, and the page-fault- +based handlers for load_unaligned_zeropad() fixup the reference. When the +encrypted/decrypted transition is complete, the pages are marked as "present" +again. See hv_vtom_clear_present() and hv_vtom_set_host_visibility(). diff --git a/Documentation/virt/hyperv/index.rst b/Documentation/virt/hyperv/index.rst index de447e11b4a5..79bc4080329e 100644 --- a/Documentation/virt/hyperv/index.rst +++ b/Documentation/virt/hyperv/index.rst @@ -11,3 +11,4 @@ Hyper-V Enlightenments vmbus clocks vpci + coco diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index fe722c5dada9..b3be87489108 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -2592,7 +2592,7 @@ Specifically: 0x6030 0000 0010 004a SPSR_ABT 64 spsr[KVM_SPSR_ABT] 0x6030 0000 0010 004c SPSR_UND 64 spsr[KVM_SPSR_UND] 0x6030 0000 0010 004e SPSR_IRQ 64 spsr[KVM_SPSR_IRQ] - 0x6060 0000 0010 0050 SPSR_FIQ 64 spsr[KVM_SPSR_FIQ] + 0x6030 0000 0010 0050 SPSR_FIQ 64 spsr[KVM_SPSR_FIQ] 0x6040 0000 0010 0054 V0 128 fp_regs.vregs[0] [1]_ 0x6040 0000 0010 0058 V1 128 fp_regs.vregs[1] [1]_ ... @@ -6368,7 +6368,7 @@ a single guest_memfd file, but the bound ranges must not overlap). See KVM_SET_USER_MEMORY_REGION2 for additional details. 4.143 KVM_PRE_FAULT_MEMORY ------------------------- +--------------------------- :Capability: KVM_CAP_PRE_FAULT_MEMORY :Architectures: none @@ -6405,6 +6405,12 @@ for the current vCPU state. KVM maps memory as if the vCPU generated a stage-2 read page fault, e.g. faults in memory as needed, but doesn't break CoW. However, KVM does not mark any newly created stage-2 PTE as Accessed. +In the case of confidential VM types where there is an initial set up of +private guest memory before the guest is 'finalized'/measured, this ioctl +should only be issued after completing all the necessary setup to put the +guest into a 'finalized' state so that the above semantics can be reliably +ensured. + In some cases, multiple vCPUs might share the page tables. In this case, the ioctl can be called in parallel. diff --git a/Documentation/virt/kvm/arm/hypercalls.rst b/Documentation/virt/kvm/arm/hypercalls.rst index 17be111f493f..af7bc2c2e0cb 100644 --- a/Documentation/virt/kvm/arm/hypercalls.rst +++ b/Documentation/virt/kvm/arm/hypercalls.rst @@ -44,3 +44,101 @@ Provides a discovery mechanism for other KVM/arm64 hypercalls. ---------------------------------------- See ptp_kvm.rst + +``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO`` +---------------------------------- + +Query the memory protection parameters for a pKVM protected virtual machine. + ++---------------------+-------------------------------------------------------------+ +| Presence: | Optional; pKVM protected guests only. | ++---------------------+-------------------------------------------------------------+ +| Calling convention: | HVC64 | ++---------------------+----------+--------------------------------------------------+ +| Function ID: | (uint32) | 0xC6000002 | ++---------------------+----------+----+---------------------------------------------+ +| Arguments: | (uint64) | R1 | Reserved / Must be zero | +| +----------+----+---------------------------------------------+ +| | (uint64) | R2 | Reserved / Must be zero | +| +----------+----+---------------------------------------------+ +| | (uint64) | R3 | Reserved / Must be zero | ++---------------------+----------+----+---------------------------------------------+ +| Return Values: | (int64) | R0 | ``INVALID_PARAMETER (-3)`` on error, else | +| | | | memory protection granule in bytes | ++---------------------+----------+----+---------------------------------------------+ + +``ARM_SMCCC_KVM_FUNC_MEM_SHARE`` +-------------------------------- + +Share a region of memory with the KVM host, granting it read, write and execute +permissions. The size of the region is equal to the memory protection granule +advertised by ``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO``. + ++---------------------+-------------------------------------------------------------+ +| Presence: | Optional; pKVM protected guests only. | ++---------------------+-------------------------------------------------------------+ +| Calling convention: | HVC64 | ++---------------------+----------+--------------------------------------------------+ +| Function ID: | (uint32) | 0xC6000003 | ++---------------------+----------+----+---------------------------------------------+ +| Arguments: | (uint64) | R1 | Base IPA of memory region to share | +| +----------+----+---------------------------------------------+ +| | (uint64) | R2 | Reserved / Must be zero | +| +----------+----+---------------------------------------------+ +| | (uint64) | R3 | Reserved / Must be zero | ++---------------------+----------+----+---------------------------------------------+ +| Return Values: | (int64) | R0 | ``SUCCESS (0)`` | +| | | +---------------------------------------------+ +| | | | ``INVALID_PARAMETER (-3)`` | ++---------------------+----------+----+---------------------------------------------+ + +``ARM_SMCCC_KVM_FUNC_MEM_UNSHARE`` +---------------------------------- + +Revoke access permission from the KVM host to a memory region previously shared +with ``ARM_SMCCC_KVM_FUNC_MEM_SHARE``. The size of the region is equal to the +memory protection granule advertised by ``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO``. + ++---------------------+-------------------------------------------------------------+ +| Presence: | Optional; pKVM protected guests only. | ++---------------------+-------------------------------------------------------------+ +| Calling convention: | HVC64 | ++---------------------+----------+--------------------------------------------------+ +| Function ID: | (uint32) | 0xC6000004 | ++---------------------+----------+----+---------------------------------------------+ +| Arguments: | (uint64) | R1 | Base IPA of memory region to unshare | +| +----------+----+---------------------------------------------+ +| | (uint64) | R2 | Reserved / Must be zero | +| +----------+----+---------------------------------------------+ +| | (uint64) | R3 | Reserved / Must be zero | ++---------------------+----------+----+---------------------------------------------+ +| Return Values: | (int64) | R0 | ``SUCCESS (0)`` | +| | | +---------------------------------------------+ +| | | | ``INVALID_PARAMETER (-3)`` | ++---------------------+----------+----+---------------------------------------------+ + +``ARM_SMCCC_KVM_FUNC_MMIO_GUARD`` +---------------------------------- + +Request that a given memory region is handled as MMIO by the hypervisor, +allowing accesses to this region to be emulated by the KVM host. The size of the +region is equal to the memory protection granule advertised by +``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO``. + ++---------------------+-------------------------------------------------------------+ +| Presence: | Optional; pKVM protected guests only. | ++---------------------+-------------------------------------------------------------+ +| Calling convention: | HVC64 | ++---------------------+----------+--------------------------------------------------+ +| Function ID: | (uint32) | 0xC6000007 | ++---------------------+----------+----+---------------------------------------------+ +| Arguments: | (uint64) | R1 | Base IPA of MMIO memory region | +| +----------+----+---------------------------------------------+ +| | (uint64) | R2 | Reserved / Must be zero | +| +----------+----+---------------------------------------------+ +| | (uint64) | R3 | Reserved / Must be zero | ++---------------------+----------+----+---------------------------------------------+ +| Return Values: | (int64) | R0 | ``SUCCESS (0)`` | +| | | +---------------------------------------------+ +| | | | ``INVALID_PARAMETER (-3)`` | ++---------------------+----------+----+---------------------------------------------+ diff --git a/Documentation/wmi/devices/msi-wmi-platform.rst b/Documentation/wmi/devices/msi-wmi-platform.rst index 29b1b2e6d42c..31a136942892 100644 --- a/Documentation/wmi/devices/msi-wmi-platform.rst +++ b/Documentation/wmi/devices/msi-wmi-platform.rst @@ -130,12 +130,12 @@ data using the `bmfdec <https://github.com/pali/bmfdec>`_ utility: Due to a peculiarity in how Windows handles the ``CreateByteField()`` ACPI operator (errors only happen when a invalid byte field is ultimately accessed), all methods require a 32 byte input -buffer, even if the Binay MOF says otherwise. +buffer, even if the Binary MOF says otherwise. The input buffer contains a single byte to select the subfeature to be accessed and 31 bytes of input data, the meaning of which depends on the subfeature being accessed. -The output buffer contains a singe byte which signals success or failure (``0x00`` on failure) +The output buffer contains a single byte which signals success or failure (``0x00`` on failure) and 31 bytes of output data, the meaning if which depends on the subfeature being accessed. WMI method Get_EC() @@ -147,7 +147,7 @@ data contains a flag byte and a 28 byte controller firmware version string. The first 4 bits of the flag byte contain the minor version of the embedded controller interface, with the next 2 bits containing the major version of the embedded controller interface. -The 7th bit signals if the embedded controller page chaged (exact meaning is unknown), and the +The 7th bit signals if the embedded controller page changed (exact meaning is unknown), and the last bit signals if the platform is a Tigerlake platform. The MSI software seems to only use this interface when the last bit is set. |
