aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-19rtnetlink: add needed_{head,tail}room attributesAlasdair McWilliam1-1/+9
Various network interface types make use of needed_{head,tail}room values to efficiently reserve buffer space for additional encapsulation headers, such as VXLAN, Geneve, IPSec, etc. However, it is not currently possible to query these values in a generic way. Introduce ability to query the needed_{head,tail}room values of a network device via rtnetlink, such that applications that may wish to use these values can do so. For example, Cilium agent iterates over present devices based on user config (direct routing, vxlan, geneve, wireguard etc.) and in future will configure netkit in order to expose the needed_{head,tail}room into K8s pods. See b9ed315d3c4c ("netkit: Allow for configuring needed_{head,tail}room"). Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alasdair McWilliam <alasdair@mcwilliam.dev> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20250917095543.14039-1-alasdair@mcwilliam.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-19psp: clarify checksum behavior of psp_dev_rcv()Daniel Zahka1-1/+2
psp_dev_rcv() decapsulates psp headers from a received frame. This will make any csum complete computed by the device inaccurate. Rather than attempt to patch up skb->csum in psp_dev_rcv() just make it clear to callers what they can expect regarding checksum complete. Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20250918212723.17495-1-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-19tcp: prefer sk_skb_reason_drop()Eric Dumazet2-2/+2
Replace two calls to kfree_skb_reason() with sk_skb_reason_drop(). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250918132007.325299-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-19net/smc: fix warning in smc_rx_splice() when calling get_page()Sidraya Jayagond1-5/+9
smc_lo_register_dmb() allocates DMB buffers with kzalloc(), which are later passed to get_page() in smc_rx_splice(). Since kmalloc memory is not page-backed, this triggers WARN_ON_ONCE() in get_page() and prevents holding a refcount on the buffer. This can lead to use-after-free if the memory is released before splice_to_pipe() completes. Use folio_alloc() instead, ensuring DMBs are page-backed and safe for get_page(). WARNING: CPU: 18 PID: 12152 at ./include/linux/mm.h:1330 smc_rx_splice+0xaf8/0xe20 [smc] CPU: 18 UID: 0 PID: 12152 Comm: smcapp Kdump: loaded Not tainted 6.17.0-rc3-11705-g9cf4672ecfee #10 NONE Hardware name: IBM 3931 A01 704 (z/VM 7.4.0) Krnl PSW : 0704e00180000000 000793161032696c (smc_rx_splice+0xafc/0xe20 [smc]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000000 001cee80007d3001 00077400000000f8 0000000000000005 0000000000000001 001cee80007d3006 0007740000001000 001c000000000000 000000009b0c99e0 0000000000001000 001c0000000000f8 001c000000000000 000003ffcc6f7c88 0007740003e98000 0007931600000005 000792969b2ff7b8 Krnl Code: 0007931610326960: af000000 mc 0,0 0007931610326964: a7f4ff43 brc 15,00079316103267ea #0007931610326968: af000000 mc 0,0 >000793161032696c: a7f4ff3f brc 15,00079316103267ea 0007931610326970: e320f1000004 lg %r2,256(%r15) 0007931610326976: c0e53fd1b5f5 brasl %r14,000793168fd5d560 000793161032697c: a7f4fbb5 brc 15,00079316103260e6 0007931610326980: b904002b lgr %r2,%r11 Call Trace: smc_rx_splice+0xafc/0xe20 [smc] smc_rx_splice+0x756/0xe20 [smc]) smc_rx_recvmsg+0xa74/0xe00 [smc] smc_splice_read+0x1ce/0x3b0 [smc] sock_splice_read+0xa2/0xf0 do_splice_read+0x198/0x240 splice_file_to_pipe+0x7e/0x110 do_splice+0x59e/0xde0 __do_splice+0x11a/0x2d0 __s390x_sys_splice+0x140/0x1f0 __do_syscall+0x122/0x280 system_call+0x6e/0x90 Last Breaking-Event-Address: smc_rx_splice+0x960/0xe20 [smc] ---[ end trace 0000000000000000 ]--- Fixes: f7a22071dbf3 ("net/smc: implement DMB-related operations of loopback-ism") Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com> Signed-off-by: Sidraya Jayagond <sidraya@linux.ibm.com> Link: https://patch.msgid.link/20250917184220.801066-1-sidraya@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-19can: raw: reorder struct raw_sock's members to optimise packingVincent Mailhol1-2/+2
struct raw_sock has several holes. Reorder the fields to save 8 bytes. Statistics before: $ pahole --class_name=raw_sock net/can/raw.o struct raw_sock { struct sock sk __attribute__((__aligned__(8))); /* 0 776 */ /* XXX last struct has 1 bit hole */ /* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */ int ifindex; /* 776 4 */ /* XXX 4 bytes hole, try to pack */ struct net_device * dev; /* 784 8 */ netdevice_tracker dev_tracker; /* 792 0 */ struct list_head notifier; /* 792 16 */ unsigned int bound:1; /* 808: 0 4 */ unsigned int loopback:1; /* 808: 1 4 */ unsigned int recv_own_msgs:1; /* 808: 2 4 */ unsigned int fd_frames:1; /* 808: 3 4 */ unsigned int xl_frames:1; /* 808: 4 4 */ unsigned int join_filters:1; /* 808: 5 4 */ /* XXX 2 bits hole, try to pack */ /* Bitfield combined with next fields */ struct can_raw_vcid_options raw_vcid_opts; /* 809 4 */ /* XXX 3 bytes hole, try to pack */ canid_t tx_vcid_shifted; /* 816 4 */ canid_t rx_vcid_shifted; /* 820 4 */ canid_t rx_vcid_mask_shifted; /* 824 4 */ int count; /* 828 4 */ /* --- cacheline 13 boundary (832 bytes) --- */ struct can_filter dfilter; /* 832 8 */ struct can_filter * filter; /* 840 8 */ can_err_mask_t err_mask; /* 848 4 */ /* XXX 4 bytes hole, try to pack */ struct uniqframe * uniq; /* 856 8 */ /* size: 864, cachelines: 14, members: 20 */ /* sum members: 852, holes: 3, sum holes: 11 */ /* sum bitfield members: 6 bits, bit holes: 1, sum bit holes: 2 bits */ /* member types with bit holes: 1, total: 1 */ /* forced alignments: 1 */ /* last cacheline: 32 bytes */ } __attribute__((__aligned__(8))); ...and after: $ pahole --class_name=raw_sock net/can/raw.o struct raw_sock { struct sock sk __attribute__((__aligned__(8))); /* 0 776 */ /* XXX last struct has 1 bit hole */ /* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */ struct net_device * dev; /* 776 8 */ netdevice_tracker dev_tracker; /* 784 0 */ struct list_head notifier; /* 784 16 */ int ifindex; /* 800 4 */ unsigned int bound:1; /* 804: 0 4 */ unsigned int loopback:1; /* 804: 1 4 */ unsigned int recv_own_msgs:1; /* 804: 2 4 */ unsigned int fd_frames:1; /* 804: 3 4 */ unsigned int xl_frames:1; /* 804: 4 4 */ unsigned int join_filters:1; /* 804: 5 4 */ /* XXX 2 bits hole, try to pack */ /* Bitfield combined with next fields */ struct can_raw_vcid_options raw_vcid_opts; /* 805 4 */ /* XXX 3 bytes hole, try to pack */ canid_t tx_vcid_shifted; /* 812 4 */ canid_t rx_vcid_shifted; /* 816 4 */ canid_t rx_vcid_mask_shifted; /* 820 4 */ can_err_mask_t err_mask; /* 824 4 */ int count; /* 828 4 */ /* --- cacheline 13 boundary (832 bytes) --- */ struct can_filter dfilter; /* 832 8 */ struct can_filter * filter; /* 840 8 */ struct uniqframe * uniq; /* 848 8 */ /* size: 856, cachelines: 14, members: 20 */ /* sum members: 852, holes: 1, sum holes: 3 */ /* sum bitfield members: 6 bits, bit holes: 1, sum bit holes: 2 bits */ /* member types with bit holes: 1, total: 1 */ /* forced alignments: 1 */ /* last cacheline: 24 bytes */ } __attribute__((__aligned__(8))); Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250917-can-raw-repack-v2-3-395e8b3a4437@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-19can: raw: use bitfields to store flags in struct raw_sockVincent Mailhol1-24/+35
The bound, loopback, recv_own_msgs, fd_frames, xl_frames and join_filters fields of struct raw_sock just need to store one bit of information. Declare all those members as a bitfields of type unsigned int and width one bit. Add a temporary variable to raw_setsockopt() and raw_getsockopt() to make the conversion between the stored bits and the socket interface. This reduces the size of struct raw_sock by sixteen bytes. Statistics before: $ pahole --class_name=raw_sock net/can/raw.o struct raw_sock { struct sock sk __attribute__((__aligned__(8))); /* 0 776 */ /* XXX last struct has 1 bit hole */ /* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */ int bound; /* 776 4 */ int ifindex; /* 780 4 */ struct net_device * dev; /* 784 8 */ netdevice_tracker dev_tracker; /* 792 0 */ struct list_head notifier; /* 792 16 */ int loopback; /* 808 4 */ int recv_own_msgs; /* 812 4 */ int fd_frames; /* 816 4 */ int xl_frames; /* 820 4 */ struct can_raw_vcid_options raw_vcid_opts; /* 824 4 */ canid_t tx_vcid_shifted; /* 828 4 */ /* --- cacheline 13 boundary (832 bytes) --- */ canid_t rx_vcid_shifted; /* 832 4 */ canid_t rx_vcid_mask_shifted; /* 836 4 */ int join_filters; /* 840 4 */ int count; /* 844 4 */ struct can_filter dfilter; /* 848 8 */ struct can_filter * filter; /* 856 8 */ can_err_mask_t err_mask; /* 864 4 */ /* XXX 4 bytes hole, try to pack */ struct uniqframe * uniq; /* 872 8 */ /* size: 880, cachelines: 14, members: 20 */ /* sum members: 876, holes: 1, sum holes: 4 */ /* member types with bit holes: 1, total: 1 */ /* forced alignments: 1 */ /* last cacheline: 48 bytes */ } __attribute__((__aligned__(8))); ...and after: $ pahole --class_name=raw_sock net/can/raw.o struct raw_sock { struct sock sk __attribute__((__aligned__(8))); /* 0 776 */ /* XXX last struct has 1 bit hole */ /* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */ int ifindex; /* 776 4 */ /* XXX 4 bytes hole, try to pack */ struct net_device * dev; /* 784 8 */ netdevice_tracker dev_tracker; /* 792 0 */ struct list_head notifier; /* 792 16 */ unsigned int bound:1; /* 808: 0 4 */ unsigned int loopback:1; /* 808: 1 4 */ unsigned int recv_own_msgs:1; /* 808: 2 4 */ unsigned int fd_frames:1; /* 808: 3 4 */ unsigned int xl_frames:1; /* 808: 4 4 */ unsigned int join_filters:1; /* 808: 5 4 */ /* XXX 2 bits hole, try to pack */ /* Bitfield combined with next fields */ struct can_raw_vcid_options raw_vcid_opts; /* 809 4 */ /* XXX 3 bytes hole, try to pack */ canid_t tx_vcid_shifted; /* 816 4 */ canid_t rx_vcid_shifted; /* 820 4 */ canid_t rx_vcid_mask_shifted; /* 824 4 */ int count; /* 828 4 */ /* --- cacheline 13 boundary (832 bytes) --- */ struct can_filter dfilter; /* 832 8 */ struct can_filter * filter; /* 840 8 */ can_err_mask_t err_mask; /* 848 4 */ /* XXX 4 bytes hole, try to pack */ struct uniqframe * uniq; /* 856 8 */ /* size: 864, cachelines: 14, members: 20 */ /* sum members: 852, holes: 3, sum holes: 11 */ /* sum bitfield members: 6 bits, bit holes: 1, sum bit holes: 2 bits */ /* member types with bit holes: 1, total: 1 */ /* forced alignments: 1 */ /* last cacheline: 32 bytes */ } __attribute__((__aligned__(8))); Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250917-can-raw-repack-v2-2-395e8b3a4437@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-19can: raw: reorder struct uniqframe's members to optimise packingVincent Mailhol1-1/+1
struct uniqframe has one hole. Reorder the fields to save 8 bytes. Statistics before: $ pahole --class_name=uniqframe net/can/raw.o struct uniqframe { int skbcnt; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ const struct sk_buff * skb; /* 8 8 */ unsigned int join_rx_count; /* 16 4 */ /* size: 24, cachelines: 1, members: 3 */ /* sum members: 16, holes: 1, sum holes: 4 */ /* padding: 4 */ /* last cacheline: 24 bytes */ }; ...and after: $ pahole --class_name=uniqframe net/can/raw.o struct uniqframe { const struct sk_buff * skb; /* 0 8 */ int skbcnt; /* 8 4 */ unsigned int join_rx_count; /* 12 4 */ /* size: 16, cachelines: 1, members: 3 */ /* last cacheline: 16 bytes */ }; Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20250917-can-raw-repack-v2-1-395e8b3a4437@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-09-19ipv4: use check_net()Christian Brauner2-3/+3
Don't directly acess the namespace count. There's even a dedicated helper for this. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19net: use check_net()Christian Brauner1-1/+1
Don't directly acess the namespace count. There's even a dedicated helper for this. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19net-sysfs: use check_net()Christian Brauner1-3/+3
Don't directly acess the namespace count. There's even a dedicated helper for this. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: add ns_common_free()Christian Brauner1-2/+2
And drop ns_free_inum(). Anything common that can be wasted centrally should be wasted in the new common helper. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19mptcp: reset blackhole on success with non-loopback ifacesMatthieu Baerts (NGI0)1-1/+1
When a first MPTCP connection gets successfully established after a blackhole period, 'active_disable_times' was supposed to be reset when this connection was done via any non-loopback interfaces. Unfortunately, the opposite condition was checked: only reset when the connection was established via a loopback interface. Fixing this by simply looking at the opposite. This is similar to what is done with TCP FastOpen, see tcp_fastopen_active_disable_ofo_check(). This patch is a follow-up of a previous discussion linked to commit 893c49a78d9f ("mptcp: Use __sk_dst_get() and dst_dev_rcu() in mptcp_active_enable()."), see [1]. Fixes: 27069e7cb3d1 ("mptcp: disable active MPTCP in case of blackhole") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/4209a283-8822-47bd-95b7-87e96d9b7ea3@kernel.org [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250918-net-next-mptcp-blackhole-reset-loopback-v1-1-bf5818326639@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-19psp: do not use sk_dst_get() in psp_dev_get_for_sock()Eric Dumazet1-10/+7
Use __sk_dst_get() and dst_dev_rcu(), because dst->dev could be changed under us. Fixes: 6b46ca260e22 ("net: psp: add socket security association code") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Tested-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250918115238.237475-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-19nscommon: simplify initializationChristian Brauner1-1/+1
There's a lot of information that namespace implementers don't need to know about at all. Encapsulate this all in the initialization helper. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19net: centralize ns_common initializationChristian Brauner1-20/+3
Centralize ns_common initialization. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: add to_<type>_ns() to respective headersChristian Brauner1-5/+0
Every namespace type has a container_of(ns, <ns_type>, ns) static inline function that is currently not exposed in the header. So we have a bunch of places that open-code it via container_of(). Move it to the headers so we can use it directly. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19net: support ns lookupChristian Brauner1-2/+6
Support the generic ns lookup infrastructure to support file handles for namespaces. The network namespace has a separate list with different lifetime rules which we can just leave in tact. We have a similar concept for mount namespaces as well where it is on two differenet lists for different purposes. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19Merge branch 'no-rebase-mnt_ns_tree_remove'Christian Brauner61-263/+555
Bring in the fix for removing a mount namespace from the mount namespace rbtree and list. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19net: use ns_common_init()Christian Brauner1-13/+33
Don't cargo-cult the same thing over and over. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19wifi: mac80211: fix Rx packet handling when pubsta information is not availableAditya Kumar Singh1-6/+22
In ieee80211_rx_handle_packet(), if the caller does not provide pubsta information, an attempt is made to find the station using the address 2 (source address) field in the header. Since pubsta is missing, link information such as link_valid and link_id is also unavailable. Now if such a situation comes, and if a matching ML station entry is found based on the source address, currently the packet is dropped due to missing link ID in the status field which is not correct. Hence, to fix this issue, if link_valid is not set and the station is an ML station, make an attempt to find a link station entry using the source address. If a valid link station is found, derive the link ID and proceed with packet processing. Otherwise, drop the packet as per the existing flow. Fixes: ea9d807b5642 ("wifi: mac80211: add link information in ieee80211_rx_status") Suggested-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com> Link: https://patch.msgid.link/20250917-fix_data_packet_rx_with_mlo_and_no_pubsta-v1-1-8cf971a958ac@oss.qualcomm.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: cfg80211: remove ieee80211_s1g_channel_widthLachlan Hodges1-27/+0
With the introduction of proper S1G channel flags, this function is no longer used. Remove it. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250918051913.500781-4-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: correctly initialise S1G chandef for STALachlan Hodges5-22/+92
When moving to the APs channel, ensure we correctly initialise the chandef and perform the required validation. Additionally, if the AP is beaconing on a 2MHz primary, calculate the 2MHz primary center frequency by extracting the sibling 1MHz primary and averaging the frequencies to find the 2MHz primary center frequency. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250918051913.500781-3-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: cfg80211: correctly implement and validate S1G chandefLachlan Hodges3-101/+115
Currently, the S1G channelisation implementation differs from that of VHT, which is the PHY that S1G is based on. The major difference between the clock rate is 1/10th of VHT. However how their channelisation is represented within cfg80211 and mac80211 vastly differ. To rectify this, remove the use of IEEE80211_CHAN_1/2/4.. flags that were previously used to indicate the control channel width, however it should be implied that the control channels are 1MHz in the case of S1G. Additionally, introduce the invert - being IEEE80211_CHAN_NO_4/8/16MHz - that imply the control channel may not be used for a certain bandwidth. With these new flags, we can perform regulatory and chandef validation just as we would for VHT. To deal with the notion that S1G PHYs may contain a 2MHz primary channel, introduce a new variable, s1g_primary_2mhz, which indicates whether we are operating on a 2MHz primary channel. In this case, the chandef::chan points to the 1MHz primary channel pointed to by the primary channel location. Alongside this, introduce some new helper routines that can extract the sibling 1MHz channel. The sibling being the alternate 1MHz primary subchannel within the 2MHz primary channel that is not pointed to by chandef::chan. Furthermore, due to unique restrictions imposed on S1G PHYs, introduce a new flag, IEEE80211_CHAN_S1G_NO_PRIMARY, which states that the 1MHz channel cannot be used as a primary channel. This is assumed to be set by vendors as it is hardware and regdom specific, When we validate a 2MHz primary channel, we need to ensure both 1MHz subchannels do not contain this flag. If one or both of the 1MHz subchannels contain this flag then the 2MHz primary is not permitted for use as a primary channel. Properly integrate S1G channel validation such that it is implemented according with other PHY types such as VHT. Additionally, implement a new S1G-specific regulatory flag to allow cfg80211 to understand specific vendor requirements for S1G PHYs. Signed-off-by: Arien Judge <arien.judge@morsemicro.com> Signed-off-by: Andrew Pope <andrew.pope@morsemicro.com> Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250918051913.500781-2-lachlan.hodges@morsemicro.com [remove redundant NL80211_ATTR_S1G_PRIMARY_2MHZ check] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: remove tx_handlers_drop debugfs statsSarika Sharma3-4/+0
Commit 906a5a8c7152 ("wifi: mac80211: add tx_handlers_drop statistics to ethtool") added a tx_handlers_drop counter to ethtool stats. During review [1], Johannes noted that the existing debugfs counter is now redundant. Remove the debugfs stat to avoid duplication and streamline statistics reporting. Link: https://lore.kernel.org/linux-wireless/ce5f2bd899caa2de32f36ce554d9cada073979c0.camel@sipsolutions.net/ # [1] Signed-off-by: Sarika Sharma <quic_sarishar@quicinc.com> Link: https://patch.msgid.link/20250918040846.4032734-1-quic_sarishar@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: Remove redundant rcu_read_lock/unlock() in spin_lockpengdonglin5-10/+0
Since commit a8bb74acd8efe ("rcu: Consolidate RCU-sched update-side function definitions") there is no difference between rcu_read_lock(), rcu_read_lock_bh() and rcu_read_lock_sched() in terms of RCU read section and the relevant grace period. That means that spin_lock(), which implies rcu_read_lock_sched(), also implies rcu_read_lock(). There is no need no explicitly start a RCU read section if one has already been started implicitly by spin_lock(). Simplify the code and remove the inner rcu_read_lock() invocation. Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com> Signed-off-by: pengdonglin <dolinux.peng@gmail.com> Link: https://patch.msgid.link/20250916044735.2316171-11-dolinux.peng@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: Extend support for changing NAN configurationIlan Peer1-23/+113
As 'struct cfg80211_nan_config' was updated, update the relevant logic to accommodate these changes. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.92b530ddaedf.I2b6d6f6074e25487303fde573ce764a64f87bdcd@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: Export an API to check if NAN is startedIlan Peer1-0/+8
So it can be used by drivers to check if NAN Device interface is started or not. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.c69652f77eb6.Ie4f3d197e0706e742e3d97614fadc11b22adfbc6@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: Get the correct interface for non-netdev skb statusIlan Peer3-19/+20
The function ieee80211_sdata_from_skb() always returned the P2P Device interface in case the skb was not associated with a netdev and didn't consider the possibility that an NAN Device interface is also enabled. To support configurations where both P2P Device and a NAN Device interface are active, extend the function to match the correct interface based on address 2 in the 802.11 MAC header. Since the 'p2p_sdata' field in struct ieee80211_local is no longer needed, remove it. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.5252d2579a49.Id4576531c6b2ad83c9498b708dc0ade6b0214fa8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: Track NAN interface start/stopIlan Peer3-3/+28
In case that NAN is started, mark the device as non idle, and set LED triggering similar to scan and ROC. Set the device to idle once NAN is stopped. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.2711d62fce22.I9b9f826490e50967a66788d713b0eba985879873@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: Accept management frames on NAN interfaceIlan Peer1-2/+10
Accept Public Action frames and Authentication frames on NAN Device interface to support flows that require these frames: - SDFs: For user space Discovery Engine (DE) implementation. - NAFs: For user space NAN Data Path (NDP) establishment. - Authentication frames: For NAN Pairing and Verification. Accept only frames from devices that are part of the NAN cluster. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.46528d69e881.Ifccd87fb2a49a3af05238f74f52fa6da8de28811@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: mac80211: Support Tx of action frame for NANIlan Peer4-4/+29
Add support for sending management frame over a NAN Device interface: - Declare support for the supported management frames types. - Since action frame transmissions over a NAN Device interface do not necessarily require a channel configuration, e.g., they can be transmitted during DW, modify the Tx path to avoid accessing channel information for NAN Device interface. - In addition modify the points in the Tx path logic to account for cases that a band is not specified in the Tx information. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.23b160089228.I65a58af753bcbcfb5c4ad8ef372d546f889725ba@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: cfg80211: Store the NAN cluster IDIlan Peer1-0/+2
When the driver indicates that the device has joined a cluster, store the cluster ID. This is needed for data path operations, e.g., filtering received frames etc. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.63e9fef2a3aa.I6c858185c9e71f84bd2c5174d7ee45902b4391c3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: cfg80211: Support Tx/Rx of action frame for NANIlan Peer1-2/+6
Add support for sending and receiving action frames over a NAN Device interface: - For Synchronized NAN operation NAN Service Discovery Frames (SDFs) and NAN Action Frames (NAFs) transmissions over a NAN Device interface, a channel parameter is not mandatory as the frame can be transmitted based on the NAN Device schedule. - For Unsynchronized NAN Discovery (USD) operation the SDFs and NAFs could be transmitted using NL80211_CMD_FRAME where a specific channel and dwell time are configured. As Synchronized NAN Operation and USD can be done concurrently, both modes need to be supported. Thus, allow sending NAN action frames when user space handles the NAN Discovery Engine (DE) with and without providing a channel as a parameter. To support reception of NAN Action frames and Authentication frames (used for NAN paring and verification) allow to register for management frame reception of NAN Device interface when user space handles the NAN DE. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.71da2b062929.I0166d51dcf14393f628cd5da366c21114f518618@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: cfg80211: Advertise supported NAN capabilitiesIlan Peer1-0/+41
Allow drivers to specify the supported NAN capabilities and support advertising the NAN capabilities to user space. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.2976966556f5.Ic6e43b10049573180c909dad806f279cfb31143e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: cfg80211: Add cluster joined notification APIsAndrei Otcheretianski2-0/+60
The drivers should notify upper layers and user space when a NAN device joins a cluster. This is needed, for example, to set the correct addr3 in SDF frames. Add API to report cluster join event. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.ad27b7b6e4d9.I70b213a2a49f18d1ba2ad325e67e8eff51cc7a1f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: nl80211: Add NAN Discovery Window (DW) notificationAndrei Otcheretianski2-0/+61
This notification will be used by the device to inform user space about upcoming DW. When received, user space will be able to prepare multicast Service Discovery Frames (SDFs) to be transmitted during the next DW using %NL80211_CMD_FRAME command on the NAN management interface. The device/driver will take care to transmit the frames in the correct timing. This allows to implement a synchronized Discovery Engine (DE) in user space, if the device doesn't support DE offload. Note that this notification can be sent before the actual DW starts as long as the driver/device handles the actual timing of the SDF transmission. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.0e1d15031bab.I5b1721e61b63910452b3c5cdcdc1e94cb094d4c9@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: nl80211: Add more configuration options for NAN commandsAndrei Otcheretianski1-35/+263
Current NAN APIs have only basic configuration for master preference and operating bands. Add and parse additional parameters which provide more control over NAN synchronization. The newly added attributes allow to publish additional NAN attributes and vendor elements in NAN beacons, control scan and discovery beacons periodicity, enable/disable DW notifications etc. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> tested: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.a4779492bf8e.I375feb919bd72358173766b9fe10010c40796b33@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-199p/trans_fd: p9_fd_request: kick rx thread if EPOLLINOleg Nesterov1-8/+1
p9_read_work() doesn't set Rworksched and doesn't do schedule_work(m->rq) if list_empty(&m->req_list). However, if the pipe is full, we need to read more data and this used to work prior to commit aaec5a95d59615 ("pipe_read: don't wake up the writer if the pipe is still full"). p9_read_work() does p9_fd_read() -> ... -> anon_pipe_read() which (before the commit above) triggered the unnecessary wakeup. This wakeup calls p9_pollwake() which kicks p9_poll_workfn() -> p9_poll_mux(), p9_poll_mux() will notice EPOLLIN and schedule_work(&m->rq). This no longer happens after the optimization above, change p9_fd_request() to use p9_poll_mux() instead of only checking for EPOLLOUT. Reported-by: syzbot+d1b5dace43896bc386c3@syzkaller.appspotmail.com Tested-by: syzbot+d1b5dace43896bc386c3@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68a2de8f.050a0220.e29e5.0097.GAE@google.com/ Link: https://lore.kernel.org/all/67dedd2f.050a0220.31a16b.003f.GAE@google.com/ Co-developed-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Message-ID: <20250819161013.GB11345@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-09-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski18-48/+114
Cross-merge networking fixes after downstream PR (net-6.17-rc7). No conflicts. Adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en/fs.h 9536fbe10c9d ("net/mlx5e: Add PSP steering in local NIC RX") 7601a0a46216 ("net/mlx5e: Add a miss level for ipsec crypto offload") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18Merge tag 'net-6.17-rc7' of ↵Linus Torvalds17-45/+110
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless. No known regressions at this point. Current release - fix to a fix: - eth: Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set" - wifi: iwlwifi: pcie: fix byte count table for 7000/8000 devices - net: clear sk->sk_ino in sk_set_socket(sk, NULL), fix CRIU Previous releases - regressions: - bonding: set random address only when slaves already exist - rxrpc: fix untrusted unsigned subtract - eth: - ice: fix Rx page leak on multi-buffer frames - mlx5: don't return mlx5_link_info table when speed is unknown Previous releases - always broken: - tls: make sure to abort the stream if headers are bogus - tcp: fix null-deref when using TCP-AO with TCP_REPAIR - dpll: fix skipping last entry in clock quality level reporting - eth: qed: don't collect too many protection override GRC elements, fix memory corruption" * tag 'net-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits) octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp() cnic: Fix use-after-free bugs in cnic_delete_task devlink rate: Remove unnecessary 'static' from a couple places MAINTAINERS: update sundance entry net: liquidio: fix overflow in octeon_init_instr_queue() net: clear sk->sk_ino in sk_set_socket(sk, NULL) Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set" selftests: tls: test skb copy under mem pressure and OOB tls: make sure to abort the stream if headers are bogus selftest: packetdrill: Add tcp_fastopen_server_reset-after-disconnect.pkt. tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect(). octeon_ep: fix VF MAC address lifecycle handling selftests: bonding: add vlan over bond testing bonding: don't set oif to bond dev when getting NS target destination net: rfkill: gpio: Fix crash due to dereferencering uninitialized pointer net/mlx5e: Add a miss level for ipsec crypto offload net/mlx5e: Harden uplink netdev access against device unbind MAINTAINERS: make the DPLL entry cover drivers doc/netlink: Fix typos in operation attributes igc: don't fail igc_probe() on LED setup error ...
2025-09-18devlink rate: Remove unnecessary 'static' from a couple placesCosmin Ratiu1-2/+2
devlink_rate_node_get_by_name() and devlink_rate_nodes_destroy() have a couple of unnecessary static variables for iterating over devlink rates. This could lead to races/corruption/unhappiness if two concurrent operations execute the same function. Remove 'static' from both. It's amazing this was missed for 4+ years. While at it, I confirmed there are no more examples of this mistake in net/ with 1, 2 or 3 levels of indentation. Fixes: a8ecb93ef03d ("devlink: Introduce rate nodes") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: ethtool: use the new helper in rss_set_prep_indir()Breno Leitao1-8/+7
Refactor rss_set_prep_indir() to utilize the new ethtool_get_rx_ring_count() helper for determining the number of RX rings, replacing the direct use of get_rxnfc with ETHTOOL_GRXRINGS. This ensures compatibility with both legacy and new ethtool_ops interfaces by transparently multiplexing between them. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-7-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: ethtool: update set_rxfh_indir to use ethtool_get_rx_ring_count helperBreno Leitao1-8/+8
Modify ethtool_set_rxfh() to use the new ethtool_get_rx_ring_count() helper function for retrieving the number of RX rings instead of directly calling get_rxnfc with ETHTOOL_GRXRINGS. This way, we can leverage the new helper if it is available in ethtool_ops. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-6-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: ethtool: update set_rxfh to use ethtool_get_rx_ring_count helperBreno Leitao1-7/+9
Modify ethtool_set_rxfh() to use the new ethtool_get_rx_ring_count() helper function for retrieving the number of RX rings instead of directly calling get_rxnfc with ETHTOOL_GRXRINGS. This way, we can leverage the new helper if it is available in ethtool_ops. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-5-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: ethtool: add get_rx_ring_count callback to optimize RX ring queriesBreno Leitao3-5/+25
Add a new optional get_rx_ring_count callback in ethtool_ops to allow drivers to provide the number of RX rings directly without going through the full get_rxnfc flow classification interface. Create ethtool_get_rx_ring_count() to use .get_rx_ring_count if available, falling back to get_rxnfc() otherwise. It needs to be non-static, given it will be called by other ethtool functions laters, as those calling get_rxfh(). Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-4-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: ethtool: remove the duplicated handling from ethtool_get_rxringsBreno Leitao1-23/+10
ethtool_get_rxrings() was a copy of ethtool_get_rxnfc(). Clean the code that will never be executed for GRXRINGS specifically. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-3-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: ethtool: add support for ETHTOOL_GRXRINGS ioctlBreno Leitao1-0/+40
This patch adds handling for the ETHTOOL_GRXRINGS ioctl command in the ethtool ioctl dispatcher. It introduces a new helper function ethtool_get_rxrings() that calls the driver's get_rxnfc() callback with appropriate parameters to retrieve the number of RX rings supported by the device. By explicitly handling ETHTOOL_GRXRINGS, userspace queries through ethtool can now obtain RX ring information in a structured manner. In this patch, ethtool_get_rxrings() is a simply copy of ethtool_get_rxnfc(). Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-2-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18net: ethtool: pass the num of RX rings directly to ethtool_copy_validate_indirBreno Leitao1-5/+5
Modify ethtool_copy_validate_indir() and callers to validate indirection table entries against the number of RX rings as an integer instead of accessing rx_rings->data. This will be useful in the future, given that struct ethtool_rxnfc might not exist for native GRXRINGS call. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-1-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18psp: rename our psp_dev_destroy()Eric Dumazet2-4/+4
psp_dev_destroy() was already used in drivers/crypto/ccp/psp-dev.c Use psp_dev_free() instead, to avoid a link error when CRYPTO_DEV_SP_CCP=y Fixes: 00c94ca2b99e ("psp: base PSP device support") Closes: https://lore.kernel.org/netdev/CANn89i+ZdBDEV6TE=Nw5gn9ycTzWw4mZOpPuCswgwEsrgOyNnw@mail.gmail.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20250918113546.177946-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-18tls: make sure to abort the stream if headers are bogusJakub Kicinski3-7/+11
Normally we wait for the socket to buffer up the whole record before we service it. If the socket has a tiny buffer, however, we read out the data sooner, to prevent connection stalls. Make sure that we abort the connection when we find out late that the record is actually invalid. Retrying the parsing is fine in itself but since we copy some more data each time before we parse we can overflow the allocated skb space. Constructing a scenario in which we're under pressure without enough data in the socket to parse the length upfront is quite hard. syzbot figured out a way to do this by serving us the header in small OOB sends, and then filling in the recvbuf with a large normal send. Make sure that tls_rx_msg_size() aborts strp, if we reach an invalid record there's really no way to recover. Reported-by: Lee Jones <lee@kernel.org> Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250917002814.1743558-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>