aboutsummaryrefslogtreecommitdiffstats
path: root/arch/s390/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-23bpf, x86: Add support for signed arena loadsKumar Kartikeya Dwivedi1-0/+5
Currently, signed load instructions into arena memory are unsupported. The compiler is free to generate these, and on GCC-14 we see a corresponding error when it happens. The hurdle in supporting them is deciding which unused opcode to use to mark them for the JIT's own consumption. After much thinking, it appears 0xc0 / BPF_NOSPEC can be combined with load instructions to identify signed arena loads. Use this to recognize and JIT them appropriately, and remove the verifier side limitation on the program if the JIT supports them. Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20250923110157.18326-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-26s390/bpf: Add s390 JIT support for timed may_gotoIlya Leoshkevich3-5/+67
The verifier provides an architecture-independent implementation of the may_goto instruction, which is currently used on s390x, but it has a downside: there is no way to prevent progs using it from running for a very long time. The solution to this problem is an alternative timed implementation, which requires architecture-specific bits. Its availability is signaled to the verifier by bpf_jit_supports_timed_may_goto() returning true. The verifier then emits a call to arch_bpf_timed_may_goto() using a non-standard calling convention. This function must act as a trampoline for bpf_check_timed_may_goto(). Implement bpf_jit_supports_timed_may_goto(), account for the special calling convention in the BPF_CALL implementation, and implement arch_bpf_timed_may_goto(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250821113339.292434-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-08-20s390/bpf: Use direct calls and jumps where possibleIlya Leoshkevich1-48/+32
After the V!=R rework (commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces")), all kernel code and related data are allocated within a 4G region, making it possible to use relative addressing in BPF code more extensively. Convert as many indirect calls and jumps to direct calls as possible, namely: * BPF_CALL * __bpf_tramp_enter() * __bpf_tramp_exit() * __bpf_prog_enter() * __bpf_prog_exit() * fentry * fmod_ret * fexit * BPF_TRAMP_F_CALL_ORIG without BPF_TRAMP_F_ORIG_STACK * Trampoline returns without BPF_TRAMP_F_SKIP_FRAME and BPF_TRAMP_F_ORIG_STACK The following indirect calls and jumps remain: * Prog returns * Trampoline returns with BPF_TRAMP_F_SKIP_FRAME or BPF_TRAMP_F_ORIG_STACK * BPF_TAIL_CALL * BPF_TRAMP_F_CALL_ORIG with BPF_TRAMP_F_ORIG_STACK As a result, only one usage of call_r1() remains, so inline it. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250819102116.252203-1-iii@linux.ibm.com
2025-08-18s390/bpf: Write back tail call counter for BPF_TRAMP_F_CALL_ORIGIlya Leoshkevich1-0/+3
The tailcall_bpf2bpf_hierarchy_fentry test hangs on s390. Its call graph is as follows: entry() subprog_tail() trampoline() fentry() the rest of subprog_tail() # via BPF_TRAMP_F_CALL_ORIG return to entry() The problem is that the rest of subprog_tail() increments the tail call counter, but the trampoline discards the incremented value. This results in an astronomically large number of tail calls. Fix by making the trampoline write the incremented tail call counter back. Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250813121016.163375-4-iii@linux.ibm.com
2025-08-18s390/bpf: Write back tail call counter for BPF_PSEUDO_CALLIlya Leoshkevich1-7/+16
The tailcall_bpf2bpf_hierarchy_1 test hangs on s390. Its call graph is as follows: entry() subprog_tail() bpf_tail_call_static(0) -> entry + tail_call_start subprog_tail() bpf_tail_call_static(0) -> entry + tail_call_start entry() copies its tail call counter to the subprog_tail()'s frame, which then increments it. However, the incremented result is discarded, leading to an astronomically large number of tail calls. Fix by writing the incremented counter back to the entry()'s frame. Fixes: dd691e847d28 ("s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250813121016.163375-3-iii@linux.ibm.com
2025-08-18s390/bpf: Do not write tail call counter into helper and kfunc framesIlya Leoshkevich1-4/+12
Only BPF functions make use of the tail call counter; helpers and kfuncs ignore and most likely also clobber it. Writing it into these functions' frames is pointless and misleading, so do not do it. Fixes: dd691e847d28 ("s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250813121016.163375-2-iii@linux.ibm.com
2025-07-30Merge tag 'bpf-next-6.17' of ↵Linus Torvalds2-101/+67
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Remove usermode driver (UMD) framework (Thomas Weißschuh) - Introduce Strongly Connected Component (SCC) in the verifier to detect loops and refine register liveness (Eduard Zingerman) - Allow 'void *' cast using bpf_rdonly_cast() and corresponding '__arg_untrusted' for global function parameters (Eduard Zingerman) - Improve precision for BPF_ADD and BPF_SUB operations in the verifier (Harishankar Vishwanathan) - Teach the verifier that constant pointer to a map cannot be NULL (Ihor Solodrai) - Introduce BPF streams for error reporting of various conditions detected by BPF runtime (Kumar Kartikeya Dwivedi) - Teach the verifier to insert runtime speculation barrier (lfence on x86) to mitigate speculative execution instead of rejecting the programs (Luis Gerhorst) - Various improvements for 'veristat' (Mykyta Yatsenko) - For CONFIG_DEBUG_KERNEL config warn on internal verifier errors to improve bug detection by syzbot (Paul Chaignon) - Support BPF private stack on arm64 (Puranjay Mohan) - Introduce bpf_cgroup_read_xattr() kfunc to read xattr of cgroup's node (Song Liu) - Introduce kfuncs for read-only string opreations (Viktor Malik) - Implement show_fdinfo() for bpf_links (Tao Chen) - Reduce verifier's stack consumption (Yonghong Song) - Implement mprog API for cgroup-bpf programs (Yonghong Song) * tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (192 commits) selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite selftests/bpf: Add selftest for attaching tracing programs to functions in deny list bpf: Add log for attaching tracing programs to functions in deny list bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions bpf: Fix various typos in verifier.c comments bpf: Add third round of bounds deduction selftests/bpf: Test invariants on JSLT crossing sign selftests/bpf: Test cross-sign 64bits range refinement selftests/bpf: Update reg_bound range refinement logic bpf: Improve bounds when s64 crosses sign boundary bpf: Simplify bounds refinement from s32 selftests/bpf: Enable private stack tests for arm64 bpf, arm64: JIT support for private stack bpf: Move bpf_jit_get_prog_name() to core.c bpf, arm64: Fix fp initialization for exception boundary umd: Remove usermode driver framework bpf/preload: Don't select USERMODE_DRIVER selftests/bpf: Fix test dynptr/test_dynptr_memset_xdp_chunks failure selftests/bpf: Fix test dynptr/test_dynptr_copy_xdp failure selftests/bpf: Increase xdp data size for arm64 64K page size ...
2025-07-29Merge tag 's390-6.17-1' of ↵Linus Torvalds2-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Standardize on the __ASSEMBLER__ macro that is provided by GCC and Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__ in both uapi and non-uapi headers - Explicitly include <linux/export.h> in architecture and driver files which contain an EXPORT_SYMBOL() and remove the include from the files which do not contain the EXPORT_SYMBOL() - Use the full title of "z/Architecture Principles of Operation" manual and the name of a section where facility bits are listed - Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid unnecessary slowing down of the build and confusing external kABI tools that process symtypes data - Print additional unrecoverable machine check information to make the root cause analysis easier - Move cmpxchg_user_key() handling to uaccess library code, since the generated code is large anyway and there is no benefit if it is inlined - Fix a problem when cmpxchg_user_key() is executing a code with a non-default key: if a system is IPL-ed with "LOAD NORMAL", and the previous system used storage keys where the fetch-protection bit was set for some pages, and the cmpxchg_user_key() is located within such page, a protection exception happens - Either the external call or emergency signal order is used to send an IPI to a remote CPU. Use the external order only, since it is at least as good and sometimes even better, than the emergency signal - In case of an early crash the early program check handler prints more or less random value of the last breaking event address, since it is not initialized properly. Copy the last breaking event address from the lowcore to pt_regs to address this - During STP synchronization check udelay() can not be used, since the first CPU modifies tod_clock_base and get_tod_clock_monotonic() might return a non-monotonic time. Instead, busy-loop on other CPUs, while the the first CPU actually handles the synchronization operation - When debugging the early kernel boot using QEMU with the -S flag and GDB attached, skip the decompressor and start directly in kernel - Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture Principles of Operation" manual - Remove the in-kernel time steering support in favour of the new s390 PTP driver, which allows the kernel clock steered more precisely - Remove a possible false-positive warning in pte_free_defer(), which could be triggered in a valid case KVM guest process is initializing * tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits) s390/mm: Remove possible false-positive warning in pte_free_defer() s390/stp: Default to enabled s390/stp: Remove leap second support s390/time: Remove in-kernel time steering s390/sclp: Use monotonic clock in sclp_sync_wait() s390/smp: Use monotonic clock in smp_emergency_stop() s390/time: Use monotonic clock in get_cycles() s390/pai_crypto: Rename PAI Crypto event 4210 scripts/gdb/symbols: make lx-symbols skip the s390 decompressor s390/boot: Introduce jump_to_kernel() function s390/stp: Remove udelay from stp_sync_clock() s390/early: Copy last breaking event address to pt_regs s390/smp: Remove conditional emergency signal order code usage s390/uaccess: Merge cmpxchg_user_key() inline assemblies s390/uaccess: Prevent kprobes on cmpxchg_user_key() functions s390/uaccess: Initialize code pages executed with non-default access key s390/skey: Provide infrastructure for executing with non-default access key s390/uaccess: Make cmpxchg_user_key() library code s390/page: Add memory clobber to page_set_storage_key() s390/page: Cleanup page_set_storage_key() inline assemblies ...
2025-07-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc6Alexei Starovoitov1-1/+9
Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-16s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL againIlya Leoshkevich1-1/+9
Commit 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") has accidentally removed the critical piece of commit c730fce7c70c ("s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL"), causing intermittent kernel panics in e.g. perf's on_switch() prog to reappear. Restore the fix and add a comment. Fixes: 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") Cc: stable@vger.kernel.org Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250716194524.48109-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-01s390/bpf: Describe the frame using a struct instead of constantsIlya Leoshkevich2-77/+47
Currently the caller-allocated portion of the stack frame is described using constants, hardcoded values, and an ASCII drawing, making it harder than necessary to ensure that everything is in sync. Declare a struct and use offsetof() and offsetofend() macros to refer to various values stored within the frame. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250624121501.50536-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-01s390/bpf: Centralize frame offset calculationsIlya Leoshkevich1-30/+26
The calculation of the distance from %r15 to the caller-allocated portion of the stack frame is copy-pasted into multiple places in the JIT code. Move it to bpf_jit_prog() and save the result into bpf_jit::frame_off, so that the other parts of the JIT can use it. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250624121501.50536-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-17s390: Explicitly include <linux/export.h>Heiko Carstens1-0/+1
Explicitly include <linux/export.h> in files which contain an EXPORT_SYMBOL(). See commit a934a57a42f6 ("scripts/misc-check: check missing #include <linux/export.h> when W=1") for more details. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-06-16s390: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headersThomas Huth1-2/+2
While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembler code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This is bad since macros starting with two underscores are names that are reserved by the C language. It can also be very confusing for the developers when switching between userspace and kernelspace coding, or when dealing with uapi headers that rather should use __ASSEMBLER__ instead. So let's now standardize on the __ASSEMBLER__ macro that is provided by the compilers. This is a completely mechanical patch (done with a simple "sed -i" statement), with some manual fixups done later while rebasing the patch. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20250611140046.137739-3-thuth@redhat.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-05-22s390/bpf: Use kernel's expoline thunksIlya Leoshkevich1-44/+17
Simplify the JIT code by replacing the custom expolines with the ones defined in the kernel text. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250519223646.66382-4-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-22s390/bpf: Add macros for calling external functionsIlya Leoshkevich1-18/+42
After the V!=R rework (commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces")), kernel and BPF programs are allocated within a 4G region, making it possible to use relative addressing to directly use kernel functions from BPF code. Add two new macros for calling kernel functions from BPF code: EMIT6_PCREL_RILB_PTR() and EMIT6_PCREL_RILC_PTR(). Factor out parts of the existing macros that are helpful for implementing the new ones. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250519223646.66382-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-14s390/bpf: Remove the orig_call NULL checkIlya Leoshkevich1-3/+2
Now that orig_call can never be NULL, remove the respective check. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250512221911.61314-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-14s390/bpf: Store backchain even for leaf progsIlya Leoshkevich1-7/+5
Currently a crash in a leaf prog (caused by a bug) produces the following call trace: [<000003ff600ebf00>] bpf_prog_6df0139e1fbf2789_fentry+0x20/0x78 [<0000000000000000>] 0x0 This is because leaf progs do not store backchain. Fix by making all progs do it. This is what GCC and Clang-generated code does as well. Now the call trace looks like this: [<000003ff600eb0f2>] bpf_prog_6df0139e1fbf2789_fentry+0x2a/0x80 [<000003ff600ed096>] bpf_trampoline_201863462940+0x96/0xf4 [<000003ff600e3a40>] bpf_prog_05f379658fdd72f2_classifier_0+0x58/0xc0 [<000003ffe0aef070>] bpf_test_run+0x210/0x390 [<000003ffe0af0dc2>] bpf_prog_test_run_skb+0x25a/0x668 [<000003ffe038a90e>] __sys_bpf+0xa46/0xdb0 [<000003ffe038ad0c>] __s390x_sys_bpf+0x44/0x50 [<000003ffe0defea8>] __do_syscall+0x150/0x280 [<000003ffe0e01d5c>] system_call+0x74/0x98 Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250512122717.54878-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: Introduce load-acquire and store-release instructionsPeilin Ye1-4/+10
Introduce BPF instructions with load-acquire and store-release semantics, as discussed in [1]. Define 2 new flags: #define BPF_LOAD_ACQ 0x100 #define BPF_STORE_REL 0x110 A "load-acquire" is a BPF_STX | BPF_ATOMIC instruction with the 'imm' field set to BPF_LOAD_ACQ (0x100). Similarly, a "store-release" is a BPF_STX | BPF_ATOMIC instruction with the 'imm' field set to BPF_STORE_REL (0x110). Unlike existing atomic read-modify-write operations that only support BPF_W (32-bit) and BPF_DW (64-bit) size modifiers, load-acquires and store-releases also support BPF_B (8-bit) and BPF_H (16-bit). As an exception, however, 64-bit load-acquires/store-releases are not supported on 32-bit architectures (to fix a build error reported by the kernel test robot). An 8- or 16-bit load-acquire zero-extends the value before writing it to a 32-bit register, just like ARM64 instruction LDARH and friends. Similar to existing atomic read-modify-write operations, misaligned load-acquires/store-releases are not allowed (even if BPF_F_ANY_ALIGNMENT is set). As an example, consider the following 64-bit load-acquire BPF instruction (assuming little-endian): db 10 00 00 00 01 00 00 r0 = load_acquire((u64 *)(r1 + 0x0)) opcode (0xdb): BPF_ATOMIC | BPF_DW | BPF_STX imm (0x00000100): BPF_LOAD_ACQ Similarly, a 16-bit BPF store-release: cb 21 00 00 10 01 00 00 store_release((u16 *)(r1 + 0x0), w2) opcode (0xcb): BPF_ATOMIC | BPF_H | BPF_STX imm (0x00000110): BPF_STORE_REL In arch/{arm64,s390,x86}/net/bpf_jit_comp.c, have bpf_jit_supports_insn(..., /*in_arena=*/true) return false for the new instructions, until the corresponding JIT compiler supports them in arena. [1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/ Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/a217f46f0e445fbd573a1a024be5c6bf1d5fe716.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-08s390/bpf: Implement exceptionsIlya Leoshkevich1-2/+53
Implement the following three pieces required from the JIT: - A "top-level" BPF prog (exception_boundary) must save all non-volatile registers, and not only the ones that it clobbers. - A "handler" BPF prog (exception_cb) must switch stack to that of exception_boundary, and restore the registers that exception_boundary saved. - arch_bpf_stack_walk() must unwind the stack and provide the results in a way that satisfies both bpf_throw() and exception_cb. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-3-iii@linux.ibm.com
2024-07-08s390/bpf: Change seen_reg to a maskIlya Leoshkevich1-16/+16
Using a mask instead of an array saves a small amount of memory and allows marking multiple registers as seen with a simple "or". Another positive side-effect is that it speeds up verification with jitterbug. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-2-iii@linux.ibm.com
2024-07-02s390/bpf: Support arena atomicsIlya Leoshkevich1-10/+94
s390x supports most BPF atomics using single instructions, which makes implementing arena support a matter of adding arena address to the base register (unfortunately atomics do not support index registers), and wrapping the respective native instruction in probing sequences. An exception is BPF_XCHG, which is implemented using two different memory accesses and a loop. Make sure there is enough extable entries for both instructions. Compute the base address once for both memory accesses. Since on exception we need to land after the loop, emit the nops manually. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-10-iii@linux.ibm.com
2024-07-02s390/bpf: Enable arenaIlya Leoshkevich1-0/+5
Now that BPF_PROBE_MEM32 and address space cast instructions are implemented, tell the verifier that the JIT supports arena. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-9-iii@linux.ibm.com
2024-07-02s390/bpf: Support address space cast instructionIlya Leoshkevich1-0/+18
The new address cast instruction translates arena offsets to userspace addresses. NULL pointers must not be translated. The common code sets up the mappings in such a way that it's enough to replace the higher 32 bits to achieve the desired result. s390x has just an instruction for this: INSERT IMMEDIATE. Implement the sequence using 3 instruction: LOAD AND TEST, BRANCH RELATIVE ON CONDITION and INSERT IMMEDIATE. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-8-iii@linux.ibm.com
2024-07-02s390/bpf: Support BPF_PROBE_MEM32Ilya Leoshkevich1-27/+110
BPF_PROBE_MEM32 is a new mode for LDX, ST and STX instructions. The JIT is supposed to add the start address of the kernel arena mapping to the %dst register, and use a probing variant of the respective memory access. Reuse the existing probing infrastructure for that. Put the arena address into the literal pool, load it into %r1 and use that as an index register. Do not clear any registers in ex_handler_bpf() for failing ST and STX instructions. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-7-iii@linux.ibm.com
2024-07-02s390/bpf: Land on the next JITed instruction after exceptionIlya Leoshkevich1-3/+4
Currently we land on the nop, which is unnecessary: we can just as well begin executing the next instruction. Furthermore, the upcoming arena support for the loop-based BPF_XCHG implementation will require landing on an instruction that comes after the loop. So land on the next JITed instruction, which covers both cases. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-6-iii@linux.ibm.com
2024-07-02s390/bpf: Introduce pre- and post- probe functionsIlya Leoshkevich1-14/+44
Currently probe insns are handled by two "if" statements at the beginning and at the end of bpf_jit_insn(). The first one needs to be in sync with the huge insn->code statement that follows it, which was not a problem so far, since the check is small. The introduction of arena will make it significantly larger, and it will no longer be obvious whether it is in sync with the opcode switch. Move these statements to the new bpf_jit_probe_load_pre() and bpf_jit_probe_post() functions, and call them only from cases that need them. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-5-iii@linux.ibm.com
2024-07-02s390/bpf: Get rid of get_probe_mem_regno()Ilya Leoshkevich1-26/+7
Commit 7fc8c362e782 ("s390/bpf: encode register within extable entry") introduced explicit passing of the number of the register to be cleared to ex_handler_bpf(), which replaced deducing it from the respective native load instruction using get_probe_mem_regno(). Replace the second and last usage in the same manner, and remove this function. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-4-iii@linux.ibm.com
2024-07-02s390/bpf: Factor out emitting probe nopsIlya Leoshkevich1-22/+40
The upcoming arena support for the loop-based BPF_XCHG implementation requires emitting nop and extable entries separately. Move nop handling into a separate function, and keep track of the nop offset. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-3-iii@linux.ibm.com
2024-05-12s390/bpf: Emit a barrier for BPF_FETCH instructionsIlya Leoshkevich1-2/+6
BPF_ATOMIC_OP() macro documentation states that "BPF_ADD | BPF_FETCH" should be the same as atomic_fetch_add(), which is currently not the case on s390x: the serialization instruction "bcr 14,0" is missing. This applies to "and", "or" and "xor" variants too. s390x is allowed to reorder stores with subsequent fetches from different addresses, so code relying on BPF_FETCH acting as a barrier, for example: stw [%r0], 1 afadd [%r1], %r2 ldxw %r3, [%r4] may be broken. Fix it by emitting "bcr 14,0". Note that a separate serialization instruction is not needed for BPF_XCHG and BPF_CMPXCHG, because COMPARE AND SWAP performs serialization itself. Fixes: ba3b86b9cef0 ("s390/bpf: Implement new atomic ops") Reported-by: Puranjay Mohan <puranjay12@gmail.com> Closes: https://lore.kernel.org/bpf/mb61p34qvq3wf.fsf@kernel.org/ Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20240507000557.12048-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-26/+20
Cross-merge networking fixes after downstream PR. No conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-19s390/bpf: Fix bpf_plt pointer arithmeticIlya Leoshkevich1-26/+20
Kui-Feng Lee reported a crash on s390x triggered by the dummy_st_ops/dummy_init_ptr_arg test [1]: [<0000000000000002>] 0x2 [<00000000009d5cde>] bpf_struct_ops_test_run+0x156/0x250 [<000000000033145a>] __sys_bpf+0xa1a/0xd00 [<00000000003319dc>] __s390x_sys_bpf+0x44/0x50 [<0000000000c4382c>] __do_syscall+0x244/0x300 [<0000000000c59a40>] system_call+0x70/0x98 This is caused by GCC moving memcpy() after assignments in bpf_jit_plt(), resulting in NULL pointers being written instead of the return and the target addresses. Looking at the GCC internals, the reordering is allowed because the alias analysis thinks that the memcpy() destination and the assignments' left-hand-sides are based on different objects: new_plt and bpf_plt_ret/bpf_plt_target respectively, and therefore they cannot alias. This is in turn due to a violation of the C standard: When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object ... From the C's perspective, bpf_plt_ret and bpf_plt are distinct objects and cannot be subtracted. In the practical terms, doing so confuses the GCC's alias analysis. The code was written this way in order to let the C side know a few offsets defined in the assembly. While nice, this is by no means necessary. Fix the noncompliance by hardcoding these offsets. [1] https://lore.kernel.org/bpf/c9923c1d-971d-4022-8dc8-1364e929d34c@gmail.com/ Fixes: f1d5df84cd8c ("s390/bpf: Implement bpf_arch_text_poke()") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Message-ID: <20240320015515.11883-1-iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-14bpf: Take return from set_memory_rox() into account with ↵Christophe Leroy1-1/+5
bpf_jit_binary_lock_ro() set_memory_rox() can fail, leaving memory unprotected. Check return and bail out when bpf_jit_binary_lock_ro() returns an error. Link: https://github.com/KSPP/linux/issues/7 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: linux-hardening@vger.kernel.org <linux-hardening@vger.kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn> # LoongArch Reviewed-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> # MIPS Part Message-ID: <036b6393f23a2032ce75a1c92220b2afcb798d5d.1709850515.git.christophe.leroy@csgroup.eu> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04s390/bpf: Fix gotol with large offsetsIlya Leoshkevich1-1/+1
The gotol implementation uses a wrong data type for the offset: it should be s32, not s16. Fixes: c690191e23d8 ("s390/bpf: Implement unconditional jump with 32-bit offset") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20240102193531.3169422-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-18s390/bpf: Fix indirect trampoline generationAlexei Starovoitov1-1/+2
The func_addr used to be NULL for indirect trampolines used by struct_ops. Now func_addr is a valid function pointer. Hence use BPF_TRAMP_F_INDIRECT flag to detect such condition. Fixes: 2cd3e3772e41 ("x86/cfi,bpf: Fix bpf_struct_ops CFI") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20231216004549.78355-1-alexei.starovoitov@gmail.com
2023-12-06bpf: Add arch_bpf_trampoline_size()Song Liu1-22/+34
This helper will be used to calculate the size of the trampoline before allocating the memory. arch_prepare_bpf_trampoline() for arm64 and riscv64 can use arch_bpf_trampoline_size() to check the trampoline fits in the image. OTOH, arch_prepare_bpf_trampoline() for s390 has to call the JIT process twice, so it cannot use arch_bpf_trampoline_size(). Signed-off-by: Song Liu <song@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # on s390x Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # on riscv Link: https://lore.kernel.org/r/20231206224054.492250-6-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-16Merge tag 'for-netdev' of ↵Jakub Kicinski1-63/+202
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-10-16 We've added 90 non-merge commits during the last 25 day(s) which contain a total of 120 files changed, 3519 insertions(+), 895 deletions(-). The main changes are: 1) Add missed stats for kprobes to retrieve the number of missed kprobe executions and subsequent executions of BPF programs, from Jiri Olsa. 2) Add cgroup BPF sockaddr hooks for unix sockets. The use case is for systemd to reimplement the LogNamespace feature which allows running multiple instances of systemd-journald to process the logs of different services, from Daan De Meyer. 3) Implement BPF CPUv4 support for s390x BPF JIT, from Ilya Leoshkevich. 4) Improve BPF verifier log output for scalar registers to better disambiguate their internal state wrt defaults vs min/max values matching, from Andrii Nakryiko. 5) Extend the BPF fib lookup helpers for IPv4/IPv6 to support retrieving the source IP address with a new BPF_FIB_LOOKUP_SRC flag, from Martynas Pumputis. 6) Add support for open-coded task_vma iterator to help with symbolization for BPF-collected user stacks, from Dave Marchevsky. 7) Add libbpf getters for accessing individual BPF ring buffers which is useful for polling them individually, for example, from Martin Kelly. 8) Extend AF_XDP selftests to validate the SHARED_UMEM feature, from Tushar Vyavahare. 9) Improve BPF selftests cross-building support for riscv arch, from Björn Töpel. 10) Add the ability to pin a BPF timer to the same calling CPU, from David Vernet. 11) Fix libbpf's bpf_tracing.h macros for riscv to use the generic implementation of PT_REGS_SYSCALL_REGS() to access syscall arguments, from Alexandre Ghiti. 12) Extend libbpf to support symbol versioning for uprobes, from Hengqi Chen. 13) Fix bpftool's skeleton code generation to guarantee that ELF data is 8 byte aligned, from Ian Rogers. 14) Inherit system-wide cpu_mitigations_off() setting for Spectre v1/v4 security mitigations in BPF verifier, from Yafang Shao. 15) Annotate struct bpf_stack_map with __counted_by attribute to prepare BPF side for upcoming __counted_by compiler support, from Kees Cook. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (90 commits) bpf: Ensure proper register state printing for cond jumps bpf: Disambiguate SCALAR register state output in verifier logs selftests/bpf: Make align selftests more robust selftests/bpf: Improve missed_kprobe_recursion test robustness selftests/bpf: Improve percpu_alloc test robustness selftests/bpf: Add tests for open-coded task_vma iter bpf: Introduce task_vma open-coded iterator kfuncs selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c bpf: Don't explicitly emit BTF for struct btf_iter_num bpf: Change syscall_nr type to int in struct syscall_tp_t net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set bpf: Avoid unnecessary audit log for CPU security mitigations selftests/bpf: Add tests for cgroup unix socket address hooks selftests/bpf: Make sure mount directory exists documentation/bpf: Document cgroup unix socket address hooks bpftool: Add support for cgroup unix socket address hooks libbpf: Add support for cgroup unix socket address hooks bpf: Implement cgroup sockaddr hooks for unix sockets bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf bpf: Propagate modified uaddrlen from cgroup sockaddr programs ... ==================== Link: https://lore.kernel.org/r/20231016204803.30153-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-5/+20
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: kernel/bpf/verifier.c 829955981c55 ("bpf: Fix verifier log for async callback return values") a923819fb2c5 ("bpf: Treat first argument as return value for bpf_throw") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11s390/bpf: Fix unwinding past the trampolineIlya Leoshkevich1-3/+14
When functions called by the trampoline panic, the backtrace that is printed stops at the trampoline, because the trampoline does not store its caller's frame address (backchain) on stack; it also stores the return address at a wrong location. Store both the same way as is already done for the regular eBPF programs. Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231010203512.385819-3-iii@linux.ibm.com
2023-10-11s390/bpf: Fix clobbering the caller's backchain in the trampolineIlya Leoshkevich1-2/+6
One of the first things that s390x kernel functions do is storing the the caller's frame address (backchain) on stack. This makes unwinding possible. The backchain is always stored at frame offset 152, which is inside the 160-byte stack area, that the functions allocate for their callees. The callees must preserve the backchain; the remaining 152 bytes they may use as they please. Currently the trampoline uses all 160 bytes, clobbering the backchain. This causes kernel panics when using __builtin_return_address() in functions called by the trampoline. Fix by reducing the usage of the caller-reserved stack area by 8 bytes in the trampoline. Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reported-by: Song Liu <song@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231010203512.385819-2-iii@linux.ibm.com
2023-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+1
Cross-merge networking fixes after downstream PR. No conflicts (or adjacent changes of note). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-09-21s390/bpf: Implement signed divisionIlya Leoshkevich1-47/+125
Implement the cpuv4 signed division. It is encoded as unsigned division, but with off field set to 1. s390x has the necessary instructions: dsgfr, dsgf and dsgr. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-9-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-21s390/bpf: Implement unconditional jump with 32-bit offsetIlya Leoshkevich1-3/+9
Implement the cpuv4 unconditional jump with 32-bit offset, which is encoded as BPF_JMP32 | BPF_JA and stores the offset in the imm field. Reuse the existing BPF_JMP | BPF_JA logic. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-8-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-21s390/bpf: Implement unconditional byte swapIlya Leoshkevich1-0/+1
Implement the cpuv4 unconditional byte swap, which is encoded as BPF_ALU64 | BPF_END | BPF_FROM_LE. Since s390x is big-endian, it's the same as the existing BPF_ALU | BPF_END | BPF_FROM_LE. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-7-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-21s390/bpf: Implement BPF_MEMSXIlya Leoshkevich1-5/+27
Implement the cpuv4 load with sign-extension, which is encoded as BPF_MEMSX (and, for internal uses cases only, BPF_PROBE_MEMSX). This is the same as BPF_MEM and BPF_PROBE_MEM, but with sign extension instead of zero extension, and s390x has the necessary instructions: lgb, lgh and lgf. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-6-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-21s390/bpf: Implement BPF_MOV | BPF_X with sign-extensionIlya Leoshkevich1-8/+40
Implement the cpuv4 register-to-register move with sign extension. It is distinguished from the normal moves by non-zero values in insn->off, which determine the source size. s390x has instructions to deal with all of them: lbr, lhr, lgbr, lghr and lgfr. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-19s390/bpf: Let arch_prepare_bpf_trampoline return program sizeSong Liu1-1/+1
arch_prepare_bpf_trampoline() for s390 currently returns 0 on success. This is not a problem for regular trampoline. However, struct_ops relies on the return value to advance "image" pointer: bpf_struct_ops_map_update_elem() { ... for_each_member(i, t, member) { ... err = bpf_struct_ops_prepare_trampoline(); ... image += err; } } When arch_prepare_bpf_trampoline returns 0 on success, all members of the struct_ops will point to the same trampoline (the last one). Fix this by returning the program size in arch_prepare_bpf_trampoline (on success). This is the same behavior as other architectures. Signed-off-by: Song Liu <song@kernel.org> Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919060258.3237176-2-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-16bpf: Use bpf_is_subprog to check for subprogsKumar Kartikeya Dwivedi1-1/+1
We would like to know whether a bpf_prog corresponds to the main prog or one of the subprogs. The current JIT implementations simply check this using the func_idx in bpf_prog->aux->func_idx. When the index is 0, it belongs to the main program, otherwise it corresponds to some subprogram. This will also be necessary to halt exception propagation while walking the stack when an exception is thrown, so we add a simple helper function to check this, named bpf_is_subprog, and convert existing JIT implementations to also make use of it. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-06s390/bpf: Pass through tail call counter in trampolinesIlya Leoshkevich1-0/+10
s390x eBPF programs use the following extension to the s390x calling convention: tail call counter is passed on stack at offset STK_OFF_TCCNT, which callees otherwise use as scratch space. Currently trampoline does not respect this and clobbers tail call counter. This breaks enforcing tail call limits in eBPF programs, which have trampolines attached to them. Fix by forwarding a copy of the tail call counter to the original eBPF program in the trampoline (for fexit), and by restoring it at the end of the trampoline (for fentry). Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reported-by: Leon Hwang <hffilwlqm@gmail.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230906004448.111674-1-iii@linux.ibm.com
2023-06-28s390: consistently use .balign instead of .alignHeiko Carstens1-2/+2
The .align directive has inconsistent behavior across architectures. Use .balign instead everywhere. This is a no-op for s390, but with this there is no mix in using .align and .balign anymore. Future code is supposed to use only .balign. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>