| Age | Commit message (Collapse) | Author | Files | Lines |
|
The verifier in the kernel ensures that the struct_ops operators behave
correctly by checking that they access parameters and context
appropriately. The verifier will approve a program as long as it correctly
accesses the context/parameters, regardless of its function signature. In
contrast, libbpf should not verify the signature of function pointers and
functions to enable flexibility in loading various implementations of an
operator even if the signature of the function pointer does not match those
in the implementations or the kernel.
With this flexibility, user space applications can adapt to different
kernel versions by loading a specific implementation of an operator based
on feature detection.
This is a follow-up of the commit c911fc61a7ce ("libbpf: Skip zeroed or
null fields if not found in the kernel type.")
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240404232342.991414-1-thinker.li@gmail.com
|
|
A test is added for bpf_for_each_map_elem() with either an arraymap or a
hashmap.
$ tools/testing/selftests/bpf/test_progs -t for_each
#93/1 for_each/hash_map:OK
#93/2 for_each/array_map:OK
#93/3 for_each/write_map_key:OK
#93/4 for_each/multi_maps:OK
#93 for_each:OK
Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240405025536.18113-4-lulie@linux.alibaba.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add selftests validating that BPF verifier handles precision marking
for SCALAR registers derived from r10 (fp) register correctly.
Given `r0 = (s8)r10;` syntax is not supported by older Clang compilers,
use the raw BPF instruction syntax to maximize compatibility.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240404214536.3551295-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Introduce a test case to evaluate AF_XDP's robustness by pushing hardware
and software ring sizes to their limits. This test ensures AF_XDP's
reliability amidst potential producer/consumer throttling due to maximum
ring utilization. The testing strategy includes:
1. Configuring rings to their maximum allowable sizes.
2. Executing a series of tests across diverse batch sizes to assess
system's behavior under different configurations.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240402114529.545475-8-tushar.vyavahare@intel.com
|
|
Add a new test case that stresses AF_XDP and the driver by configuring
small hardware and software ring sizes. This verifies that AF_XDP continues
to function properly even with insufficient ring space that could lead
to frequent producer/consumer throttling. The test procedure involves:
1. Set the minimum possible ring configuration(tx 64 and rx 128).
2. Run tests with various batch sizes(1 and 63) to validate the system's
behavior under different configurations.
Update Makefile to include network_helpers.o in the build process for
xskxceiver.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240402114529.545475-7-tushar.vyavahare@intel.com
|
|
handling AF_XDP socket closures
Introduce a new function, set_ring_size(), to manage asynchronous AF_XDP
socket closure. Retry set_hw_ring_size up to SOCK_RECONF_CTR times if it
fails due to an active AF_XDP socket. Return an error immediately for
non-EBUSY errors. This enhances robustness against asynchronous AF_XDP
socket closures during ring size changes.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240402114529.545475-6-tushar.vyavahare@intel.com
|
|
ring size
Introduce a new function called set_hw_ring_size that allows for the
dynamic configuration of the ring size within the interface.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240402114529.545475-5-tushar.vyavahare@intel.com
|
|
max interface size
Introduce a new function called get_hw_size that retrieves both the
current and maximum size of the interface and stores this information
in the 'ethtool_ringparam' structure.
Remove ethtool_channels struct from xdp_hw_metadata.c due to redefinition
error. Remove unused linux/if.h include from flow_dissector BPF test to
address CI pipeline failure.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240402114529.545475-4-tushar.vyavahare@intel.com
|
|
Convert the constant BATCH_SIZE into a variable named batch_size to allow
dynamic modification at runtime. This is required for the forthcoming
changes to support testing different hardware ring sizes.
While running these tests, a bug was identified when the batch size is
roughly the same as the NIC ring size. This has now been addressed by
Maciej's fix in commit 913eda2b08cc ("i40e: xsk: remove count_mask").
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240402114529.545475-3-tushar.vyavahare@intel.com
|
|
LLVM generates bpf_addr_space_cast instruction while translating
pointers between native (zero) address space and
__attribute__((address_space(N))). The addr_space=0 is reserved as
bpf_arena address space.
rY = addr_space_cast(rX, 0, 1) is processed by the verifier and
converted to normal 32-bit move: wX = wY.
rY = addr_space_cast(rX, 1, 0) : used to convert a bpf arena pointer to
a pointer in the userspace vma. This has to be converted by the JIT.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20240325150716.4387-3-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In order to prevent mptcpify prog from affecting the running results
of other BPF tests, a pid limit was added to restrict it from only
modifying its own program.
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/8987e2938e15e8ec390b85b5dcbee704751359dc.1712054986.git.tanggeliang@kylinos.cn
|
|
When testing send_signal and stacktrace_build_id_nmi using the riscv sbi
pmu driver without the sscofpmf extension or the riscv legacy pmu driver,
then failures as follows are encountered:
test_send_signal_common:FAIL:perf_event_open unexpected perf_event_open: actual -1 < expected 0
#272/3 send_signal/send_signal_nmi:FAIL
test_stacktrace_build_id_nmi:FAIL:perf_event_open err -1 errno 95
#304 stacktrace_build_id_nmi:FAIL
The reason is that the above pmu driver or hardware does not support
sampling events, that is, PERF_PMU_CAP_NO_INTERRUPT is set to pmu
capabilities, and then perf_event_open returns EOPNOTSUPP. Since
PERF_PMU_CAP_NO_INTERRUPT is not only set in the riscv-related pmu driver,
it is better to skip testing when this capability is set.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240402073029.1299085-1-pulehui@huaweicloud.com
|
|
Currently, cond_break macro uses bytes to encode the may_goto insn.
Patch [1] in llvm implemented may_goto insn in BPF backend.
Replace byte-level encoding with llvm inline asm for better usability.
Using llvm may_goto insn is controlled by macro __BPF_FEATURE_MAY_GOTO.
[1] https://github.com/llvm/llvm-project/commit/0e0bfacff71859d1f9212205f8f873d47029d3fb
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240402025446.3215182-1-yonghong.song@linux.dev
|
|
When BPF selftests are built in RELEASE=1 mode with -O2 optimization
level, uprobe_multi binary, called from multi-uprobe tests is optimized
to the point that all the thousands of target uprobe_multi_func_XXX
functions are eliminated, breaking tests.
So ensure they are preserved by using weak attribute.
But, actually, compiling uprobe_multi binary with -O2 takes a really
long time, and is quite useless (it's not a benchmark). So in addition
to ensuring that uprobe_multi_func_XXX functions are preserved, opt-out
of -O2 explicitly in Makefile and stick to -O0. This saves a lot of
compilation time.
With -O2, just recompiling uprobe_multi:
$ touch uprobe_multi.c
$ time make RELEASE=1 -j90
make RELEASE=1 -j90 291.66s user 2.54s system 99% cpu 4:55.52 total
With -O0:
$ touch uprobe_multi.c
$ time make RELEASE=1 -j90
make RELEASE=1 -j90 22.40s user 1.91s system 99% cpu 24.355 total
5 minutes vs (still slow, but...) 24 seconds.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240329190410.4191353-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
settimeo is invoked in start_server() and in connect_fd_to_fd() already,
no need to invoke settimeo(lfd, 0) and settimeo(fd, 0) in do_test()
anymore. This patch drops them.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/dbc3613bee3b1c78f95ac9ff468bf47c92f106ea.1711447102.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
To simplify the code, use BPF selftests helper connect_fd_to_fd() in
bpf_tcp_ca.c instead of open-coding it. This helper is defined in
network_helpers.c, and exported in network_helpers.h, which is already
included in bpf_tcp_ca.c.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/e105d1f225c643bee838409378dd90fd9aabb6dc.1711447102.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Get addrs directly from available_filter_functions_addrs and
send to the kernel during kprobe_multi_attach. This avoids
consultation of /proc/kallsyms. But available_filter_functions_addrs
is introduced in 6.5, i.e., it is introduced recently,
so I skip the test if the kernel does not support it.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041523.1200301-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In my locally build clang LTO kernel (enabling CONFIG_LTO and
CONFIG_LTO_CLANG_THIN), kprobe_multi_bench_attach/kernel subtest
failed like:
test_kprobe_multi_bench_attach:PASS:get_syms 0 nsec
test_kprobe_multi_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
libbpf: prog 'test_kprobe_empty': failed to attach: No such process
test_kprobe_multi_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts unexpected error: -3
#117/1 kprobe_multi_bench_attach/kernel:FAIL
There are multiple symbols in /sys/kernel/debug/tracing/available_filter_functions
are renamed in /proc/kallsyms due to cross file inlining. One example is for
static function __access_remote_vm in mm/memory.c.
In a non-LTO kernel, we have the following call stack:
ptrace_access_vm (global, kernel/ptrace.c)
access_remote_vm (global, mm/memory.c)
__access_remote_vm (static, mm/memory.c)
With LTO kernel, it is possible that access_remote_vm() is inlined by
ptrace_access_vm(). So we end up with the following call stack:
ptrace_access_vm (global, kernel/ptrace.c)
__access_remote_vm (static, mm/memory.c)
The compiler renames __access_remote_vm to __access_remote_vm.llvm.<hash>
to prevent potential name collision.
The kernel bpf_kprobe_multi_link_attach() and ftrace_lookup_symbols() try
to find addresses based on /proc/kallsyms, hence the current test failed
with LTO kenrel.
This patch consulted /proc/kallsyms to find the corresponding entries
for the ksym and this solved the issue.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041518.1199758-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
These two functions allow selftests to do loading/searching
kallsyms based on their specific compare functions.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041513.1199440-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Refactor trace helper function load_kallsyms_local() such that
it invokes a common function with a compare function as input.
The common function will be used later for other local functions.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041508.1199239-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Refactor some functions in kprobe_multi_test.c to extract
some helper functions who will be used in later patches
to avoid code duplication.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041503.1198982-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Replace CHECK with ASSERT macros for ksyms tests.
This test failed earlier with clang lto kernel, but the
issue is gone with latest code base. But replacing
CHECK with ASSERT still improves code as ASSERT is
preferred in selftests.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041448.1197812-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds a test to ensure all static tcp-cc kfuncs is visible to
the struct_ops bpf programs. It is checked by successfully loading
the struct_ops programs calling these tcp-cc kfuncs.
This patch needs to enable the CONFIG_TCP_CONG_DCTCP and
the CONFIG_TCP_CONG_BBR.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240322191433.4133280-2-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Utilize bpf_modify_return_test_tp() kfunc to have a fast way to trigger
tp/raw_tp/fmodret programs from another BPF program, which gives us
comparable batched benchmarks to (batched) kprobe/fentry benchmarks.
We don't switch kprobe/fentry batched benchmarks to this kfunc to make
bench tool usable on older kernels as well.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240326162151.3981687-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Instead of front-loading all possible benchmarking BPF programs for
trigger benchmarks, explicitly specify which BPF programs are used by
specific benchmark and load only it.
This allows to be more flexible in supporting older kernels, where some
program types might not be possible to load (e.g., those that rely on
newly added kfunc).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240326162151.3981687-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Remove "legacy" benchmarks triggered by syscalls in favor of newly added
in-kernel/batched benchmarks. Drop -batched suffix now as well.
Next patch will restore "feature parity" by adding back
tp/raw_tp/fmodret benchmarks based on in-kernel kfunc approach.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240326162151.3981687-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Existing kprobe/fentry triggering benchmarks have 1-to-1 mapping between
one syscall execution and BPF program run. While we use a fast
get_pgid() syscall, syscall overhead can still be non-trivial.
This patch adds kprobe/fentry set of benchmarks significantly amortizing
the cost of syscall vs actual BPF triggering overhead. We do this by
employing BPF_PROG_TEST_RUN command to trigger "driver" raw_tp program
which does a tight parameterized loop calling cheap BPF helper
(bpf_get_numa_node_id()), to which kprobe/fentry programs are
attached for benchmarking.
This way 1 bpf() syscall causes N executions of BPF program being
benchmarked. N defaults to 100, but can be adjusted with
--trig-batch-iters CLI argument.
For comparison we also implement a new baseline program that instead of
triggering another BPF program just does N atomic per-CPU counter
increments, establishing the limit for all other types of program within
this batched benchmarking setup.
Taking the final set of benchmarks added in this patch set (including
tp/raw_tp/fmodret, added in later patch), and keeping for now "legacy"
syscall-driven benchmarks, we can capture all triggering benchmarks in
one place for comparison, before we remove the legacy ones (and rename
xxx-batched into just xxx).
$ benchs/run_bench_trigger.sh
usermode-count : 79.500 ± 0.024M/s
kernel-count : 49.949 ± 0.081M/s
syscall-count : 9.009 ± 0.007M/s
fentry-batch : 31.002 ± 0.015M/s
fexit-batch : 20.372 ± 0.028M/s
fmodret-batch : 21.651 ± 0.659M/s
rawtp-batch : 36.775 ± 0.264M/s
tp-batch : 19.411 ± 0.248M/s
kprobe-batch : 12.949 ± 0.220M/s
kprobe-multi-batch : 15.400 ± 0.007M/s
kretprobe-batch : 5.559 ± 0.011M/s
kretprobe-multi-batch: 5.861 ± 0.003M/s
fentry-legacy : 8.329 ± 0.004M/s
fexit-legacy : 6.239 ± 0.003M/s
fmodret-legacy : 6.595 ± 0.001M/s
rawtp-legacy : 8.305 ± 0.004M/s
tp-legacy : 6.382 ± 0.001M/s
kprobe-legacy : 5.528 ± 0.003M/s
kprobe-multi-legacy : 5.864 ± 0.022M/s
kretprobe-legacy : 3.081 ± 0.001M/s
kretprobe-multi-legacy: 3.193 ± 0.001M/s
Note how xxx-batch variants are measured with significantly higher
throughput, even though it's exactly the same in-kernel overhead. As
such, results can be compared only between benchmarks of the same kind
(syscall vs batched):
fentry-legacy : 8.329 ± 0.004M/s
fentry-batch : 31.002 ± 0.015M/s
kprobe-multi-legacy : 5.864 ± 0.022M/s
kprobe-multi-batch : 15.400 ± 0.007M/s
Note also that syscall-count is setting a theoretical limit for
syscall-triggered benchmarks, while kernel-count is setting similar
limits for batch variants. usermode-count is a happy and unachievable
case of user space counting without doing any syscalls, and is mostly
the measure of CPU speed for such a trivial benchmark.
As was mentioned, tp/raw_tp/fmodret require kernel-side kfunc to produce
similar benchmark, which we address in a separate patch.
Note that run_bench_trigger.sh allows to override a list of benchmarks
to run, which is very useful for performance work.
Cc: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240326162151.3981687-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Rename uprobe-base to more precise usermode-count (it will match other
baseline-like benchmarks, kernel-count and syscall-count). Also use
BENCH_TRIG_USERMODE() macro to define all usermode-based triggering
benchmarks, which include usermode-count and uprobe/uretprobe benchmarks.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240326162151.3981687-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
BPF verifier emits "unknown func" message when given BPF program type
does not support BPF helper. This message may be confusing for users, as
important context that helper is unknown only to current program type is
not provided.
This patch changes message to "program of this type cannot use helper "
and aligns dependent code in libbpf and tests. Any suggestions on
improving/changing this message are welcome.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20240325152210.377548-1-yatsenko@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch extends the fib_lookup test suite by adding a few test
cases for each IP family to test the new BPF_FIB_LOOKUP_MARK flag
to the bpf_fib_lookup:
* Test destination IP address selection with and without a mark
and/or the BPF_FIB_LOOKUP_MARK flag set
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240326101742.17421-3-aspsk@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts, or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds a missing check to bloom filter creating, rejecting
values above KMALLOC_MAX_SIZE. This brings the bloom map in line with
many other map types.
The lack of this protection can cause kernel crashes for value sizes
that overflow int's. Such a crash was caught by syzkaller. The next
patch adds more guard-rails at a lower level.
Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240327024245.318299-2-andreimatei1@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Recently, I frequently hit the following test failure:
[root@arch-fb-vm1 bpf]# ./test_progs -n 33/1
test_lookup_update:PASS:skel_open 0 nsec
[...]
test_lookup_update:PASS:sync_rcu 0 nsec
test_lookup_update:FAIL:map1_leak inner_map1 leaked!
#33/1 btf_map_in_map/lookup_update:FAIL
#33 btf_map_in_map:FAIL
In the test, after map is closed and then after two rcu grace periods,
it is assumed that map_id is not available to user space.
But the above assumption cannot be guaranteed. After zero or one
or two rcu grace periods in different siturations, the actual
freeing-map-work is put into a workqueue. Later on, when the work
is dequeued, the map will be actually freed.
See bpf_map_put() in kernel/bpf/syscall.c.
By using workqueue, there is no ganrantee that map will be actually
freed after a couple of rcu grace periods. This patch removed
such map leak detection and then the test can pass consistently.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240322061353.632136-1-yonghong.song@linux.dev
|
|
To simplify the code, use BPF selftests helper start_server() in
bpf_tcp_ca.c instead of open-coding it. This helper is defined in
network_helpers.c, and exported in network_helpers.h, which is already
included in bpf_tcp_ca.c.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/9926a79118db27dd6d91c4854db011c599cabd0e.1711331517.git.tanggeliang@kylinos.cn
|
|
The arena_list selftest uses (1ull << 32) in the mmap address
computation for arm64. Use the same in the verifier_arena selftest.
This makes the selftest pass for arm64 on the CI[1].
[1] https://github.com/kernel-patches/bpf/pull/6622
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20240322133552.70681-1-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Some distros seem to enable the -fcf-protection=branch by default,
which breaks our setup on first instruction of uprobe trigger
functions and place there endbr64 instruction.
Marking them with nocf_check attribute to skip that.
Ignoring unknown attribute warning in gcc for bench objects, because
nocf_check can be used only when -fcf-protection=branch is enabled,
otherwise we get a warning and break compilation.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240322134936.1075395-1-jolsa@kernel.org
|
|
With glibc 2.28, selftests compilation fails for benchs/bench_trigger.c:
benchs/bench_trigger.c: In function ‘inc_counter’:
benchs/bench_trigger.c:25:23: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration]
25 | tid = gettid();
| ^~~~~~
| getgid
cc1: all warnings being treated as errors
It appears support for the gettid() wrapper is variable across glibc
versions, so may be safer to use syscall(SYS_gettid) instead.
Fixes: 520fad2e3206 ("selftests/bpf: scale benchmark counting by using per-CPU counters")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240322095728.95671-1-alan.maguire@oracle.com
|
|
When benchmarking with multiple threads (-pN, where N>1), we start
contending on single atomic counter that both BPF trigger benchmarks are
using, as well as "baseline" tests in user space (trig-base and
trig-uprobe-base benchmarks). As such, we start bottlenecking on
something completely irrelevant to benchmark at hand.
Scale counting up by using per-CPU counters on BPF side. On use space
side we do the next best thing: hash thread ID to approximate per-CPU
behavior. It seems to work quite well in practice.
To demonstrate the difference, I ran three benchmarks with 1, 2, 4, 8,
16, and 32 threads:
- trig-uprobe-base (no syscalls, pure tight counting loop in user-space);
- trig-base (get_pgid() syscall, atomic counter in user-space);
- trig-fentry (syscall to trigger fentry program, atomic uncontended per-CPU
counter on BPF side).
Command used:
for b in uprobe-base base fentry; do \
for p in 1 2 4 8 16 32; do \
printf "%-11s %2d: %s\n" $b $p \
"$(sudo ./bench -w2 -d5 -a -p$p trig-$b | tail -n1 | cut -d'(' -f1 | cut -d' ' -f3-)"; \
done; \
done
Before these changes, aggregate throughput across all threads doesn't
scale well with number of threads, it actually even falls sharply for
uprobe-base due to a very high contention:
uprobe-base 1: 138.998 ± 0.650M/s
uprobe-base 2: 70.526 ± 1.147M/s
uprobe-base 4: 63.114 ± 0.302M/s
uprobe-base 8: 54.177 ± 0.138M/s
uprobe-base 16: 45.439 ± 0.057M/s
uprobe-base 32: 37.163 ± 0.242M/s
base 1: 16.940 ± 0.182M/s
base 2: 19.231 ± 0.105M/s
base 4: 21.479 ± 0.038M/s
base 8: 23.030 ± 0.037M/s
base 16: 22.034 ± 0.004M/s
base 32: 18.152 ± 0.013M/s
fentry 1: 14.794 ± 0.054M/s
fentry 2: 17.341 ± 0.055M/s
fentry 4: 23.792 ± 0.024M/s
fentry 8: 21.557 ± 0.047M/s
fentry 16: 21.121 ± 0.004M/s
fentry 32: 17.067 ± 0.023M/s
After these changes, we see almost perfect linear scaling, as expected.
The sub-linear scaling when going from 8 to 16 threads is interesting
and consistent on my test machine, but I haven't investigated what is
causing it this peculiar slowdown (across all benchmarks, could be due
to hyperthreading effects, not sure).
uprobe-base 1: 139.980 ± 0.648M/s
uprobe-base 2: 270.244 ± 0.379M/s
uprobe-base 4: 532.044 ± 1.519M/s
uprobe-base 8: 1004.571 ± 3.174M/s
uprobe-base 16: 1720.098 ± 0.744M/s
uprobe-base 32: 3506.659 ± 8.549M/s
base 1: 16.869 ± 0.071M/s
base 2: 33.007 ± 0.092M/s
base 4: 64.670 ± 0.203M/s
base 8: 121.969 ± 0.210M/s
base 16: 207.832 ± 0.112M/s
base 32: 424.227 ± 1.477M/s
fentry 1: 14.777 ± 0.087M/s
fentry 2: 28.575 ± 0.146M/s
fentry 4: 56.234 ± 0.176M/s
fentry 8: 106.095 ± 0.385M/s
fentry 16: 181.440 ± 0.032M/s
fentry 32: 369.131 ± 0.693M/s
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240315213329.1161589-1-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add test validating BPF cookie can be passed during raw_tp/tp_btf
attachment and can be retried at runtime with bpf_get_attach_cookie()
helper.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-6-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gmail.com
|
|
Add a sk_msg bpf program test where the program is running in a pid
namespace. The test is successful:
#165/4 ns_current_pid_tgid/new_ns_sk_msg:OK
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184915.2976718-1-yonghong.song@linux.dev
|
|
Add a cgroup bpf program test where the bpf program is running
in a pid namespace. The test is successfully:
#165/3 ns_current_pid_tgid/new_ns_cgrp:OK
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184910.2976522-1-yonghong.song@linux.dev
|
|
Refactor some functions in both user space code and bpf program
as these functions are used by later cgroup/sk_msg tests.
Another change is to mark tp program optional loading as later
patches will use optional loading as well since they have quite
different attachment and testing logic.
There is no functionality change.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184904.2976123-1-yonghong.song@linux.dev
|
|
Replace CHECK in selftest ns_current_pid_tgid with recommended ASSERT_* style.
I also shortened subtest name as the prefix of subtest name is covered
by the test name already.
This patch does fix a testing issue. Currently even if bss->user_{pid,tgid}
is not correct, the test still passed since the clone func returns 0.
I fixed it to return a non-zero value if bss->user_{pid,tgid} is incorrect.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184859.2975543-1-yonghong.song@linux.dev
|
|
Check that 4Gbyte arena can be allocated and overflow/underflow access in
the first and the last page behaves as expected.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20240315021834.62988-5-alexei.starovoitov@gmail.com
|
|
Remove hard coded PAGE_SIZE.
Add #include <sys/user.h> instead (that works on x86-64 and s390)
and fallback to slow getpagesize() for aarch64.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20240315021834.62988-4-alexei.starovoitov@gmail.com
|
|
The selftests use
to tell LLVM about special pointers. For LLVM there is nothing "arena"
about them. They are simply pointers in a different address space.
Hence LLVM diff https://github.com/llvm/llvm-project/pull/85161 renamed:
. macro __BPF_FEATURE_ARENA_CAST -> __BPF_FEATURE_ADDR_SPACE_CAST
. global variables in __attribute__((address_space(N))) are now
placed in section named ".addr_space.N" instead of ".arena.N".
Adjust libbpf, bpftool, and selftests to match LLVM.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20240315021834.62988-3-alexei.starovoitov@gmail.com
|
|
There are statements with two semicolons. Remove the second one, it
is redundant.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240315092654.2431062-1-colin.i.king@gmail.com
|
|
A new version of a type may have additional fields that do not exist in
older versions. Previously, libbpf would reject struct_ops maps with a new
version containing extra fields when running on a machine with an old
kernel. However, we have updated libbpf to ignore these fields if their
values are all zeros or null in order to provide backward compatibility.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240313214139.685112-3-thinker.li@gmail.com
|
|
According to a report, skeletons fail to assign shadow pointers when being
compiled with C++ programs. Unlike C doing implicit casting for void
pointers, C++ requires an explicit casting.
To support C++, we do explicit casting for each shadow pointer.
Also add struct_ops_module.skel.h to test_cpp to validate C++
compilation as part of BPF selftests.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20240312013726.1780720-1-thinker.li@gmail.com
|