| Age | Commit message (Collapse) | Author | Files | Lines |
|
The note_data at ptr is read as a nhdr but this may yield
out-of-bounds reads if there isn't nhdrs worth of data.
Be more defensive before doing the reads.
This is motivated by address sanitizer capturing out of bounds reads
running "perf top".
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When exploring building bpf_skel with GCC's BPF support, there was a
build failure because of bpf_core_field_exists vs the mem_hops bitfield:
```
In file included from util/bpf_skel/sample_filter.bpf.c:6:
util/bpf_skel/sample_filter.bpf.c: In function 'perf_get_sample':
tools/perf/libbpf/include/bpf/bpf_core_read.h:169:42: error: cannot take address of bit-field 'mem_hops'
169 | #define ___bpf_field_ref1(field) (&(field))
| ^
tools/perf/libbpf/include/bpf/bpf_helpers.h:222:29: note: in expansion of macro '___bpf_field_ref1'
222 | #define ___bpf_concat(a, b) a ## b
| ^
tools/perf/libbpf/include/bpf/bpf_helpers.h:225:29: note: in expansion of macro '___bpf_concat'
225 | #define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
| ^~~~~~~~~~~~~
tools/perf/libbpf/include/bpf/bpf_core_read.h:173:9: note: in expansion of macro '___bpf_apply'
173 | ___bpf_apply(___bpf_field_ref, ___bpf_narg(args))(args)
| ^~~~~~~~~~~~
tools/perf/libbpf/include/bpf/bpf_core_read.h:188:39: note: in expansion of macro '___bpf_field_ref'
188 | __builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_EXISTS)
| ^~~~~~~~~~~~~~~~
util/bpf_skel/sample_filter.bpf.c:167:29: note: in expansion of macro 'bpf_core_field_exists'
167 | if (bpf_core_field_exists(data->mem_hops))
| ^~~~~~~~~~~~~~~~~~~~~
cc1: error: argument is not a field access
```
___bpf_field_ref1 was adapted for GCC in 12bbcf8e840f40b82b02981e96e0a5fbb0703ea9
but the trick added for compatibility in 3a8b8fc3174891c4c12f5766d82184a82d4b2e3e
isn't compatible with that as an address is used as an argument.
Workaround this by calling __builtin_preserve_field_info directly as the
bpf_core_field_exists macro does, but without the ___bpf_field_ref use.
Co-developed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Signed-off-by: Sam James <sam@gentoo.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://gcc.gnu.org/PR121420
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
libtracefs isn't used by perf but not having it installed causes build
warnings.
Given the library isn't used, there is no need for the configuration or
warnings so remove.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250929170600.59000-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Removed in favor of the shell script version.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'import perf' test needs to set up a path to the python module as
well as to know the python command to invoke.
These are hard coded at build time to be build a directory and the
python used in the build, which is less than desirable.
Avoid the hard coded values by reusing the existing shell script python
setup and determine a potential built python module via the path of the
perf executable.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
BRBE emits IRQ and ERET branches for branching and returning from
trapped instructions. Add a test that loops on a trapped instruction
(MRS - Read special register) for this.
Extend the expected 'any_call' branches to include FAULT_DATA and
FAULT_INST as these are emitted by BRBE.
Reviewed-by: Ian Rogers <irogers@google.com>
Co-developed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adam Young <admiyo@os.amperecomputing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Test that SYSCALL type branches are emitted from the expected 'getppid'
symbol. Test that when only 'k' is used, sources addresses are all in
the kernel. Test that no kernel addresses leak by checking for them in
the 'u' test.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adam Young <admiyo@os.amperecomputing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
check_branches() will be used by other tests in a later commit so make
it a function. And the any_call filters are duplicated and will also
be extended in a later commit, so move them to a variable.
No functional changes intended.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adam Young <admiyo@os.amperecomputing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Don't open evsels on all CPUs, open them just on the CPUs they
support. This avoids opening say an e-core event on a p-core and
getting a failure - achieve this by getting rid of the "all_cpu_map".
In install_pe functions don't use the cpu_map_idx as a CPU number,
translate the cpu_map_idx, which is a dense index into the cpu_map
skipping holes at the beginning, to a proper CPU number.
Before:
```
$ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1
Performance counter stats for 'system wide':
<not supported> cpu_atom/cycles/
566,270,672 cpu_core/cycles/
<not supported> cpu_atom/instructions/
572,792,836 cpu_core/instructions/ # 1.01 insn per cycle
1.001595384 seconds time elapsed
```
After:
```
$ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1
Performance counter stats for 'system wide':
443,299,201 cpu_atom/cycles/
1,233,919,737 cpu_core/cycles/
213,634,112 cpu_atom/instructions/ # 0.48 insn per cycle
2,758,965,527 cpu_core/instructions/ # 2.24 insn per cycle
1.001699485 seconds time elapsed
```
Fixes: 7fac83aaf2eecc9e ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: bpf@vger.kernel.org
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reduce the API surface that is in bpf_counter.h, this helps compiler
analysis like unused static function, makes it easier to set a
breakpoint and just makes it easier to see the code is self contained.
When code is shared between BPF C code, put it inside HAVE_BPF_SKEL.
Move transitively found #includes into appropriate C files.
No functional change.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove the arch-specific guard around TYPE_STATE_MAX_REGS and define it
as 32 for all architectures.
The architecture that perf is built on may not match the architecture
that produced the perf.data file, so relying on __powerpc__ or similar
is fragile.
Using 32 as a fixed upper bound is safe since it is greater than the
previous maximum of 16.
Add a comment to clarify that TYPE_STATE_MAX_REGS is an arch-independent
maximum rather than a build-time choice.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The process_event() function in "builtin-script.c" invokes
perf_sample__fprintf_synth() for displaying PERF_TYPE_SYNTH
type events.
if (attr->type == PERF_TYPE_SYNTH && PRINT_FIELD(SYNTH))
perf_sample__fprintf_synth(sample, evsel, fp);
perf_sample__fprintf_synth() process the sample depending on the value
in evsel->core.attr.config. Introduce perf_sample__fprintf_synth_vpadtl()
and invoke this for PERF_SYNTH_POWERPC_VPA_DTL
Sample output:
./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.300 MB perf.data ]
./perf script
perf 13322 [002] 233.835807: sched:sched_switch: perf:13322 [120] R ==> migration/2:27 [0]
migration/2 27 [002] 233.835811: sched:sched_migrate_task: comm=perf pid=13322 prio=120 orig_cpu=2 dest_cpu=3
migration/2 27 [002] 233.835818: sched:sched_stat_runtime: comm=migration/2 pid=27 runtime=9214 [ns]
migration/2 27 [002] 233.835819: sched:sched_switch: migration/2:27 [0] S ==> swapper/2:0 [120]
swapper 0 [002] 233.835822: vpa-dtl: timebase: 338954486062657 dispatch_reason:decrementer_interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:435, ready_to_enqueue_time:0, waiting_to_ready_time:34775058, processor_id: 202 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms])
swapper 0 [001] 233.835886: vpa-dtl: timebase: 338954486095398 dispatch_reason:priv_doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:542, ready_to_enqueue_time:0, waiting_to_ready_time:1245360, processor_id: 201 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms])
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Create samples from DTL entries for displaying in 'perf report'
and 'perf script'.
When the different PERF_RECORD_XX records are processed from perf
session, powerpc_vpadtl_process_event() will be invoked.
For each of the PERF_RECORD_XX record, compare the timestamp of perf
record with timestamp of top element in the auxtrace heap.
Process the auxtrace queue if the timestamp of element from heap is
lower than timestamp from entry in perf record.
Sometimes it could happen that one buffer is only partially processed.
if the timestamp of occurrence of another event is more than currently
processed element in the queue, it will move on to next perf record.
So keep track of position of buffer to continue processing next time.
Update the timestamp of the auxtrace heap with the timestamp of last
processed entry from the auxtrace buffer.
Generate perf sample for each entry in the dispatch trace log.
Fill in the sample details:
- sample ip is picked from srr0 field of dtl_entry
- sample cpu is picked from processor_id of dtl_entry
- sample id is from sample_id of powerpc_vpadtl
- cpumode is set to PERF_RECORD_MISC_KERNEL
- Additionally save the details in raw_data of sample.
This is to print the relevant fields in perf_sample__fprintf_synth()
when called from builtin-script
The sample is processed by calling perf_session__deliver_synth_event()
so that it gets included in perf report.
Sample Output:
./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.300 MB perf.data ]
./perf report
# Samples: 321 of event 'vpa-dtl'
# Event count (approx.): 321
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ..............................
#
100.00% 100.00% swapper [kernel.kallsyms] [k] plpar_hcall_norets_notrace
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
other events across CPU's
When the Dispatch Trace Log data is collected along with other events
like sched tracepoint events, it needs to be correlated and present
interleaved along with these events.
Perf events can be collected parallely across the CPUs. Hence it needs
to be ensured events/dtl entries are processed in timestamp order.
An auxtrace_queue is created for each CPU.
Data within each queue is in increasing order of timestamp. Each
auxtrace queue has a array/list of auxtrace buffers.
When processing the auxtrace buffer, the data is mmapp'ed.
All auxtrace queues is maintained in auxtrace heap.
Each queue has a queue number and a timestamp.
The queues are sorted/added to head based on the time stamp.
So always the lowest timestamp (entries to be processed first) is on top
of the heap.
The auxtrace queue needs to be allocated and heap needs to be populated
in the sorted order of timestamp.
The queue needs to be filled with data only once via
powerpc_vpadtl__update_queues() function.
powerpc_vpadtl__setup_queues() iterates through all the entries to
allocate and setup the auxtrace queue.
To add to auxtrace heap, it is required to fetch the timebase of first
entry for each of the queue.
The first entry in the queue for VPA DTL PMU has the boot timebase,
frequency details which are needed to get timestamp which is required to
correlate with other events.
The very next entry is the actual trace data that provides timestamp for
occurrence of DTL event.
Formula used to get the timestamp from dtl entry is:
((timbase from DTL entry - boot time) / frequency) * 1000000000
powerpc_vpadtl_decode() adds the boot time and frequency as part of
powerpc_vpadtl_queue structure so that it can be reused.
Each of the dtl_entry is of 48 bytes size. Sometimes it could happen
that one buffer is only partially processed (if the timestamp of
occurrence of another event is more than currently processed element in
queue, it will move on to next event).
In order to keep track of position of buffer, additional fields is added
to powerpc_vpadtl_queue structure.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
DTL samples
Dispatch Trace Log details are captured as-is in PERF_RECORD_AUXTRACE
records.
To present dtl entries as samples, create an event with name as
"vpa-dtl" and type PERF_TYPE_SYNTH.
Add perf_synth_id, "PERF_SYNTH_POWERPC_VPA_DTL" as config value for the
event.
Create a sample id to be a fixed offset from evsel id.
To present the relevant fields from the "struct dtl_entry", prepare the
entries as events of type PERF_TYPE_SYNTH.
By defining as PERF_TYPE_SYNTH type, samples can be printed as part of
perf_sample__fprintf_synth in builtin-script.c
From powerpc_vpadtl_process_auxtrace_info(), invoke
auxtrace_queues__process_index() function which will queue the auxtrace
buffers by invoke auxtrace_queues__add_event().
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add VPA DTL PMU auxtrace process function for "perf report -D".
The auxtrace event processing functions are defined in file
"util/powerpc-vpadtl.c".
Data structures used includes "struct powerpc_vpadtl_queue", "struct
powerpc_vpadtl" to store the auxtrace buffers in queue. Different
PERF_RECORD_XXX are generated during recording.
PERF_RECORD_AUXTRACE_INFO is processed first since it is of type
perf_user_event_type and perf session event delivers
perf_session__process_user_event() first.
Define function powerpc_vpadtl_process_auxtrace_info() to handle the
processing of PERF_RECORD_AUXTRACE_INFO records.
In this function, initialize the aux buffer queues using
auxtrace_queues__init().
Setup the required infrastructure for aux data processing.
The data is collected per CPU and auxtrace_queue is created for each
CPU.
Define powerpc_vpadtl_process_event() function to process
PERF_RECORD_AUXTRACE records.
In this, add the event to queue using auxtrace_queues__add_event() and
process the buffer in powerpc_vpadtl_dump_event().
The first entry in the buffer with timebase as zero has boot timebase
and frequency.
Remaining data is of format for "struct powerpc_vpadtl_entry".
Define the translation for dispatch_reasons and preempt_reasons, report
this when dump trace is invoked via powerpc_vpadtl_dump()
Sample output:
./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.300 MB perf.data ]
./perf report -D
0 0 0x39b10 [0x30]: PERF_RECORD_AUXTRACE size: 0x690 offset: 0 ref: 0 idx: 0 tid: -1 cpu: 0
.
. ... VPA DTL PMU data: size 1680 bytes, entries is 35
. 00000000: boot_tb: 21349649546353231, tb_freq: 512000000
. 00000030: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:7064, ready_to_enqueue_time:187, waiting_to_ready_time:6611773
. 00000060: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:146, ready_to_enqueue_time:0, waiting_to_ready_time:15359437
. 00000090: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:4868, ready_to_enqueue_time:232, waiting_to_ready_time:5100709
. 000000c0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:179, ready_to_enqueue_time:0, waiting_to_ready_time:30714243
. 000000f0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:197, ready_to_enqueue_time:0, waiting_to_ready_time:15350648
. 00000120: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:213, ready_to_enqueue_time:0, waiting_to_ready_time:15353446
. 00000150: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:212, ready_to_enqueue_time:0, waiting_to_ready_time:15355126
. 00000180: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:6368, ready_to_enqueue_time:164, waiting_to_ready_time:5104665
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The powerpc PMU collecting Dispatch Trace Log (DTL) entries makes use of
AUX support in perf infrastructure.
The PMU driver has the functionality to collect trace entries in the aux
buffer.
On the tools side, this data is made available as PERF_RECORD_AUXTRACE
records.
This record is generated by "perf record" command.
To enable the creation of PERF_RECORD_AUXTRACE, add functions to
initialize auxtrace records ie "auxtrace_record__init()".
Fill in fields for other callbacks like info_priv_size, info_fill, free,
recording options etc.
Define auxtrace_type as PERF_AUXTRACE_VPA_DTL. Add header file to
define vpa dtl pmu specific details.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A union is used to set the priv value in thread (a void*) to a boolean
value through type punning. Undefined behavior sanitizer fails on this.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Test cases from perftool_testsuite are affected by the current
directory where the test are run. For this reason, the test
driver has to change the directory to the base_dir for references to
work correctly.
Utilize absolute paths when sourcing and referencing other scripts so
that the current working directory doesn't impact the test cases.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Jakub Brnak <jbrnak@redhat.com>
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
- "the the"
- "in in"
- "a a"
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Markus Heidelberg <m.heidelberg@cab.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 instruction decoder update from Borislav Petkov:
- Add instruction decoding support for the XOP-prefixed instruction set
present on the AMD Bulldozer uarch
[ These instructions don't normally happen, but a X86_NATIVE_CPU build
on a bulldozer host can make the compiler then use these unusual
instruction encodings ]
* tag 'x86_misc_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/insn: Add XOP prefix instructions decoder support
|
|
Update sierraforest events to v1.12 released in:
https://github.com/intel/perfmon/commit/8279984b0b2eef35412c0281983ef59ae74f19ed
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update sapphirerapids events to v1.35 released in:
https://github.com/intel/perfmon/commit/94ed947d7b5966e50a685118f69e8942976ec397
https://github.com/intel/perfmon/commit/9d993957a2a8fc1d4f843013c979a413141e9839
Also adds cpu_cstate_c0 and cpu_cstate_c6 metrics.
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update pantherlake events to v1.00 released in:
https://github.com/intel/perfmon/commit/b149786b6f9db18c09e6c89f3eb4fdb849ead2cf
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update lunarlake events to v1.17 released in:
https://github.com/intel/perfmon/commit/ec387ac70602283d01da32c2ff5fe9cb15d3d25b
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update lunarlake events to v1.18 released in:
https://github.com/intel/perfmon/commit/04e11e566618d781b854f526d7e394ed50420638
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update graniterapids events to v1.15 released in:
https://github.com/intel/perfmon/commit/8af960a34864d291ad5a5a9f8fdf5a40486b78a1
https://github.com/intel/perfmon/commit/943667ba5b3f3203e47aede6bc7d017a26b89a26
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update grandridge events to v1.10 released in:
https://github.com/intel/perfmon/commit/c3028836415b8b5d31b8439eefc087492148c19f
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update emeraldrapids events to v1.20 released in:
https://github.com/intel/perfmon/commit/868b433955f3e94126420ee9374b9e0a6ce2d83e
https://github.com/intel/perfmon/commit/43681e2817a960d06c5b8870cc6d3e5b7b6feeb9
Also adds cpu_cstate_c0 and cpu_cstate_c6 metrics.
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update arrowlake events to v1.13 released in:
https://github.com/intel/perfmon/commit/718cdcec8b9637819af5e9eff8f705f731b0f971
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update alderlake events to v1.34 released in:
https://github.com/intel/perfmon/commit/80b773ebcf601b0e48e31f2184ffef933c4d842e
Event JSON automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When building perf with LIBBPF_DYNAMIC=1 on a fedora system with
libbpf-devel 1.5 I it was breaking with:
util/bpf-event.c: In function ‘format_btf_variable’:
util/bpf-event.c:291:18: error: ‘const struct btf_dump_type_data_opts’ has no member named ‘emit_strings’
291 | .emit_strings = 1,
| ^~~~~~~~~~~~
util/bpf-event.c:291:33: error: initialized field overwritten [-Werror=override-init]
291 | .emit_strings = 1,
| ^
util/bpf-event.c:291:33: note: (near initialization for ‘opts.skip_names’)
Check the version before using that feature.
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need it to fix some other libbpf version dependent issues when
building with LIBBPF_DYNAMIC=1.
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Building perf with LIBBPF_DYNAMIC (ie not the default static linking of
libbpf with perf) is breaking as the libbpf isn't version 1.7 or newer,
where dont_enable is added to bpf_perf_event_opts.
To avoid this breakage add a compile time version check and don't
declare the variable when not present.
Fixes: 5e2ac8e8571df54d ("perf bpf-filter: Enable events manually")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Hao Ge <gehao@kylinos.cn>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"One notable addition is the creation of the 'transitional' keyword for
kconfig so CONFIG renaming can go more smoothly.
This has been a long-standing deficiency, and with the renaming of
CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI
support), this came up again.
The breadth of the diffstat is mainly this renaming.
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
- lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
(Junjie Cao)
- Add str_assert_deassert() helper (Lad Prabhakar)
- gcc-plugins: Remove TODO_verify_il for GCC >= 16
- kconfig: Fix BrokenPipeError warnings in selftests
- kconfig: Add transitional symbol attribute for migration support
- kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI"
* tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lib/string_choices: Add str_assert_deassert() helper
kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
kconfig: Add transitional symbol attribute for migration support
kconfig: Fix BrokenPipeError warnings in selftests
gcc-plugins: Remove TODO_verify_il for GCC >= 16
stddef: Introduce __TRAILING_OVERLAP()
stddef: Remove token-pasting in TRAILING_OVERLAP()
lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
|
|
The kernel's CFI implementation uses the KCFI ABI specifically, and is
not strictly tied to a particular compiler. In preparation for GCC
supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with
associated options).
Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will
enable CONFIG_CFI during olddefconfig.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The string result of build_id__snprintf() is unconditionally used in
places like dsos__fprintf_buildid_cb(). If the build id has size 0 then
this creates a use of uninitialized memory. Add null termination for the
size 0 case.
A similar fix was written by Jiri Olsa in commit 6311951d4f8f28c4 ("perf
tools: Initialize output buffer in build_id__sprintf") but lost in the
transition to snprintf.
Fixes: fccaaf6fbbc59910 ("perf build-id: Change sprintf functions to snprintf")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The fallback message is unconditionally printed in places like
record__open().
If no fallback is attempted this can lead to printing uninitialized
data, crashes, etc.
Fixes: c0a54341c0e89333 ("perf evsel: Introduce event fallback method")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The detection of uncore_imc may happen for free running PMUs and the
clockticks event may be present on uncore_clock. Rewrite the test to
detect duplicated/deduplicated events from perf list, not hardcoded to
uncore_imc.
If perf stat fails then assume it is permissions and skip the test.
Committer testing:
Before:
root@x1:~# perf test -vv uniquifyi
96: perf stat events uniquifying:
--- start ---
test child forked, pid 220851
stat event uniquifying test
grep: Unmatched [, [^, [:, [., or [=
Event is not uniquified [Failed]
perf stat -e clockticks -A -o /tmp/__perf_test.stat_output.X7ChD -- true
# started on Fri Sep 19 16:48:38 2025
Performance counter stats for 'system wide':
CPU0 2,310,956 uncore_clock/clockticks/
0.001746771 seconds time elapsed
---- end(-1) ----
96: perf stat events uniquifying : FAILED!
root@x1:~#
After:
root@x1:~# perf test -vv uniquifyi
96: perf stat events uniquifying:
--- start ---
test child forked, pid 222366
Uniquification of PMU sysfs events test
Testing event uncore_imc_free_running/data_read/ is uniquified to uncore_imc_free_running_0/data_read/
Testing event uncore_imc_free_running/data_total/ is uniquified to uncore_imc_free_running_0/data_total/
Testing event uncore_imc_free_running/data_write/ is uniquified to uncore_imc_free_running_0/data_write/
Testing event uncore_imc_free_running/data_read/ is uniquified to uncore_imc_free_running_1/data_read/
Testing event uncore_imc_free_running/data_total/ is uniquified to uncore_imc_free_running_1/data_total/
Testing event uncore_imc_free_running/data_write/ is uniquified to uncore_imc_free_running_1/data_write/
---- end(0) ----
96: perf stat events uniquifying : Ok
root@x1:~#
Fixes: 070b315333ee942f ("perf test: Restrict uniquifying test to machines with 'uncore_imc'")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The PMU name is appearing twice in:
```
$ perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.57 MiB uncore_imc_free_running_0/uncore_imc_free_running,data_total/
CPU0 1.58 MiB uncore_imc_free_running_1/uncore_imc_free_running,data_total/
0.000892376 seconds time elapsed
```
Use the pmu_name_len_no_suffix to avoid this problem.
Committer testing:
After this patch:
root@x1:~# perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.69 MiB uncore_imc_free_running_0/data_total/
CPU0 1.68 MiB uncore_imc_free_running_1/data_total/
0.002141605 seconds time elapsed
root@x1:~#
Fixes: 7d45f402d3117e0b ("perf evlist: Make uniquifying counter names consistent")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The test starts a workload and then opens events. If the events fail
to open, for example because of perf_event_paranoid, the gopipe of the
workload is leaked and the file descriptor leak check fails when the
test exits. To avoid this cancel the workload when opening the events
fails.
Before:
```
$ perf test -vv 7
7: PERF_RECORD_* events & perf_sample fields:
--- start ---
test child forked, pid 1189568
Using CPUID GenuineIntel-6-B7-1
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
Attempt to add: software/cpu-clock/
..after resolving event: software/config=0/
cpu-clock -> software/cpu-clock/
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x9 (PERF_COUNT_SW_DUMMY)
sample_type IP|TID|TIME|CPU
read_format ID|LOST
disabled 1
inherit 1
mmap 1
comm 1
enable_on_exec 1
task 1
sample_id_all 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
{ wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 1189569 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
perf_evlist__open: Permission denied
---- end(-2) ----
Leak of file descriptor 6 that opened: 'pipe:[14200347]'
---- unexpected signal (6) ----
iFailed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
#0 0x565358f6666e in child_test_sig_handler builtin-test.c:311
#1 0x7f29ce849df0 in __restore_rt libc_sigaction.c:0
#2 0x7f29ce89e95c in __pthread_kill_implementation pthread_kill.c:44
#3 0x7f29ce849cc2 in raise raise.c:27
#4 0x7f29ce8324ac in abort abort.c:81
#5 0x565358f662d4 in check_leaks builtin-test.c:226
#6 0x565358f6682e in run_test_child builtin-test.c:344
#7 0x565358ef7121 in start_command run-command.c:128
#8 0x565358f67273 in start_test builtin-test.c:545
#9 0x565358f6771d in __cmd_test builtin-test.c:647
#10 0x565358f682bd in cmd_test builtin-test.c:849
#11 0x565358ee5ded in run_builtin perf.c:349
#12 0x565358ee6085 in handle_internal_command perf.c:401
#13 0x565358ee61de in run_argv perf.c:448
#14 0x565358ee6527 in main perf.c:555
#15 0x7f29ce833ca8 in __libc_start_call_main libc_start_call_main.h:74
#16 0x7f29ce833d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#17 0x565358e391c1 in _start perf[851c1]
7: PERF_RECORD_* events & perf_sample fields : FAILED!
```
After:
```
$ perf test 7
7: PERF_RECORD_* events & perf_sample fields : Skip (permissions)
```
Fixes: 16d00fee703866c6 ("perf tests: Move test__PERF_RECORD into separate object")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If libperl is installed then the perf tool build will build against
it. There appears to be limited interest in the scripting support for
perl so let's make it opt-in and deprecate it.
With this patch applied you need to add LIBPERL=1 to get libperl
support in perf - there is no warning if libperl is missing, but
building will fail if libperl is missing and the build has LIBPERL=1.
The perf version output is changed to:
```
$ perf version --build-options
perf version 6.17.rc3.g8eca69269947
aio: [ on ] # HAVE_AIO_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
bpf_skeletons: [ on ] # HAVE_BPF_SKEL
debuginfod: [ on ] # HAVE_DEBUGINFOD_SUPPORT
dwarf: [ on ] # HAVE_LIBDW_SUPPORT
dwarf_getlocations: [ on ] # HAVE_LIBDW_SUPPORT
dwarf-unwind: [ on ] # HAVE_DWARF_UNWIND_SUPPORT
auxtrace: [ on ] # HAVE_AUXTRACE_SUPPORT
libbfd: [ OFF ] # HAVE_LIBBFD_SUPPORT ( tip: Deprecated, license incompatibility, use BUILD_NONDISTRO=1 and install binutils-dev[el] )
libbpf-strings: [ on ] # HAVE_LIBBPF_STRINGS_SUPPORT
libcapstone: [ on ] # HAVE_LIBCAPSTONE_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_LIBDW_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
libopencsd: [ OFF ] # HAVE_CSTRACE_SUPPORT
libperl: [ OFF ] # HAVE_LIBPERL_SUPPORT ( tip:
Deprecated, use LIBPERL=1 and install libperl-dev to build with it )
libpfm4: [ on ] # HAVE_LIBPFM
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip:
Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build
with it )
lzma: [ on ] # HAVE_LZMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
zstd: [ on ] # HAVE_ZSTD_SUPPORT
```
i.e. there is a tip saying about deprecation and how to get support
back.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <qmo@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Yuzhuo Jing <yuzhuo@google.com>
Link: https://lore.kernel.org/lkml/aMrk03gigBlGcYLK@x1/
Link: https://lore.kernel.org/lkml/CAP-5=fVX+bLBRJCiziDi_hBySgv2NFtDoghtpheSSxVAvvETGw@mail.gmail.com
[ Keep the pre-existing perl-ExtUtils-Embed hint for Fedora/RHEL systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If a user specifies an AUX buffer larger than 2 GiB, the returned size
may exceed 0x80000000. Since the err variable is defined as a signed
32-bit integer, such a value overflows and becomes negative.
As a result, the perf record command reports an error:
0x146e8 [0x30]: failed to process type: 71 [Unknown error 183711232]
Change the type of the err variable to a signed 64-bit integer to
accommodate large buffer sizes correctly.
Fixes: d5652d865ea734a1 ("perf session: Add ability to skip 4GiB or more")
Reported-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20250808-perf_fix_big_buffer_size-v1-1-45f45444a9a4@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add two mmap() workloads: one that eagerly populates a region and
another that demand faults it in.
The intent is to probe the memory subsytem performance incurred
by mmap().
$ perf bench mem mmap -s 4gb -p 4kb -l 10 -f populate
# Running 'mem/mmap' benchmark:
# function 'populate' (Eagerly populated map())
# Copying 4gb bytes ...
1.811691 GB/sec
$ perf bench mem mmap -s 4gb -p 2mb -l 10 -f populate
# Running 'mem/mmap' benchmark:
# function 'populate' (Eagerly populated mmap())
# Copying 4gb bytes ...
12.272017 GB/sec
$ perf bench mem mmap -s 4gb -p 1gb -l 10 -f populate
# Running 'mem/mmap' benchmark:
# function 'populate' (Eagerly populated mmap())
# Copying 4gb bytes ...
17.085927 GB/sec
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Split mem benchmark options into common and memset/memcpy specific.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There can be a significant gap in memset/memcpy performance depending
on the size of the region being operated on.
With chunk-size=4kb:
$ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
$ perf bench mem memset -p 4kb -k 4kb -s 4gb -l 10 -f x86-64-stosq
# Running 'mem/memset' benchmark:
# function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
13.011655 GB/sec
With chunk-size=1gb:
$ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
$ perf bench mem memset -p 4kb -k 1gb -s 4gb -l 10 -f x86-64-stosq
# Running 'mem/memset' benchmark:
# function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
21.936355 GB/sec
So, allow the user to specify the chunk-size.
The default value is identical to the total size of the region, which
preserves current behaviour.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Page sizes that can be selected: 4KB, 2MB, 1GB.
Both the reservation and node from which hugepages are allocated
from are expected to be addressed by the user.
An example of page-size selection:
$ perf bench mem memset -s 4gb -p 2mb
# Running 'mem/memset' benchmark:
# function 'default' (Default memset() provided by glibc)
# Copying 4gb bytes ...
14.919194 GB/sec
# function 'x86-64-unrolled' (unrolled memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
11.514503 GB/sec
# function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
12.600568 GB/sec
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Using mmap() ensures that the buffer is always aligned at a fixed
boundary. Switch to that to remove one source of variability.
Since we always want to read/write from the allocated buffers map
with pagetables pre-populated.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
No functional change.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move benchmark function parameters in struct bench_params.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250917152418.4077386-4-ankur.a.arora@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|