diff options
| author | Ian Rogers <irogers@google.com> | 2025-08-25 14:12:04 -0700 |
|---|---|---|
| committer | Arnaldo Carvalho de Melo <acme@redhat.com> | 2025-09-12 15:53:32 -0300 |
| commit | 035c17893082b403c98330f1fdb58fd925951038 (patch) | |
| tree | ded7894a9e341d60ae2f7a5ccfefee5c3d21b212 | |
| parent | perf stat: Don't skip failing group events (diff) | |
| download | linux-035c17893082b403c98330f1fdb58fd925951038.tar.gz linux-035c17893082b403c98330f1fdb58fd925951038.zip | |
perf parse-events: Add 'X' modifier to exclude an event from being regrouped
The function parse_events__sort_events_and_fix_groups is needed to fix
uncore events like:
```
$ perf stat -e '{data_read,data_write}' ...
```
so that the multiple uncore PMUs have a group each of data_read and
data_write events.
The same function will perform architecture sorting and group fixing,
in particular for Intel topdown/perf-metric events. Grouping multiple
perf metric events together causes perf_event_open to fail as the
group can only support one. This means command lines like:
```
$ perf stat -e 'slots,slots' ...
```
fail as the slots events are forced into a group together to try to
satisfy the perf-metric event constraints.
As the user may know better than
parse_events__sort_events_and_fix_groups add a 'X' modifier to skip
its regrouping behavior. This allows the following to succeed rather
than fail on the second slots event being opened:
```
$ perf stat -e 'slots,slots:X' -a sleep 1
Performance counter stats for 'system wide':
6,834,154,071 cpu_core/slots/ (50.13%)
5,548,629,453 cpu_core/slots/X (49.87%)
1.002634606 seconds time elapsed
```
Closes: https://lore.kernel.org/lkml/20250822082233.1850417-1-dapeng1.mi@linux.intel.com/
Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reported-by: Xudong Hao <xudong.hao@intel.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Yoshihiro Furudera <fj5100bi@fujitsu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| -rw-r--r-- | tools/perf/Documentation/perf-list.txt | 1 | ||||
| -rw-r--r-- | tools/perf/util/evsel.h | 1 | ||||
| -rw-r--r-- | tools/perf/util/parse-events.c | 5 | ||||
| -rw-r--r-- | tools/perf/util/parse-events.h | 1 | ||||
| -rw-r--r-- | tools/perf/util/parse-events.l | 5 |
5 files changed, 9 insertions, 4 deletions
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index 28215306a78a..a5039d1614f9 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -73,6 +73,7 @@ counted. The following modifiers exist: e - group or event are exclusive and do not share the PMU b - use BPF aggregration (see perf stat --bpf-counters) R - retire latency value of the event + X - don't regroup the event to match PMUs The 'p' modifier can be used for specifying how precise the instruction address should be. The 'p' modifier can be specified multiple times: diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index e927a3a4fe0e..03f9f22e3a0c 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -89,6 +89,7 @@ struct evsel { bool use_config_name; bool skippable; bool retire_lat; + bool dont_regroup; int bpf_fd; struct bpf_object *bpf_obj; struct list_head config_terms; diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 0026cff4d69e..452f12191f6e 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -1896,6 +1896,8 @@ static int parse_events__modifier_list(struct parse_events_state *parse_state, evsel->bpf_counter = true; if (mod.retire_lat) evsel->retire_lat = true; + if (mod.dont_regroup) + evsel->dont_regroup = true; } return 0; } @@ -2192,13 +2194,12 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list) * Set the group leader respecting the given groupings and that * groups can't span PMUs. */ - if (!cur_leader) { + if (!cur_leader || pos->dont_regroup) { cur_leader = pos; cur_leaders_grp = &pos->core; if (pos_force_grouped) force_grouped_leader = pos; } - cur_leader_pmu_name = cur_leader->group_pmu_name; if (strcmp(cur_leader_pmu_name, pos_pmu_name)) { /* PMU changed so the group/leader must change. */ diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index 62dc7202e3ba..a5c5fc39fd6f 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -216,6 +216,7 @@ struct parse_events_modifier { bool guest : 1; /* 'G' */ bool host : 1; /* 'H' */ bool retire_lat : 1; /* 'R' */ + bool dont_regroup : 1; /* 'X' */ }; int parse_events__modifier_event(struct parse_events_state *parse_state, void *loc, diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 2034590eb789..294e943bcdb4 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -206,6 +206,7 @@ static int modifiers(struct parse_events_state *parse_state, yyscan_t scanner) CASE('e', exclusive); CASE('b', bpf); CASE('R', retire_lat); + CASE('X', dont_regroup); default: return PE_ERROR; } @@ -251,10 +252,10 @@ term_name {name_start}[a-zA-Z0-9_*?.\[\]!\-:]* quoted_name [\']{name_start}[a-zA-Z0-9_*?.\[\]!\-:,\.=]*[\'] drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)? /* - * If you add a modifier you need to update check_modifier(). + * If you add a modifier you need to update modifiers(). * Also, the letters in modifier_event must not be in modifier_bp. */ -modifier_event [ukhpPGHSDIWebR]{1,16} +modifier_event [ukhpPGHSDIWebRX]{1,17} modifier_bp [rwx]{1,3} lc_type (L1-dcache|l1-d|l1d|L1-data|L1-icache|l1-i|l1i|L1-instruction|LLC|L2|dTLB|d-tlb|Data-TLB|iTLB|i-tlb|Instruction-TLB|branch|branches|bpu|btb|bpc|node) lc_op_result (load|loads|read|store|stores|write|prefetch|prefetches|speculative-read|speculative-load|refs|Reference|ops|access|misses|miss) |
