acrn-kernel/tools/perf
Kan Liang 12e89e65f4 perf hist: Add fast path for duplicate entries check
Perf checks the duplicate entries in a callchain before adding an entry.
However the check is very slow especially with deeper call stack.
Almost ~50% elapsed time of perf report is spent on the check when the
call stack is always depth of 32.

The hist_entry__cmp() is used to compare the new entry with the old
entries. It will go through all the available sorts in the sort_list,
and call the specific cmp of each sort, which is very slow.

Actually, for most cases, there are no duplicate entries in callchain.
The symbols are usually different. It's much faster to do a quick check
for symbols first. Only do the full cmp when the symbols are exactly the
same.

The quick check is only to check symbols, not dso. Export
_sort__sym_cmp.

  $ perf record --call-graph lbr ./tchain_edit_64

  Without the patch
  $time perf report --stdio
  real    0m21.142s
  user    0m21.110s
  sys     0m0.033s

  With the patch
  $time perf report --stdio
  real    0m10.977s
  user    0m10.948s
  sys     0m0.027s

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200319202517.23423-18-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-18 09:05:01 -03:00
..
Documentation perf c2c: Add option to enable the LBR stitching approach 2020-04-18 09:05:01 -03:00
arch tools headers: Update x86's syscall_64.tbl with the kernel sources 2020-04-14 11:02:52 -03:00
bench perf bench: Add event synthesis benchmark 2020-04-16 12:19:12 -03:00
examples/bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
include/bpf perf bpf: Remove bpf/ subdir from bpf.h headers used to build bpf events 2020-02-18 10:13:28 -03:00
jvmti
pmu-events perf pmu-events x86: Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metric 2020-04-03 09:37:56 -03:00
python
scripts perf script: Add flamegraph.py script 2020-04-16 12:19:14 -03:00
tests perf parser: Add support to specify rXXX event with pmu 2020-04-18 09:05:00 -03:00
trace tools headers UAPI: Sync linux/mman.h with the kernel 2020-04-14 09:04:53 -03:00
ui perf report/top TUI: Fix title line formatting 2020-04-03 09:37:55 -03:00
util perf hist: Add fast path for duplicate entries check 2020-04-18 09:05:01 -03:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
Build
CREDITS
MANIFEST libperf: Move to tools/lib/perf 2020-01-06 11:46:09 -03:00
Makefile tools: Let O= makes handle a relative path with -C option 2020-03-06 17:08:28 -03:00
Makefile.config perf tools: Support Python 3.8+ in Makefile 2020-04-03 10:03:44 -03:00
Makefile.perf perf: Normalize gcc parameter when generating arch errno table 2020-03-26 11:04:01 -03:00
builtin-annotate.c perf annotate: Prefer cmdline option over default config 2020-02-27 10:45:08 -03:00
builtin-bench.c perf bench: Add event synthesis benchmark 2020-04-16 12:19:12 -03:00
builtin-buildid-cache.c
builtin-buildid-list.c
builtin-c2c.c perf c2c: Add option to enable the LBR stitching approach 2020-04-18 09:05:01 -03:00
builtin-config.c
builtin-data.c
builtin-diff.c perf tools: Basic support for CGROUP event 2020-04-03 09:37:55 -03:00
builtin-evlist.c
builtin-ftrace.c perf tools: Support CAP_PERFMON capability 2020-04-16 12:19:08 -03:00
builtin-help.c
builtin-inject.c perf inject: Fix processing of ID index for injected instruction tracing 2019-12-04 12:39:53 -03:00
builtin-kallsyms.c
builtin-kmem.c
builtin-kvm.c
builtin-list.c
builtin-lock.c
builtin-mem.c
builtin-probe.c perf probe: Check return value of strlist__add() for -ENOMEM 2020-02-27 11:03:13 -03:00
builtin-record.c perf record: Add --all-cgroups option 2020-04-03 09:37:55 -03:00
builtin-report.c perf report: Add option to enable the LBR stitching approach 2020-04-18 09:05:01 -03:00
builtin-sched.c perf sched timehist: Add support for filtering on CPU 2020-01-06 11:46:09 -03:00
builtin-script.c perf script: Add option to enable the LBR stitching approach 2020-04-18 09:05:01 -03:00
builtin-stat.c perf stat: Honour --timeout for forked workloads 2020-04-16 12:17:41 -03:00
builtin-timechart.c
builtin-top.c perf top: Add option to enable the LBR stitching approach 2020-04-18 09:05:01 -03:00
builtin-trace.c perf trace: Resolve prctl's 'option' arg strings to numbers 2020-02-11 16:41:50 -03:00
builtin-version.c
builtin.h
check-headers.sh tools headers: Synchronize linux/bits.h with the kernel sources 2020-04-14 11:40:05 -03:00
command-list.txt
design.txt perf tools: Support CAP_PERFMON capability 2020-04-16 12:19:08 -03:00
perf-archive.sh
perf-completion.sh
perf-read-vdso.c
perf-sys.h
perf-with-kcore.sh
perf.c
perf.h