License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2009-11-05 08:31:31 +08:00
|
|
|
#ifndef BENCH_H
|
|
|
|
#define BENCH_H
|
|
|
|
|
2020-03-02 23:09:38 +08:00
|
|
|
#include <sys/time.h>
|
|
|
|
|
|
|
|
extern struct timeval bench__start, bench__end, bench__runtime;
|
|
|
|
|
2013-03-14 06:34:24 +08:00
|
|
|
/*
|
|
|
|
* The madvise transparent hugepage constants were added in glibc
|
|
|
|
* 2.13. For compatibility with older versions of glibc, define these
|
|
|
|
* tokens if they are not already defined.
|
|
|
|
*
|
|
|
|
* PA-RISC uses different madvise values from other architectures and
|
|
|
|
* needs to be special-cased.
|
|
|
|
*/
|
|
|
|
#ifdef __hppa__
|
|
|
|
# ifndef MADV_HUGEPAGE
|
|
|
|
# define MADV_HUGEPAGE 67
|
|
|
|
# endif
|
|
|
|
# ifndef MADV_NOHUGEPAGE
|
|
|
|
# define MADV_NOHUGEPAGE 68
|
|
|
|
# endif
|
|
|
|
#else
|
|
|
|
# ifndef MADV_HUGEPAGE
|
|
|
|
# define MADV_HUGEPAGE 14
|
|
|
|
# endif
|
|
|
|
# ifndef MADV_NOHUGEPAGE
|
|
|
|
# define MADV_NOHUGEPAGE 15
|
|
|
|
# endif
|
|
|
|
#endif
|
|
|
|
|
2017-03-27 22:47:20 +08:00
|
|
|
int bench_numa(int argc, const char **argv);
|
|
|
|
int bench_sched_messaging(int argc, const char **argv);
|
|
|
|
int bench_sched_pipe(int argc, const char **argv);
|
2019-03-09 02:17:47 +08:00
|
|
|
int bench_syscall_basic(int argc, const char **argv);
|
2017-03-27 22:47:20 +08:00
|
|
|
int bench_mem_memcpy(int argc, const char **argv);
|
|
|
|
int bench_mem_memset(int argc, const char **argv);
|
2020-07-30 06:00:34 +08:00
|
|
|
int bench_mem_find_bit(int argc, const char **argv);
|
2017-03-27 22:47:20 +08:00
|
|
|
int bench_futex_hash(int argc, const char **argv);
|
|
|
|
int bench_futex_wake(int argc, const char **argv);
|
|
|
|
int bench_futex_wake_parallel(int argc, const char **argv);
|
|
|
|
int bench_futex_requeue(int argc, const char **argv);
|
2015-07-07 16:55:53 +08:00
|
|
|
/* pi futexes */
|
2017-03-27 22:47:20 +08:00
|
|
|
int bench_futex_lock_pi(int argc, const char **argv);
|
perf bench: Add epoll parallel epoll_wait benchmark
This program benchmarks concurrent epoll_wait(2) for file descriptors
that are monitored with with EPOLLIN along various semantics, by a
single epoll instance. Such conditions can be found when using
single/combined or multiple queuing when load balancing.
Each thread has a number of private, nonblocking file descriptors,
referred to as fdmap. A writer thread will constantly be writing to the
fdmaps of all threads, minimizing each threads's chances of epoll_wait
not finding any ready read events and blocking as this is not what we
want to stress. Full details in the start of the C file.
Committer testing:
# perf bench
Usage:
perf bench [<common options>] <collection> <benchmark> [<options>]
# List of all available benchmark collections:
sched: Scheduler and IPC benchmarks
mem: Memory access benchmarks
numa: NUMA scheduling and MM benchmarks
futex: Futex stressing benchmarks
epoll: Epoll stressing benchmarks
all: All benchmarks
# perf bench epoll
# List of available benchmarks for collection 'epoll':
wait: Benchmark epoll concurrent epoll_waits
all: Run all futex benchmarks
# perf bench epoll wait
# Running 'epoll/wait' benchmark:
Run summary [PID 19295]: 3 threads monitoring on 64 file-descriptors for 8 secs.
[thread 0] fdmap: 0xdaa650 ... 0xdaa74c [ 328241 ops/sec ]
[thread 1] fdmap: 0xdaa900 ... 0xdaa9fc [ 351695 ops/sec ]
[thread 2] fdmap: 0xdaabb0 ... 0xdaacac [ 381423 ops/sec ]
Averaged 353786 operations/sec (+- 4.35%), total secs = 8
#
Committer notes:
Fix the build on debian:experimental-x-mips, debian:experimental-x-mipsel
and others:
CC /tmp/build/perf/bench/epoll-wait.o
bench/epoll-wait.c: In function 'writerfn':
bench/epoll-wait.c:399:12: error: format '%ld' expects argument of type 'long int', but argument 2 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
printinfo("exiting writer-thread (total full-loops: %ld)\n", iter);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~
bench/epoll-wait.c:86:31: note: in definition of macro 'printinfo'
do { if (__verbose) { printf(fmt, ## arg); fflush(stdout); } } while (0)
^~~
cc1: all warnings being treated as errors
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com> <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20181106152226.20883-2-dave@stgolabs.net
Link: http://lkml.kernel.org/r/20181106182349.thdkpvshkna5vd7o@linux-r8p5>
[ Applied above fixup as per Davidlohr's request ]
[ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ]
[ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-06 23:22:25 +08:00
|
|
|
int bench_epoll_wait(int argc, const char **argv);
|
perf bench: Add epoll_ctl(2) benchmark
Benchmark the various operations allowed for epoll_ctl(2). The idea is
to concurrently stress a single epoll instance doing add/mod/del
operations.
Committer testing:
# perf bench epoll ctl
# Running 'epoll/ctl' benchmark:
Run summary [PID 20344]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs.
[thread 0] fdmap: 0x21a46b0 ... 0x21a47ac [ add: 1680960 ops; mod: 1680960 ops; del: 1680960 ops ]
[thread 1] fdmap: 0x21a4960 ... 0x21a4a5c [ add: 1685440 ops; mod: 1685440 ops; del: 1685440 ops ]
[thread 2] fdmap: 0x21a4c10 ... 0x21a4d0c [ add: 1674368 ops; mod: 1674368 ops; del: 1674368 ops ]
[thread 3] fdmap: 0x21a4ec0 ... 0x21a4fbc [ add: 1677568 ops; mod: 1677568 ops; del: 1677568 ops ]
Averaged 1679584 ADD operations (+- 0.14%)
Averaged 1679584 MOD operations (+- 0.14%)
Averaged 1679584 DEL operations (+- 0.14%)
#
Lets measure those calls with 'perf trace' to get a glympse at what this
benchmark is doing in terms of syscalls:
# perf trace -m32768 -s perf bench epoll ctl
# Running 'epoll/ctl' benchmark:
Run summary [PID 20405]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs.
[thread 0] fdmap: 0x21764e0 ... 0x21765dc [ add: 1100480 ops; mod: 1100480 ops; del: 1100480 ops ]
[thread 1] fdmap: 0x2176790 ... 0x217688c [ add: 1250176 ops; mod: 1250176 ops; del: 1250176 ops ]
[thread 2] fdmap: 0x2176a40 ... 0x2176b3c [ add: 1022464 ops; mod: 1022464 ops; del: 1022464 ops ]
[thread 3] fdmap: 0x2176cf0 ... 0x2176dec [ add: 705472 ops; mod: 705472 ops; del: 705472 ops ]
Averaged 1019648 ADD operations (+- 11.27%)
Averaged 1019648 MOD operations (+- 11.27%)
Averaged 1019648 DEL operations (+- 11.27%)
Summary of events:
epoll-ctl (20405), 1264 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
eventfd2 256 9.514 0.001 0.037 5.243 68.00%
clone 4 1.245 0.204 0.311 0.531 24.13%
mprotect 66 0.345 0.002 0.005 0.021 7.43%
openat 45 0.313 0.004 0.007 0.073 21.93%
mmap 88 0.302 0.002 0.003 0.013 5.02%
futex 4 0.160 0.002 0.040 0.140 83.43%
sched_setaffinity 4 0.124 0.005 0.031 0.070 49.39%
read 44 0.103 0.001 0.002 0.013 15.54%
fstat 40 0.052 0.001 0.001 0.003 5.43%
close 39 0.039 0.001 0.001 0.001 1.48%
stat 9 0.034 0.003 0.004 0.006 7.30%
access 3 0.023 0.007 0.008 0.008 4.25%
open 2 0.021 0.008 0.011 0.013 22.60%
getdents 4 0.019 0.001 0.005 0.009 37.15%
write 2 0.013 0.004 0.007 0.009 38.48%
munmap 1 0.010 0.010 0.010 0.010 0.00%
brk 3 0.006 0.001 0.002 0.003 26.34%
rt_sigprocmask 2 0.004 0.001 0.002 0.003 43.95%
rt_sigaction 3 0.004 0.001 0.001 0.002 16.07%
prlimit64 3 0.004 0.001 0.001 0.001 5.39%
prctl 1 0.003 0.003 0.003 0.003 0.00%
epoll_create 1 0.003 0.003 0.003 0.003 0.00%
lseek 2 0.002 0.001 0.001 0.001 11.42%
sched_getaffinity 1 0.002 0.002 0.002 0.002 0.00%
arch_prctl 1 0.002 0.002 0.002 0.002 0.00%
set_tid_address 1 0.001 0.001 0.001 0.001 0.00%
getpid 1 0.001 0.001 0.001 0.001 0.00%
set_robust_list 1 0.001 0.001 0.001 0.001 0.00%
execve 1 0.000 0.000 0.000 0.000 0.00%
epoll-ctl (20406), 1245480 events, 14.6%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
epoll_ctl 619511 1034.927 0.001 0.002 6.691 0.67%
nanosleep 3226 616.114 0.006 0.191 10.376 7.57%
futex 2 11.336 0.002 5.668 11.334 99.97%
set_robust_list 1 0.001 0.001 0.001 0.001 0.00%
clone 1 0.000 0.000 0.000 0.000 0.00%
epoll-ctl (20407), 1243151 events, 14.5%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
epoll_ctl 618350 1042.181 0.001 0.002 2.512 0.40%
nanosleep 3220 366.261 0.012 0.114 18.162 9.59%
futex 4 5.463 0.001 1.366 5.427 99.12%
set_robust_list 1 0.002 0.002 0.002 0.002 0.00%
epoll-ctl (20408), 1801690 events, 21.1%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
epoll_ctl 896174 1540.581 0.001 0.002 6.987 0.74%
nanosleep 4667 783.393 0.006 0.168 10.419 7.10%
futex 2 4.682 0.002 2.341 4.681 99.93%
set_robust_list 1 0.002 0.002 0.002 0.002 0.00%
clone 1 0.000 0.000 0.000 0.000 0.00%
epoll-ctl (20409), 4254890 events, 49.8%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
epoll_ctl 2116416 3768.097 0.001 0.002 9.956 0.41%
nanosleep 11023 1141.778 0.006 0.104 9.447 4.95%
futex 3 0.037 0.002 0.012 0.029 70.50%
set_robust_list 1 0.008 0.008 0.008 0.008 0.00%
madvise 1 0.005 0.005 0.005 0.005 0.00%
clone 1 0.000 0.000 0.000 0.000 0.00%
#
Committer notes:
Fix build on fedora:24-x-ARC-uClibc, debian:experimental-x-mips,
debian:experimental-x-mipsel, ubuntu:16.04-x-arm and ubuntu:16.04-x-powerpc
CC /tmp/build/perf/bench/epoll-ctl.o
bench/epoll-ctl.c: In function 'init_fdmaps':
bench/epoll-ctl.c:214:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
for (i = 0; i < nfds; i+=inc) {
^
bench/epoll-ctl.c: In function 'bench_epoll_ctl':
bench/epoll-ctl.c:377:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
for (i = 0; i < nthreads; i++) {
^
bench/epoll-ctl.c:388:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
for (i = 0; i < nthreads; i++) {
^
cc1: all warnings being treated as errors
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20181106152226.20883-3-dave@stgolabs.net
[ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ]
[ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-06 23:22:26 +08:00
|
|
|
int bench_epoll_ctl(int argc, const char **argv);
|
2020-04-02 23:43:53 +08:00
|
|
|
int bench_synthesize(int argc, const char **argv);
|
perf bench: Add kallsyms parsing
Add a benchmark for kallsyms parsing. Example output:
Running 'internals/kallsyms-parse' benchmark:
Average kallsyms__parse took: 103.971 ms (+- 0.121 ms)
Committer testing:
Test Machine: AMD Ryzen 5 3600X 6-Core Processor
[root@five ~]# perf bench internals kallsyms-parse
# Running 'internals/kallsyms-parse' benchmark:
Average kallsyms__parse took: 79.692 ms (+- 0.101 ms)
[root@five ~]# perf stat -r5 perf bench internals kallsyms-parse
# Running 'internals/kallsyms-parse' benchmark:
Average kallsyms__parse took: 80.563 ms (+- 0.079 ms)
# Running 'internals/kallsyms-parse' benchmark:
Average kallsyms__parse took: 81.046 ms (+- 0.155 ms)
# Running 'internals/kallsyms-parse' benchmark:
Average kallsyms__parse took: 80.874 ms (+- 0.104 ms)
# Running 'internals/kallsyms-parse' benchmark:
Average kallsyms__parse took: 81.173 ms (+- 0.133 ms)
# Running 'internals/kallsyms-parse' benchmark:
Average kallsyms__parse took: 81.169 ms (+- 0.074 ms)
Performance counter stats for 'perf bench internals kallsyms-parse' (5 runs):
8,093.54 msec task-clock # 0.999 CPUs utilized ( +- 0.14% )
3,165 context-switches # 0.391 K/sec ( +- 0.18% )
10 cpu-migrations # 0.001 K/sec ( +- 23.13% )
744 page-faults # 0.092 K/sec ( +- 0.21% )
34,551,564,954 cycles # 4.269 GHz ( +- 0.05% ) (83.33%)
1,160,584,308 stalled-cycles-frontend # 3.36% frontend cycles idle ( +- 1.60% ) (83.33%)
14,974,323,985 stalled-cycles-backend # 43.34% backend cycles idle ( +- 0.24% ) (83.33%)
58,712,905,705 instructions # 1.70 insn per cycle
# 0.26 stalled cycles per insn ( +- 0.01% ) (83.34%)
14,136,433,778 branches # 1746.632 M/sec ( +- 0.01% ) (83.33%)
141,943,217 branch-misses # 1.00% of all branches ( +- 0.04% ) (83.33%)
8.1040 +- 0.0115 seconds time elapsed ( +- 0.14% )
[root@five ~]#
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200501221315.54715-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-02 06:13:13 +08:00
|
|
|
int bench_kallsyms_parse(int argc, const char **argv);
|
perf bench: Add build-id injection benchmark
Sometimes I can see that 'perf record' piped with 'perf inject' take a
long time processing build-ids.
So introduce a inject-build-id benchmark to the internals benchmark
suite to measure its overhead regularly.
It runs the 'perf inject' command internally and feeds the given number
of synthesized events (MMAP2 + SAMPLE basically).
Usage: perf bench internals inject-build-id <options>
-i, --iterations <n> Number of iterations used to compute average (default: 100)
-m, --nr-mmaps <n> Number of mmap events for each iteration (default: 100)
-n, --nr-samples <n> Number of sample events per mmap event (default: 100)
-v, --verbose be more verbose (show iteration count, DSO name, etc)
By default, it measures average processing time of 100 MMAP2 events
and 10000 SAMPLE events. Below is a result on my laptop.
$ perf bench internals inject-build-id
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 25.789 msec (+- 0.202 msec)
Average time per event: 2.528 usec (+- 0.020 usec)
Average memory usage: 8411 KB (+- 7 KB)
Committer testing:
$ perf bench
Usage:
perf bench [<common options>] <collection> <benchmark> [<options>]
# List of all available benchmark collections:
sched: Scheduler and IPC benchmarks
syscall: System call benchmarks
mem: Memory access benchmarks
numa: NUMA scheduling and MM benchmarks
futex: Futex stressing benchmarks
epoll: Epoll stressing benchmarks
internals: Perf-internals benchmarks
all: All benchmarks
$ perf bench internals
# List of available benchmarks for collection 'internals':
synthesize: Benchmark perf event synthesis
kallsyms-parse: Benchmark kallsyms parsing
inject-build-id: Benchmark build-id injection
$ perf bench internals inject-build-id
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 14.202 msec (+- 0.059 msec)
Average time per event: 1.392 usec (+- 0.006 usec)
Average memory usage: 12650 KB (+- 10 KB)
Average build-id-all injection took: 12.831 msec (+- 0.071 msec)
Average time per event: 1.258 usec (+- 0.007 usec)
Average memory usage: 11895 KB (+- 10 KB)
$
$ perf stat -r5 perf bench internals inject-build-id
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 14.380 msec (+- 0.056 msec)
Average time per event: 1.410 usec (+- 0.006 usec)
Average memory usage: 12608 KB (+- 11 KB)
Average build-id-all injection took: 11.889 msec (+- 0.064 msec)
Average time per event: 1.166 usec (+- 0.006 usec)
Average memory usage: 11838 KB (+- 10 KB)
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 14.246 msec (+- 0.065 msec)
Average time per event: 1.397 usec (+- 0.006 usec)
Average memory usage: 12744 KB (+- 10 KB)
Average build-id-all injection took: 12.019 msec (+- 0.066 msec)
Average time per event: 1.178 usec (+- 0.006 usec)
Average memory usage: 11963 KB (+- 10 KB)
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 14.321 msec (+- 0.067 msec)
Average time per event: 1.404 usec (+- 0.007 usec)
Average memory usage: 12690 KB (+- 10 KB)
Average build-id-all injection took: 11.909 msec (+- 0.041 msec)
Average time per event: 1.168 usec (+- 0.004 usec)
Average memory usage: 11938 KB (+- 10 KB)
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 14.287 msec (+- 0.059 msec)
Average time per event: 1.401 usec (+- 0.006 usec)
Average memory usage: 12864 KB (+- 10 KB)
Average build-id-all injection took: 11.862 msec (+- 0.058 msec)
Average time per event: 1.163 usec (+- 0.006 usec)
Average memory usage: 12103 KB (+- 10 KB)
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 14.402 msec (+- 0.053 msec)
Average time per event: 1.412 usec (+- 0.005 usec)
Average memory usage: 12876 KB (+- 10 KB)
Average build-id-all injection took: 11.826 msec (+- 0.061 msec)
Average time per event: 1.159 usec (+- 0.006 usec)
Average memory usage: 12111 KB (+- 10 KB)
Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
4,267.48 msec task-clock:u # 1.502 CPUs utilized ( +- 0.14% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
102,092 page-faults:u # 0.024 M/sec ( +- 0.08% )
3,894,589,578 cycles:u # 0.913 GHz ( +- 0.19% ) (83.49%)
140,078,421 stalled-cycles-frontend:u # 3.60% frontend cycles idle ( +- 0.77% ) (83.34%)
948,581,189 stalled-cycles-backend:u # 24.36% backend cycles idle ( +- 0.46% ) (83.25%)
5,835,587,719 instructions:u # 1.50 insn per cycle
# 0.16 stalled cycles per insn ( +- 0.21% ) (83.24%)
1,267,423,636 branches:u # 296.996 M/sec ( +- 0.22% ) (83.12%)
17,484,290 branch-misses:u # 1.38% of all branches ( +- 0.12% ) (83.55%)
2.84176 +- 0.00222 seconds time elapsed ( +- 0.08% )
$
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20201012070214.2074921-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-12 15:02:09 +08:00
|
|
|
int bench_inject_build_id(int argc, const char **argv);
|
perf bench: Add benchmark for evlist open/close operations
This new benchmark finds the total time that is taken to open, mmap,
enable, disable, munmap, close an evlist (time taken for new,
create_maps, config, delete is not counted in).
The evlist can be configured as in perf-record using the
-a,-C,-e,-u,--per-thread,-t,-p options.
The events can be duplicated in the evlist to quickly test performance
with many events using the -n options.
Furthermore, also the number of iterations used to calculate the
statistics is customizable.
Examples:
- Open one dummy event system-wide:
$ sudo ./perf bench internals evlist-open-close
Number of cpus: 4
Number of threads: 1
Number of events: 1 (4 fds)
Number of iterations: 100
Average open-close took: 613.870 usec (+- 32.852 usec)
- Open the group '{cs,cycles}' on CPU 0
$ sudo ./perf bench internals evlist-open-close -e '{cs,cycles}' -C 0
Number of cpus: 1
Number of threads: 1
Number of events: 2 (2 fds)
Number of iterations: 100
Average open-close took: 8503.220 usec (+- 252.652 usec)
- Open 10 'cycles' events for user 0, calculate average over 100 runs
$ sudo ./perf bench internals evlist-open-close -e cycles -n 10 -u 0 -i 100
Number of cpus: 4
Number of threads: 328
Number of events: 10 (13120 fds)
Number of iterations: 100
Average open-close took: 180043.140 usec (+- 2295.889 usec)
Committer notes:
Replaced a deprecated bzero() call with designated initialized zeroing.
Added some missing evlist allocation checks, one noted by Riccardo on
the mailing list.
Minor cosmetic changes (sent in private).
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210809201101.277594-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 04:11:02 +08:00
|
|
|
int bench_evlist_open_close(int argc, const char **argv);
|
2022-05-05 23:57:45 +08:00
|
|
|
int bench_breakpoint_thread(int argc, const char **argv);
|
|
|
|
int bench_breakpoint_enable(int argc, const char **argv);
|
perf bench: Add epoll parallel epoll_wait benchmark
This program benchmarks concurrent epoll_wait(2) for file descriptors
that are monitored with with EPOLLIN along various semantics, by a
single epoll instance. Such conditions can be found when using
single/combined or multiple queuing when load balancing.
Each thread has a number of private, nonblocking file descriptors,
referred to as fdmap. A writer thread will constantly be writing to the
fdmaps of all threads, minimizing each threads's chances of epoll_wait
not finding any ready read events and blocking as this is not what we
want to stress. Full details in the start of the C file.
Committer testing:
# perf bench
Usage:
perf bench [<common options>] <collection> <benchmark> [<options>]
# List of all available benchmark collections:
sched: Scheduler and IPC benchmarks
mem: Memory access benchmarks
numa: NUMA scheduling and MM benchmarks
futex: Futex stressing benchmarks
epoll: Epoll stressing benchmarks
all: All benchmarks
# perf bench epoll
# List of available benchmarks for collection 'epoll':
wait: Benchmark epoll concurrent epoll_waits
all: Run all futex benchmarks
# perf bench epoll wait
# Running 'epoll/wait' benchmark:
Run summary [PID 19295]: 3 threads monitoring on 64 file-descriptors for 8 secs.
[thread 0] fdmap: 0xdaa650 ... 0xdaa74c [ 328241 ops/sec ]
[thread 1] fdmap: 0xdaa900 ... 0xdaa9fc [ 351695 ops/sec ]
[thread 2] fdmap: 0xdaabb0 ... 0xdaacac [ 381423 ops/sec ]
Averaged 353786 operations/sec (+- 4.35%), total secs = 8
#
Committer notes:
Fix the build on debian:experimental-x-mips, debian:experimental-x-mipsel
and others:
CC /tmp/build/perf/bench/epoll-wait.o
bench/epoll-wait.c: In function 'writerfn':
bench/epoll-wait.c:399:12: error: format '%ld' expects argument of type 'long int', but argument 2 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
printinfo("exiting writer-thread (total full-loops: %ld)\n", iter);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~
bench/epoll-wait.c:86:31: note: in definition of macro 'printinfo'
do { if (__verbose) { printf(fmt, ## arg); fflush(stdout); } } while (0)
^~~
cc1: all warnings being treated as errors
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com> <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20181106152226.20883-2-dave@stgolabs.net
Link: http://lkml.kernel.org/r/20181106182349.thdkpvshkna5vd7o@linux-r8p5>
[ Applied above fixup as per Davidlohr's request ]
[ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ]
[ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-06 23:22:25 +08:00
|
|
|
|
2009-11-10 19:50:53 +08:00
|
|
|
#define BENCH_FORMAT_DEFAULT_STR "default"
|
|
|
|
#define BENCH_FORMAT_DEFAULT 0
|
|
|
|
#define BENCH_FORMAT_SIMPLE_STR "simple"
|
|
|
|
#define BENCH_FORMAT_SIMPLE 1
|
2009-11-10 07:19:59 +08:00
|
|
|
|
2009-11-10 19:50:53 +08:00
|
|
|
#define BENCH_FORMAT_UNKNOWN -1
|
2009-11-10 07:19:59 +08:00
|
|
|
|
|
|
|
extern int bench_format;
|
2014-06-17 02:14:19 +08:00
|
|
|
extern unsigned int bench_repeat;
|
2009-11-10 07:19:59 +08:00
|
|
|
|
2018-11-10 05:07:19 +08:00
|
|
|
#ifndef HAVE_PTHREAD_ATTR_SETAFFINITY_NP
|
|
|
|
#include <pthread.h>
|
|
|
|
#include <linux/compiler.h>
|
|
|
|
static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr __maybe_unused,
|
|
|
|
size_t cpusetsize __maybe_unused,
|
|
|
|
cpu_set_t *cpuset __maybe_unused)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2009-11-05 08:31:31 +08:00
|
|
|
#endif
|