NAME
perf-config - Get and set variables in a configuration file.SYNOPSIS
perf config [<file-option>] [section.name[=value] ...] or perf config [<file-option>] -l | --list
DESCRIPTION
You can manage variables in a configuration file with this command.OPTIONS
-l, --listShow current config variables, name and value,
for all sections.
For writing and reading options: write to user
$HOME/.perfconfig file or read it.
For writing and reading options: write to
system-wide $(sysconfdir)/perfconfig or read it.
CONFIGURATION FILE
The perf configuration file contains many variables to change various aspects of each of its tools, including output, disk usage, etc. The $HOME/.perfconfig file is used to store a per-user configuration. The file $(sysconfdir)/perfconfig can be used to store a system-wide default configuration.Syntax
The file consist of sections. A section starts with its name surrounded by square brackets and continues till the next section begins. Each variable must be in a section, and have the form name = value, for example:[section] name1 = value1 name2 = value2
Example
Given a $HOME/.perfconfig like this:[colors] # Color variables top = red, default medium = green, default normal = lightgray, default selected = white, lightgray jump_arrows = blue, default addr = magenta, default root = white, blue
[tui] # Defaults if linked with libslang report = on annotate = on top = on
[buildid] # Default, disable using /dev/null dir = ~/.debug
[annotate] # Defaults hide_src_code = false use_offset = true jump_arrows = true show_nr_jumps = false
[help] # Format can be man, info, web or html format = man autocorrect = 0
[ui] show-headers = true
[call-graph] # fp (framepointer), dwarf record-mode = fp print-type = graph order = caller sort-key = function
[report] # Defaults sort_order = comm,dso,symbol percent-limit = 0 queue-size = 0 children = true group = true skip-empty = true
[llvm] dump-obj = true clang-opt = -g
% perf config annotate.hide_src_code=true
% perf config ui.show-headers=false kmem.default=slab
% perf config --user report.sort-order=srcline
% perf config --system colors.selected=yellow,green
% perf config call-graph.record-mode
% perf config report.queue-size call-graph.order report.children
% perf config --user call-graph.sort-order
% perf config --system buildid.dir
Variables
colors.*The variables for customizing the colors used
in the output for the report, top and annotate in the
TUI. They should specify the foreground and background colors, separated by a
comma, for example:
medium = green, lightgray
If you want to use the color configured for you terminal, just leave it as 'default', for example:
medium = default, lightgray
Available colors: red, yellow, green, cyan, gray, black, blue, white, default, magenta, lightgray
top means a overhead percentage which
is more than 5%. And values of this variable specify percentage colors. Basic
key values are foreground-color red and background-color
default.
medium means a overhead percentage
which has more than 0.5%. Default values are green and
default.
normal means the rest of overhead
percentages except top, medium, selected. Default values
are lightgray and default.
This selects the colors for the current entry
in a list of entries from sub-commands (top, report, annotate). Default values
are black and lightgray.
Colors for jump arrows on assembly code
listings such as jns, jmp, jane, etc. Default values are
blue, default.
This selects colors for addresses from
annotate. Default values are magenta, default.
Colors for headers in the output of a
sub-commands (top, report). Default values are white,
blue.
Sets a timeout (in milliseconds) for parsing
/proc/<pid>/maps files. Can be overridden by the --proc-map-timeout
option on supported subcommands. The default timeout is 500ms.
Subcommands that can be configured here are
top, report and annotate. These values are booleans, for
example:
[tui] top = true
will make the TUI be the default for the 'top' subcommand. Those will be available if the required libs were detected at tool build time.
Each executable and shared library in modern
distributions comes with a content based identifier that, if available, will
be inserted in a perf.data file header to, at analysis time find what
is needed to do symbol resolution, code annotation, etc.
The recording tools also stores a hard link or copy in a per-user directory, $HOME/.debug/, of binaries, shared libraries, /proc/kallsyms and /proc/kcore files to be used at analysis time.
The buildid.dir variable can be used to either change this directory cache location, or to disable it altogether. If you want to disable it, set buildid.dir to /dev/null. The default is $HOME/.debug
buildid-cache.debuginfod=URLs Specify
debuginfod URLs to be used when retrieving perf.data binaries, it follows the
same syntax as the DEBUGINFOD_URLS variable, like:
buildid-cache.debuginfod=http://192.168.122.174:8002
These are in control of addresses, jump
function, source code in lines of assembly code from a specific program.
annotate.disassembler_style: Use this to change the default disassembler style to some other value supported by binutils, such as "intel", see the '-M' option help in the 'objdump' man page.
If a program which is analyzed has source
code, this option lets annotate print a list of assembly code with the
source code. For example, let’s see a part of a program.
There’re four lines. If this option is true, they can be printed
without source code from a program as below.
│ push %rbp │ mov %rsp,%rbp │ sub $0x10,%rsp │ mov (%rdi),%rdx
But if this option is 'false', source code of the part can be also printed as below. Default is 'false'.
│ struct rb_node *rb_next(const struct rb_node *node) │ { │ push %rbp │ mov %rsp,%rbp │ sub $0x10,%rsp │ struct rb_node *parent; │ │ if (RB_EMPTY_NODE(node)) │ mov (%rdi),%rdx │ return n;
This option works with tui, stdio2 browsers.
Basing on a first address of a loaded
function, offset can be used. Instead of using original addresses of assembly
code, addresses subtracted from a base address can be printed. Let’s
illustrate an example. If a base address is 0XFFFFFFFF81624d50 as below,
ffffffff81624d50 <load0>
an address on assembly code has a specific absolute address as below
ffffffff816250b8:│ mov 0x8(%r14),%rdi
but if use_offset is 'true', an address subtracted from a base address is printed. Default is true. This option is only applied to TUI.
368:│ mov 0x8(%r14),%rdi
This option works with tui, stdio2 browsers.
There can be jump instruction among assembly
code. Depending on a boolean value of jump_arrows, arrows can be printed or
not which represent where do the instruction jump into as below.
│ ┌──jmp 1333 │ │ xchg %ax,%ax │1330:│ mov %r15,%r10 │1333:└─→cmp %r15,%r14
If jump_arrow is 'false', the arrows isn't printed as below. Default is 'false'.
│ ↓ jmp 1333 │ xchg %ax,%ax │1330: mov %r15,%r10 │1333: cmp %r15,%r14
This option works with tui browser.
When showing source code if this option is
true, line numbers are printed as below.
│1628 if (type & PERF_SAMPLE_IDENTIFIER) { │ ↓ jne 508 │1628 data->id = *array; │1629 array++; │1630 }
However if this option is 'false', they aren't printed as below. Default is 'false'.
│ if (type & PERF_SAMPLE_IDENTIFIER) { │ ↓ jne 508 │ data->id = *array; │ array++; │ }
This option works with tui, stdio2 browsers.
Let’s see a part of assembly code.
│1382: movb $0x1,-0x270(%rbp)
If use this, the number of branches jumping to that address can be printed as below. Default is 'false'.
│1 1382: movb $0x1,-0x270(%rbp)
This option works with tui, stdio2 browsers.
To compare two records on an instruction base,
with this option provided, display total number of samples that belong to a
line in assembly code. If this option is true, total periods are
printed instead of percent values as below.
302 │ mov %eax,%eax
But if this option is 'false', percent values for overhead are printed i.e. Default is 'false'.
99.93 │ mov %eax,%eax
This option works with tui, stdio2, stdio browsers.
By default perf annotate shows percentage of
samples. This option can be used to print absolute number of samples. Ex, when
set as false:
Percent│ 74.03 │ mov %fs:0x28,%rax
When set as true:
Samples│ 6 │ mov %fs:0x28,%rax
This option works with tui, stdio2, stdio browsers.
Default is 1, meaning just jump targets
will have offsets show right beside the instruction. When set to 2
call instructions will also have its offsets shown, 3 or higher will
show offsets for all instructions.
This option works with tui, stdio2 browsers.
Demangle symbol names to human readable form.
Default is true.
Demangle kernel symbol names to human readable
form. Default is true.
This option control the way to calculate
overhead of filtered entries - that means the value of this option is
effective only if there’s a filter (by comm, dso or symbol name).
Suppose a following example:
Overhead Symbols ........ ....... 33.33% foo 33.33% bar 33.33% baz
This is an original overhead and we'll filter out the first 'foo' entry. The value of 'relative' would increase the overhead of 'bar' and 'baz' to 50.00% for each, while 'absolute' would show their current overhead (33.33%).
This option controls display of column headers
(like Overhead and Symbol) in report and top. If
this option is false, they are hidden. This option is only applied to
TUI.
The following controls the handling of
call-graphs (obtained via the -g/--call-graph options).
The mode for user space can be fp
(frame pointer), dwarf and lbr. The value dwarf is
effective only if libunwind (or a recent version of libdw) is present on the
system; the value lbr only works for certain cpus. The method for
kernel space is controlled not by this option but by the kernel config
(CONFIG_UNWINDER_*).
The size of stack to dump in order to do
post-unwinding. Default is 8192 (byte). When using dwarf into record-mode, the
default size will be used if omitted.
The print-types can be graph (graph absolute),
fractal (graph relative), flat and folded. This option controls a way to show
overhead for each callchain entry. Suppose a following example.
Overhead Symbols ........ ....... 40.00% foo | ---foo | |--50.00%--bar | main | --50.00%--baz main
This output is a 'fractal' format. The 'foo' came from 'bar' and 'baz' exactly half and half so 'fractal' shows 50.00% for each (meaning that it assumes 100% total overhead of 'foo').
The 'graph' uses absolute overhead value of 'foo' as total so each of 'bar' and 'baz' callchain will have 20.00% of overhead. If 'flat' is used, single column and linear exposure of call chains. 'folded' mean call chains are displayed in a line, separated by semicolons.
This option controls print order of
callchains. The default is callee which means callee is printed at top
and then followed by its caller and so on. The caller prints it in
reverse order.
If this option is not set and report.children or top.children is set to true (or the equivalent command line option is given), the default value of this option is changed to 'caller' for the execution of 'perf report' or 'perf top'. Other commands will still default to 'callee'.
The callchains are merged if they contain same
information. The sort-key option determines a way to compare the callchains. A
value of sort-key can be function or address. The default
is function.
When there’re many callchains
it’d print tons of lines. So perf omits small callchains under a
certain overhead (threshold) and this option control the threshold. Default is
0.5 (%). The overhead is calculated by value depends on
call-graph.print-type.
This is a maximum number of lines of callchain
printed for a single histogram entry. Default is 0 which means no
limitation.
Allows changing the default sort order from
"comm,dso,symbol" to some other default, for instance
"sym,dso" may be more fitting for kernel developers.
This one is mostly the same as
call-graph.threshold but works for histogram entries. Entries having an
overhead lower than this percentage will not be printed. Default is 0.
If percent-limit is 10, only entries which have more than 10% of
overhead will be printed.
This option sets up the maximum allocation
size of the internal event queue for ordering events. Default is 0, meaning no
limit.
Children means functions called from
another function. If this option is true, perf report cumulates
callchains of children and show (accumulated) total overhead as well as
Self overhead. Please refer to the perf report manual. The
default is true.
This option is to show event group information
together. Example output with this turned on, notice that there is one column
per event in the group, ref-cycles and cycles:
# group: {ref-cycles,cycles} # ======== # # Samples: 7K of event 'anon group { ref-cycles, cycles }' # Event count (approx.): 6876107743 # # Overhead Command Shared Object Symbol # ................ ....... ................. ................... # 99.84% 99.76% noploop noploop [.] main 0.07% 0.00% noploop ld-2.15.so [.] strcmp 0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del
This option can change default stat behavior
with empty results. If it’s set true, perf report --stat will
not show 0 stats.
Same as report.children. So if it is
enabled, the output of top command will have Children overhead
column as well as Self overhead column by default. The default is
true.
This is identical to
call-graph.record-mode, except it is applicable only for top
subcommand. This option ONLY setup the unwind method. To enable perf
top to actually use it, the command line option -g must be
specified.
This option can assign a tool to view manual
pages when help subcommand was invoked. Supported tools are man,
woman (with emacs client) and konqueror. Default is man.
New man viewer tool can be also added using 'man.<tool>.cmd' or use different path using 'man.<tool>.path' config option.
When the subcommand is run on stdio, determine
whether it uses pager or not based on this value. Default is
unspecified.
This option decides which allocator is to be
analyzed if neither --slab nor --page option is used. Default is
slab.
This option can be cache,
no-cache, skip or mmap. cache is to post-process
data and save/update the binaries into the build-id cache (in ~/.debug). This
is the default. But if this option is no-cache, it will not update the
build-id cache. skip skips post-processing and does not update the
cache. mmap skips post-processing and reads build-ids from MMAP
events.
This is identical to
call-graph.record-mode, except it is applicable only for record
subcommand. This option ONLY setup the unwind method. To enable perf
record to actually use it, the command line option -g must be
specified.
Use n control blocks in asynchronous
(Posix AIO) trace writing mode ( n default: 1, max: 4).
Specify debuginfod URL to be used when
cacheing perf.data binaries, it follows the same syntax as the DEBUGINFOD_URLS
variable, like:
http://192.168.122.174:8002
If the URLs is 'system', the value of DEBUGINFOD_URLS system environment variable is used.
This option sets the number of columns to sort
the result. The default is 0, which means sorting by baseline. Setting it to 1
will sort the result by delta (or other compute method selected).
This options sets the method for computing the
diff result. Possible values are delta, delta-abs, ratio
and wdiff. Default is delta.
Allows adding a set of events to add to the
ones specified by the user, or use as a default one if none was specified. The
initial use case is to add augmented_raw_syscalls.o to activate the perf
trace logic that looks for syscall pointer contents after the normal
tracepoint payload.
Number of columns to align the argument list,
default is 70, use 40 for the strace default, zero to no alignment.
Do not follow children threads.
Should syscall argument names be printed? If
not then trace.show_zeros will be set.
Show syscall duration.
If set to yes will show common string
prefixes in tables. The default is to remove the common prefix in things like
"MAP_SHARED", showing just "SHARED".
Show syscall start timestamp.
Do not suppress syscall arguments that are
equal to zero.
Use "libtraceevent" to use that
library to augment the tracepoint arguments, "libbeauty", the
default, to use the same argument beautifiers used in the strace-like
sys_enter+sys_exit lines.
Can be used to select the default tracer when
neither -G nor -F option is not specified. Possible values are function
and function_graph.
Path to clang. If omit, search it from
$PATH.
Cmdline template. Below lines show its default
value. Environment variable is used to pass options. "$CLANG_EXEC -D
KERNEL -D NR_CPUS=$NR_CPUS "\
"-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \ "$CLANG_OPTIONS
$PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \ "-Wno-unused-value
-Wno-pointer-sign " \ "-working-directory $WORKING_DIR " \
"-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o -
$LLVM_OPTIONS_PIPE"
Options passed to clang.
kbuild directory. If not set, use
/lib/modules/uname -r/build. If set to "" deliberately, skip kernel
header auto-detector.
Options passed to make when detecting
kernel header options.
Enable perf dump BPF object files compiled by
LLVM.
Options passed to llc.
Define how many ns worth of time to show
around samples in perf report sample context browser.
Any option defines a script that is added to
the scripts menu in the interactive perf browser and whose output is
displayed. The name of the option is the name, the value is a script command
line. The script gets the same options passed as a full perf script, in
particular -i perfdata file, --cpu, --tid
Limit the size of ordered_events queue, so we
could control allocation size of perf data files without proper finished round
events.
(boolean) Change the default for
"--big-num". To make "--no-big-num" the default, set
"stat.big-num=false".
If set, Intel PT decoder will set the mispred
flag on all branches.
If set and non-zero, the maximum number of
unconditional branches decoded without consuming any trace packets. If the
maximum is exceeded there will be a "Never-ending loop" error. The
default is 100000.
s390 only. The directory to save the auxiliary
trace buffer can be changed using this option. Ex, auxtrace.dumpdir=/tmp. If
the directory does not exist or has the wrong file type, the current directory
is used.
Log size in bytes to output when using the
option --itrace=d+e Refer itrace option of perf-script(1) or
perf-report(1). The default is 16384.
Base path for daemon data. All sessions data
are stored under this path.
Defines new record session for daemon. The
value is record’s command line without the record keyword.
SEE ALSO
perf(1)2024-06-21 | perf |