Table of Contents
In this section we describe the configuration and control of the profiling system with opcontrol in more depth. The opcontrol script has a default setup, but you can alter this with the options given below. In particular, if your hardware supports performance counters, you can configure them. There are a number of counters (for example, counter 0 and counter 1 on the Pentium III). Each of these counters can be programmed with an event to count, such as cache misses or MMX operations. The event chosen for each counter is reflected in the profile data collected by OProfile: functions and binaries at the top of the profiles reflect that most of the chosen events happened within that code.
Additionally, each counter has a "count" value: this corresponds to how detailed the profile is. The lower the value, the more frequently profile samples are taken. A counter can choose to sample only kernel code, user-space code, or both (both is the default). Finally, some events have a "unit mask" - this is a value that further restricts the types of event that are counted. The event types and unit masks for your CPU are listed by opcontrol --list-events.
The opcontrol script provides the following actions :
--init
Loads the OProfile module if required and makes the OProfile driver interface available.
--setup
Followed by list arguments for profiling set up. List of arguments
saved in /root/.oprofile/daemonrc.
Giving this option is not necessary; you can just directly pass one
of the setup options, e.g. opcontrol --no-vmlinux.
--status
Show configuration information.
--start-daemon
Start the oprofile daemon without starting actual profiling. The profiling
can then be started using --start. This is useful for avoiding
measuring the cost of daemon startup, as --start is a simple
write to a file in oprofilefs. Not available in 2.2/2.4 kernels.
--start
Start data collection with either arguments provided by --setup
or information saved in /root/.oprofile/daemonrc. Specifying
the addition --verbose makes the daemon generate lots of debug data
whilst it is running.
--dump
Force a flush of the collected profiling data to the daemon.
--stop
Stop data collection (this separate step is not possible with 2.2 or 2.4 kernels).
--shutdown
Stop data collection and kill the daemon.
--reset
Clears out data from current session, but leaves saved sessions.
--save=session_name
Save data from current session to session_name.
--deinit
Shuts down daemon. Unload the OProfile module and oprofilefs.
--list-events
List event types and unit masks.
--help
Generate usage messages.
There are a number of possible settings, of which, only
--vmlinux (or --no-vmlinux)
is required. These settings are stored in ~/.oprofile/daemonrc.
--buffer-size=num
Number of samples in kernel buffer. When using a 2.6 kernel buffer watershed need to be tweaked when changing this value.
--buffer-watershed=num
Set kernel buffer watershed to num samples (2.6 only). When it'll remain only buffer-size - buffer-watershed free entry in the kernel buffer data will be flushed to daemon, most usefull value are in the range [0.25 - 0.5] * buffer-size.
--cpu-buffer-size=num
Number of samples in kernel per-cpu buffer (2.6 only). If you profile at high rate it can help to increase this if the log file show excessive count of sample lost cpu buffer overflow.
--event=[eventspec]
Use the given performance counter event to profile. See Section 1.2, “Specifying performance counter events” below.
--session-dir=dir_path
Create/use sample database out of directory dir_path instead of
the default location (/var/lib/oprofile).
--separate=[none,lib,kernel,thread,cpu,all]
By default, every profile is stored in a single file. Thus, for example,
samples in the C library are all accredited to the /lib/libc.o
profile. However, you choose to create separate sample files by specifying
one of the below options.
none
|
No profile separation (default) |
lib
|
Create per-application profiles for libraries |
kernel
|
Create per-application profiles for the kernel and kernel modules |
thread
|
Create profiles for each thread and each task |
cpu
|
Create profiles for each CPU |
all
|
All of the above options |
Note that --separate=kernel also turns on --separate=lib.
When using --separate=kernel, samples in hardware interrupts, soft-irqs, or other
asynchronous kernel contexts are credited to the task currently running. This means you will see
seemingly nonsense profiles such as /bin/bash showing samples for the PPP modules,
etc.
On 2.2/2.4 only kernel threads already started when profiling begins are correctly profiled; newly started kernel thread samples are credited to the vmlinux (kernel) profile.
Using --separate=thread creates a lot
of sample files if you leave OProfile running for a while; it's most
useful when used for short sessions, or when using image filtering.
--callgraph=#depth
Enable call-graph sample collection with a maximum depth. Use 0 to disable callgraph profiling. NOTE: Callgraph support is available on a limited number of platforms at this time; for example:
x86 with recent 2.6 kernel
ARM with recent 2.6 kernel
PowerPC with 2.6.17 kernel
--image=image,[images]|"all"
Image filtering. If you specify one or more absolute
paths to binaries, OProfile will only produce profile results for those
binary images. This is useful for restricting the sometimes voluminous
output you may get otherwise, especially with
--separate=thread. Note that if you are using
--separate=lib or
--separate=kernel, then if you specification an
application binary, the shared libraries and kernel code
are included. Specify the value
"all" to profile everything (the default).
--vmlinux=file
vmlinux kernel image.
--no-vmlinux
Use this when you don't have a kernel vmlinux file, and you don't want to profile the kernel. This still counts the total number of kernel samples, but can't give symbol-based results for the kernel or any modules.
Here, we have a Pentium III running at 800MHz, and we want to look at where data memory references are happening most, and also get results for CPU time.
# opcontrol --event=CPU_CLK_UNHALTED:400000 --event=DATA_MEM_REFS:10000 # opcontrol --vmlinux=/boot/2.6.0/vmlinux # opcontrol --start |
Here, we have an Intel laptop without support for performance counters, running on 2.4 kernels.
# ophelp -r CPU with RTC device # opcontrol --vmlinux=/boot/2.4.13/vmlinux --event=RTC_INTERRUPTS:1024 # opcontrol --start |
If we're running 2.6 kernels, we can use --start-daemon to avoid
the profiler startup affecting results.
# opcontrol --vmlinux=/boot/2.6.0/vmlinux # opcontrol --start-daemon # my_favourite_benchmark --init # opcontrol --start ; my_favourite_benchmark --run ; opcontrol --stop |
Here, we want to see a profile of the OProfile daemon itself, including when it was running inside the kernel driver, and its use of shared libraries.
# opcontrol --separate=kernel --vmlinux=/boot/2.6.0/vmlinux # opcontrol --start # my_favourite_stress_test --run # opreport -l -p /lib/modules/2.6.0/kernel /usr/local/bin/oprofiled |
It can often be useful to split up profiling data into several different time periods. For example, you may want to collect data on an application's startup separately from the normal runtime data. You can use the simple command opcontrol --save to do this. For example :
# opcontrol --save=blah |
will create a sub-directory in $SESSION_DIR/samples containing the samples
up to that point (the current session's sample files are moved into this
directory). You can then pass this session name as a parameter to the post-profiling
analysis tools, to only get data up to the point you named the
session. If you do not want to save a session, you can do
rm -rf $SESSION_DIR/samples/sessionname or, for the
current session, opcontrol --reset.
The --event option to opcontrol
takes a specification that indicates how the details of each
hardware performance counter should be setup. If you want to
revert to OProfile's default setting (--event
is strictly optional), use --event=default. Use of this
option over-rides all previous event selections.
You can pass multiple event specifications. OProfile will allocate
hardware counters as necessary. Note that some combinations are not
allowed by the CPU; running opcontrol --list-events gives the details
of each event. The event specification is a colon-separated string
of the form name:count:unitmask:kernel:user as described in this table:
name
|
The symbolic event name, e.g. CPU_CLK_UNHALTED |
count
|
The counter reset value, e.g. 100000 |
unitmask
|
The unit mask, as given in the events list: e.g. 0x0f; or a symbolic name as given by the first word of the description (only valid for unit masks having an "extra:" parameter) |
kernel
|
Whether to profile kernel code |
user
|
Whether to profile userspace code |
The last three values are optional, if you omit them (e.g. --event=DATA_MEM_REFS:30000),
they will be set to the default values (a unit mask of 0, and profiling both kernel and
userspace code). Note that some events require a unit mask.
For the PowerPC platforms, all events specified must be in the same group; i.e., the group number
appended to the event name (e.g. <some-event-name>_GRP9) must be the same.
If OProfile is using RTC mode, and you want to alter the default counter value,
you can use something like --event=RTC_INTERRUPTS:2048. Note the last
three values here are ignored.
If OProfile is using timer-interrupt mode, there is no configuration possible.
The table below lists the events selected by default
(--event=default) for the various computer architectures:
| Processor | cpu_type | Default event |
| Alpha EV4 | alpha/ev4 | CYCLES:100000:0:1:1 |
| Alpha EV5 | alpha/ev5 | CYCLES:100000:0:1:1 |
| Alpha PCA56 | alpha/pca56 | CYCLES:100000:0:1:1 |
| Alpha EV6 | alpha/ev6 | CYCLES:100000:0:1:1 |
| Alpha EV67 | alpha/ev67 | CYCLES:100000:0:1:1 |
| ARM/XScale PMU1 | arm/xscale1 | CPU_CYCLES:100000:0:1:1 |
| ARM/XScale PMU2 | arm/xscale2 | CPU_CYCLES:100000:0:1:1 |
| ARM/MPCore | arm/mpcore | CPU_CYCLES:100000:0:1:1 |
| AVR32 | avr32 | CPU_CYCLES:100000:0:1:1 |
| Athlon | i386/athlon | CPU_CLK_UNHALTED:100000:0:1:1 |
| Pentium Pro | i386/ppro | CPU_CLK_UNHALTED:100000:0:1:1 |
| Pentium II | i386/pii | CPU_CLK_UNHALTED:100000:0:1:1 |
| Pentium III | i386/piii | CPU_CLK_UNHALTED:100000:0:1:1 |
| Pentium M (P6 core) | i386/p6_mobile | CPU_CLK_UNHALTED:100000:0:1:1 |
| Pentium 4 (non-HT) | i386/p4 | GLOBAL_POWER_EVENTS:100000:1:1:1 |
| Pentium 4 (HT) | i386/p4-ht | GLOBAL_POWER_EVENTS:100000:1:1:1 |
| Hammer | x86-64/hammer | CPU_CLK_UNHALTED:100000:0:1:1 |
| Family10h | x86-64/family10 | CPU_CLK_UNHALTED:100000:0:1:1 |
| Family11h | x86-64/family11h | CPU_CLK_UNHALTED:100000:0:1:1 |
| Itanium | ia64/itanium | CPU_CYCLES:100000:0:1:1 |
| Itanium 2 | ia64/itanium2 | CPU_CYCLES:100000:0:1:1 |
| TIMER_INT | timer | None selectable |
| IBM iseries | PowerPC 4/5/970 | CYCLES:10000:0:1:1 |
| IBM pseries | PowerPC 4/5/970/Cell | CYCLES:10000:0:1:1 |
| IBM s390 | timer | None selectable |
| IBM s390x | timer | None selectable |