Overview
OProfile is a system-wide profiler for Linux systems, capable of profiling
all running code at low overhead. OProfile is released under the GNU GPL.
For versions 0.9.7 and earlier, the profiler consists of a kernel driver and a daemon for collecting sample data.
In version 0.9.8, the kernel driver/daemon method of collecting sample data is deprecated in favor of profiling
with the Linux Kernel Performance Events Subsystem (kernel version 2.6.31 or higher). Several
post-profiling tools for turning profile data into human readable information are available.
OProfile leverages the hardware performance counters of the CPU to enable profiling
of a wide variety of interesting statistics, which can also be used for basic
time-spent profiling. All code is profiled: hardware and software interrupt handlers, kernel modules, the kernel,
shared libraries, and applications.
OProfile is currently in alpha status; however it has proven stable over a large number
of differing configurations; it is being used on machines ranging from laptops to
16-way NUMA-Q boxes. As always, there is no warranty.
Features
- Unobtrusive
-
No special recompilations, wrapper libraries or the like are necessary. Even debug symbols
(-g option to gcc) are not necessary unless you want to produce annotated source.
Kernel patches are usually unnecessary, except in cases where the running kernel may not yet support
some newer processor models.
- System-wide profiling
-
All code running on the system is profiled, enabling analysis of system performance. Note: Root
authority is required to do system-wide profiling.
- Single process profiling
-
Application developers will find the single process profiling feature very convenient since it
does not require root authority, and profile data is collected only for the specified process
(or command). This method has the added benefit of "following" fork/execs and collecting
profile information on those child processes as well. Note: This method of profiling requires
a kernel version of 2.6.31 or higher.
- Performance counter support
-
Enables collection of various low-level data, and assocation with particular sections
of code.
- Call-graph support
-
With an x86 or ARM 2.6 kernel, OProfile can provide gprof-style call-graph
profiling data.
- Low overhead
-
OProfile has a typical overhead of 1-8%, dependent on sampling frequency and workload.
- Post-profile analysis
-
Profile data can be produced on the function-level or instruction-level detail. Source trees
annotated with profile information can be created. A hit list of applications and functions
that take the most time across the whole system can be produced.
- System support
-
OProfile works across a range of CPUs, include the Intel range, AMD's Athlon and AMD64 processors range,
the Alpha, ARM, IBM PowerPC, and more.
OProfile will work against almost any 2.2, 2.4 and 2.6 kernels, and works on both UP and SMP
systems from desktops to the scariest NUMAQ boxes. Note: As of version 0.9.8, only 2.6 kernels
are supported.
Example reports
You can see what sort of output OProfile can produce with the example reports.
History
The early versions of OProfile were developed as part credit for an M.Sc. in Computer Science. The
basic principles of the design were inspired by Compaq's DCPI profiler.
It is a capital mistake to theorise before one has data. Insensibly one begins to twist facts to suit theories instead of theories to suit facts.
- Sherlock Holmes
2012/08/28