Page doesn't render properly ?

PPC E500 v2 events

This is a list of PPC E500 v2's performance counter event types. Please see PowerPC e500 Core Complex Reference Manual Chapter 7: Performance Monitor. Downloadable from freescale.com

NameDescriptionCounters usableUnit mask options
CPU_CLK Cycles all
COMPLETED_INSNS Completed Instructions (0, 1, or 2 per cycle) all
COMPLETED_OPS Completed Micro-ops (counts 2 for load/store w/update) all
INSTRUCTION_FETCHES Instruction fetches all
DECODED_OPS Micro-ops decoded all
COMPLETED_BRANCHES Branch Instructions completed all
COMPLETED_LOAD_OPS Load micro-ops completed all
COMPLETED_STORE_OPS Store micro-ops completed all
COMPLETION_REDIRECTS Number of completion buffer redirects all
BRANCHES_FINISHED Branches finished all
TAKEN_BRANCHES_FINISHED Taken branches finished all
BIFFED_BRANCHES_FINISHED Biffed branches finished all
BRANCHES_MISPREDICTED Branch instructions mispredicted due to direction, target, or IAB prediction all
BRANCHES_MISPREDICTED_DIRECTION Branches mispredicted due to direction prediction all
BTB_HITS Branches that hit in the BTB, or missed but are not taken all
DECODE_STALLED Cycles the instruction buffer was not empty, but 0 instructions decoded all
ISSUE_STALLED Cycles the issue buffer is not empty but 0 instructions issued all
BRANCH_ISSUE_STALLED Cycles the branch buffer is not empty but 0 instructions issued all
SRS0_SCHEDULE_STALLED Cycles SRS0 is not empty but 0 instructions scheduled all
SRS1_SCHEDULE_STALLED Cycles SRS1 is not empty but 0 instructions scheduled all
VRS_SCHEDULE_STALLED Cycles VRS is not empty but 0 instructions scheduled all
LRS_SCHEDULE_STALLED Cycles LRS is not empty but 0 instructions scheduled all
BRS_SCHEDULE_STALLED Cycles BRS is not empty but 0 instructions scheduled Load/Store, Data Cache, and dLFB Events all
TOTAL_TRANSLATED Total Ldst microops translated. all
LOADS_TRANSLATED Number of cacheable L* or EVL* microops translated. (This includes microops from load-multiple, load-update, and load-context instructions.) all
STORES_TRANSLATED Number of cacheable ST* or EVST* microops translated. (This includes microops from store-multiple, store-update, and save-context instructions.) all
TOUCHES_TRANSLATED Number of cacheable DCBT and DCBTST instructions translated (L1 only) (Does not count touches that are converted to nops i.e. exceptions, noncacheable, hid0[nopti] bit is set.) all
CACHEOPS_TRANSLATED Number of dcba, dcbf, dcbst, and dcbz instructions translated (e500 traps on dcbi) all
CACHEINHIBITED_ACCESSES_TRANSLATED Number of cache inhibited accesses translated all
GUARDED_LOADS_TRANSLATED Number of guarded loads translated all
WRITETHROUGH_STORES_TRANSLATED Number of write-through stores translated all
MISALIGNED_ACCESSES_TRANSLATED Number of misaligned load or store accesses translated. all
TOTAL_ALLOCATED_DLFB Total allocated to dLFB all
LOADS_TRANSLATED_ALLOCATED_DLFB Loads translated and allocated to dLFB (Applies to same class of instructions as loads translated.) all
STORES_COMPLETED_ALLOCATED_DLFB Stores completed and allocated to dLFB (Applies to same class of instructions as stores translated.) all
TOUCHES_TRANSLATED_ALLOCATED_DLFB Touches translated and allocated to dLFB (Applies to same class of instructions as touches translated.) all
STORES_COMPLETED Number of cacheable ST* or EVST* microops completed. (Applies to the same class of instructions as stores translated.) all
DL1_LOCKS Number of cache lines locked in the dL1. (Counts a lock even if an overlock condition is encountered.) all
DL1_RELOADS This is historically used to determine dcache miss rate (along with loads/stores completed). This counts dL1 reloads for any reason. all
DL1_CASTOUTS dL1 castouts. Does not count castouts due to DCBF. all
DETECTED_REPLAYS Times detected replay condition - Load miss with dLFB full. all
LOAD_MISS_QUEUE_FULL_REPLAYS Load miss with load queue full. all
LOAD_GUARDED_MISS_NOT_LAST_REPLAYS Load guarded miss when the load is not yet at the bottom of the completion buffer. all
STORE_TRANSLATED_QUEUE_FULL_REPLAYS Translate a store when the StQ is full. all
ADDRESS_COLLISION_REPLAYS Address collision. all
DMMU_MISS_REPLAYS DMMU_MISS_REPLAYS : DMMU miss. all
DMMU_BUSY_REPLAYS DMMU_BUSY_REPLAYS : DMMU busy. all
SECOND_PART_MISALIGNED_AFTER_MISS_REPLAYS Second part of misaligned access when first part missed in cache. all
LOAD_MISS_DLFB_FULL_CYCLES Cycles stalled on replay condition - Load miss with dLFB full. all
LOAD_MISS_QUEUE_FULL_CYCLES Cycles stalled on replay condition - Load miss with load queue full. all
LOAD_GUARDED_MISS_NOT_LAST_CYCLES Cycles stalled on replay condition - Load guarded miss when the load is not yet at the bottom of the completion buffer. all
STORE_TRANSLATED_QUEUE_FULL_CYCLES Cycles stalled on replay condition - Translate a store when the StQ is full. all
ADDRESS_COLLISION_CYCLES Cycles stalled on replay condition - Address collision. all
DMMU_MISS_CYCLES Cycles stalled on replay condition - DMMU miss. all
DMMU_BUSY_CYCLES Cycles stalled on replay condition - DMMU busy. all
SECOND_PART_MISALIGNED_AFTER_MISS_CYCLES Cycles stalled on replay condition - Second part of misaligned access when first part missed in cache. all
IL1_LOCKS Number of cache lines locked in the iL1. (Counts a lock even if an overlock condition is encountered.) all
IL1_FETCH_RELOADS This is historically used to determine icache miss rate (along with instructions completed) Reloads due to demand fetch. all
FETCHES Counts the number of fetches that write at least one instruction to the instruction buffer. (With instruction fetched, can used to compute instructions-per-fetch) all
IMMU_TLB4K_RELOADS iMMU TLB4K reloads all
IMMU_VSP_RELOADS iMMU VSP reloads all
DMMU_TLB4K_RELOADS dMMU TLB4K reloads all
DMMU_VSP_RELOADS dMMU VSP reloads all
L2MMU_MISSES Counts iTLB/dTLB error interrupt all
BIU_MASTER_REQUESTS Number of master transactions. (Number of master TSs.) all
BIU_MASTER_I_REQUESTS Number of master I-Side transactions. (Number of master I-Side TSs.) all
BIU_MASTER_D_REQUESTS Number of master D-Side transactions. (Number of master D-Side TSs.) all
BIU_MASTER_D_CASTOUT_REQUESTS Number of master D-Side non-program-demand castout transactions. This counts replacement pushes and snoop pushes. This does not count DCBF castouts. (Number of master D-side non-program-demand castout TSs.) all
BIU_MASTER_RETRIES Number of transactions which were initiated by this processor which were retried on the BIU interface. (Number of master ARTRYs.) all
SNOOP_REQUESTS Number of externally generated snoop requests. (Counts snoop TSs.) all
SNOOP_HITS Number of snoop hits on all D-side resources regardless of the cache state (modified, exclusive, or shared) all
SNOOP_PUSHES Number of snoop pushes from all D-side resources. (Counts snoop ARTRY/WOPs.) all
SNOOP_RETRIES Number of snoop requests retried. (Counts snoop ARTRYs.) all
PMC0_OVERFLOW Counts the number of times PMC0[32] transitioned from 1 to 0. all
PMC1_OVERFLOW Counts the number of times PMC1[32] transitioned from 1 to 0. all
PMC2_OVERFLOW Counts the number of times PMC2[32] transitioned from 1 to 0. all
PMC3_OVERFLOW Counts the number of times PMC3[32] transitioned from 1 to 0. all
INTERRUPTS Number of interrupts taken all
EXTERNAL_INTERRUPTS Number of external input interrupts taken all
CRITICAL_INTERRUPTS Number of critical input interrupts taken all
SC_TRAP_INTERRUPTS Number of system call and trap interrupts all
Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is. - Rob Pike
2014/09/12