Page doesn't render properly ?

Intel Phi (Knights Landing) Microarchitecture events

This is a list of all Intel Phi (Knights Landing) Microarchitecture performance counter event types. Please see Intel Xeon Phi(TM) Processor Performance Monitoring Reference and Intel Architecture Optimization Reference Manual.

NameDescriptionCounters usableUnit mask options
CPU_CLK_UNHALTED Clock cycles when not halted all
UNHALTED_REFERENCE_CYCLES Unhalted reference cycles all 0x01: No unit mask
INST_RETIRED number of instructions retired all
LLC_MISSES Last level cache demand requests from this core that missed the LLC all 0x41: No unit mask
LLC_REFS Last level cache demand requests from this core all 0x4f: No unit mask
BR_INST_RETIRED number of branch instructions retired all
BR_MISS_PRED_RETIRED number of mispredicted branches retired (precise) all
recycleq Counts the number of retired load or store micro-ops that get pushed into the Recycle Queue all 0x01: (name=ld_block_st_forward) Counts the number of occurrences a retired load gets blocked because its address partially overlaps with a store.
0x02: (name=ld_block_std_notready) Counts the number of occurrences a retired load gets blocked because its address overlaps with a store whose data is not ready.
0x04: (name=st_splits) Counts the number of occurrences a retired store that is a cache line split. Each split should be counted only once.
0x08: (name=ld_splits) Counts the number of occurrences a retired load that is a cache line split. Each split should be counted only once.
0x10: (name=lock) Counts all the retired locked loads. It does not include stores because we would double count if we count stores.
0x20: (name=sta_full) Counts the store micro-ops retired that were pushed in the rehad queue because the store address buffer is full.
0x40: (name=any_ld) Counts any retired load that was pushed into the recycle queue for any reason.
0x80: (name=any_st) Counts any retired store that was pushed into the recycle queue for any reason.
mem_uops_retired Counts the number of memory micro-ops retired. all 0x01: (name=l1_miss_loads) Counts the number of load micro-ops retired that miss in L1 D cache.
0x02: (name=l2_hit_loads) Counts the number of load micro-ops retired that hit in the L2.
0x04: (name=l2_miss_loads) Counts the number of load micro-ops retired that miss in the L2.
0x08: (name=dtlb_miss_loads) Counts the number of load micro-ops retired that cause a DTLB miss.
0x10: (name=utlb_miss_loads) Counts the number of load micro-ops retired that caused micro TLB miss.
0x20: (name=hitm) Counts the loads retired that get the data from the other core in the same tile in M state.
0x40: (name=any_loads) Counts all the load micro-ops retired.
0x80: (name=any_stores) Counts all the store micro-ops retired.
page_walks Counts the number of core cycles for page walks all 0x01: (name=d_side_walks) Counts the total D-side page walks that are completed or started. The page walks started in the speculative path will also be counted.
0x01: (name=d_side_cycles) Counts the total number of core cycles for all the D-side page walks. The cycles for page walks started in speculative path will also be included.
0x02: (name=i_side_walks) Counts the total I-side page walks that are completed.
0x02: (name=i_side_cycles) Counts the total number of core cycles for all the I-side page walks. The cycles for page walks started in speculative path will also be included.
0x03: (name=walks) Counts the total page walks completed (I-side and D-side)
0x03: (name=cycles) Counts the total number of core cycles for all the page walks. The cycles for page walks started in speculative path will also be included.
l2_requests_reject Counts the number of MEC requests from the L2Q that reference a cache line were rejected. all 0x00: (name=all) Counts the number of MEC requests from the L2Q that reference a cache line excluding SW prefetches filling only to L2 cache and L1 evictions (automatically exlcudes L2HWP, UC, WC) that were rejected - Multiple repeated rejects should be counted multiple times.
core_reject_l2q Number of requests not accepted into the L2Q because of any L2 queue reject condition. all 0x00: (name=all) Counts the number of MEC requests that were not accepted into the L2Q because of any L2 queue reject condition. There is no concept of at-ret here. It might include requests due to instructions in the speculative path
icache Instruction fetches all 0x03: (name=accesses) All instruction fetches including uncacheable
0x01: (name=hits) All instruction fetches that hit instruction cache
0x02: (name=misses) All instruction fetches that missed instruction cache (produced a memory request); counted only once, not once per outstanding cycle
fetch_stall Counts the number of core cycles the instruction fetch pipe was stalls all 0x01: (name=icache_fill_pending_cycles) Counts the number of core cycles the fetch stalls because of an icache miss. This is a cumulative count of core cycles the fetch stalled for all icache misses
0x01: (name=icache_fill_pending_edge) Counts the number of times it happens that fetch stalls because of an icache miss.
l2_requests L2 cache requests all 0x41: (name=miss) Counts the total number of L2 cache misses.
0x4f: (name=reference) Counts the total number of L2 cache references.
uops_retired Retired uops all 0x01: (name=ms) Counts the number of uops retired that are from complex flows issued by the micro-sequencer
0x10: (name=all) Counts the number of uops retired
0x20: (name=scalar_simd) Counts the number of scalar SSE, AVX, AVX2, AVX-512 micro-ops except for loads (memory-to-register mov-type micro ops), division, sqrt.
0x40: (name=packed_simd) Counts the number of packed SSE, AVX, AVX2, AVX-512 micro-ops (both floating point and integer) except for loads (memory-to-register mov-type micro-ops), packed byte and word multiplies.
machine_clears Counts the number of times that the machine clears at retire. all 0x01: (name=smc) Counts the number of times that the machine clears due to program modifying data within 1K of a recently fetched code page.
0x02: (name=memory_ordering) Counts the number of times the machine clears due to memory ordering hazards.
0x04: (name=fp_assist) Counts the number of floating operations retired that required microcode assists
0x08: (name=all) Counts all machine clears
br_inst_retired Counts the number of branch instructions retired all 0x00: (name=any) Counts the number of branch instructions retired
0x7e: (name=jcc) Counts the number of branch instructions retired that were conditional jumps.
0xfe: (name=taken_jcc) Counts the number of branch instructions retired that were conditional jumps and predicted taken.
0xf9: (name=call) Counts the number of near CALL branch instructions retired.
0xfd: (name=rel_call) Counts the number of near relative CALL branch instructions retired.
0xfb: (name=ind_call) Counts the number of near indirect CALL branch instructions retired.
0xf7: (name=return) Counts the number of near RET branch instructions retired.
0xeb: (name=non_return_ind) Counts the number of branch instructions retired that were near indirect CALL or near indirect JMP.
0xbf: (name=far_branch) Counts the number of far branch instructions retired.
br_misp_retired Counts the number of mispredicted branch instructions retired all 0x00: (name=any) All mispredicted branches
0x7e: (name=jcc) Number of mispredicted conditional branch instructions retired
0xfe: (name=taken_jcc) Number of mispredicted taken conditional branch instructions retired
0xf9: (name=call) Counts the number of mispredicted near CALL branch instructions retired.
0xfd: (name=rel_call) Counts the number of mispredicted near relative CALL branch instructions retired.
0xfb: (name=ind_call) Number of mispredicted indirect call branch instructions retired
0xf7: (name=return) Number of mispredicted return branch instructions retired
0xeb: (name=non_return_ind) Number of mispredicted non-return branch instructions retired
0xbf: (name=far_branch) Counts the number of mispredicted far branch instructions retired.
no_alloc_cycles Counts the number of core cycles when no micro-ops are allocated all 0x01: (name=rob_full) Counts the number of core cycles when no micro-ops are allocated and the ROB is full
0x02: (name=mispredicts) Counts the number of core cycles when no micro-ops are allocated and the alloc pipe is stalled waiting for a mispredicted branch to retire.
0x20: (name=rat_stall) Counts the number of core cycles when no micro-ops are allocated and a RATstall (caused by reservation station full) is asserted.
0x7f: (name=all) Counts the total number of core cycles when no micro-ops are allocated for any reason.
rs_full_stall Counts the number of core cycles when the allocate stalls because the required RS is full. all 0x01: (name=mec) Counts the number of core cycles when allocation pipeline is stalled and is waiting for a free MEC reservation station entry.
0x1f: (name=all) Counts the total number of core cycles the Alloc pipeline is stalled when any one of the reservation stations is full.
cycles_div_busy Number of core cycles when divider is busy all 0x01: (name=all) Cycles the number of core cycles when divider is busy, does not imply a stall waiting for the divider
baclears Counts the number of times Branch Target Buffer (BTB) prediction was corrected by a later branch predictor all 0x01: (name=all) Counts the number of times front-end resteers for any branch as a result of another branch handling mechanism in the front-end.
0x08: (name=return) Counts the number of times the front-end resteers for RET branches as a result of another branch handling mechanism in the front-end.
0x10: (name=cond) Counts the number of times the front-end resteers for conditional branches as a result of another branch handling mechanism in the front-end.
ms_decoded Microcode sequencer decode entrypoints all 0x01: (name=ms_entry) Counts the number of times the MSROM starts a flow of uops.
A wise man proportions his belief to the evidence. - David Hume
2017/07/24