|
|
| |
PMC.CORE(3) |
FreeBSD Library Functions Manual |
PMC.CORE(3) |
pmc.core —
measurement events for Intel Core Solo and Core Duo family
CPUs
Performance Counters Library (libpmc, -lpmc)
Intel Core Solo and Core Duo CPUs contain PMCs conforming to version 1 of the
Intel performance measurement architecture.
These PMCs are documented in
Volume 3: System Programming
Guide, IA-32 Intel® Architecture Software
Developer's Manual, Order Number 253669-027US,
Intel Corporation, July
2008.
CPUs conforming to version 1 of the Intel performance measurement architecture
contain two programmable PMCs of class PMC_CLASS_IAP .
The PMCs are 40 bits width and offer the following capabilities:
Capability |
Support |
PMC_CAP_CASCADE |
No |
PMC_CAP_EDGE |
Yes |
PMC_CAP_INTERRUPT |
Yes |
PMC_CAP_INVERT |
Yes |
PMC_CAP_READ |
Yes |
PMC_CAP_PRECISE |
No |
PMC_CAP_SYSTEM |
Yes |
PMC_CAP_TAGGING |
No |
PMC_CAP_THRESHOLD |
Yes |
PMC_CAP_USER |
Yes |
PMC_CAP_WRITE |
Yes |
Event specifiers for these PMCs support the following common qualifiers:
cmask= value
- Configure the PMC to increment only if the number of configured events
measured in a cycle is greater than or equal to
value.
edge
- Configure the PMC to count the number of de-asserted to asserted
transitions of the conditions expressed by the other qualifiers. If
specified, the counter will increment only once whenever a condition
becomes true, irrespective of the number of clocks during which the
condition remains true.
inv
- Invert the sense of comparison when the
“
cmask ” qualifier is present, making
the counter increment when the number of events per cycle is less than the
value specified by the “cmask ”
qualifier.
os
- Configure the PMC to count events happening at processor privilege level
0.
usr
- Configure the PMC to count events occurring at privilege levels 1, 2 or
3.
If neither of the “os ” or
“usr ” qualifiers are specified, the
default is to enable both.
Events that require core-specificity to be specified use a
additional qualifier
“core= value”,
where argument value is one of:
all
- Measure event conditions on all cores.
this
- Measure event conditions on this core.
The default is “this ”.
Events that require an agent qualifier to be specified use an
additional qualifier “agent= value”,
where argument value is one of:
this
- Measure events associated with this bus agent.
any
- Measure events caused by any bus agent.
The default is “this ”.
Events that require a hardware prefetch qualifier to be specified
use an additional qualifier
“prefetch= value”,
where argument value is one of:
both
- Include all prefetches.
only
- Only count hardware prefetches.
exclude
- Exclude hardware prefetches.
The default is “both ”.
Events that require a cache coherence qualifier to be specified
use an additional qualifier
“cachestate= value”,
where argument value contains one or more of the
following letters:
e
- Count cache lines in the exclusive state.
i
- Count cache lines in the invalid state.
m
- Count cache lines in the modified state.
s
- Count cache lines in the shared state.
The default is “eims ”.
The following event names are case insensitive. Whitespace, hyphens and
underscore characters in these names are ignored.
Core PMCs support the following events:
BAClears
- (Event E6H, Umask 00H) The number of BAClear conditions asserted.
BTB_Misses
- (Event E2H, Umask 00H) The number of branches for which the branch table
buffer did not produce a prediction.
Br_BAC_Missp_Exec
- (Event 8AH, Umask 00H) The number of branch instructions executed that
were mispredicted at the front end.
Br_Bogus
- (Event E4H, Umask 00H) The number of bogus branches.
Br_Call_Exec
- (Event 92H, Umask 00H) The number of
CALL
instructions executed.
Br_Call_Missp_Exec
- (Event 93H, Umask 00H) The number of
CALL
instructions executed that were mispredicted.
Br_Cnd_Exec
- (Event 8BH, Umask 00H) The number of conditional branch instructions
executed.
Br_Cnd_Missp_Exec
- (Event 8CH, Umask 00H) The number of conditional branch instructions
executed that were mispredicted.
Br_Ind_Call_Exec
- (Event 94H, Umask 00H) The number of indirect
CALL
instructions executed.
Br_Ind_Exec
- (Event 8DH, Umask 00H) The number of indirect branches executed.
Br_Ind_Missp_Exec
- (Event 8EH, Umask 00H) The number of indirect branch instructions executed
that were mispredicted.
Br_Inst_Exec
- (Event 88H, Umask 00H) The number of branch instructions executed
including speculative branches.
Br_Instr_Decoded
- (Event E0H, Umask 00H) The number of branch instructions decoded.
Br_Instr_Ret
- (Event C4H, Umask 00H) (Alias “Branch Instruction Retired”)
The number of branch instructions retired. This is an architectural
performance event.
Br_MisPred_Ret
- (Event C5H, Umask 00H) (Alias “Branch Misses Retired”) The
number of mispredicted branch instructions retired. This is an
architectural performance event.
Br_MisPred_Taken_Ret
- (Event CAH, Umask 00H) The number of taken and mispredicted branches
retired.
Br_Missp_Exec
- (Event 89H, Umask 00H) The number of branch instructions executed and
mispredicted at execution including branches that were not predicted.
Br_Ret_BAC_Missp_Exec
- (Event 91H, Umask 00H) The number of return branch instructions that were
mispredicted at the front end.
Br_Ret_Exec
- (Event 8FH, Umask 00H) The number of return branch instructions
executed.
Br_Ret_Missp_Exec
- (Event 90H, Umask 00H) The number of return branch instructions executed
that were mispredicted.
Br_Taken_Ret
- (Event C9H, Umask 00H) The number of taken branches retired.
Bus_BNR_Clocks
- (Event 61H, Umask 00H) The number of external bus cycles while BNR (bus
not ready) was asserted.
Bus_DRDY_Clocks
[,agent=agent]
- (Event 62H, Umask 00H) The number of external bus cycles while DRDY was
asserted.
Bus_Data_Rcv
- (Event 64H, Umask 40H) The number of cycles during which the processor is
busy receiving data.
Bus_Locks_Clocks
[,core=core]
- (Event 63H) The number of external bus cycles while the bus lock signal
was asserted.
Bus_Not_In_Use
[,core=core]
- (Event 7DH) The number of cycles when there is no transaction from the
core.
Bus_Req_Outstanding
[,agent=agent] [,core=core]
- (Event 60H) The weighted cycles of cacheable bus data read requests from
the data cache unit or hardware prefetcher.
Bus_Snoop_Stall
- (Event 7EH, Umask 00H) The number bus cycles while a bus snoop is
stalled.
Bus_Snoops
[,agent=agent]
[,cachestate=mesi]
- (Event 77H) The number of snoop responses to bus transactions.
Bus_Trans_Any
[,agent=agent]
- (Event 70H) The number of completed bus transactions.
Bus_Trans_Brd
[,core=core]
- (Event 65H) The number of read bus transactions.
Bus_Trans_Burst
[,agent=agent]
- (Event 6EH) The number of completed burst transactions. Retried
transactions may be counted more than once.
Bus_Trans_Def
[,core=core]
- (Event 6DH) The number of completed deferred transactions.
Bus_Trans_IO
[,agent=agent] [,core=core]
- (Event 6CH) The number of completed I/O transactions counting both reads
and writes.
Bus_Trans_Ifetch
[,agent=agent] [,core=core]
- (Event 68H) Completed instruction fetch transactions.
Bus_Trans_Inval
[,agent=agent] [,core=core]
- (Event 69H) The number completed invalidate transactions.
Bus_Trans_Mem
[,agent=agent]
- (Event 6FH) The number of completed memory transactions.
Bus_Trans_P
[,agent=agent] [,core=core]
- (Event 6BH) The number of completed partial transactions.
Bus_Trans_Pwr
[,agent=agent] [,core=core]
- (Event 6AH) The number of completed partial write transactions.
Bus_Trans_RFO
[,agent=agent] [,core=core]
- (Event 66H) The number of completed read-for-ownership transactions.
Bus_Trans_WB
[,agent=agent]
- (Event 67H) The number of completed write-back transactions from the data
cache unit, excluding L2 write-backs.
Cycles_Div_Busy
- (Event 14H, Umask 00H) The number of cycles the divider is busy. The event
is only available on PMC0.
Cycles_Int_Masked
- (Event C6H, Umask 00H) The number of cycles while interrupts were
disabled.
Cycles_Int_Pending_Masked
- (Event C7H, Umask 00H) The number of cycles while interrupts were disabled
and interrupts were pending.
DCU_Snoop_To_Share
[,core=core]
- (Event 78H) The number of data cache unit snoops to L1 cache lines in the
shared state.
DCache_Cache_Lock
[,cachestate=mesi]
- (Event 42H) The number of cacheable locked read operations to invalid
state.
DCache_Cache_LD
[,cachestate=mesi]
- (Event 40H) The number of cacheable L1 data read operations.
DCache_Cache_ST
[,cachestate=mesi]
- (Event 41H) The number cacheable L1 data write operations.
DCache_M_Evict
- (Event 47H, Umask 00H) The number of M state data cache lines that were
evicted.
DCache_M_Repl
- (Event 46H, Umask 00H) The number of M state data cache lines that were
allocated.
DCache_Pend_Miss
- (Event 48H, Umask 00H) The weighted cycles an L1 miss was
outstanding.
DCache_Repl
- (Event 45H, Umask 0FH) The number of data cache line replacements.
Data_Mem_Cache_Ref
- (Event 44H, Umask 02H) The number of cacheable read and write operations
to L1 data cache.
Data_Mem_Ref
- (Event 43H, Umask 01H) The number of L1 data reads and writes, both
cacheable and un-cacheable.
Dbus_Busy
[,core=core]
- (Event 22H) The number of core cycles during which the data bus was
busy.
Dbus_Busy_Rd
[,core=core]
- (Event 23H) The number of cycles during which the data bus was busy
transferring data to a core.
Div
- (Event 13H, Umask 00H) The number of divide operations including
speculative operations for integer and floating point divides. This event
can only be counted on PMC1.
Dtlb_Miss
- (Event 49H, Umask 00H) The number of data references that missed the
TLB.
ESP_Uops
- (Event D7H, Umask 00H) The number of ESP folding instructions
decoded.
EST_Trans
[,trans=transition]
- (Event 3AH) Count the number of Intel Enhanced SpeedStep transitions. The
argument transition can be one of the following
values:
any
- (Umask 00H) Count all transitions.
frequency
- (Umask 01H) Count frequency transitions.
The default is “any ”.
FP_Assist
- (Event 11H, Umask 00H) The number of floating point operations that
required microcode assists. The event is only available on PMC1.
FP_Comp_Instr_Ret
- (Event C1H, Umask 00H) The number of X87 floating point compute
instructions retired. The event is only available on PMC0.
FP_Comps_Op_Exe
- (Event 10H, Umask 00H) The number of floating point computational
instructions executed.
FP_MMX_Trans
- (Event CCH, Umask 01H) The number of transitions from X87 to MMX.
Fused_Ld_Uops_Ret
- (Event DAH, Umask 01H) The number of fused load uops retired.
Fused_St_Uops_Ret
- (Event DAH, Umask 02H) The number of fused store uops retired.
Fused_Uops_Ret
- (Event DAH, Umask 00H) The number of fused uops retired.
HW_Int_Rx
- (Event C8H, Umask 00H) The number of hardware interrupts received.
ICache_Misses
- (Event 81H, Umask 00H) The number of instruction fetch misses in the
instruction cache and streaming buffers.
ICache_Reads
- (Event 80H, Umask 00H) The number of instruction fetches from the
instruction cache and streaming buffers counting both cacheable and
un-cacheable fetches.
IFU_Mem_Stall
- (Event 86H, Umask 00H) The number of cycles the instruction fetch unit was
stalled while waiting for data from memory.
ILD_Stall
- (Event 87H, Umask 00H) The number of instruction length decoder
stalls.
ITLB_Misses
- (Event 85H, Umask 00H) The number of instruction TLB misses.
Instr_Decoded
- (Event D0H, Umask 00H) The number of instructions decoded.
Instr_Ret
- (Event C0H, Umask 00H) (Alias “Instruction Retired”) The
number of instructions retired. This is an architectural performance
event.
L1_Pref_Req
- (Event 4FH, Umask 00H) The number of L1 prefetch request due to data cache
misses.
L2_ADS
[,core=core]
- (Event 21H) The number of L2 address strobes.
L2_IFetch
[,cachestate=mesi]
[,core=core]
- (Event 28H) The number of instruction fetches by the instruction fetch
unit from L2 cache including speculative fetches.
L2_LD
[,cachestate=mesi]
[,core=core]
- (Event 29H) The number of L2 cache reads.
L2_Lines_In
[,core=core]
[,prefetch=prefetch]
- (Event 24H) The number of L2 cache lines allocated.
L2_Lines_Out
[,core=core]
[,prefetch=prefetch]
- (Event 26H) The number of L2 cache lines evicted.
L2_M_Lines_In
[,core=core]
- (Event 25H) The number of L2 M state cache lines allocated.
L2_M_Lines_Out
[,core=core]
[,prefetch=prefetch]
- (Event 27H) The number of L2 M state cache lines evicted.
L2_No_Request_Cycles
[,cachestate=mesi] [,core=core]
[,prefetch=prefetch]
- (Event 32H) The number of cycles there was no request to access L2
cache.
L2_Reject_Cycles
[,cachestate=mesi] [,core=core]
[,prefetch=prefetch]
- (Event 30H) The number of cycles the L2 cache was busy and rejecting new
requests.
L2_Rqsts
[,cachestate=mesi] [,core=core]
[,prefetch=prefetch]
- (Event 2EH) The number of L2 cache requests.
L2_ST
[,cachestate=mesi]
[,core=core]
- (Event 2AH) The number of L2 cache writes including speculative
writes.
LD_Blocks
- (Event 03H, Umask 00H) The number of load operations delayed due to store
buffer blocks.
LLC_Misses
- (Event 2EH, Umask 41H) The number of cache misses for references to the
last level cache, excluding misses due to hardware prefetches. This is an
architectural performance event.
LLC_Reference
- The number of references to the last level cache, excluding those due to
hardware prefetches. This is an architectural performance event. (Event
2EH, Umask 4FH) This is an architectural performance event.
MMX_Assist
- (Event CDH, Umask 00H) The number of EMMX instructions executed.
MMX_FP_Trans
- (Event CCH, Umask 00H) The number of transitions from MMX to X87.
MMX_Instr_Exec
- (Event B0H, Umask 00H) The number of MMX instructions executed excluding
MOVQ and MOVD stores.
MMX_Instr_Ret
- (Event CEH, Umask 00H) The number of MMX instructions retired.
Misalign_Mem_Ref
- (Event 05H, Umask 00H) The number of misaligned data memory references,
counting loads and stores.
Mul
- (Event 12H, Umask 00H) The number of multiply operations include
speculative floating point and integer multiplies. This event is available
on PMC1 only.
NonHlt_Ref_Cycles
- (Event 3CH, Umask 01H) (Alias “Unhalted Reference Cycles”)
The number of non-halted bus cycles. This is an architectural performance
event.
Pref_Rqsts_Dn
- (Event F8H, Umask 00H) The number of hardware prefetch requests issued in
backward streams.
Pref_Rqsts_Up
- (Event F0H, Umask 00H) The number of hardware prefetch requests issued in
forward streams.
Resource_Stall
- (Event A2H, Umask 00H) The number of cycles where there is a resource
related stall.
SD_Drains
- (Event 04H, Umask 00H) The number of cycles while draining store
buffers.
SIMD_FP_DP_P_Ret
- (Event D8H, Umask 02H) The number of SSE/SSE2 packed double precision
instructions retired.
SIMD_FP_DP_P_Comp_Ret
- (Event D9H, Umask 02H) The number of SSE/SSE2 packed double precision
compute instructions retired.
SIMD_FP_DP_S_Ret
- (Event D8H, Umask 03H) The number of SSE/SSE2 scalar double precision
instructions retired.
SIMD_FP_DP_S_Comp_Ret
- (Event D9H, Umask 03H) The number of SSE/SSE2 scalar double precision
compute instructions retired.
SIMD_FP_SP_P_Comp_Ret
- (Event D9H, Umask 00H) The number of SSE/SSE2 packed single precision
compute instructions retired.
SIMD_FP_SP_Ret
- (Event D8H, Umask 00H) The number of SSE/SSE2 scalar single precision
instructions retired, both packed and scalar.
SIMD_FP_SP_S_Ret
- (Event D8H, Umask 01H) The number of SSE/SSE2 scalar single precision
instructions retired.
SIMD_FP_SP_S_Comp_Ret
- (Event D9H, Umask 01H) The number of SSE/SSE2 single precision compute
instructions retired.
SIMD_Int_128_Ret
- (Event D8H, Umask 04H) The number of SSE2 128-bit integer instructions
retired.
SIMD_Int_Pari_Exec
- (Event B3H, Umask 20H) The number of SIMD integer packed arithmetic
instructions executed.
SIMD_Int_Pck_Exec
- (Event B3H, Umask 04H) The number of SIMD integer pack operations
instructions executed.
SIMD_Int_Plog_Exec
- (Event B3H, Umask 10H) The number of SIMD integer packed logical
instructions executed.
SIMD_Int_Pmul_Exec
- (Event B3H, Umask 01H) The number of SIMD integer packed multiply
instructions executed.
SIMD_Int_Psft_Exec
- (Event B3H, Umask 02H) The number of SIMD integer packed shift
instructions executed.
SIMD_Int_Sat_Exec
- (Event B1H, Umask 00H) The number of SIMD integer saturating instructions
executed.
SIMD_Int_Upck_Exec
- (Event B3H, Umask 08H) The number of SIMD integer unpack instructions
executed.
SMC_Detected
- (Event C3H, Umask 00H) The number of times self-modifying code was
detected.
SSE_NTStores_Miss
- (Event 4BH, Umask 03H) The number of times an SSE streaming store
instruction missed all caches.
SSE_NTStores_Ret
- (Event 07H, Umask 03H) The number of SSE streaming store instructions
executed.
SSE_PrefNta_Miss
- (Event 4BH, Umask 00H) The number of times
PREFETCHNTA missed all caches.
SSE_PrefNta_Ret
- (Event 07H, Umask 00H) The number of
PREFETCHNTA
instructions retired.
SSE_PrefT1_Miss
- (Event 4BH, Umask 01H) The number of times
PREFETCHT1 missed all caches.
SSE_PrefT1_Ret
- (Event 07H, Umask 01H) The number of
PREFETCHT1
instructions retired.
SSE_PrefT2_Miss
- (Event 4BH, Umask 02H) The number of times
PREFETCHNT2 missed all caches.
SSE_PrefT2_Ret
- (Event 07H, Umask 02H) The number of
PREFETCHT2
instructions retired.
Seg_Reg_Loads
- (Event 06H, Umask 00H) The number of segment register loads.
Serial_Execution_Cycles
- (Event 3CH, Umask 02H) The number of non-halted bus cycles of this code
while the other core was halted.
Thermal_Trip
- (Event 3BH, Umask C0H) The duration in a thermal trip based on the current
core clock.
Unfusion
- (Event DBH, Umask 00H) The number of unfusion events.
Unhalted_Core_Cycles
- (Event 3CH, Umask 00H) The number of core clock cycles when the clock
signal on a specific core is not halted. This is an architectural
performance event.
Uops_Ret
- (Event C2H, Umask 00H) The number of micro-ops retired.
The following table shows the mapping between the PMC-independent aliases
supported by Performance Counters Library (libpmc,
-lpmc) and the underlying hardware events used.
The following errata affect performance measurement on these processors. These
errata are documented in Intel®
CoreTM Duo Processor and Intel® CoreTM Solo Processor on 65 nm
Process, Specification Update,
Order Number 309222-017, Intel
Corporation, July 2008.
- AE19
- Data prefetch performance monitoring events can only be enabled on a
single core.
- AE25
- Performance monitoring counters that count external bus events may report
incorrect values after processor power state transitions.
- AE28
- Performance monitoring events for retired floating point operations (C1H)
may not be accurate.
- AE29
- DR3 address match on MOVD/MOVQ/MOVNTQ memory store instruction may
incorrectly increment performance monitoring count for saturating SIMD
instructions retired (Event CFH).
- AE33
- Hardware prefetch performance monitoring events may be counted
inaccurately.
- AE36
- The
CPU_CLK_UNHALTED performance monitoring event
(Event 3CH) counts clocks when the processor is in the C1/C2 processor
power states.
- AE39
- Certain performance monitoring counters related to bus, L2 cache and power
management are inaccurate.
- AE51
- Performance monitoring events for retired instructions (Event C0H) may not
be accurate.
- AE67
- Performance monitoring event
FP_ASSIST may not be
accurate.
- AE78
- Performance monitoring event for hardware prefetch requests (Event 4EH)
and hardware prefetch request cache misses (Event 4FH) may not be
accurate.
- AE82
- Performance monitoring event
FP_MMX_TRANS_TO_MMX
may not count some transitions.
The pmc library first appeared in
FreeBSD 6.0.
The Performance Counters Library (libpmc, -lpmc) library
was written by Joseph Koshy
<jkoshy@FreeBSD.org>.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |