Additional Performance Analysis Tools:

Intel® Trace Analyzer and Collector
MPI Analyzer and Profiler
Intel® VTune™ Amplifier
Performance Profiler
Intel® Advisor
Vectorization Optimization & Thread Prototyping
Storage Performance Snapshot
Visualize System Storage Bottlenecks
No

Additional Performance Analysis Tools:

Intel® Trace Analyzer and Collector - MPI Analyzer and Profiler
Intel® VTune™ Amplifier - Performance Profiler
Intel® Advisor - Vectorization Optimization & Thread Prototyping
Storage Performance Snapshot - Visualize System Storage Bottlenecks
No
Application Performance Snapshot

Application Performance Snapshot

Current run Target Delta
MPI Time <10%
Serial Time <15%
OpenMP Imbalance <10%
CPU Utilization >90%
Memory Stalls <20%
Back-End Stalls <20%
FPU Utilization >50%
SIMD Instr. per Cycle >1
I/O Bound <10%
Application:
Report creation date:
Rank:
Number of ranks:
Ranks per node:
OpenMP threads:
HW Platform:
Logical Core Count per node:
Elapsed Time
SP FLOPS
CPI
(MAX , MIN )

MPI Time

of Elapsed Time
()

MPI Imbalance

of Elapsed Time
()
TOP 5 MPI Functions %

Serial Time

of Elapsed Time
()

OpenMP Imbalance

of Elapsed Time
()

CPU Utilization

Average CPU Usage

Out of logical CPUs

Memory Stalls

of pipeline slots

Cache Stalls

of cycles

DRAM Stalls

of cycles

NUMA

of remote accesses

Back-End Stalls

of pipeline slots

L2 Hit Bound

of cycles

L2 Miss Bound

of cycles

FPU Utilization

SP FLOPs per Cycle

Out of

Vector Capacity Usage

FP Instruction Mix

% of Packed FP Instr.:
% of 128-bit:
% of 256-bit:
% of 512-bit:
% of Scalar FP Instr.:

FP Arith/Mem Rd Instr. Ratio

FP Arith/Mem Wr Instr. Ratio

SIMD Instr. per Cycle

FP Instruction Mix

% of Packed SIMD Instr.:
% of Scalar SIMD Instr.:

I/O Bound


(AVG , PEAK )

Read

AVG , MAX

Write

AVG , MAX

Memory Footprint

Resident total:
Resident:
Per node:
Peak:
Average:
Per rank:
Peak:
Average:
Virtual total:
Virtual:
Per node:
Peak:
Average:
Per rank:
Peak:
Average:
Metric value collected during the application profiling run.
Metric threshold used to indicate possible performance issues. Threshold values are fixed and may not accurately reflect the nature of your application.
Visual representation of the current run value compared to the target threshold. The Delta is set to zero if the current run value is within the target threshold.
Percentage from Elapsed Time
Intel® MPI Performance Snapshot report cannot be opened with the current browser. Use any of these supported browsers: