MPI Performance Snapshot Summary
Application: ./a.out
Rank:
Number of ranks: 44
Used statistics: stats.txt, app_stat.txt
Overview
MPI Time: 12.56 sec75.61%
MPI Imbalance: 12.52 sec75.32%
Computation Time: 4.05 sec24.39%
OpenMP Time:
OpenMP Imbalance:
Serial Time:
WallClock time:
16.62 sec
MPI TimeTime spent inside the MPI library. High values are usually bad.
This value is HIGH. The application is Communication-bound.
MPI ImbalanceMean unproductive wait time per process spent in the MPI library calls when a process is waiting for data. This time is part of the MPI time above. High values are usually bad.
This value is HIGH. The application workload is NOT well balanced between MPI ranks.
Computation TimeMean time per process spent in the application code. This is the sum of the OpenMP Time and the Serial time. High values are usually good.
This value is AVERAGE. The application is Computation-bound.
OpenMP Time
OpenMP Imbalance
Serial Time
Hardware Metrics
GFLOPS:
Cycles Per Instruction Rate:
Memory Bound Coefficient:
Memory Usage
Memory consumption:
Per-process memory usage affects the application scalability.
Peak memory consumption (rank 38): 0.32 MB
Mean memory consumption: 0.29 MB
Per-process memory usage affects the application scalability.
Performance by Metric
WallClock time: 16.62 sec
Total application lifetime. The time is elapsed time for the slowest process. This metric includes the MPI Time and the Computation time below.
MPI Time: 12.56 sec75.61%
Time spent inside the MPI library. High values are usually bad.
This value is HIGH. The application is Communication-bound.

This might be caused by:
  • High wait times inside the library - see the MPI Imbalance metric below.
  • Active communications - see the diagrams 'MPI Time per Rank' (key '-t' or '-t -D' for per MPI-function details) & 'Collective Operations Time per Rank' (key '-c' or '-c -D' for per MPI-function details).
  • Unoptimized settings of the MPI library. You can tune Intel® MPI Library for your application and cluster configuration using the mpitune utility available as part of the library package.
MPI Imbalance: 12.52 sec75.32%
Mean unproductive wait time per process spent in the MPI library calls when a process is waiting for data. This time is part of the MPI time above. High values are usually bad.
This value is HIGH. The application workload is NOT well balanced between MPI ranks.

For more details about the MPI communication scheme use Intel® Trace Analyzer and Collector available as part of Intel® Parallel Studio XE Cluster Edition.
Computation Time: 4.05 sec24.39%
Mean time per process spent in the application code. This is the sum of the OpenMP Time and the Serial time. High values are usually good.
This value is AVERAGE. The application is Computation-bound.
  • For more information about basic CPU counters see the diagram 'Counters and Memory usage statistics' (key '-o').
  • For more information about the performance profile of the computation code we recommend looking at CPU utilization at node level using Intel® VTune™ Amplifier XE. The tool is available as part of Intel® Parallel Studio XE Cluster Edition.
OpenMP Time:
OpenMP Imbalance:
Serial Time: