EA2P : Energy-Aware Application Profiler¤
EA2P is an energy profiling tool designed to accurately measure the energy consumption of various computer devices, including RAM, CPU, and GPU. It supports multiple hardware vendors such as Nvidia, AMD, and Intel, allowing comprehensive energy measurements across different systems. The particularity of the tool is the flexibility over target device selection feature and the support for AMD devices energy measurement.
Please consult the documentation or support resources for your specific CPU and GPU models to find the appropriate configuration and instructions for monitoring energy consumption. Keep in mind that the availability of such features may vary depending on your hardware.
Features¤
-
Granular Results: Provides detailed and fine-grained energy measurements per device and power domains, particularly for Intel-based components, offering a comprehensive understanding of energy consumption across the system.
-
Multi-Device Measurement: Supports measurement for a variety of devices including RAM, AMD GPU & CPU, Nvidia GPU, and Intel CPU. This comprehensive coverage allows for holistic energy analysis.
-
Code Instrumentation: Offers an API for code instrumentation as well as a Command Line Interface (CLI) for flexible usage, enabling both direct integration into applications and standalone usage for measurement purposes.
-
Sampling Frequency Control: Provides users with the option to set the sampling frequency, allowing for customizable energy measurement intervals based on specific requirements and precision needs.
-
Automatic Device Detection: Automatically detects device vendors and selects appropriate commands, simplifying usage for users and ensuring compatibility across different hardware configurations.
-
Selective Device Measurement: Allows users to select specific devices for measurement, offering the flexibility to focus on a subset of the system components, which can be advantageous for targeted analysis.
- Multi-node Measurement: Provides users with the ability to monitor energy consumption across multiple nodes in traditional HPC or cluster computing environments. A comprehensive energy-per-rank (node) breakdown and total energy consumption for each device type in a homogeneous node system are provided to you.
- Docker support: To further enhance the usability and portability of the energy measurement tool, it has been containerized using Docker.
Requirements¤
- RAPL (Running Average Power Limit): it is a feature found in modern Intel processors that allows monitoring and controlling power consumption. RAPL provides a set of registers that can be used to read power-related information, such as power consumption, and to set power limits for the processor. If it is not installed, you can run the code below:
- ROCm-SMI : ROCm-SMI (Radeon Open Compute System Management Interface) is a command-line interface developed by AMD as part of the ROCm (Radeon Open Compute) software stack. It provides a set of tools for managing and monitoring AMD GPUs kernels that are compatible with the ROCm platform. So you should install the ROCm stack for GPU profiling if it is not installed on your AMD GPU platform : install ROCm
- Nvidia-SMI : Nvidia-SMI(Nvidia System Management Interface) is the ROCm-SMI alternative if you are working with Nvidia GPU. Generally it comes with Nvidia drivers installation : install Nividia Drivers
- Perf tools : It is used to monitore energy for AMD CPU since we did not yet find a way to access the AMD RAPL files in Linux systems.
- MPI library (for multi-node profiling) : Ensure that you have an MPI implementation installed on your system. Common implementations include MPICH and OpenMPI. And use the following to run the instrumented code :