Skip to content

Latest commit

 

History

History
92 lines (65 loc) · 4.55 KB

File metadata and controls

92 lines (65 loc) · 4.55 KB

Native module for OneAgent NVBit extension

Overview

Native module for gathering kernel execution performance metrics via code instrumentation. It is compiled into a dynamic library, which then needs to be preloaded into the process one wishes to monitor.

Example usage:

ONEAGENT_NVBIT_EXTENSION_CONF_FILE=<path-to>/nvbit-module.conf LD_PRELOAD=<path-to>/libnvbit-module.so <the application being instrumented>

External dependencies

Dependency Tested version
spdlog 1.3.1
CUDA Toolkit 11.0
NVBit 1.5.3
Boost 1.71.0
Google Test 1.10.0
CMake 3.18.4
vcpkg N/A
Compiler with C++20 support (C++17 for CUDA) gcc 10.2.0

Building

Set up vcpkg

git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
./vcpkg install spdlog:x64-linux boost-program-options:x64-linux gtest:x64-linux

Download NVBit

NVBit does not require separate compilation as documented in README ("Getting Started with NVBit" section) located in the root directory of NVBit release package.

Build the project

mkdir build
cd build
cmake -G "Unix Makefiles" -DNVBIT_PATH="<path_to_nvbit_release>" -DCMAKE_TOOLCHAIN_FILE="<vcpkg_directory>/scripts/buildsystems/vcpkg.cmake" ..

Configuration

The module is configured twofold:

  1. startup configuration is read once during start from the file specified via ONEAGENT_NVBIT_EXTENSION_CONF_FILE environment variable,
  2. runtime configuration is read every runtime_config_polling_interval seconds from file specified via runtime_config_path.

Startup configuration

Startup configuration needs to be provided upfront via ONEAGENT_NVBIT_EXTENSION_CONF_FILE environment variable. Lintes starting with # are treated as comments and ignored. The list of settings is as denoted in the table below.

Key Value type Default value Description
logfile Valid filesystem path unset Path to log file
runtime_config_path Valid filesystem path unset Path to runtime configuration file
runtime_config_polling_interval Positive integer 10 Runtime configuration polling internal in seconds
measurements_output_dir Valid filesystem path unset Directory where measurements will be written to
verbose Boolean false Enable verbose (debug) logging
console_log_enabled Boolean false Enable logging to stdout
count_warp_level Boolean true Count warp level or thread level instructions
exclude_pred_off Boolean false Exclude predicated off instruction from count
mangled_names Boolean true Print kernel names mangled or not

See nvbit-module.conf for an example.

Runtime configuration

Runtime configuration is created on the fly by Python extension and contains a list of pids that should be instrumented, along with instrumentation functions to apply to each of them.

See nvbit-module-runtime.conf for an example. For a detailed documentation of communication protocol, see here.

Limitations

Multiple GPU code injection routines cannot be enabled at once, e.g.

  • gmem_access_coalescence and count_instr combined won't work
  • gmem_access_coalescence and occupancy will work

This limitation is subject to removal in future increments.