Skip to content

tuda-parallel/TMIO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Release GitHub Release contributors issues license


TMIO

Tracing MPI-IO

View Demo · Report Bug · Request Feature

This repository contains the TMIO source code. TMIO is a C++ tracing library that can be easily attached to existing code to monitor MPI-IO online. The tool traces synchronous as well as asynchronous I/O. TMIO offers a C as well as a C++ interface. We provide two methods for linking the library to the application, depending on whether the information is used for offline or online analysis. The obtained traces can be analyzed as explained here exploring the traces.

Table of Contents
  1. Getting Started
  2. Usage
  3. Exploring the Traces
  4. Contributing
  5. Contact
  6. License
  7. Acknowledgments

see latest updates here: Latest News

Getting Started

Prerequisites

Installation

Go to the build directory:

cd build

Build the library using the Makefile. The library can be built with or without MessagePack:

  • Standard :

     make library
  • MessagePack support:

     make msgpack_library

Options can be passed with flags to the make command or through the file ./include/ioflags.h.

Testing

To test that the library works execute:

make clean
make 

To test the MessagePack support, call:

make msgpack

Usage

There are two ways to use this library to trace I/O: offline or online tracing. The offline mode uses the LD_PRELOAD mechanism. Upon MPI_Finalize, the collected data is written into a single that can be analyzed using the tools from here. In the online mode, the application is compiled with the TMIO library, and a line is added to indicate when to flush the results out to a file (JSON Lines or MessagePack). The same set of tools can be used in both modes. Note that, the online version was developed to work with the predictor tool from the FTIO repo to detect the period of the I/O phases during the execution of an application.

Offline Analysis:

For offline tracing, the LD_PRELOAD mechanism is used. First, build the library with either MessagePack support or not (see Installation). After that, just call:

LD_PRELOAD=path_to_lib/libtmio.so $(MPIRUN)  -np $(PROCS) $(MPI_RUN_FLAGS) ./your_code variable_1 variable_2

Online Analysis:

The code needs to be compiled with the library (IOR example). Three steps need to be performed:

  1. The library needs to be included in the code:
      // Somewhere at the beginning of the code
      	#include "tmio_c.h"
  2. A single line needs to be added indicating when to flush the data out. Whenever this line is reached, the collected traces are flushed out to the tracing file:
      // Somewhere in the code
      iotrace_summary();
  3. The application needs to be recompiled for the changes to take effect

An example on how to modify IOR is provided here.

Exploring the Traces:

The generated file can be easily examined with all provided tools from the FTIO repo to:

  1. Find the period of the I/O phases offline or online
  2. Visualize the traced results with ioplot
  3. Use ioparse to merge several traces into a single profile that can be examined with Extra-P

Contributing

If you have a suggestion that would make this better, please fork the repo and create a pull request.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Contact

License

license

Distributed under the BSD 3-Clause License. See LISCENCE for more information.

Acknowledgments

Authors:

  • Ahmad Tarraf

Citation

  @inproceedings{AT24a, 
   author={Tarraf, Ahmad and Muñoz, Javier Fernandez and Singh, David E. and Özden, Taylan and Carretero, Jesus and Wolf, Felix},
   title={I/O Behind the Scenes: Bandwidth Requirements of HPC Applications With Asynchronous I/O}, 
   address={Kobe, Japan}, 
   booktitle={2024 IEEE International Conference on Cluster Computing (CLUSTER)}, 
   year={2024}, 
   month={sep},
   note={(accepted)}
  }

 @inproceedings{Tarraf_Bandet_Boito_Pallez_Wolf_2024, 
  author={Tarraf, Ahmad and Bandet, Alexis and Boito, Francieli and Pallez, Guillaume and Wolf, Felix},
  title={Capturing Periodic I/O Using Frequency Techniques}, 
  booktitle={2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)}, 
  address={San Francisco, CA, USA}, 
  year={2024},
  month=may, 
  pages={1–14}
 }

Publications

  1. A. Tarraf, A. Bandet, F. Boito, G. Pallez, and F. Wolf, “Capturing Periodic I/O Using Frequency Techniques,” in 2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS), San Francisco, CA, USA, May 2024, pp. 1–14.