An integrated tutorial on InfiniBand, verbs, and MPI
MacArthur, P., Liu, Q., Russell, R. D., Mizero, F., Veeraraghavan, M., et al. (2017). An integrated tutorial on InfiniBand, verbs, and MPI. IEEE Communications Surveys & Tutorials, doi:10.1109/COMST.2017.2746083
Title | An integrated tutorial on InfiniBand, verbs, and MPI |
---|---|
Author(s) | Patrick MacArthur, Qian Liu, Robert D. Russell, Fabrice Mizero, Malathi Veeraraghavan, John M. Dennis |
Abstract | This tutorial presents the details of the interconnection network utilized in many high performance computing (HPC) systems today. "InfiniBand" is the hardware interconnect utilized by over 35% of the top 500 supercomputers in the world as of June, 2017. "Verbs" is the term used for both the semantic description of the interface in the InfiniBand architecture specifications, and the name used for the functions defined in the widely used OpenFabrics alliance implementation of the software interface to InfiniBand. "Message passing interface" is the primary software library by which HPC applications portably pass messages between processes across a wide range of interconnects including InfiniBand. Our goal is to explain how these three components are designed and how they interact to provide a powerful, efficient interconnect for HPC applications. We provide a succinct look into the inner technical workings of each component that should be instructive to both novices to HPC applications as well as to those who may be familiar with one component, but not necessarily the others, in the design and functioning of the total interconnect. A supercomputer interconnect is not a monolithic structure, and this tutorial aims to give non-experts a "big-picture" overview of its substructure with an appreciation of how and why features in one component influence those in others. We believe this is one of the first tutorials to discuss these three major components as one integrated whole. In addition, we give detailed examples of practical experience and typical algorithms used within each component in order to give insights into what issues and trade-offs are important. |
Publication Title | IEEE Communications Surveys & Tutorials |
Publication Date | Oct 1, 2017 |
Publisher's Version of Record | https://dx.doi.org/10.1109/COMST.2017.2746083 |
OpenSky Citable URL | https://n2t.net/ark:/85065/d7kh0qxc |
OpenSky Listing | View on OpenSky |
CISL Affiliations | TDD, ASAP |