HOMME trace analysis [presentation]

Mizero, F., Dennis, J. M., Veeraraghavan, M., Russell, R., Liu, Q.. (2014). HOMME trace analysis [presentation].

Title HOMME trace analysis [presentation]
Genre Conference Material
Author(s) Fabrice Mizero, John M. Dennis, M. Veeraraghavan, R. Russell, Q. Liu
Abstract The High-Order Method Modeling Environment (HOMME) is the default dynamical core within CAM, and consumes a significant fraction of the total cycles on Yellowstone. While HOMME has an scalable communication library that has been shown to scale to large processor counts, its scalability on Yellowstone's Infiniband network has shown unexpected performance variability. Our work during the summer has consisted of analyzing several traces from HOMME runs to determine the source of this high variability in network latency. We have developed a statistical based methodology that consists of monitoring congestion and analyzing message arrival times using Extrae. We have applied this method to look at two different suspected causes of performance variability: OS Jitter and network congestion. OS Jitter introduces delays in an application's execution due kernel interrupts. Network congestion increases communication time due to queuing delays within the Infiniband network. We present the results of our analysis as well as suggest a potential solution which utilizes virtual lanes to prevent congestion due to head of line blocking.
Publication Title
Publication Date Jul 31, 2014
Publisher's Version of Record
OpenSky Citable URL https://n2t.org/ark:/85065/d73t9jvk
OpenSky Listing View on OpenSky
CISL Affiliations CISLAODEPT, TDD, ASAP

Back to our listing of publications.