HOMME trace analysis [presentation]
Mizero, F., Dennis, J. M., Veeraraghavan, M., Russell, R., Liu, Q.. (2014). HOMME trace analysis [presentation].
Title | HOMME trace analysis [presentation] |
---|---|
Genre | Conference Material |
Author(s) | Fabrice Mizero, John M. Dennis, M. Veeraraghavan, R. Russell, Q. Liu |
Abstract | The High-Order Method Modeling Environment (HOMME) is the default dynamical core within CAM, and consumes a significant fraction of the total cycles on Yellowstone. While HOMME has an scalable communication library that has been shown to scale to large processor counts, its scalability on Yellowstone's Infiniband network has shown unexpected performance variability. Our work during the summer has consisted of analyzing several traces from HOMME runs to determine the source of this high variability in network latency. We have developed a statistical based methodology that consists of monitoring congestion and analyzing message arrival times using Extrae. We have applied this method to look at two different suspected causes of performance variability: OS Jitter and network congestion. OS Jitter introduces delays in an application's execution due kernel interrupts. Network congestion increases communication time due to queuing delays within the Infiniband network. We present the results of our analysis as well as suggest a potential solution which utilizes virtual lanes to prevent congestion due to head of line blocking. |
Publication Title | |
Publication Date | Jul 31, 2014 |
Publisher's Version of Record | |
OpenSky Citable URL | https://n2t.org/ark:/85065/d73t9jvk |
OpenSky Listing | View on OpenSky |
CISL Affiliations | CISLAODEPT, TDD, ASAP |