SIParCS 2021 - Leila Ghaffari
Performance Portability of Shallow Water Model with DPC++
The computational capacity of high-performance computing platforms has increased rapidly during recent years, in part due to accelerators like GPGPUs, which can increase performance by offloading and parallelizing computations. Domain scientists are interested in exploring the use of accelerators in addition to traditional CPUs for performing parallel computations, but it can be difficult and time-consuming to port or develop multiple versions of code that only run on specific architectures. It is more desirable to have a single performance-portable source code. There are a number of new frameworks that advertise the ability to execute the same code on CPU or accelerators with limited or no modifications. This project aimed to assess the performance portability of the Shallow Water Model (SWM) mini-app using a prominent one of these frameworks, the Intel oneAPI toolkit.
The first step in this project was to learn various concepts of Data Parallel C++ (DPC++), the direct programming tool in Intel oneAPI, to parallelize a simple vector addition C++ code. Next, we ported the SWM mini-app with DPC++ and ran it on an Intel Xeon Skylake CPU and an Intel-Xe GPU with different problem sizes. A comparison between the performance of the DPC++ ported code and the original serial, OpenMP, and OpenACC versions of SWM mini-app is discussed in this presentation. The DPC++ version performed poorly on the Intel GPU but had very good performance on the Intel CPU for large problem sizes.
Mentors: Supreeth Suresh, Cena Miller, Jian Sun, & John Dennis
Slides and poster