Authors
Sreeram Potluri, Anshuman Goswami, Davide Rossetti, Chris J Newburn, Manjunath Gorentla Venkata, Neena Imam
Publication date
2017/12/18
Conference
2017 IEEE 24th International Conference on High Performance Computing (HiPC)
Pages
253-262
Publisher
IEEE
Description
GPUs have become an essential component for building compute clusters with high compute density and high performance per watt. As such clusters scale to have 1000s of GPUs, efficiently moving data between the GPUs becomes imperative to get maximum performance. NVSHMEM is an implementation of the OpenSHMEM standard for NVIDIA GPU clusters which allows communication to be issued from inside GPU kernels. In earlier work, we have shown how NVSHMEM can be used to achieve better application performance on GPUs connected through PCIe or NVLink. As part of this effort, we implement IB verbs for Mellanox InfiniBand adapters in CUDA. We evaluate different design alternatives, taking into consideration the relaxed memory model, automatic memory access coalescing and thread hierarchy on the GPU. We also consider correctness issues that arise in these designs. We take advantage of …
Total citations
20182019202020212022202320245883263
Scholar articles
S Potluri, A Goswami, D Rossetti, CJ Newburn… - 2017 IEEE 24th International Conference on High …, 2017