Authors
Elena Agostini, Davide Rossetti, Sreeram Potluri
Publication date
2017/5/14
Conference
2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
Pages
248-257
Publisher
IEEE
Description
NVIDIA GPUDirect is a family of technologies aimed at optimizing data movement among GPUs (P2P) or between GPUs and third-party devices (RDMA). GPUDirect Async, introduced in CUDA 8.0, is a new addition which allows direct synchronization between GPU and third party devices. For example, Async allows an NVIDIA GPU to directly trigger and poll for completion of communication operations queued to an InfiniBand Connect-IB network adapter, removing CPU involvement from the critical path in GPU accelerated applications. In this paper, we present the building blocks of GPUDirect Async and explain the supported usage models of this new technology. We also present a performance evaluation using a micro-benchmark and a synthetic stencil benchmark. Finally, we demonstrate the use of Async in a few multi-GPU MPI applications: HPGMG-FV (geometric multi-grid), achieving up to 25% improvement …
Total citations
201720182019202020212022202320241614342
Scholar articles
E Agostini, D Rossetti, S Potluri - 2017 17th IEEE/ACM International Symposium on …, 2017