Authors
Mikhail Asiatici, Paolo Ienne
Publication date
2021/6/14
Conference
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)
Pages
609-622
Publisher
IEEE
Description
Efficient large-scale graph processing is crucial to many disciplines. Yet, while graph algorithms naturally expose massive parallelism opportunities, their performance is limited by the memory system because of irregular memory accesses. State-of-the-art FPGA graph processors, such as ForeGraph and FabGraph, address the memory issues by using scratchpads and regularly streaming edges from DRAM, but then they end up wasting bandwidth on unneeded data. Yet, where classic caches and scratchpads fail to deliver, FPGAs make powerful unorthodox solutions possible. In this paper, we resort to extreme nonblocking caches that handle tens of thousands of outstanding read misses. They significantly increase the ability of memory systems to coalesce multiple accelerator accesses into fewer DRAM memory requests; essentially, when latency is not the primary concern, they bring the advantages expected …
Total citations
20212022202320242786
Scholar articles
M Asiatici, P Ienne - 2021 ACM/IEEE 48th Annual International Symposium …, 2021