Authors
Giuseppe Tagliavini, Daniele Cesarini, Andrea Marongiu
Publication date
2018/3/12
Journal
IEEE Transactions on Parallel and Distributed Systems
Volume
29
Issue
9
Pages
2150-2163
Publisher
IEEE
Description
In recent years, programmable many-core accelerators (PMCAs) have been introduced in embedded systems to satisfy stringent performance/Watt requirements. This has increased the urge for programming models capable of effectively leveraging hundreds to thousands of processors. Task-based parallelism has the potential to provide such capabilities, offering high-level abstractions to outline abundant and irregular parallelism in embedded applications. However, efficiently supporting this programming paradigm on embedded PMCAs is challenging, due to the large time and space overheads it introduces. In this paper we describe a lightweight OpenMP tasking runtime environment (RTE) design for a state-of-the-art embedded PMCA, the Kalray MPPA 256. We provide an exhaustive characterization of the costs of our RTE, considering both synthetic workload and real programs, and we compare to several …
Total citations
20182019202020212022202320242276384
Scholar articles
G Tagliavini, D Cesarini, A Marongiu - IEEE Transactions on Parallel and Distributed Systems, 2018