ChordMap: Automated Mapping of Streaming Applications onto CGRA

Abstract

Streaming applications, consisting of several communicating kernels, are ubiquitous in the embedded computing systems. The synchronous data flow (SDF) is commonly used to capture the complex communication patterns among the kernels. The general-purpose processors cannot meet the throughput requirement of the compute-intensive kernels in the current and emerging applications. The coarse-grained reconfigurable arrays (CGRAs) are well-suited to accelerate the individual kernel and the compiler technology is well-developed to support the mapping of a kernel onto a CGRA accelerator. However, the system-level mapping of the entire streaming application onto a resource-constrained CGRA to maximize throughput remains unexplored. We introduce a novel CGRA mapper, called ChordMap, to automatically generate a high-quality mapping of streaming applications represented as SDF onto CGRAs. We propose an optimized spatio-temporal mapping with modulo-scheduling that judiciously employs concurrent execution of multiple kernels to improve parallelism and thereby maximize throughput. ChordMap achieves, on average, 1.74× higher throughput across eight streaming applications compared to the state-of-the-art.

Publication
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems