Abstract: Data-parallel programming languages have many desirable features, such as single-thread semantics and the ability to express fine-grained parallelism. However, it is challenging to implement such languages on conventional MIMD multiprocessors, because these machines incur a high overhead for small grain sizes. This paper presents compile-time analysis techniques for data-parallel program graphs that reduce these overheads in two ways: by stepping up the grain size, and by relaxing the synchronous nature of the computation without altering the program semantics. The algorithms partition the program graph into "clusters" of nodes such that all nodes in a cluster have the same loop structure, and further refine these clusters into "epochs" based on generation and consumption patterns of data vectors. This converts the fine-grain parallelism in the original program to medium-grain loop parallelism, which is better suited to MIMD machines. A compiler has been implemented based on these ideas. We present performance results for data-parallel kernels analyzed by the compiler and converted to single-program multiple-data (SPMD) code running on an Encore Multimax.
@inproceedings{Chatterjee91a, author = "Siddhartha Chatterjee and Guy~E. Blelloch and Allan~L. Fisher", title = "Size and Access Inference for Data-Parallel Programs", booktitle = "ACM SIGPLAN '91 Conference on Programming Language Design and Implementation", pages = "130--144" month = jun, year = 1991 }