Abstract: This report describes NESL, a strongly-typed, applicative, data-parallel language. NESL is intended to be used as a portable interface for programming a variety of parallel and vector computers, and as a basis for teaching parallel algorithms. Parallelism is supplied through a simple set of data-parallel constructs based on sequences, including a mechanism for applying any function over the elements of a sequence in parallel and a rich set of parallel functions that manipulate sequences.
NESL fully supports nested sequences and nested parallelism---the ability to take a parallel function and apply it over multiple instances in parallel. Nested parallelism is important for implementing algorithms with irregular nested loops (where the inner loop lengths depend on the outer iteration) and for divide-and-conquer algorithms. NESL also provides a performance model for calculating the asymptotic performance of a program on various parallel machine models. This is useful for estimating running times of algorithms on actual machines and, when teaching algorithms, for supplying a close correspondence between the code and the theoretical complexity.
This report defines NESL and describes several examples of algorithms coded in the language. The examples include algorithms for median finding, sorting, string searching, finding prime numbers, and finding a planar convex hull. NESL currently compiles to an intermediate language called VCODE, which runs on vector multiprocessors (the CRAY C90 and J90), distributed memory machines (the IBM SP2, Intel Paragon, and Connection Machine CM-5), and sequential workstations. For many algorithms, the current implementation gives performance close to optimized machine-specific code for these machines.
Note: This report is an updated version of CMU-CS-92-103, which described version 2.4 of the language, and of CMU-CS-93-129, which described version 2.6 of the language.