Introduction
Intermediate languages are used by many modern compilers. Typically they are produced by a compiler's front end, which handles parsing and error checking for a particular high-level language, and are consumed by the back end, which handles code generation for a particular machine architecture. Intermediate languages simplify the inevitable process of porting a compiler to a new architecture by enabling the developer to re-use the front end of the compiler. Creating a new high-level language is also made easier if an existing intermediate language is used, because back ends can then be reused.
Choosing the right intermediate language is an important decision for language developers and compiler writers. Ideally, an intermediate language should be simple enough to serve as a good compiler target, while at the same time allowing for efficient execution on a range of platforms. Moreover, it should isolate low-level issues such as error checking and memory management. This paper was inspired by the observation that the recently developed Java programming language [14] appears to possess all of these characteristics. We wanted to know whether Java would make a good intermediate language for current and future compilers.
Java has several attractions as an intermediate language. The first is the design of the language itself. By being strongly-typed, Java makes the compiler easier to debug, because many code generator bugs will trigger type errors during compilation of the resulting Java code. In contrast, if a weakly-typed intermediate language were used, these bugs might not be found until runtime. Java also provides garbage collection, thereby removing a large source of potential memory-management bugs in the generated code. In addition, if the language being implemented must itself provide garbage collection, the developer can simply push the responsibility down to Java.
Another attraction of Java is that it is a portable, network-aware language. The details of machine architecture, operating system, and display environment are all handled transparently by the Java virtual machine [15]. The same Java program can run on a Unix workstation, a PC, and a Macintosh, while retaining the same ``look and feel'' on each platform. Using Java as an intermediate language also allows programs to be distributed in an executable form (Java bytecode) over the Internet.
Finally, Java is a highly successful commercial product. This fact has huge advantages for a language developer. In particular, there is a whole industry devoted to porting Java to new platforms, improving Java compilers and run-time systems, writing libraries of Java objects, and fixing bugs. The developer would normally have to do all of these chores if a special-purpose intermediate language were used.
Given all these advantages, what might stop us from using Java as an intermediate language? There are three main questions to consider:
- Is Java easy to use in a new or existing system?
- Does Java provide sufficient functionality to model the features of the source language?
- Can the resulting programs be efficiently executed by a Java virtual machine?
To try to answer these questions, we built a system that translates VCODE [5] (a specialized intermediate language for the high-level parallel language NESL [4]) into Java, and performed a series of benchmarks to compare this new implementation with the original.
The rest of this paper is organized as follows. Section 2 gives an overview of NESL, VCODE, and the current NESL system. Section 3 describes the translation of VCODE, its run-time system, and its libraries into Java. Section 4 discusses our experiences in building the system and outlines additional optimizations that we incorporated into the final version. Section 5 presents benchmark results, and Section 6 describes related projects. Finally, Section 7 summarizes the work and our conclusions.