Keywords: Andrew User Interface System, AUIS, Andrew Toolkit, ATK, C++, object-oriented toolkit, Ness scripting language, recursive embedding, integrated user interface system, compound documents, Object Linking and Embedding, OLE
Some readers may recognize AUIS by the name Andrew Toolkit or ATK. The Andrew Toolkit remains a component of AUIS and provides a compound-document architecture and tools for building applications and new objects. [Borenstein, 1990; Sherman, 1991; Palay, 1992] However, AUIS is considerably more; beyond ATK it includes complete editor applications for word processing, source editing, drawings, equations, spreadsheets, fonts, preferences, and more. The most elaborate application is the Andrew Message System, a full-featured mail and bulletin board reading, composing and management system; and one which is MIME compatible. AUIS is an open system with the source code distributed under the X tape license.
A hallmark of AUIS is architectures for recursive embedding of objects,
which means that the object for one variety of information may be included
within that of another. Figure 1, for example, shows several types
of object embedded in a spreadsheet which is embedded in turn in a
document.
Other toolkits are available for Unix and X, but none offer the scope
of Andrew. Motif, OpenLook, the Athena widget set, and other widget
sets provide "interactors" which each manage a small rectangle and provide
callbacks to the application when the user operates the interactor.
The newest and most publicized integrated user interface system is
the Object Linking and Embedding (OLE) component of Microsoft's Windows
version 3.1 [Microsoft, 1992]. This system supports recursive embedding
and even provides that embedded objects may execute in a separate process.
One user interface system that is already in Unix/X/C++ is the Fresco
project[Linton, 1993], which is based on Interviews [Linton, 1989]
and also on ideas from AUIS. Fresco is still far from complete. The
plan is that applications will not be part of Fresco, but will be created
by interested vendors.
Each visible AUIS object is implemented internally as two objects, one derived from the "dataobject" class and the other from the "view" class. Dataobjects retain the information to be displayed and are responsible for reading/writing the information from/to a datastream. Views are responsible for displaying the information from a dataobject within a rectangle on the screen. They also handle interaction with the user and printing. Splitting visible objects into two internal objects has the advantage that there can be multiple views on a single data object, as can happen when a document is viewed in two windows or when, for instance, a spreadsheet is viewed as both a table and a pie chart.
The heart of ATK is its architecture for recursive embedding of objects. In practice, the embedding is represented by a tree of view objects where the window is the root and each object is the parent of those it contains. The architecture defines methods on views that a parent calls on the child to pass events and other methods which a child calls on a parent to request future events. With these methods the parent and child negotiate the sharing of resources such as screen space, keyboard, mouse, menu, other user input devices, data stream space, printed page space, execution time and memory, extension language interfaces, and so on.
The HelloWorld application typically just displays HelloWorld and quits. The simplest verion in AUIS, as shown in Figure 2, does much more: displays on screen, prints, can be edited, and is avilable for cut/copy/paste. Only a few more lines would be required to change "Hello" to bold-italic.
/* helloworldapp.H */
#include <application.H> class helloworldapp : public application { public:
/* helloworldapp.C */
#include <andrewos.h> #include <helloworldapp.H> #include <im.H> #include <text.H> #include <textview.H> ATKdefineRegistry(helloworldapp, application, NULL); helloworldapp::helloworldapp(){} boolean helloworldapp::Start(){
Unlike HelloWorld, typical applications display a variety of objects scattered through a substrate widget such as a text, table, or drawing. Figure 3 shows the code to place a single button within an Andrew 'layout' object; each additional object requires similar code.
/* make the NextPage button */ NextPageComp = dobj->CreateComponent(); dobj->FillInComponent("cel", NextPageComp); cel = (struct cel *)(NextPageComp->data); cel->SetObjectByName("pushbutton"); cel->SetViewName(NULL, TRUE); cel->SetRefName("Nextpage"); ((struct pushbutton *)cel->GetObject())->SetText("Next Page"); dolay->SetComponentSize(NextPageComp, W-92, 0, 91, 40);
One facility of AUIS is an application called createinset which generates
the source code and help files for a new object with a given name.
Although this new object is functional, its real role is to be modified
to provide some new service. Such modification is usually easier than
creation of a new object from scratch. As part of the conversion to
C++, createinset was modified so it now produces code for C++ objects.
Inline procedures are the appropriate conversion for many operations
that were done with preprocessor macros in C. In general, creating
a function from a macro may not be possible. However, many macros
in AUIS were simple functions to access or assign to object components;
the converter automatically transforms these to the appropriate in-line
code. The macros
Multiple inheritance would have aided in several cases in the original Andrew development. One obvious case is that of views and graphics. Views are derived from a storage management hierarchy while graphics are derived from a hierarchy intended to allow the system to utilize multiple window systems. It was not practical to have eithe base class derived from the other, and yet we wanted the convenience of doing graphics to view objects. With multiple inheritance, we could have defined a class that inherited from both bases; however, lacking it we devised a macro-kludge wherein each method of a graphic is defined as a macro function offered by views. One consequence is the need to modify both classes for any changes to the graphics method. It may be too late to revise the system to exploit multiple inheritance, but other opportunities to do so will arise.
In order to determine which function calls in the C code needed to be converted to method calls in C++, the converter must determine which methods are defined in each class in the original system. This information is extracted from the original header files by one script, called C++Index. The main script--called C++Conv--is then invoked with the names of a collection of .c files. These and their corresponding header files are converted to C++ by local syntactic transformations.
Processing of .c files affected primarily function declarations and
method calls. It was the convention in the old code that functions
were declared with names of the form class__name, that is, the class
name followed by two underlines and then the name of the specific function.
Such declarations were usually converted by changing the double underline
to a double colon: class::name. In addition, the declaration was changed
from old C with parameters declared after the right parenthesis to
ANSI C with declarations in-line within the header:
Method calls were formerly disguised as function calls with the affected
object as the first argument:
A crucial trick in the conversion simplified parsing: a first pass converted every comment and string to a fixed length value containing an index into a table. This meant that pattern match searches during processing would find no spurious matches within comments or strings. After processing, the fixed length values were re-expanded to their original values.
The converter did not attempt global parsing of the C code because
this would only have been necessary to determine precise type information.
Instead, the compiler was employed as an adjunct to the converter
to find all the type errors. These were then corrected manually.
In many cases the correction was a revision of the code that went well
beyond what any converter could have done.
Where C++ lacked services, they have been implemented in a base class, ATK, from which all Andrew Toolkit objects must derive. (Objects derived from other bases or none at all can certainly be used, but they will not have these services.)
Class initialization. In monolithic systems, the main program
can call on an initialization function in each class. In AUIS, the
main program is not necessarily aware of all classes that will be utilized
during an execution; it could be considerably wasteful to initialize
several hundred unused classes. (And it may be impossible for the
main program to initialize all dynamically loadable classes.) In Class-C,
each class had a class procedure InitializeClass that was automatically
called before any execution of code in the class. To emulate this
mechanism in C++, the convention is that a class is initialized the
first time one of its functions is called. Each constructor and static
member function must include as its first operation the statement:
We considered introducing file scope objects whose constructors would implement the initialization for a class. However several current C++ implementations construct all file scope objects before main is called. Since ATK is a very large system we believed that lazy execution of class initialization code was required.
Creation by type name. When an AUIS data stream is read in, the type of each subordinate object is denoted with a character string giving the name of the object class. The ATK class provides a static member function ATK::NewObject whose argument is a character string and whose value is an object of the named class.
In order to implement ATK::NewObject, each class of objects must be registered. A class definition prepares for this by including in the .C file a call on the ATKdefineRegistry macro. This macro creates a table entry which is installed in a central table when the main program or loader calls ATKregister for the class.
Object initialization. The Class-C method of initializing objects was for each class to provide an initialization function with the following signature:
Obviously the return of a boolean indicating success or failure would not map directly to the use of constructors in C++. (Since C++ constructors have no return type.) The expected way for constructors to fail is to throw an exception. However exceptions are not widely implemented yet. Indeed, initially none of the C++ compilers available to us had working exceptions. For the time being, InitializeObject methods are converted to constructors, and the return statements are converted to a macro which simply prints an error message if the value passed indicates failure.
Header file incompatibilities. Header files in various compilers and operating systems are incompatible with C++. For example, the IBM AIX 3.1.5 header files used the keyword "new" as a parameter name. The Cfront 2.1 and GNU C++ implementations we had available did not remedy this situation. Moreover, some functions which aren't specified by any standard simply had no prototype. One result was that we factored out network socket code into C source files. We also adopted a coding standard requiring inclusion of andrewos.h as the first included file. This header file includes a number of standard system header files, doing whatever is necessary to address any failings of the header files with respect to C++ compatibility. The Class-C to C++ converter imposed this standard on all converted source files.
Nested types. An attempt was made to use nested types. In particular several classes used function pointers for callbacks, and it seemed sensible to provide typedefs for these pointer types within the class declaration. This approach was abandoned when we discovered the GNU C++ compiler had several bugs in this area and the Cfront 2.1 compiler didn't implement nested types at all. We settled for manual name scoping of the typedefs by prepending classname_ to the typedefs.
Inheritance. In C++ all names in the scope of a class are inherited by derived classes. In Class-C, however, only ordinary methods were inherited; access to data members of base classes were via a "header" member as in self->header.dataobject.id. Conversion led to some silent name clashes between base and derived classes. Clashes with class procedure names were harmless, since our original code would never try to access the wrong version of the function. Data conflicts were hazardous, however, since initially the converter lost the information about which instance of the variable name was desired. Constructs like self->header.dataobject.id were converted to self->id, so a derived class version of id might be silently substituted for the base class version. To resolve the problem, the converter was fixed to retain the information; the example converts to this->dataobject::id.
Name scope. The introduction of nested types in some compilers broke some code where structures, enums, and unions were defined inside structures or classses, or where the first reference to a type was within a struct or class. These nested types were now placed in the scope of the struct or class, instead of in the global scope as before. To avoid this problem the converter was modified to warn of type definitions within structs and classes. Where these occurred we manually either provided a forward declaration or moved the definition of the type outside the class.
In Class-C each class has separate name spaces for member functions (methods, macros and class procedures) and data. This led to a situations where a function and a data member of a class had the same name. The converter was extended to warn about these name conflicts and they were resolved manually.
In C++ the names of data members are in the scope of member functions effectively between file scope variables and arguments. This means that an unqualified reference to a global variable could be silently overridden by a class member of the same name. The converter resolved this by utilizing :: to ensure that any potential conflicts would be resolved automatically or result in a compiler error. In Class-C all class data references were to structure members, so the converter added :: before any name of a class member which was not preceded by '.' or '->'. within the body of member functions. Unfortunately the case of local variables shadowing class members also triggerred the addition of ::, so a compiler detectable syntax error resulted in this case. Avoiding this problem would have required full type information to distinguish between local variable declarations and statements.
Conflicts also arose from the use of struct's and functions of the same name, since the compiler thought the function call was a call to the constructor. This problem was avoided by manually renaming the structure or function where possible, and by making sure that a prototype for the function was seen before the call.
Multiple inheritance. ATK is a "single root class" toolkit. In Class-C the root class didn't exist as such, but there was a struct basicobject, to which any object could be cast and still provide type information. This proved adequate since Class-C supported only single inheritance. When the ATK runtime system for C++ was designed an explicit root class seemed the best solution. Now that we have started to look at making it safe for client code to use multiple inheritance with ATK classes, we have discovered some problems. If the root class ATK is derived non-virtually, multiple instances will be included in each derived class. In order to cast a derived class pointer up to an ATK pointer in this case an explicit series of casts will be needed to pick one of the ATK instances. It would then be impossible to cast the pointer back to the original type without knowing the exact sequence of casts used to create the ATK pointer. Not only is this dangerous, but the current C++ definition makes it impossible if the derivation from ATK is virtual. (We hope that RTTI will allow down casts from a virtual base.)
Run-time systems. Class-C provides run time type information, virtual constructors, and dynamic object construction by class name. A separate section of this paper will address our design considerations for dynamic loading in C++.
Run time type information for ATK in C++ (of the same sort as the RTTI proposal before the C++ ANSI committee) is provided via a common base class (ATK) and a class registry.
The root class ATK provides static methods to display a message or throw an exception on failure of a constructor, create a new object given a string representing the class name, compare two classes for a base/derived class relationship by name, query by name whether a given class has been registered, load a class by name, or register a class. A single virtual function ATKregistry returns a pointer to the ATKregistryEntry for the object's class. This function is implemented by the ATKdefineRegistry macro described below. Other methods of the ATK class provide for accessing the class name of an object, creating a new object of the same class, and testing an object for a base/derived relationship with another object or class.
An ATKregistryEntry structure for each class contains the class name, a pointer to a function to create a new instance of the class, a pointer to the class initialization function, a list of the parent classes, and pointer to the next class in the registry. Currently the run time system is limited to single inheritance. This is because the function to create a new instance returns an ATK *, thus without compiler support casting it down to the appropriate type would be impossible if the class is derived from ATK multiply or virtually.
The ATKregistryEntry for a class is defined with the macro ATKdefineRegistry(classname, baseclassname, classinitfunction) in the top level of the source file implementing the class classname. The ATKregister(classname) macro is used to enter the class in the class registry. Generally a source file with a function containing the ATKregister calls is generated automatically by a program, given a list of the desired classes and/or libraries. The generated function is then called from the main() function of the program.
Future work will probably include phasing out the use of the C++ ATK run time type information system in favor of the ANSI standard support. One particular feature the C++ ATK system lacks is the safety of the proposed checked cast.
Dynamic loading. Class-C provides very flexible, on-demand dynamic loading. The header file for a class is sufficient to compile and run code utilizing it. During execution the class is dynamically loaded when the code first executes a "class procedure" (constructor or static member function) of the class. Another facility offered is to create an object of a class from a C string giving the class's name. With C++, dynamic loading is more difficult and less portable; the tricks used in Class-C involved preprocessor definition of function names, but static member functions are usually called with "::" qualifiers and there is no good way to replace them with preprocessor magic. In consequence, dynamic loading in the C++ version will be restricted to loading a class given its string name. Methods can be applied to objects of loaded classes only if they are virtual methods of a base class linked with the system.
Weak vs. strong typing. AUIS code in Class-C is primarily based on traditional, non-ANSI C without function protoypes. The C++ converter automatically added prototypes and the C++ source was modified by hand where compilation using these revealed type errors. Many such problems occured with function pointers because the original code assumed that function pointer values are interchangeable. For instance, the "proctable" stores pairs: name and function pointer. The cannonical prototype for these functions is (void foo(struct basicobject *self, long rock)). However, sometimes these functions return values and often the rock was a pointer. Casts were necessary in many places to force the code to compile. Almost always the actual function was defined to take a pointer to a derived class of view, for instance textview, figview, rasterview, and so on. However, passing a derived object in this fashion will break with multiple inheritance: it may not be possible for the recipient function to access the passed object as a view if it derives from other classes as well. We believe that the only truly type-safe solution requires templates.
Memory management. Both of the main classes derived from class ATK use reference counting. In consequence, such objects cannot be terminated with the normal C++ delete operator. Instead they must apply the inherited method Destroy. For the same reason, objects should not be declared automatic. (Pointers to objects can be automatic.)
Test 2 - New. Three thousand objects were created. This measuresd primarily the object creation mechanism, but some methods were called during object initialization.
Test 3 - Dup(n). A string containing styled text was concatenated
with itself n times in the form
s := s ~ s This tested many method calls and object creations as the
styles were copied.
Results are reported for two platforms in Table 1, where all measurements
are in seconds. Several runs were made and the lowest value is reported
as being the least likely to have been affected by other processes.
In most cases the other values were within three percent of the lowest.
C | C++ | |||
PMAX/Ultrix | ||||
Count | 1.08 | 1.26 | ||
New | 1.04 | 0.80 | ||
Dup-6 | 1.26 | 1.41 | ||
Dup-8 | 4.57 | 30.32 | ||
RS6000/AIX3.2 | ||||
Count | 0.73 | 0.52 | ||
New | 0.83 | 0.47 | ||
Dup-8 | 3.01 | 3.25 |
The Count test shows that g++ produced faster code for the RS6000 and
slower code on the PMAX. The New test, however, shows that creating
objects is faster with C++. Possibly it is using a faster malloc package.
In both cases, the Dup test showed the C code faster. On the PMAX,
the parameter was reduced from eight to six since it seemed possible
that page thrashing accounted for the discrepancy. Even the lower
parameter, however, showed that the C code was faster.
If you do acquire the Andrew User Interface System, you will find yourself
with an excellent environment for word processing, editing program
source text, and many other realms. You will also have the capability
to extend this environment in new and imaginative ways. If you do
the latter, we would be delighted to have you submit your work for
incorporation into the AUIS distribution so it can be enjoyed by all.
Hansen, Wilfred J., Enhancing documents with embedded programs: How Ness extends insets in the Andrew Toolkit, Proceedings of IEEE Computer Society 1990 International Conference on Computer Languages, March, 1990, New Orleans, IEEE Computer Society Press (Los Alamitos, CA) 23-32.
Hansen, W. J., Subsequence References: First Class Values for Substrings, ACM Trans. Prog. Lang. and Sys. 14, 4, Oct. 1992.
Linton, M. A., J. M. Vlissides, P. R. Calder, Composing user interfaces with interviews, IEEE Computer 22(2), February, 1989, 8-22.
Linton, M. and C. Price, Building Distributed User Interfaces with Fresco, Proceedings of the 7th X Technical Conference, Boston, Massachusetts, January, 1993, pp.77-87.
Microsoft Corp., Object Linking and Embedding (OLE), Part No. 098-31727, Microsoft Corp. (Redmond, WA, 1992).
Palay, Andrew J., Wilfred J. Hansen, et al., The Andrew Toolkit - An Overview, presented at the Usenix Conference, Dallas, TX, January, 1988.
Palay, Andrew J., Towards an "Operating System" for User Interface Components, in Multimedia Interface Design ed. Meera M. Blattner and Roger B. Dannenberg, ACM Press (New York, 1992).
Sherman, Mark, D. Anderson, W. J. Hansen, T. P. Neuendorffer, A. J.
Palay, Z. Stern, "Allocation of User-Interface Resources in the Andrew
Toolkit,", Proceedings of the International Conference on Multimedia
Information Systems, (Singapore) McGraw-Hill, January, 1991.