CMU CS 17-670 Fall 2022

Project 3: Wee-O-Wasm

In this project, you will extend your WeeWasm engine to handle a dynamic object model. The objective of this project is that you gain familiarity with object representations, including how dynamically typed object systems are implemented.

Introduction

WebAssembly is a low-level code format with a large, linear memory, which makes it a good target for native languages like C, C++, and Rust. However, as we've seen in class, many higher level languages compile to VMs that do not have low-level memory, but instead have an object model and bytecodes for manipulating objects. Examples are C#, Java, and Python. In such VMs, object references are opaque values, rather than addresses, and the VM manages the allocation and reclamation of objects.

Wee-O-Wasm Extension

We wish to add an object capability to Wasm that makes it an easier target for compiling object-oriented languages. While there is a proposal to add a garbage-collected object model that is statically-typed, it contains enough complexity (including type canonicalization), that we'll focus on a simpler, dynamically-typed object model.

The externref type

We're going to reuse a mechanism that is in standard WebAssembly but not yet in WeeWasm, the opaque externref type. In standard Wasm, modules use this type in order to interface with a host language environment, like JavaScript on the web. Code in a module cannot create values of this type, since it is completely opaque, but can receive values returned from imported functions, pass these values around internally, and pass them to further imported functions. Because the value representation is opaque, Wasm does not offer instructions to load or store externref values to memory.

We'll reuse the externref type to represent references in our new object system in Wee-O-Wasm.

In standard Wasm, there are very few operations available on externref, except checking for null. Yet in Wee-O-Wasm, we will define a new set of operations that create new objects, read and write properties, etc. Being new, these won't be standard Wasm bytecodes, but a new fictional set for the purpose of this class.

Operation Signature Semantic description
obj.new [] -> externref Create a new empty object.
obj.box_i32 i32 -> externref Create an object that represents the given signed i32 value, i.e. a boxed i32.
obj.box_f64 f64 -> externref Create an object that represents the given double value, i.e. a boxed f64.
obj.get [externref externref] -> externref Get a property on an object, using another object as the key.
Traps if either the object or the key is null.
Traps if the object is a boxed number.
Returns null if the key is not found in the object.
obj.set [externref externref externref] -> [] Set a property on an object, using another object as the key.
Traps if either the object or the key is null.
Traps if the object is a boxed number.
i32.unbox externref -> i32 Convert a previously-boxed 32-bit integer back to a signed integer.
Traps if the object is null or is not a boxed i32.
f64.unbox externref -> f64 Convert a previously-boxed double back to a double.
Traps if the object is null or is not a boxed f64.
obj.eq [externref externref] -> i32 Compare two objects for equality, where boxed numbers do not have distinguishable identity.
Returns 1 if equal, 0 otherwise.
Boxed numbers are never equal to null.

Operation	Signature	Semantic description
obj.new	[] -> externref	Create a new empty object.
obj.box_i32	i32 -> externref	Create an object that represents the given signed i32 value, i.e. a boxed i32.
obj.box_f64	f64 -> externref	Create an object that represents the given double value, i.e. a boxed f64.
obj.get	[externref externref] -> externref	Get a property on an object, using another object as the key. Traps if either the object or the key is null. Traps if the object is a boxed number. Returns null if the key is not found in the object.
obj.set	[externref externref externref] -> []	Set a property on an object, using another object as the key. Traps if either the object or the key is null. Traps if the object is a boxed number.
i32.unbox	externref -> i32	Convert a previously-boxed 32-bit integer back to a signed integer. Traps if the object is null or is not a boxed i32.
f64.unbox	externref -> f64	Convert a previously-boxed double back to a double. Traps if the object is null or is not a boxed f64.
obj.eq	[externref externref] -> i32	Compare two objects for equality, where boxed numbers do not have distinguishable identity. Returns 1 if equal, 0 otherwise. Boxed numbers are never equal to null.

Import "weewasm"."operation"

While these new operations make a lot of sense to add as new bytecodes, each getting a numeric opcode, we'll instead choose to encode them as imported functions. That is, a module will request access to each operation it needs by importing a function by its operation name. Once imported, each becomes part of the index space for functions and can be called directly by index, or indirectly through a table. This not only saves us the task of picking a new binary encoding for each instruction (modifying assemblers and disassemblers, etc), but it will also allow these operations to be emulated in other embeddings of Wasm like JavaScript on the web. In essence, we model each bytecode as a foreign function.

In Wasm, all imports have a two-level namespace, i.e. a module name and a member name. While neither name is interpreted by the Wasm engine itself, the names are meaningful to the host. In the JavaScript embedding of Wasm, the first name specifies a module, which is loaded from the import object supplied to the Wasm instantiation API, and the member name specifies a property within that result.

In Wee-O-Wasm, modules express these operations as importing a function from the "weewasm" module and specifying one of the operation names from the table above. The expected signature of the function must match what is in the table.

Native Wee-O-Wasm implementation

Your task in this project is implement these functions "natively" in your WeeWasm interpreter. Your engine effectively intrinsifies each function as a known built-in. In your implementation, you are free to implement them as actual calls to imported functions or to rewrite the bytecode to use reserved bytecodes representing each function. Note that imports can be called indirectly, so your implementation needs to handle this, too.