Starter code is now on GitHub!
The sections in a WeeWasm module are:
Id | Kind | Description |
---|---|---|
1 | Type | declares signatures of functions imported or defined in the module |
2 | Import | declares imports into the module |
3 | Function | declares functions defined later in the module |
4 | Table | defines tables |
5 | Memory | defines a memory |
6 | Global | defines global variables |
7 | Export | exports any imports or definitions from the module |
8 | Start | declares a start function |
9 | Element | declares elements used to initialize tables |
10 | Code | defines the bodies of functions declared previously in this module |
11 | Data | defines data segments that are used to initialize memory |
In WeeWasm text, each section is introduced by a keyword indicating the section kind, followed by the contents.
Examples:
A dissassembled type section:
type 1 ; declares a type section with 1 entry sig 2 i32 i32 1 i32 ; declares the signature [i32 i32] -> i32
A dissassembled function section:
func 2 ; declares a function section with 2 entries 3 ; declares a function with signature #3 [i32 i32] -> i32 4 ; declares a function with signature #4 i64 -> []
A dissassembled global section:
global 3 ; declares a global section with 3 entries i32 ; a global of type i32, immutable f64 mutable ; a global of type f64, mutable externref mutable ; a global of type externref, mutable
In the binary format, every section has a byte length at the beginning. In the WeeWasm text format, this length is not present. (The numbers above are the number of *entries* for the specific section type).
Thus you can use whitespace in your disassembler however you like and emit comments or other debugging info.
To better understand the required output of your disassembler, below is a BNF of the output format.
Note that comments in this grammar are to explain
WeeWasm ::= CustomSection* TypeSection ? CustomSection* ImportSection ? CustomSection* FunctionSection ? CustomSection* TableSection ? CustomSection* MemorySection ? CustomSection* GlobalSection ? CustomSection* ExportSection ? CustomSection* StartSection ? CustomSection* ElementSection ? CustomSection* CodeSection ? CustomSection* DataSection ? CustomSection* TypeSection ::= "type" <INT> TypeEntry* ImportSection ::= "import" <INT> ImportEntry* FunctionSection ::= "func" <INT> <INT>* TableSection ::= "table" <INT> TableType* MemorySection ::= "memory" "1" MemoryType GlobalSection ::= "global" <INT> GlobalEntry* ExportSection ::= "export" <INT> ExportEntry* StartSection ::= "start" <INT> ElementSection ::= "element" <INT> ElementEntry* CodeSection ::= "code" <INT> CodeEntry* DataSection ::= "data" <INT> DataEntry* CustomSection ::= "custom" <INT> ; section length String ; section name TypeEntry ::= "sig" <INT> ; param count Type* <INT> ; result count Type* Type ::= "i32" | "f64" | "externref" ImportEntry ::= String String (FuncImport | TableImport | MemoryImport | GlobalImport) FuncImport ::= "func" <INT> TableImport ::= "table" TableType MemoryImport ::= "memory" MemoryType GlobalImport ::= "global" GlobalType MemoryType ::= <INT> ; initial <INT>? ; maximum GlobalType ::= Type "mutable"? TableType ::= (Type | "funcref") <INT> ; initial <INT>? ; maximum ExportEntry ::= String (FuncExport | TableExport | MemoryExport | GlobalExport) FuncExport ::= "func" <INT> TableExport ::= "table" <INT> MemoryExport ::= "memory" <INT> GlobalExport ::= "global" <INT> GlobalEntry ::= GlobalType Instr* ElementEntry ::= <INT> ; offset <INT> ; count <INT>* ; function indexes CodeEntry ::= "body" <INT> ; length <INT> ; locals count LocalDecl* Instr* "end" LocalDecl ::= <INT> Type Instr ::= "unreachable" | "nop" | "block" | "loop" | "if" | "else" | "end" | "br" <INT> | "br_if" <INT> | "br_table" <INT> <INT>* <INT> | "return" | "call" <INT> | "call_indirect" <INT> | "drop" | "select" | "local.get" <INT> | "local.set" <INT> | "local.tee" <INT> | "global.get" <INT> | "global.set" <INT> | "i32.load" <INT> | "f64.load" <INT> | "i32.load8_s" <INT> | "i32.load8_u" <INT> | "i32.load16_s" <INT> | "i32.load16_u" <INT> | "i32.store" <INT> | "f64.store" <INT> | "i32.store8" <INT> | "i32.store16" <INT> | "i32.const" <INT> | "f64.const" <HEX64> | "i32.eqz" | "i32.eq" | "i32.ne" | "i32.lt_s" | "i32.lt_u" | "i32.gt_s" | "i32.gt_u" | "i32.le_s" | "i32.le_u" | "i32.ge_s" | "i32.ge_u" | "f64.eq" | "f64.ne" | "f64.lt" | "f64.gt" | "f64.le" | "f64.ge" | "i32.add" | "i32.sub" | "i32.mul" | "i32.div_s" | "i32.div_u" | "i32.rem_s" | "i32.rem_u" | "i32.and" | "i32.or" | "i32.xor" | "i32.shl" | "i32.shr_s" | "i32.shr_u" | "f64.add" | "f64.sub" | "f64.mul" | "f64.div" | "i32.trunc_f64_s" | "i32.trunc_f64_u" | "f64.convert_i32_s" | "f64.convert_i32_u" | "i32.extend8_s" | "i32.extend16_s" DataEntry ::= <INT> ; offset <INT> ; count <BYTE>* String ::= <INT> QUOTED_UTF8
Q: Does my disassembler need to handle invalid inputs?
A: No. Think of your disassembler as a dumb tool that translates bytes into text.
Specifically it
Q: The Element and Data entries are more complicated than the grammar. What gives?
A: You only need to support element and data entries that have the flags byte (first byte) as 0 and use an i32.const N as their offset expression. You should output the N as the offset.
Q: The block, loop, and if bytecodes have immediates that declare a block type. Where are these?
A: You should assume that block types for these instructions will always be empty (i.e. -64) and not output them.
Q: How should <BYTE> be formatted?
A: Your solution should output bytes in the grammar as two-byte hexadecimal, upper case (i.e. the %02X format-specifier in C).
Q: How should <HEX64> be formatted?
A: Your solution should output this quantity has a 64-bit 0-padded 16 character hexadecimal number (i.e. the %016X format-specifier in C). You may need to use a standard macro to avoid a compiler warning.
Q: Load and store instructions have an additional flags byte that includes an alignment. Where is this?
A: Your solution should not output the flags byte for loads and stores.
Q: Where is the table index in the grammar?
A: Your solution should assume there is at most one table and not output the table index.