Lecture 19 - JIT compilers - "JIT" will be used as a short hand for dynamic compiler - compiles while the program is running (or just before) - translates from source (maybe AST), or bytecode to target machine - usually a target machine is a CPU, but could be any machine that allows new code to be loaded - e.g. a VM written in Java that compiles another language into Java bytecode - also possible in languages with "eval"--Lisp, Scheme, JavaScript, Smalltalk, SELF - will focus on compilation to machine code in this series of lectures - compilation time becomes part of runtime of program - different tradeoffs: - code quality vs compile time - memory consumption of JIT code vs execution time - buffer for compiled code might be a fixed size Broadly, 3 classes of JIT compiler designs 1. Baseline compilers - design either for - simplicity - compilation speed - do few optimizations - usually a single pass - no or minimal IR - often compile directly from bytecode, source, or AST - no, simple, or local register allocation - coupled tightly with a machine code assembler (or macro assembler) 2. Tracing compilers - compile a subset of a method, sometimes only a basic block - linear or tree-structured control-flow - gather a trace either statically or more often, dynamically - simple or local register allocation 3. Optimizing compilers - look a lot like a traditional compiler - designed for better code quality - SSA or other IR designed for optimization - performs one or more optimizations on IR - full-fledged register allocator - linear scan, graph coloring, another algorithm - multiple backends for multiple architectures - make use of profiling and type feedback - differentiation from static compilers: speculative optimization and deopt - Selecting what to compile - Static, offline compiler can compile everything - mostly covered in a typical compilers class - JIT can be more selective - What a JIT compiles depends on the tiering setup - no other execution tier: must at least compile that which is executed - have interpreter? - that supports whole language/bytecode set? - supports debugging? - have higher optimizing tier? - Static hints: program or profiling run provides list of hints - Method/program characteristics: - E.g. only compile units with or without certain features - in Wasm: SIMD instructions might only be supported in one tier - in V8: first optimizing compiler did not support try/catch - E.g. only compile units less than a maximum size - Dynamic selection: - compile units that are: - frequently executed - not being debugged - don't have instrumentation attached - Selecting when to compile - Lazy compilation: only compile when a unit (e.g. function) is first executed - Background compilation: concurrently compile using additional threads - How do we detect when something is "hot"? - Profiling techniques - sampling profiling - counter-based sampling - time-based sampling - tier-up heuristics - branch profiling - type profiling - path profiling - On-stack replacement: program stuck in a loop, not exiting function - Selecting how to compile: - Multiple JIT tiers, or JIT modes - Single-pass compilation - Focus on simplicity (therefore correctness) and/or compile speed - Generally do not build a full intermediate representation of the bytecode or AST - AST-walking - bytecode-by-bytecode - require a form of abstract interpretation - basically, "running the code without running the code" - instead of concrete values, abstract values represent a possible set of runtime values - e.g. just the type, or the register in which the runtime value is stored - can also represent facts about a value that may be known, e.g. that it is a constant - abstract interpreters have to start with some unknown inputs - merges or loops in the control flow require approximation - example of abstract interpretation of Wasm code - baseline compiler design for Wasm