15-411

Guest Lecture, November 9

Title: Bodo approach: how to build a compiler for optimizing and parallelizing Python computation

Abstract

Bodo is a parallel compute platform that combines Python simplicity and HPC performance/scalability for large-scale data analytics/AI/ML. Bodo uses a new auto-parallelizing just-in-time (JIT) compiler approach that compiles code to optimized parallel binaries with MPI, whereas other distributed data processing frameworks use a high-overhead driver-executor runtime library. Hence, Bodo is usually more scalable and orders of magnitude faster than alternatives such as Spark and Dask. Find out more at https://bodo.ai.

In this talk, we discuss how Bodo’s automatic parallelization and optimization machinery works internally, such as the way high-level Python APIs are optimized as deeply embedded DSLs. We will use interactive Jupyter notebooks to go through example codes and explain various compiler stages from Python bytecode to optimized high-level IR and LLVM IR.

Bio

Ehsan Totoni is an entrepreneur, computer science researcher, and software engineer working on democratization of High Performance Computing (HPC) for data analytics/AI/ML. Ehsan received his PhD in computer science from the University of Illinois at Urbana-Champaign, working on various aspects of HPC and Parallel Computing. He then worked as a research scientist at Intel Labs and Carnegie Mellon University, focusing on programming systems to address the gap between programmer productivity and computing performance.