Proxy Cache

My name is Tengda. I recengly graduated with a Master of Computational Data Science (MCDS) degree from Carnegie Mellon University. Prior to this, I obtained my bachelor's degree from the School of Computing, National University of Singapore (NUS).
I'm rigorously trained in two core areas: 1) designing and implementing cutting-edge machine learning algorithms to address complex business problems. 2) building reliable distributed systems that scale effectively. I’m also experienced in natural language processing and its practical applications, having completed my bachelor's thesis on neural abstractive summarization under the guidance of Professor Vaibhav Rajan.
As a passionate problem-solver, I currently work as an ML engineer at Uber, focusing on driver trip pricing. I’m fortunate to have worked with TikTok (Recommendation System), Shopee (Search Algorithms), Bank of America Merrill Lynch (Full-Stack Development), and GIC (Fintech) in the past.
Uber
Machine Learning Engineer - Present)
Working on driver trip pricing...
TikTok
Machine Learning Engineer Intern -
My internship project improves the recall/retrieval stage of a large-scale recommendation system (TikTok Shop) by introducing onsite/offsite advertisement signals. I've experimented with the following two types of methods.
Shopee
Software Engineer -
As a software engineer in the search algorithm team, my responsibility is to develop and deploy algorithms so that the search results are more relevant to the users' query intention.
I spent roughly half my time there on machine learning model research and implementation (e.g. gradient boosting decision trees, embedding + MLP, and transformer-based models). The other half is mostly on reliable batch pipelines for data delivery and the model deployment layer.
My implementation improves click-through rate by 3.8% and decreases the bad case rate by 20% in A/B test, which benefits millions of users in the 8 regions Shopee operates.
Bank of America Merrill Lynch
Software Engineer Intern -
At BAML, I formulated workflows and developed multiple full-stack web applications including frontend (AngularJS), backend (Scala), and unit-testing (ScalaTest) to help clients manage portfolios. I collaborated closely with the product side to ensure a smooth user experience.
GIC
Data Scientist Intern -
As a data scientist intern in the equity technology team, I developed tools, models, and dashboards to help portfolio managers make well-informed investment decisions.
Carnegie Mellon University
Master of Computational Data Science -
Northwestern University
Exchange Program in Math & Computer Science -
National University of Singapore
Bachelor of Science (Honors) in Business Analytics -
Proxy Cache
Two-Phase Commit
BusTub RDBMS
file RPC
QE Searh Engine
NUS Lighthouse
Distributed File-Caching Proxy
This is a Java 8 implementation of a distributed file system with caching. Each proxy server can handle multiple clients concurrently, while multiple local proxies connect to a centralized server. The file-caching proxy employs a check-on-use technique to guarantee consistency at open-close session granularity, similar to the andrew file system (AFS). It also supports whole-file caching with LRU eviction.
Two-Phase Commit
This is a Java 8 implementation of the distributed consensus protocol Two-Phase Commit. In Phase I, the coordinator proposes a transaction for each participant in hope of unanimous consent. Upon receiving all votes from participants, it enters Phase II to distribute the final decision (either commit or abort) to each participant. Both the coordinator and participants use Write-Ahead logging for persistent state and crash recovery. A timeout mechanism is implemented to handle message lost.
BusTub Relational Database
BusTub is a relational database management system written in C++17. It is part of the database project course taught by Professor Andy Pavlo at CMU. I implemented the following four components of the system.
file RPC
This is a client-server implementation of remote file procedure calls written in C. The client side offers interposed file system calls including open, close, read, write, lseek, stat, unlink, getdirentries etc. The server could handle multiple clients operating on different files concurrently. These RPCs are implemented on top of TCP connection. Both the client/server uses customized stub to serialize/deserialize the function parameters, results, and errno.
QryEval Search Engine
QryEval is a full-fledged search engine based on the open source Apache Lucene. It leverages the python inheritance properties for extensibility and presents a uniform interface to users. It features the following key components: