Experience
From March 2025 to August 2025, I was a research associate member at Sea AI Lab, advised by Longxu Dou and Min Lin.
We released Reptile, an LLM-based
terminal agent. My primary focus was on agent scaffolding, human-in-the-loop training data
collection and benchmarking.
From January 2024 to July 2024, I was a research intern at THUKEG, collaborating closely with Xiao Liu and Yu
Gu, under the supervision of Professor Jie
Tang and Yuxiao Dong. We released VisualAgentBench, a systematic benchmark to
evaluate and develop vision language models as visual foundation agents. It has received over 70
citations and garnered .
Prior to that, I worked on enhancing the graph reasoning ability of large language models with Ziwei Chai, under the
supervision of Professor Yang Yang. We proposed GraphLLM, which enables large language models to solve
fundamental graph reasoning tasks with near-perfect accuracy. It has received over 170 citations and
garnered .
|
Publications
* denotes equal contribution.
|
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation
Agents
Xiao Liu*,
Tianjie Zhang*,
Yu Gu*,
Iat Long Iong,
Yifan Xu,
Xixuan Song,
Shudan Zhang,
Hanyu Lai,
Xinyi Liu,
Hanlin Zhao,
Jiadai Sun,
Xinyue Yang,
Yu Yang,
Zehan Qi,
Shuntian Yao,
Xueqiao Sun,
Siyi Cheng,
Qinkai Zheng,
Hao Yu,
Hanchen Zhang,
Wenyi Hong,
Ming Ding,
Lihang Pan,
Xiaotao Gu,
Aohan Zeng,
Zhengxiao Du,
Chan Hee Song,
Yu Su,
Yuxiao Dong,
Jie Tang
ICLR, 2025
Paper
/
Code & Data
/
Poster
A systematic benchmark for evaluating and developing large multimodal models as visual foundation
agents across Embodied, GUI, and Visual Design scenarios.
|
GraphLLM: Boosting Graph Reasoning Ability of Large Language Model
Ziwei Chai*,
Tianjie Zhang*,
Liang Wu,
Kaiqiao Han,
Xiaohai Hu,
Xuanwen Huang,
Yang Yang
IEEE Transactions on Big Data, 2025
Paper
/
Code
Enabling large language models to proficiently interpret and reason on graph data.
|
An Expert is Worth One Token: Synergizing Multiple Expert LLMs as
Generalist via Expert Token Routing
Ziwei Chai,
Guoyin Wang,
Jing Su,
Tianjie Zhang,
Xuanwen Huang,
Xuwu Wang,
Jingjing Xu,
Jianbo Yuan,
Hongxia Yang,
Fei Wu,
Yang Yang
ACL, 2024
Paper
/
Code
A unified framework that facilitates the seamless integration of multiple expert LLMs.
|
|