First Author Publications
FlexLLM: Token-Level Co-Serving of LLM Inference and Finetuning with SLO Guarantees
Gabriele Oliaro,
Xupeng Miao,
Xinhao Cheng,
Vineeth Kada,
Mengdi Wu,
Ruohan Gao,
Yingyi Huang,
Remi Delacourt,
April Yang,
Yingcheng Wang,
Colin Unger,
Zhihao Jia
NSDI 2026
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
Xupeng Miao,
Gabriele Oliaro,
Zhihao Zhang,
Xinhao Cheng,
Zeyu Wang,
Zhengxin Zhang,
Rae Ying Yee Wong,
Alan Zhu,
Lijie Yang,
Xiaoxiang Shi,
Chunan Shi,
Zhuoming Chen,
Daiyaan Arfeen,
Reyna Abhyankar,
Zhihao Jia
ASPLOS 2024 (Cited 450+ times 🏆)
Collaborations
Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel
Hongyi Jin,
Bohan Hou,
Guanjie Wang,
Ruihang Lai,
Jinqi Chen,
Zihao Ye,
Yaxing Cai,
Yixin Dong,
Xinhao Cheng,
Zhihao Zhang,
Yilong Zhao,
Yingyi Huang,
Lijie Yang,
Jinchen Jiang,
Gabriele Oliaro,
Jianan Ji,
Xupeng Miao,
Vinod Grover,
Todd C. Mowry,
Zhihao Jia,
Tianqi Chen
MLSys 2026
AdaServe: SLO-Customized LLM Serving with Fine-Grained Speculative Decoding
Zikun Li,
Zhuofu Chen,
Remi Delacourt,
Gabriele Oliaro,
Zeyu Wang,
Qinghan Chen,
Shuhuai Lin,
April Yang,
Zhihao Zhang,
Zhuoming Chen,
Sean Lai,
Xupeng Miao,
Zhihao Jia
EuroSys 2026
Optimal Kernel Orchestration for Tensor Programs with Korch
Muyan Hu,
Ashwin Venkatram,
Shreyashri Biswas,
Balamurugan Marimuthu,
Bohan Hou,
Gabriele Oliaro,
Haojie Wang,
Liyan Zheng,
Xupeng Miao,
Jidong Zhai,
Zhihao Jia
ASPLOS 2024
Direct Telemetry Access
Jonatan Langlet,
Ran Ben Basat,
Gabriele Oliaro,
Michael Mitzenmacher,
Minlan Yu,
Gianni Antichi
SIGCOMM 2023