Gabriele Oliaro
Gabriele Oliaro
Home
Publications
Industry Experience
Contact
CV
Light
Dark
Automatic
Publications
Type
Conference paper
Preprint
Date
2024
2023
2021
Gabriele Oliaro
,
Zhihao Jia
,
Daniel Campos
,
Aurick Qiao
(2024).
SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference
. ArXiv 2024.
PDF
Cite
DOI
Zhengxin Zhang
,
Dan Zhao
,
Xupeng Miao
,
Gabriele Oliaro
,
Qing Li
,
Yong Jiang
,
Zhihao Jia
(2024).
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
. ACL 2024 Oral (Outstanding paper award 🏆).
PDF
Cite
DOI
Xupeng Miao
,
Gabriele Oliaro
,
Zhihao Zhang
,
Xinhao Cheng
,
Zeyu Wang
,
Zhengxin Zhang
,
Rae Ying Yee Wong
,
Alan Zhu
,
Lijie Yang
,
Xiaoxiang Shi
,
Chunan Shi
,
Zhuoming Chen
,
Daiyaan Arfeen
,
Reyna Abhyankar
,
Zhihao Jia
(2024).
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
. ASPLOS 2024.
PDF
Cite
Code
DOI
Xupeng Miao
,
Gabriele Oliaro
,
Xinhao Cheng
,
Mengdi Wu
,
Colin Unger
,
Zhihao Jia
(2024).
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
. ArXiv 2024.
PDF
Cite
Code
DOI
Muyan Hu
,
Ashwin Venkatram
,
Shreyashri Biswas
,
Balamurugan Marimuthu
,
Bohan Hou
,
Gabriele Oliaro
,
Haojie Wang
,
Liyan Zheng
,
Xupeng Miao
,
Jidong Zhai
,
Zhihao Jia
(2024).
Optimal Kernel Orchestration for Tensor Programs with Korch
. ASPLOS 2024.
PDF
Code
Xupeng Miao
,
Gabriele Oliaro
,
Zhihao Zhang
,
Xinhao Cheng
,
Hongyi Jin
,
Tianqi Chen
,
Zhihao Jia
(2023).
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
. ArXiv 2023.
PDF
Cite
DOI
Jonatan Langlet
,
Ran Ben Basat
,
Gabriele Oliaro
,
Michael Mitzenmacher
,
Minlan Yu
,
Gianni Antichi
(2023).
Direct Telemetry Access
. SIGCOMM 2023.
PDF
Cite
Code
DOI
Jonatan Langlet
,
Ran Ben Basat
,
Sivaram Ramanathan
,
Gabriele Oliaro
,
Michael Mitzenmacher
,
Minlan Yu
,
Gianni Antichi
(2021).
Zero-CPU Collection with Direct Telemetry Access
. HotNets 2021.
PDF
Cite
DOI
Cite
×