Gabriele Oliaro
Gabriele Oliaro
Home
Publications
Industry Experience
Contact
CV
Light
Dark
Automatic
3
SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Gabriele Oliaro
,
Zhihao Jia
,
Daniel Campos
,
Aurick Qiao
PDF
Cite
DOI
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Xupeng Miao
,
Gabriele Oliaro
,
Xinhao Cheng
,
Mengdi Wu
,
Colin Unger
,
Zhihao Jia
PDF
Cite
Code
DOI
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Xupeng Miao
,
Gabriele Oliaro
,
Zhihao Zhang
,
Xinhao Cheng
,
Hongyi Jin
,
Tianqi Chen
,
Zhihao Jia
PDF
Cite
DOI
Cite
×