Gabriele Oliaro

Gabriele Oliaro

CS PhD Student

Carnegie Mellon University

About

I am a 3rd year Ph.D. student in the Computer Science Department at Carnegie Mellon University, where I am fortunate to work with Zhihao Jia as part of the Catalyst Lab and Parallel Data Lab.

I am interested in machine learning systems, parallel computing and distributed systems, with a particular focus on large language models (LLMs).

Download my CV.

Interests
  • Machine Learning
  • Distributed Systems
  • Parallel Computing
  • Networking
Education
  • PhD in Computer Science, 2028

    Carnegie Mellon University

  • MS in Advanced Computing, 2023

    Tsinghua University

  • BS in Electrical Engineering, 2021

    Harvard University

Recent Publications

Quickly discover relevant content by filtering publications.
SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference
ArXiv 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
ACL 2024 Oral (Outstanding paper award 🏆)
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
ASPLOS 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
ArXiv 2024
Optimal Kernel Orchestration for Tensor Programs with Korch
ASPLOS 2024
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
ArXiv 2023
Direct Telemetry Access
SIGCOMM 2023
Zero-CPU Collection with Direct Telemetry Access
HotNets 2021

Work Experience

 
 
 
 
 
Snowflake AI Research
Research Intern
May 2024 – Oct 2024 San Mateo, CA
Work on Cortex Analyst and built SuffixDecoding
 
 
 
 
 
Whist Technologies
Software Engineer
Sep 2021 – Aug 2022 New York, NY
Worked in the Systems Engineering Team

Contact

  • goliaro@cs.cmu.edu
  • Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213