Ghostdecoding: leveraging random-feature kernels for error-aware and training-free KV cache selection
Guo, Hao
This item's files can only be accessed by the System Administrators group.
Permalink
https://hdl.handle.net/2142/129705
Description
Title
Ghostdecoding: leveraging random-feature kernels for error-aware and training-free KV cache selection
Author(s)
Guo, Hao
Issue Date
2025-04-22
Director of Research (if dissertation) or Advisor (if thesis)
Mendis, Charith
Department of Study
Siebel School Comp & Data Sci
Discipline
Computer Science
Degree Granting Institution
University of Illinois Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Large Language Models
Efficient ML
KV Cache
Abstract
Key-value (KV) cache is essential for efficient inference in large language models (LLMs) by storing intermediate representations to reduce redundant computations. However, as sequence lengths grow, it becomes a major bottleneck due to increasing computational and memory demands. Existing KV cache compression methods mitigate this issue by pruning or selecting critical entries, but often suffer from several limitations, including unpredictable errors, limited dynamism, and strong assumptions about context relevance. In this work, we propose GhostDecoding, a novel training-free KV cache selection mechanism that dynamically reduces the effective sequence length while maintaining error awareness for evicted entries. Using a random feature softmax kernel, our method estimates the attention score of an arbitrary number of evicted positions with O(1) time and space overhead, and selectively recomputes them when necessary. We develop efficient sparse CUDA kernels to support our algorithm. Experimental results demonstrate that GhostDecoding achieves up to 1.6× more computation reduction compared to H2O, leading to up to 1.9× decoding speed-up compared to full attention on long sequences.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.