Director of Research (if dissertation) or Advisor (if thesis)
Chang, Kevin Chen-Chuan
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Efficient Natural Language Processing
Retrieval-augmented Generation
Language
eng
Abstract
Retrieval-Augmented Generation (RAG) is a technique to augment language models with external knowledge of corpus. Despite the rapid evolution of large language models, RAG is still a promising method for solving the difficulty of updating information and unreliable memorization of large language models as many research endeavors and commercial services leveraged retrieval-augmented generation to improve reliability. However, RAG has its drawbacks including high latency and intensive computational resource utilization. The inefficiency resides in two aspects: the long input due to retrieved documents and slow autoregressive generation. To address these two issues, we propose Efficient Title Reranker, a fast reranker to select important documents for input, and Cascade Speculative Drafting which improves upon speculative decoding to increase the generation efficiency of large language models. The Efficient Title Reranker achieves state-of-the-art in retrieval accuracy while being more efficient than the baseline on the KILT knowledge benchmark. On the other hand, Cascade Speculative Drafting outperforms Speculative Decoding in generation speed on both GSM8k and MMLU without additional training.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.