Entity-based long document summarization using LLMs
Potluri, Abhilash
This item's files can only be accessed by the Administrator group.
Permalink
https://hdl.handle.net/2142/124661
Description
Title
Entity-based long document summarization using LLMs
Author(s)
Potluri, Abhilash
Issue Date
2024-04-16
Director of Research (if dissertation) or Advisor (if thesis)
Han, Jiawei
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Summarization
Long Documents
Nlp
Llm
Entity Extraction
Chain-of-density
Language
eng
Abstract
Recent studies have found that the summaries generated by Large Language Models (LLMs) such as OpenAI's Generative Pre-trained Transformer (GPT) tend to be ranked as the most fluent abstractive summaries. Existing long document summarization research has focused on changing model architecture (such as different attention modules) but since LLMs (especially now that recent models have very large context windows recently) seem to be the best at outputting fluent summaries, we seek to understand if we can augment LLMs with information so that it produces the most accurate summary. Specifically, in this project, we aim to investigate if we can use a tandem approach of entity extraction and LLM prompting to generate the highest quality summary possible for scientific papers (long documents). We compare summarization using GPT only, using GPT and an entity extraction approach, and using a GPT Chain-of-Density based approach with the extracted entities and find that providing the entities improves the summary quality. Despite long documents containing over 6000 tokens on average, we find that we can generate an adequate to good summary in over half the cases using our chain-of-density method (nearly 80\% of inputs in two of the datasets). We also show how our entity extraction method is better in this setting than some contemporary approaches and experiment with some variations of early stopping and entity decay on the Chain-of-Density based prompting. While this still leaves significant room for improvement, our results are promising first steps towards a new methodology for long document summarization of scientific papers.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.