Withdraw
Loading…
Enhancing the verifiability of large language model based medical question answering systems
Wang, Xiao
Loading…
Permalink
https://hdl.handle.net/2142/129236
Description
- Title
- Enhancing the verifiability of large language model based medical question answering systems
- Author(s)
- Wang, Xiao
- Issue Date
- 2025-05-02
- Director of Research (if dissertation) or Advisor (if thesis)
- Zhang, Minjia
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Large Language Models
- Retrieval-Augmented Generation
- Trustworthy AI
- Verifiability
- Citation Generation
- Biomedical Question Answering
- Abstract
- Large language models (LLMs) have demonstrated impressive capabilities across a wide range of natural language processing tasks. However, their tendency to generate fluent yet unverifiable statements poses a fundamental challenge for deployment in high-stakes domains such as medicine, law, and education. This thesis addresses the central question of \textit{verifiability}: how can LLMs produce outputs that are not only accurate but also supported by transparent, checkable evidence? Focusing on the concrete case of medical question answering (QA), this work investigates citation generation as a mechanism for enhancing verifiability. Rather than adhering to a fixed pipeline, the thesis follows an iterative, design-driven approach to evaluate how system-level decisions—including the use of parametric versus non-parametric knowledge, retrieval-augmented generation (RAG), and fine-grained attribution strategies—affect the alignment between model-generated content and external sources. Based on these insights, a two-pass citation framework is proposed. The approach first encourages in-context citation generation during answer formulation, followed by a post hoc retrieval and reranking stage that refines attribution at the statement level. This pipeline improves citation recall and precision while maintaining fluency and factual correctness. Additionally, a human annotation study reveals that recent general-purpose LLMs can serve as effective automatic judges of citation quality, often outperforming domain-specific NLI models in aligning with expert judgments. In summary, this thesis contributes both practical methods and conceptual frameworks for improving LLM verifiability in biomedical QA, with broader implications for developing trustworthy, evidence-supported AI in other knowledge-intensive fields.
- Graduation Semester
- 2025-05
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/129236
- Copyright and License Information
- Copyright 2025 Xiao Wang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…