Withdraw
Loading…
Robust foundation model for healthcare
Wu, Zhenbang
Loading…
Permalink
https://hdl.handle.net/2142/132499
Description
- Title
- Robust foundation model for healthcare
- Author(s)
- Wu, Zhenbang
- Issue Date
- 2025-11-18
- Director of Research (if dissertation) or Advisor (if thesis)
- Sun, Jimeng
- Doctoral Committee Chair(s)
- Sun, Jimeng
- Committee Member(s)
- Tong, Hanghang
- Zhao, Han
- Nalls, Mike
- Faghri, Faraz
- Department of Study
- Siebel Computing &DataScience
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- deep learning
- healthcare
- foundation models
- Abstract
- Clinical data are routinely collected during patient visits, encompassing demographics, diagnoses, laboratory test results, medication prescriptions, medical images, and clinical notes. While there is growing interest in applying deep learning techniques to clinical predictive modeling, developing robust models for healthcare remains challenging due to several key obstacles: (1) data fragmentation, with patient health records scattered across multiple institutions; (2) data missingness, as records frequently contain gaps in both features and labels; (3) distribution shift, where training and testing data often differ in distribution; and (4) complex data schema, given that clinical data comes with varying structures and standardizations. This dissertation presents a comprehensive approach to addressing these challenges through multiple contributions. First, I developed foundational methods for handling healthcare data complexities: MedLink links de-identified patient health records across hospitals by matching health patterns without relying on sensitive patient identifiers; MUSE extends model training to include patients with missing features and labels, leveraging additional training data to improve model performance; SLDG supports the development of clinical predictive models that effectively adapt to domain shifts in target data, ensuring better generalizability; and Llemr introduces a general framework for instruction-tuning large language models (LLMs) to process and interpret clinical data with complex structures. Building upon these methodological contributions, I co-led the development of PyHealth, an open-source Python library that provides a comprehensive framework for deep learning on healthcare data. PyHealth integrates lessons learned from the aforementioned research, offering standardized data processing pipelines, implementations of state-of-the-art healthcare ML models, and evaluation protocols specifically designed for clinical applications. The library provides practical tools for handling complex medical data and supporting various clinical prediction tasks. Together, these contributions, from foundational research methods to practical tools, provide a comprehensive framework for advancing deep learning applications in healthcare, enabling researchers and practitioners to develop more robust and reliable clinical predictive models while addressing the fundamental challenges inherent in medical data.
- Graduation Semester
- 2025-12
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/132499
- Copyright and License Information
- Copyright 2025 Zhenbang Wu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…