Withdraw
Loading…
Knowledge base integration in biomedical natural language processing applications
Sakakini, Tarek
Loading…
Permalink
https://hdl.handle.net/2142/110453
Description
- Title
- Knowledge base integration in biomedical natural language processing applications
- Author(s)
- Sakakini, Tarek
- Issue Date
- 2021-04-09
- Director of Research (if dissertation) or Advisor (if thesis)
- Bhat, Suma
- Doctoral Committee Chair(s)
- Bhat, Suma
- Committee Member(s)
- Viswanath, Pramod
- Hasegawa-Johnson, Mark
- Morrow, Daniel
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Natural Language Processing
- Biomedical Text
- Low-resource
- Knowledge Bases
- Abstract
- With the progress of natural language processing in the biomedical field, the lack of annotated data due to regulations and expensive labor remains an issue. In this work, we study the potential of knowledge bases for biomedical language processing to compensate for the shortage of annotated data. Accordingly, we experiment with the integration of a rigorous biomedical knowledge base, the Unified Medical Language System, in three different biomedical natural language processing applications: text simplification, conversational agents for medication adherence, and automatic evaluation of medical students' chart notes. In the first task, we take as a use case simplifying medication instructions to enhance medication adherence among patients. Given the lack of an appropriate parallel corpus, the Unified Medical Language System provided simpler synonyms for an unsupervised system we devise, and we show a positive impact on comprehension through a human subjects study. As for the second task, we devise an unsupervised system to automatically evaluate chart notes written by medical students. The purpose of the system is to speed up the feedback process and enhance the educational experience. With the lack of training corpora, utilizing the Unified Medical Language System proved to enhance the accuracy of evaluation after integration into the baseline system. For the final task, the Unified Medical Language System was used to augment the training data of a conversational agent that educates patients on their medications. As part of the educational procedure, the agent needed to assess the comprehension of the patients by evaluating their answers to predefined questions. Starting with a small seed set of paraphrases of acceptable answers, the Unified Medical Language System was used to artificially augment the original small seed set via synonymy. Results did not show an increase in quality of system output after knowledge base integration due to the majority of errors resulting from mishandling of counts and negations. We later demonstrate the importance of a (lacking) entity linking system to perform optimal integration of biomedical knowledge bases, and we offer a first stride towards solving that problem, along with conclusions on proper training setup and processes for automatic collection of an annotated dataset for biomedical word sense disambiguation.
- Graduation Semester
- 2021-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/110453
- Copyright and License Information
- Copyright 2021 Tarek Sakakini
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…