Withdraw
Loading…
Completing the knowledge lifecycle for language models
Zhang, Zixuan
Loading…
Permalink
https://hdl.handle.net/2142/125564
Description
- Title
- Completing the knowledge lifecycle for language models
- Author(s)
- Zhang, Zixuan
- Issue Date
- 2024-07-08
- Director of Research (if dissertation) or Advisor (if thesis)
- Ji, Heng
- Doctoral Committee Chair(s)
- Ji, Heng
- Committee Member(s)
- Zhai, Chengxiang
- Tong, Hanghang
- Small, Kevin
- Yih, Scott Wen-tau
- Department of Study
- Siebel Computing &DataScience
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Language model
- Knowledge and language
- Abstract
- In recent years, language models (LMs) have achieved remarkable success, driven by the highly effective Transformer architecture. Supported by in-context learning and language model scaling laws, we can transform nearly all natural language processing (NLP) tasks into language modeling problems, and then develop exceptionally large and powerful language models to solve them. However, current language models still face critical challenges, such as the issue of hallucination and the difficulty of training these models to adapt to frequent updates in knowledge. In this dissertation, we identify two main reasons as the root causes of these challenges: implicit knowledge representation and sparse knowledge distribution. First, the knowledge is represented implicitly as parameters in language models, which makes it challenging to identify, locate, and edit a specific piece of knowledge inside a language model. Additionally, real-world new knowledge is distributed in the huge amount of unstructured data sparsely, making it difficult to extract and saturate useful knowledge without an effective knowledge extraction system. In summary, significant flaws still exist despite the great success and achievements of the language model related research. In this thesis, we aim to tackle these challenges by completing a knowledge lifecycle, to establish a knowledge-oriented updating cycle for existing language models, enabling them to continuously improve by extracting new knowledge, eliciting existing internal knowledge, and subsequently integrating and updating the model itself. We propose three steps to enable the self-improvement ability of language models: Knowledge Extraction and Summarization, Knowledge Elicitation from Language Models, and Knowledge Update and Integration. First, we need to develop a powerful extraction system capable of extracting and summarizing useful knowledge from the vast amounts of real-world data. After this extraction, we create algorithms that can efficiently draw out the existing implicit knowledge embedded within language models. Finally, we merge this elicited knowledge with the newly extracted information, manage updates and resolve conflicts, and finally integrate the refined knowledge back to the model to complete an updating cycle. We have made a number of novel contributions given the goal of building such a kind of self-updating cycle. For extracting structured knowledge from textual documents, we build RESIN, the first cross-lingual multi-document multi-media information extraction system, that is able to process hundreds of multi-media document clusters at a scale and generate high-quality event graphs. For knowledge elicitation, we propose sparse latent typing (SLT), a pre-training objective designed to encourage language models to develop a structured understanding of the pre-training text. We also study the knowledge update problem in both semi-parameteristic (retrieval-augmented generation, RAG) and fully-parameteristic settings, and propose effective methods to improve the model’s generalizability upon knowledge updates by mitigating knowledge over-memorization. Extensive experiments have been conducted, and the results demonstrate that the language model can be significantly improved through this knowledge updating cycle.
- Graduation Semester
- 2024-08
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/125564
- Copyright and License Information
- Copyright 2024 Zixuan Zhang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…