Withdraw
Loading…
Optimal graph learning
Xu, Zhe
Loading…
Permalink
https://hdl.handle.net/2142/129837
Description
- Title
- Optimal graph learning
- Author(s)
- Xu, Zhe
- Issue Date
- 2025-07-07
- Director of Research (if dissertation) or Advisor (if thesis)
- Tong, Hanghang
- Doctoral Committee Chair(s)
- Tong, Hanghang
- Committee Member(s)
- Banerjee, Arindam
- Chen, Yuzhong
- Han, Jiawei
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- graph machine learning
- graph data augmentation
- Abstract
- The past decades have seen significant advancements in graph machine learning, with numerous sophisticated models and algorithms crafted for a variety of learning tasks, including ranking, classification, regression, and anomaly detection. Generally, most existing works focus on addressing the question: given a graph, what is the best way to mine it? Despite their remarkable achievements, little attention is paid to the graph data itself, which could be noisy, huge, and imbalanced at every stage of the data collection process. In this thesis, our focus is on the relatively unexplored realm of graph data, intending to enhance various downstream graph machine learning tasks. We term this line of research "optimal graph learning", aiming to identify the most effective graph data to improve efficiency, effectiveness, and expressiveness. However, some unique challenges arise. First (formulation), it is not clear how to formulate data optimization in a data-driven way, especially considering that the downstream tasks can be versatile. Second (volume), the sheer volume of graph datasets can result in significant time and space complexity for underlying optimization solutions. Third (pattern), capturing various essential graph patterns at different granularities presents a challenge. This thesis introduces our progress towards the optimal graph learning problem. Concretely, we categorize our work into three directions: graph refinement, graph augmentation, and graph distillation. For graph refinement, we developed (1) a pure data-driven solution named GaSoliNe against noisy data and (2) Stager, a solution tailored for addressing imbalanced data. For graph augmentation, we developed three augmentation solutions: (1) ALT, enhancing broad models' performance on graphs with arbitrary heterophily, (2) DisCo, which can generate realistic graphs based on the training graphs, and (3) AuGLM, which incorporates the graph structure into the textual input so that the language models can successfully handle the node classification task. For graph distillation, we developed (1) a bilevel optimization-based solution named KiDD to shrink the size of given graphs and, meanwhile, preserve the utility of training data and (2) graph rationale discovery framework named FIG, which can find the critical subgraph in every given graph to enhance the performance of graph-level performance. Collectively, these contributions establish foundational progress toward data-centric graph machine learning and demonstrate the value of optimizing graph data itself to improve downstream task performance.
- Graduation Semester
- 2025-08
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/129837
- Copyright and License Information
- Copyright 2025 Zhe Xu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…