Withdraw
Loading…
Anticipating and resolving ad hoc information needs during discourse
Ros, Kevin
Loading…
Permalink
https://hdl.handle.net/2142/132493
Description
- Title
- Anticipating and resolving ad hoc information needs during discourse
- Author(s)
- Ros, Kevin
- Issue Date
- 2025-11-14
- Director of Research (if dissertation) or Advisor (if thesis)
- Zhai, ChengXiang
- Doctoral Committee Chair(s)
- Zhai, ChengXiang
- Committee Member(s)
- Chen-Chuan Chang, Kevin
- Huang, Yun
- Deng, Yu
- Department of Study
- Siebel School Comp & Data Sci
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- information retrieval
- proactive
- search engines
- information needs
- Abstract
- The ability to access information is the foundation of a functional society, but this access is sometimes hindered by the challenges of resolving ad hoc information needs - those that arise spontaneously while a user is exposed to new information (e.g., while a student listening to a lecture). Traditionally, these needs are manually resolved, which is often limited by personal or environmental constraints. This thesis addresses this problem by studying methods for anticipating and automatically resolving likely ad hoc information needs, contributing five main studies motivated by gaps in the existing literature. First, we study how to generate likely student questions using online, asynchronous lecture video transcripts in low-data settings. After curating a dataset of lecture transcripts and asked questions, we use low-data techniques to train generative language models, and find that pre-training with search engine queries leads to more precise questions but continuous prefix tuning offers mixed results. Second, we investigate how retrieval models perform for anticipating and resolving ad hoc information needs using different pre-search contexts. We evaluate the models using a constructed dataset of webpages mentioned in online discussion threads, and find that the models struggle to proactively anticipate needed information. Third, we build on the aforementioned contributions by developing TextData, a system demonstration guided by our first two contributions that provides users with predicted questions or search results based on highlighted context. Fourth, we build InstInfo, another system demonstration that suggests research papers to academic conference presentation attendees in real-time. We conduct a small user study with InstInfo, and find that the needs of conference attendees are mostly centered around content from the paper being presented, and that timing is critical for user utility. Fifth, based on this user study, we develop the RTPR (Real-Time Proactive Retrieval) framework, which is a suite of evaluation metrics for evaluating real-time proactive retrieval systems. We then apply these metrics to retrieval models trained using a newly-created dataset (ProPres, a refined academic conference setting), and find significant limitations for existing retrieval models to provide user utility in settings of real-time proactive retrieval. Collectively, the contribution of this thesis is the systematic exploration of how to anticipate and resolve ad hoc information needs across various settings, with the goal of laying the groundwork for future proactive retrieval systems.
- Graduation Semester
- 2025-12
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/132493
- Copyright and License Information
- Copyright 2025 Kevin Ros
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…