Withdraw
Loading…
An agentic approach to information seeking for knowledge acquisition
Gangi Reddy, Revanth
Loading…
Permalink
https://hdl.handle.net/2142/129865
Description
- Title
- An agentic approach to information seeking for knowledge acquisition
- Author(s)
- Gangi Reddy, Revanth
- Issue Date
- 2025-07-14
- Director of Research (if dissertation) or Advisor (if thesis)
- Ji, Heng
- Doctoral Committee Chair(s)
- Ji, Heng
- Committee Member(s)
- Zhai, ChengXiang
- Hakkani-Tur, Dilek
- Das, Dipanjan
- Wen-tau Yih, Scott
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Information Seeking
- Agentic Search
- Retrieval and Ranking
- Knowledge Extraction
- Abstract
- The digital age has created an information paradox: while access to data is unprecedented, our ability to synthesize it into actionable knowledge is increasingly overwhelmed. Modern search systems, optimized for simple, linear tasks, are unsuited for the multi-source aggregation and iterative reasoning that defines human-like information seeking. Further, current web agents fail at the breadth-first exploration required to answer complex queries. This thesis bridges these critical gaps by introducing a new paradigm for automated information seeking, moving beyond simple retrieval to a more holistic and human-like approach to knowledge acquisition. The foundational contribution is a modular, agent-based framework that decomposes the complex information-seeking process into specialized Navigator, Extractor, and Aggregator components. Guided by a feedback loop, our proposed approach intelligently explores the web and aggregates information across a variety of information access settings, enabling a more human-like, exploratory approach to information gathering. The true utility of this general framework is demonstrated through its adaptation to high-impact, specialized domains. To combat information latency in crowd-sourced knowledge bases, we introduce an agentic system that automates Wikipedia updates by monitoring online sources and generating edit suggestions for human review. For the software development domain, where code’s unique semantics challenge traditional search, we leverage large-scale, curated code data to train specialized ranking models that achieve state-of-the-art performance in code retrieval and software issue localization. An effective information-seeking system is fundamentally dependent on its core ranking and retrieval engine. This thesis addresses three critical bottlenecks in modern search systems. To overcome the limitation of fixed-granularity ranking, we introduce an approach that leverages multi-vector embeddings for any-granularity ranking, allowing flexible retrieval from passages down to atomic propositions without separate indexing. To resolve the efficiency-effectiveness trade-off of powerful but slow listwise LLM rerankers, we propose ranking with a single token output, boosting inference latency by up to 50% while improving ranking accuracy. Finally, to break the recall ceiling imposed by initial retrieval, we introduce an inference-time feedback mechanism that uses the reranker's output to refine the retriever's query representations for improving search recall. Beyond finding documents, true understanding requires extracting precise, structured knowledge. This dissertation introduces novel methods for fine-grained knowledge extraction involving claim detection and cross-media question answering. We introduce a zero-shot framework that uses question answering to extract not only factual claims from news but also their critical attributes, such as the claim source and subject topic. For contexts where information is split across modalities, we drive progress in complex question answering that requires grounding entities between images and text for cross-media reasoning. These advancements culminate in a human-centered system that combines the principles of information-seeking and knowledge acquisition to address the complex needs of intelligence analysis. By automating the generation of structured, verifiable, and event-driven situation reports, we provide a powerful demonstration of how the research in this thesis can empower humans to focus on high-level strategic thinking. Together, these contributions provide a robust foundation for the next generation of information-seeking systems, that can enable humans to more effectively navigate and make sense of our complex, information-rich world.
- Graduation Semester
- 2025-08
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/129865
- Copyright and License Information
- Copyright 2025 Revanth Gangi Reddy
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…