Withdraw
Loading…
Harnessing data priors to mitigate 3D data scarcity
Zhao, Xiaoming
Loading…
Permalink
https://hdl.handle.net/2142/127176
Description
- Title
- Harnessing data priors to mitigate 3D data scarcity
- Author(s)
- Zhao, Xiaoming
- Issue Date
- 2024-11-04
- Director of Research (if dissertation) or Advisor (if thesis)
- Schwing, Alexander Gerhard
- Doctoral Committee Chair(s)
- Schwing, Alexander Gerhard
- Committee Member(s)
- Hoiem, Derek W
- Wang, Shenlong
- Colburn, Alex
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- 3D Computer Vision
- Novel View Synthesis
- Dynamic Novel View Synthesis
- Generative Adversarial Network
- Diffusion Model
- 3D Relighting
- Abstract
- Recently, we have witnessed remarkable advances in various fields related to machine learning and artificial intelligence, e.g., the rise of Large Language Models (LLMs) for natural language processing and diffusion models for visual content generation. Alongside improvements in algorithm design, the ability to train models on massive data has undoubtedly been a cornerstone of such progress. However, not all fields or tasks have the privilege of obtaining data of a similar scale. This naturally raises the question: how can we address the challenge of data scarcity in areas where large datasets are unavailable? In this dissertation, we study this challenge in the context of 3D computer vision and demonstrate how harnessing data priors from various domains can help mitigate 3D data scarcity. We begin by focusing on the task of dynamic view synthesis. By incorporating various data priors, e.g., those from a pre-trained static view synthesis model, we develop a system that is capable of producing high-quality free-viewpoint and free-time rendering from a monocular video without access to large-scale real-world multiview 4D (3D + time) data. In the domain of category-specific 3D content generation, we propose to leverage data priors from a pre-trained 2D Generative Adversarial Network (GAN), enabling us to obtain a 3D-aware GAN model very efficiently without relying on real-world multiview 3D data. Additionally, we utilize data priors from a pre-trained text-to-image diffusion model to tackle the task of 3D lighting, a new paradigm that outperforms many state-of-the-art inverse rendering approaches, without the need for extensive real-world multiview 3D relighting data. Finally, we conclude this dissertation with promising future directions.
- Graduation Semester
- 2024-12
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/127176
- Copyright and License Information
- Copyright 2024 Xiaoming Zhao
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…