From objects to worlds: scalable learning of 3D assets

Huang, Zixuan

From objects to worlds: scalable learning of 3D assets

Huang, Zixuan

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/132639

Description

Title

From objects to worlds: scalable learning of 3D assets

Author(s)

Huang, Zixuan

Issue Date

2025-11-13

Director of Research (if dissertation) or Advisor (if thesis)

Rehg, James M.

Doctoral Committee Chair(s)

Rehg, James M.

Committee Member(s)

Schwing, Alexander
Wang, Shenlong
Wu, Jiajun
Vedaldi, Andrea

Department of Study

Siebel School Comp & Data Sci

Discipline

Computer Science

Degree Granting Institution

University of Illinois Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

3D Generation
3D Reconstruction
Video Generation
World Models

Language

eng

Abstract

Learning to reconstruct and generate the 3D world is a fundamental research problem in computer vision, with critical applications across diverse domains. However, the development of robust 3D generation and reconstruction systems is hindered by the scarcity of high-quality 3D data. This thesis aims to address this scaling challenge along several dimensions. First, we introduce ShapeClipper, which leverages semantic consistency from unlabeled 2D images to learn 3D shape reconstruction models. This enables scalable 3D learning from only single-view images, without any 3D annotations. Second, we present PointInfinity, a resolution-invariant point diffusion model for learning continuous 3D surfaces from point clouds. PointInfinity facilitates 3D learning using noisy point clouds derived from object-centric videos. Third, we introduce ZeroShape and re-examine the classical regression-based 3D reconstruction approach. We show it outperforms diffusion methods in accuracy, as well as computational and data efficiency. Finally, we explore the feasibility of learning 3D from in-the-wild videos without any 3D prior or data. As an initial yet solid step, we evaluate the 3D awareness of recent video foundation models, and find that state-of-the-art video generative models already possess strong 3D understanding. Together, this thesis makes significant advancements in scalable learning of 3D, providing practical solutions for reconstruction and generating 3D objects and worlds under limited high-quality 3D data.

Graduation Semester

2025-12

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/132639

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

From objects to worlds: scalable learning of 3D assets

Huang, Zixuan

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In