Uni4D: Unifying visual foundation models for 4D modeling from a single video

Yao, David Yifan

Uni4D: Unifying visual foundation models for 4D modeling from a single video

Yao, David Yifan

Permalink

https://hdl.handle.net/2142/129198

Description

Title

Uni4D: Unifying visual foundation models for 4D modeling from a single video

Author(s)

Yao, David Yifan

Issue Date

2025-04-15

Director of Research (if dissertation) or Advisor (if thesis)

Wang, Shenlong

Department of Study

Siebel School Comp & Data Sci

Discipline

Computer Science

Degree Granting Institution

University of Illinois Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Keyword(s)

Computer Vision
Machine Learning
Structure from Motion
4D Reconstruction
Dynamic Modeling
Video Depth Estimation
Pose Estimation

Language

eng

Abstract

This paper presents a unified approach to understanding dynamic scenes from casual videos. Large pretrained vision foundation models, such as vision-language, video depth prediction, motion tracking, and segmentation models, offer promising capabilities. However, training a single model for comprehensive 4D understanding remains challenging. We introduce Uni4D, a multi-stage optimization framework that harnesses multiple pretrained models to advance dynamic 3D modeling, including static/dynamic reconstruction, camera pose estimation, and dense 3D motion tracking. Our results show state-of-the-art performance in dynamic 4D modeling with superior visual quality. Notably, Uni4D requires no retraining or fine-tuning, highlighting the effectiveness of repurposing visual foundation models for 4D understanding.

Graduation Semester

2025-05

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/129198

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Uni4D: Unifying visual foundation models for 4D modeling from a single video

Yao, David Yifan

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In