Withdraw
Loading…
Uni4D: Unifying visual foundation models for 4D modeling from a single video
Yao, David Yifan
Loading…
Permalink
https://hdl.handle.net/2142/129198
Description
- Title
- Uni4D: Unifying visual foundation models for 4D modeling from a single video
- Author(s)
- Yao, David Yifan
- Issue Date
- 2025-04-15
- Director of Research (if dissertation) or Advisor (if thesis)
- Wang, Shenlong
- Department of Study
- Siebel School Comp & Data Sci
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Computer Vision
- Machine Learning
- Structure from Motion
- 4D Reconstruction
- Dynamic Modeling
- Video Depth Estimation
- Pose Estimation
- Abstract
- This paper presents a unified approach to understanding dynamic scenes from casual videos. Large pretrained vision foundation models, such as vision-language, video depth prediction, motion tracking, and segmentation models, offer promising capabilities. However, training a single model for comprehensive 4D understanding remains challenging. We introduce Uni4D, a multi-stage optimization framework that harnesses multiple pretrained models to advance dynamic 3D modeling, including static/dynamic reconstruction, camera pose estimation, and dense 3D motion tracking. Our results show state-of-the-art performance in dynamic 4D modeling with superior visual quality. Notably, Uni4D requires no retraining or fine-tuning, highlighting the effectiveness of repurposing visual foundation models for 4D understanding.
- Graduation Semester
- 2025-05
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/129198
- Copyright and License Information
- Copyright 2025 David Yifan Yao
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…