Build realistic and interactive 3D scene simulation from in-the-wild videos
Xie, Ziyang
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/129543
Description
Title
Build realistic and interactive 3D scene simulation from in-the-wild videos
Author(s)
Xie, Ziyang
Issue Date
2025-04-28
Director of Research (if dissertation) or Advisor (if thesis)
Forsyth, David
Department of Study
Siebel School Comp & Data Sci
Discipline
Computer Science
Degree Granting Institution
University of Illinois Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
3D Reconstruction, 3D Simulation, Gaussian Splatting
Abstract
This thesis proposes a unified system for constructing realistic and interactive 3D scene simulations directly from in-the-wild monocular videos. Given unstructured video input, the system reconstructs complete virtual environments that support high-fidelity rendering and real-time physical interaction. The pipeline is designed to enable a broad range of applications, including embodied agent training, sim-to-real transfer, and virtual content creation.
The pipeline addresses key challenges in 3D scene simulation through four core components: (1) a geometry-consistent background reconstruction module that combines Structure-from-Motion, 3D Gaussian Splatting, and learned geometric priors to reconstruct consistent large-scale environments; (2) a tri-branch foreground object modeling framework that supports object-level reconstruction, retrieval from large-scale 3D asset databases, and generation via 3D generative models; (3) a scene composition, relighting and rendering module that ensures photorealistic composition of foreground and background with consistent placement, lighting and shadows; and (4) a simulation layer that enables realistic sensor simulation and interactive physics-based interaction within the simulation environment.
A key feature of the system is its extensibility: each component is designed to incorporate future advances in neural rendering, generative modeling, and geometry reconstruction. This design enables the pipeline to serve both as a practical simulation pipeline and as a research platform for complex, real-world 3D scene simulation. By bridging unconstrained video input with high-fidelity interactive simulation, this work offers a scalable and generalizable framework for building realistic virtual environments from everyday video data.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.