LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Arxiv 2023
Our method can generate navigatable 3D scenes out of a single text prompt of a single image. Click and drag (navigate) / shift and scroll (zoom) to feel the 3D.
Abstract
With the widespread usage of VR devices and contents, demands for 3D scene generation techniques become more popular. Existing 3D scene generation models, however, limit the target scene to specific domain, primarily due to their training strategies using 3D scan dataset that is far from the real-world. To address such limitation, we propose LucidDreamer, a domain-free scene generation pipeline by fully leveraging the power of existing large-scale diffusion-based generative model.
Our LucidDreamer has two alternate steps: Dreaming and Alignment. First, to generate multi-view consistent images from inputs, we set the point cloud as a geometrical guideline for each image generation. Specifically, we project a portion of point cloud to the desired view and provide the projection as a guidance for inpainting using the generative model. The inpainted images are lifted to 3D space with estimated depth maps, composing a new points. Second, to aggregate the new points into the 3D scene, we propose an aligning algorithm which harmoniously integrates the portions of newly generated 3D scenes. The finally obtained 3D scene serves as initial points for optimizing Gaussian splats. LucidDreamer produces Gaussian splats that are highly-detailed compared to the previous 3D scene generation methods, with no constraint on domain of the target scene.
Introducing LucidDreamer
LucidDreamer maintains and expands its world model by recursive dreaming and alignment.
Dynamic Re-prompting
LucidDreamer can accept a sequence of text prompts for scene generation, enabling fine-grained controls.
Perceptual Quality
CLIP-based Quantitative comparison of generated scenes from images generated by Stable Diffusion. Wequantitatively compare the results using CLIP-Score and CLIP-IQA with RGBD2. For CLIP-IQA, we use quality, colorful, and sharp criteria. LucidDreamer shows dominating results on all metrics.
\[ \begin{array}{c|c|ccc} \hline \text{Models} & \text{CLIP-Score} \uparrow & \text{CLIP-IQA Quality} \uparrow & \text{CLIP-IQA Colorful} \uparrow & \text{CLIP-IQA Sharp} \uparrow \\ \hline \text{RGBD2} & 0.2035 & 0.1279 & 0.2081 & 0.0126 \\ \textbf{LucidDreamer} & \textbf{0.2110} & \textbf{0.6161} & \textbf{0.8453} & \textbf{0.5356} \\ \hline \end{array} \]
Reconstruction Quality
Reconstruction metrics of Gaussian splats according to the source of initial SfM points. We use the initial point cloud generated by COLMAP and compare the reconstruction results. Using our point cloud consistently shows better reconstruction metrics.
\[ \begin{array}{c|c|ccc} \hline \text{Iters} & \text{Source of SfM points} & \text{PSNR} \uparrow & \text{SSIM} \uparrow & \text{LPIPS} \downarrow \\ \hline 1000 & \text{COLMAP} & 23.15 & 0.7246 & 0.2910 \\ & \textbf{LucidDreamer} & \textbf{32.59} & \textbf{0.9672} & \textbf{0.0272} \\ \hline 3000 & \text{COLMAP} & 30.87 & 0.9478 & 0.0353 \\ & \textbf{LucidDreamer} & \textbf{33.80} & \textbf{0.9754} & \textbf{0.0178} \\ \hline 7000 & \text{COLMAP} & 32.52 & 0.9687 & 0.0208 \\ & \textbf{LucidDreamer} & \textbf{34.24} & \textbf{0.9781} & \textbf{0.0164} \\ \hline \end{array} \]
More 3D Gaussian Splatting Scenes
Click and drag to navigate. Shift and scroll to zoom in/out.
Citation
@article{chung2023luciddreamer,
title={LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes},
author={Chung, Jaeyoung and Lee, Suyoung and Nam, Hyeongjin and Lee, Jaerin and Lee, Kyoung Mu},
journal={arXiv preprint arXiv:2311.13384},
year={2023}
}