Coder Social home page Coder Social logo

canev9910 / luciddreamer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from luciddreamer-cvlab/luciddreamer

0.0 0.0 0.0 62.53 MB

Official code for the paper "LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes".

License: Other

C++ 2.77% Python 85.30% C 0.15% Cuda 11.60% CMake 0.18%

luciddreamer's Introduction

๐Ÿ˜ด LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes ๐Ÿ˜ด

Project ArXiv Open In Colab Open in Spaces

Github X LICENSE

demo_trim2.mp4
*Denotes equal contribution.


๐Ÿค– Install

Ubuntu

Prerequisite

  • CUDA>=11.4 (higher version is OK).
  • Python==3.9 (cannot use 3.10 due to open3d compatibility)

Installation script

conda create -n lucid python=3.9
conda activate lucid
pip install peft diffusers scipy numpy imageio[ffmpeg] opencv-python Pillow open3d torch==2.0.1  torchvision==0.15.2 gradio omegaconf
# ZoeDepth
pip install timm==0.6.7
# Gaussian splatting
pip install plyfile==0.8.1

cd submodules/depth-diff-gaussian-rasterization-min
# sudo apt-get install libglm-dev # may be required for the compilation.
python setup.py install
cd ../simple-knn
python setup.py install
cd ../..

Windows (Experimental, Tested on Windows 11 with VS2022)

Checklist

Installation script

conda create -n lucid python=3.9
conda activate lucid
conda install pytorch=2.0.1 torchvision=0.15.2 torchaudio=2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install peft diffusers scipy numpy imageio[ffmpeg] opencv-python Pillow open3d gradio omegaconf
# ZoeDepth
pip install timm==0.6.7
# Gaussian splatting
pip install plyfile==0.8.1

# There is an issue with whl file so please manually install the module now.
cd submodules\depth-diff-gaussian-rasterization-min\third_party
git clone https://github.com/g-truc/glm.git
cd ..\
python setup.py install
cd ..\simple-knn
python setup.py install
cd ..\..

โšก Usage

We offer several ways to interact with LucidDreamer:

  1. A demo is available on ironjr/LucidDreamer HuggingFace Space (including custom SD ckpt) and ironjr/LucidDreamer-mini HuggingFace Space (minimal features / try at here in case of the former is down) (We appreciate all the HF / Gradio team for their support).
Untitled.mov
  1. Another demo is available on a Colab, implemented by @camenduru (We greatly thank @camenduru for the contribution).
  2. You can use the gradio demo locally by running CUDA_VISIBLE_DEVICES=0 python app.py (full feature including huggingface model download, requires ~15GB) or CUDA_VISIBLE_DEVICES=0 python app_mini.py (minimum viable demo, uses only SD1.5).
  3. You can also run this with command line interface as described below.

Run with your own samples

# Default Example
python run.py --image <path_to_image> --text <path_to_text_file> [Other options]
  • Replace <path_to_image> and <path_to_text_file> with the paths to your image and text files.

Other options

  • --image (-img): Specify the path to the input image for scene generation.
  • --text (-t): Path to the text file containing the prompt that guides the scene generation.
  • --neg_text (-nt): Optional. A negative text prompt to refine and constrain the scene generation.
  • --campath_gen (-cg): Choose a camera path for scene generation (options: lookdown, lookaround, rotate360).
  • --campath_render (-cr): Select a camera path for video rendering (options: back_and_forth, llff, headbanging).
  • --model_name: Optional. Name of the inpainting model used for dreaming. Leave blank for default(SD 1.5).
  • --seed: Set a seed value for reproducibility in the inpainting process.
  • --diff_steps: Number of steps to perform in the inpainting process.
  • --save_dir (-s): Directory to save the generated scenes and videos. Specify to organize outputs.

Guideline for the prompting / Troubleshoot

General guides

  1. If your image is indoors with specific scene (and possible character in it), you can just put the most simplest representation of the scene first, like a cozy livingroom for christmas, or a dark garage, etc. Please avoid prompts like 1girl because it will generate many humans for each inpainting task.
  2. If you want to start from already hard-engineered image from e.g., StableDiffusion model, or a photo taken from other sources, you can try using WD14 tagger (huggingface demo) to extract the danbooru tags from an image. Please ensure you remove some comma separated tags if you don't want them to appear multiple times. This include human-related objects, e.g., 1girl, white shirt, boots, smiling face, red eyes, etc. Make sure to specify the objects you want to have multiples of them.

Q. I generate unwanted objects everywhere, e.g., photo frames.

  1. Manipulate negative prompts to set harder constraints for the frame object. You may try adding tags like twitter thumbnail, profile image, instagram image, watermark, text to the negative prompt. In fact, negative prompts are the best thing to try if you want some things not to be appeared in the resulting image.
  2. Try using other custom checkpoint models, which employs different pipeline methods: LaMa inpainting -> ControlNet-inpaint guided image inpainting.

Visualize .ply files

There are multiple available viewers / editors for Gaussian splatting .ply files.

  1. @playcanvas's Super-Splat project (Live demo). This is the viewer we have used for our debugging along with MeshLab.

image

  1. @antimatter15's WebGL viewer for Gaussian splatting (Live demo).

  2. @splinetool's web-based viewer for Gaussian splatting. This is the version we have used in our project page's demo.

๐Ÿšฉ Updates

  • โœ… December 12, 2023: We have precompiled wheels for the CUDA-based submodules and put them in submodules/wheels. The Windows installation guide is revised accordingly!
  • โœ… December 11, 2023: We have updated installation guides for Windows. Thank you @Maoku for your great contribution!
  • โœ… December 8, 2023: HuggingFace Space demo is out. We deeply thank all the HF team for their support!
  • โœ… December 7, 2023: Colab implementation is now available thanks to @camenduru!
  • โœ… December 6, 2023: Code release!
  • โœ… November 22, 2023: We have released our paper, LucidDreamer on arXiv.

๐ŸŒ Citation

Please cite us if you find our project useful!

@article{chung2023luciddreamer,
    title={LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes},
    author={Chung, Jaeyoung and Lee, Suyoung and Nam, Hyeongjin and Lee, Jaerin and Lee, Kyoung Mu},
    journal={arXiv preprint arXiv:2311.13384},
    year={2023}
}

๐Ÿค— Acknowledgement

We deeply appreciate ZoeDepth, Stability AI, and Runway for their models.

๐Ÿ“ง Contact

If you have any questions, please email [email protected], [email protected], [email protected].

โญ Star History

Star History Chart

luciddreamer's People

Contributors

ironjr avatar esw0116 avatar robot0321 avatar eltociear avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.