Coder Social home page Coder Social logo

sitcoms3d's Introduction

Reconstructing 3D Humans and Environments in TV Shows

This is the repository for the paper "The One Where They Reconstructed 3D Humans and Environments in TV Shows" in ECCV 2022.
Georgios Pavlakos*, Ethan Weber*, Matthew Tancik, Angjoo Kanazawa

You can find our project page at https://ethanweber.me/sitcoms3D/.

Demo results of our approach

Install environment

conda create --name sitcoms3D -y python=3.8
conda activate sitcoms3D
python -m pip install --upgrade pip
python setup.py develop

Getting the data

To download the metadata related to our paper, please see METADATA.md.

Demo with our data

We provide a demo of using our data in notebooks/data_demo.ipynb. To run this demo, you'll need to install the required packages in requirements.txt.

pip install -r requirements.txt
python download_smpl.py
# now open notebooks/data_demo.ipynb to play with the data

Training NeRF

This is how you'd train with the TBBT-big_living_room environment. The runs will be saved to the data/sparse_reconstruction_and_nerf_data/TBBT-big_living_room/runs folder, which you can visualize with TensorBoard.

python sitcoms3D/nerf/run_train.py --environment_dir data/sparse_reconstruction_and_nerf_data/TBBT-big_living_room

We recommend using nerfstudio for future work with the sitcom3D data. A fast method with a sitcom3D dataloader (called the "friends" dataset) is under development here. It can be used with a real-time viewer.

Register new images to COLMAP sparse reconstructions

See REGISTER_NEW_IMAGES.md for details on how to register new images to our sparse reconstructions (i.e., to obtain new camera parameters for images in our sitcom rooms).

Qualitative system evaluation

We used the codebase https://github.com/ethanweber/anno for our qualitative system evaluation. The code requires data, setup, and webpage hosting. However, it is quite generalizable and can be used for many qualitative user study tasks. The basic idea behind the repo is to create HITs (human intelligence tasks) with questions each composed of (1) a question, (2) a list of media (images, videos, etc.) and (3) possible choices. Given the question, the user will respond with their answer choice. We employ consistency quality by showing the same questions multiple times with different ordering of media/choices and only keep responses where annotators performed sufficiently well.

Citing

If you find this code or data useful for your research, please consider citing the following paper:

@Inproceedings{pavlakos2022sitcoms3d,
  Title          = {The One Where They Reconstructed 3D Humans and Environments in TV Shows},
  Author         = {Pavlakos, Georgios and Weber, Ethan and Tancik, Matthew and Kanazawa, Angjoo},
  Booktitle      = {ECCV},
  Year           = {2022}
}

sitcoms3d's People

Contributors

ethanweber avatar dependabot[bot] avatar geopavlakos avatar

Stargazers

Sangmin Kim avatar Junhyeong Cho avatar  avatar meliortony avatar Jonny Dubowsky avatar Inferencer avatar Xto1c avatar  avatar Jeongwan On avatar Lingjun Mao avatar Aviv Hurvitz avatar Jeff Carpenter avatar Aditya Chetan avatar Changmin Jeon avatar Haitao Xiao avatar Dylan Thomas avatar KBΓΓR avatar  avatar Saba Hesaraki avatar Han Lin avatar  avatar Avdhesh K avatar  avatar Soumitri Chattopadhyay avatar  avatar  avatar Cem Turan avatar Sandalots avatar  avatar Luming Tang avatar Jeff avatar İlhan Poyraz avatar  avatar Gege Gao avatar ZixiangZhou avatar Jerry Zhi-Yang He avatar Ajinkya Puar avatar Aria F avatar César Díaz Blanco avatar KK avatar Leo avatar  avatar  avatar Xinpeng Liu avatar CK Hicks avatar Defe avatar  avatar Tony Yanick avatar Jonathan Fly avatar  avatar Weibo Mao avatar Fangwen Shu avatar  avatar Justin Hinman avatar  avatar Wentao Zhu avatar BCH avatar John Cao avatar Purva Tendulkar avatar Bulat Suleymanov avatar Amirreza avatar chenpei avatar Peter Unger avatar Rafa Pagés avatar Weichao Qiu avatar Inhee Lee avatar Dongyu Yan avatar Youtian Lin avatar kiui avatar xuxudong avatar Jiong WANG avatar Thibault Coppex avatar Russell August Anderson avatar mika avatar Dimitri Diakopoulos avatar Nick Porcino avatar Ancyloce avatar  avatar  avatar  avatar Zhao Jiale avatar  avatar  avatar Shi Yan avatar Shitty Girl avatar Tanishq Abraham avatar Zhu Shuai avatar  avatar shengyenlo avatar RWL avatar HAESUNG JEON avatar Michael Pedersen avatar ngrlt avatar Arihant Lunawat avatar  avatar  avatar  avatar Xavier B. avatar  avatar Vikrant Dewangan avatar

Watchers

 avatar Snow avatar  avatar Kostas Georgiou avatar  avatar Matt Shaffer avatar Dave Keeshan avatar

sitcoms3d's Issues

Question about SMPL data

Hi, I'm trying your method on the custom dataset, but I have a problem to get human_data and human_pairs.
Did you implement multi-shot human reconstruction by extending SPIN (https://github.com/nkolot/SPIN)?
I wonder if you have a plan to share the code of calibrated multi-shot human reconstruction.

Supplemental materials

In the paper it is mentioned multiple times to refer to the supplemental materials for more details. Where can I find it?

Getting ValueError: Invalid device ID (0)

render from the NeRF camera pose

def show_humans(human_obj_meshes, pose, K, image_name):
image = media.read_image(f"/content/sitcoms3D/data/sparse_reconstruction_and_nerf_data/{sitcom_location}/images/{image_name}")
color_h, depth_h, alpha_h = render_human(human_obj_meshes, pose, K)
media.show_image(image, height=200, title="Image we use for camera pose and intrinsics")
media.show_image(color_h, height=200, title="Image of humans rendered from this camera")
composited = (color_h * alpha_h[...,None] + image * (1 - alpha_h[...,None])).astype("uint8")
media.show_image(composited, height=200, title="Composited image")

pose = nerf_image_name_to_info[image_name]["camtoworld"]
K = nerf_image_name_to_info[image_name]["intrinsics"]
show_humans(human_obj_meshes, pose, K, image_name) # image_name to read the background image

OUTPUT :

ValueError Traceback (most recent call last)
in
10 pose = nerf_image_name_to_info[image_name]["camtoworld"]
11 K = nerf_image_name_to_info[image_name]["intrinsics"]
---> 12 show_humans(human_obj_meshes, pose, K, image_name) # image_name to read the background image

4 frames
/usr/local/lib/python3.7/dist-packages/pyrender/platforms/egl.py in get_device_by_index(device_id)
81 devices = query_devices()
82 if device_id >= len(devices):
---> 83 raise ValueError('Invalid device ID ({})'.format(device_id, len(devices)))
84 return devices[device_id]
85

ValueError: Invalid device ID (0)

About the visualization in this project.

Hi, it is a magic work! I am impressed by the great 3D reconstruction visualization in the web site (https://ethanweber.me/sitcoms3D/). I try to use the trained checkpoint to generate the visualization. But I cannot find the code of visualization. I would appreciate it if you could offer me the complete code of visualization (e.g., temporal reconstruction with novel views).

Missing keyframes.txt for --image_list_path

Hi!
In https://github.com/ethanweber/sitcoms3D/blob/master/REGISTER_NEW_IMAGES.md there is:

python detect.py --height 720 --width 1280 --n 2048 new_images/h5 new_images/images --image-extension jpg

However, this crashes with:

    assert image_list_path is not None
AssertionError

In detect.py, it says --image_list_path needs to point to keyframes.txt or keyframes_reg.txt.
I can't see this file in the repository or downloaded data however. Do you happen to have any advice on how to proceed?
Thank you in advance!

Customizing the code

Hi, Thanks for sharing your project.
Is it possible to use your code for another dataset?
I want to try it for another TV show like Office.
Can you guide me if it is possible?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.