Coder Social home page Coder Social logo

saafke / romp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from arthur151/romp

0.0 0.0 0.0 54.31 MB

ROMP: Monocular, One-stage, Regression of Multiple 3D People, ICCV21

License: MIT License

Python 95.96% Shell 0.59% Cython 1.13% C++ 1.29% C 1.03%

romp's Introduction

Monocular, One-stage, Regression of Multiple 3D People

Google Colab demo arXiv PWC

ROMP, accepted by ICCV 2021, is a concise one-stage network for multi-person 3D mesh recovery from a single image.

  • Simple. Concise one-stage framework for simultaneous person detection and 3D body mesh recovery.

  • Fast. ROMP can achieve real-time inference on a 1070Ti GPU.

  • Strong. ROMP achieves superior performance on multiple challenging multi-person/occlusion benchmarks.

  • Easy to use. We provide user friendly testing API and webcam demos.

Contact: [email protected]. Feel free to contact me for related questions or discussions! arXiv paper.

Table of contents

Features

  • Running the examples on Google Colab.
  • Real-time online multi-person webcam demo for driving textured SMPL model. We also provide a wardrobe for changing clothes.
  • Batch processing images/videos via command line / jupyter notebook / calling ROMP as a python lib.
  • Exporting the captured single-person motion to FBX file for Blender/Unity usage.
  • Training and evaluation for re-implementing our results presented in paper.
  • Convenient API for 2D / 3D visualization, parsed datasets.

News

2021/12/23: Add Training guidance.
2021/12/20: Upgrade the Pytorch to 1.10.0, Pytorch3D to 0.6.1.
2021/12/2: Add optional renderers (pyrender or pytorch3D). Fix some bugs reported in issues.
✨✨2021/10/10: V1.1 released, including multi-person webcam, extracting , webcam temporal optimization, live blender character animation, interactive visualization. Let's try!
Old logs

Getting started

Try on Google Colab

It allows you to run the project in the cloud, free of charge. Let's give the prepared Google Colab demo a try.

Installation

Please refer to install.md for installation.

Inference

Currently, we support processing images, video or real-time webcam.
Pelease refer to config_guide.md for configurations.
ROMP can be called as a python lib inside the python code, jupyter notebook, or from command line / scripts, please refer to Google Colab demo for examples.

Processing images

To re-implement the demo results, please run

cd ROMP
# change the `inputs` in configs/image.yml to /path/to/your/image folder, then run 
sh scripts/image.sh
# or run the command like
python -m romp.predict.image --inputs=demo/images --output_dir=demo/image_results

Please refer to config_guide.md for saving the estimated mesh/Center maps/parameters dict.

For interactive visualization, please run

python -m romp.predict.image --inputs=demo/images --output_dir=demo/image_results --show_mesh_stand_on_image  --interactive_vis

Caution: To use show_mesh_stand_on_image and interactive_vis, you must run ROMP on a computer with visual desktop to support the rendering. Most remote servers without visual desktop is not supported. Please use save_visualization_on_img instead.

Here, we show an example of calling ROMP as a python lib to process images.

click here to show the code

```bash
# set the absolute path to ROMP
path_to_romp = '/path/to/ROMP'
import os,sys
sys.path.append(path_to_romp)
# set the detailed configurations
from romp.lib.config import ConfigContext, parse_args, args
ConfigContext.parsed_args = parse_args(["--configs_yml=configs/image.yml",'--inputs=/path/to/images_folder', '--output_dir=/path/to/save/image_results', '--save_centermap', False]) # Be caution that setting the bool configs needs two elements, ['--config', True/False]
# import the ROMP image processor
from romp.predict.image import Image_processor
processor = Image_processor(args_set=args())
results_dict = processor.run(args().inputs) # you can change the args().inputs to other /path/to/images_folder
```

Processing videos

cd ROMP
python -m romp.predict.video --inputs=demo/videos/sample_video.mp4 --output_dir=demo/sample_video_results --save_visualization_on_img --save_dict_results

# or you can set all configurations in configs/video.yml, then run 
sh scripts/video.sh

We notice that some users only want to extract the motion of the formost person, like this

To achieve this, please run
python -m romp.predict.video --inputs=demo/videos/demo_video_frames --output_dir=demo/demo_video_fp_results --show_largest_person_only --save_dict_results --show_mesh_stand_on_image 

All functions can be combined or work individually. Welcome to try them.

Here, we show an example of calling ROMP as a python lib to process videos.

click here to show the code

```bash
# set the absolute path to ROMP
path_to_romp = '/path/to/ROMP'
import os,sys
sys.path.append(path_to_romp)
# set the detailed configurations
from romp.lib.config import ConfigContext, parse_args, args
ConfigContext.parsed_args = parse_args(["--configs_yml=configs/video.yml",'--inputs=/path/to/video', '--output_dir=/path/to/save/video_results', '--save_visualization_on_img',False]) # Be caution that setting the bool configs needs two elements, ['--config', True/False]
# import the ROMP image processor
from romp.predict.video import Video_processor
processor = Video_processor(args_set=args())
results_dict = processor.run(args().inputs) # you can change the args().inputs to other /path/to/video
```

Webcam

To do this you just need to run:

cd ROMP
sh scripts/webcam.sh

To drive a character in Blender, please refer to expert.md.

Export

Export to Blender FBX

Please refer to expert.md to export the results to fbx files for Blender usage. Currently, this function only support the single-person video cases. Therefore, please test it with demo/videos/sample_video2_results/sample_video2.mp4, whose results would be saved to demo/videos/sample_video2_results.

Blender Addons

Chuanhang Yan : developing an addon for driving character in Blender.
VLT Media creates a QuickMocap-BlenderAddon to read the .npz file created by ROMP. Clean & smooth the resulting keyframes.

Train

Please prepare the training datasets following dataset.md, and then refer to train.md for training.

Evaluation

Please refer to evaluation.md for evaluation on benchmarks.

Bugs report

Please refer to bug.md for solutions. Welcome to submit the issues for related bugs. I will solve them as soon as possible.

Citation

@InProceedings{ROMP,
author = {Sun, Yu and Bao, Qian and Liu, Wu and Fu, Yili and Michael J., Black and Mei, Tao},
title = {Monocular, One-stage, Regression of Multiple 3D People},
booktitle = {ICCV},
month = {October},
year = {2021}
}

Contributor

This repository is currently maintained by Yu Sun.

ROMP has also benefited from many developers, including

Acknowledgement

We thank Peng Cheng for his constructive comments on Center map training.

Here are some great resources we benefit:

Please consider citing their papers.

romp's People

Contributors

arthur151 avatar gngdb avatar saafke avatar vimalmollyn avatar vltmedia avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.