Coder Social home page Coder Social logo

psnerf's Introduction

PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo (ECCV 2022)

This repository contains the implementation of the paper:

PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo
Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
European Conference on Computer Vision (ECCV), 2022

Abstract

Traditional multi-view photometric stereo (MVPS) methods are often composed of multiple disjoint stages, resulting in noticeable accumulated errors. In this paper, we present a neural inverse rendering method for MVPS based on implicit representation. Given multi-view images of a non-Lambertian object illuminated by multiple unknown directional lights, our method jointly estimates the geometry, materials, and lights. Our method first employs multi-light images to estimate per-view surface normal maps, which are used to regularize the normals derived from the neural radiance field. It then jointly optimizes the surface normals, spatially-varying BRDFs, and lights based on a shadow-aware differentiable rendering layer. After optimization, the reconstructed object can be used for novel-view rendering, relighting, and material editing. Experiments on both synthetic and real datasets demonstrate that our method achieves far more accurate shape reconstruction than existing MVPS and neural rendering methods. Our code and model will be made publicly available.

Setup

Our work is implemented in PyTorch and tested with Ubuntu 18.04/20.04.

  • Python 3.9
  • PyTorch 1.8.0

Create an anaconda environment called psnerf using

conda env create -f environment.yaml
conda activate psnerf

Download

Download the dataset and trained models using

sh download.sh

If the above command is not working, please manually download the files:

  • Released models of SDPS-Net from Google Drive (LCNet and NENet) and put them in preprocessing/data/models/;
  • Trained models of our method from OneDrive (data.tgz) and put the extracted folder data/ under the project path;
  • Preprocessed dataset of both synthetic and real objects from OneDrive (dataset.tgz) and put the extracted folder dataset/ under the project path;
  • Environment maps for relighting application from OneDrive (envmap.tgz) and put the extracted folder envmap/ under stage2/.

Dataset

The real dataset was processed from DiLiGenT-MV Dataset, which contains 5 objects BEAR, BUDDHA, COW, POT2, READING.

The synthetic dataset was rendered using Mitsuba, which contains 2 objects BUNNY, ARMADILLO.

After downloaded and extracted, you can find the processed datasets in ./dataset folder.

Model

We release the pretrained models of the 5 real scenes. After downloaded and extracted, you can find them in ./data folder.

Objects

Additionally, if you want to evaluate the mesh, you may run the following command or manually download the rescaled ground-truth objects from OneDrive (objects.tgz).

# Download objects
wget http://www.visionlab.cs.hku.hk/data/psnerf/objects.tgz
tar -xzvf dataset.tgz
rm objects.tgz

The file trans.json stores the transformation parameters. You may also get the rescaled objects from DiLiGenT-MV Dataset by vertices_new = (vertices - center) / scale. (Objects in objects.tgz are already rescaled.)

Test & Evaluation

After download the pretrained models, you may run the following code to test and evalauate the results. (You may specify the OBJ_NAME from bear, buddha, cow, pot2, reading.)

## Please replace the `GPU_ID` and `OBJ_NAME` with your choices.
cd stage2
python eval.py --gpu GPU_ID --obj_name OBJ_NAME --expname test_1 --exps_folder ../data/stage2
cd ..

python evaluation.py --obj OBJ_NAME --expname test_1 --test_out_dir stage2/test_out

For testing and evaluating on your own trained models:

## Please replace the `GPU_ID`, `OBJ_NAME` and `EXPNAME` with your choices.
cd stage2
python eval.py --gpu GPU_ID --obj_name OBJ_NAME --expname EXPNAME
cd ..
## other optional arguments
# --exps_folder EXP_FOLDER        # specify exp_folder (default: ./out)
# --test_out_dir TEST_OUT_DIR     # test_out_dir (default: ./test_out)
# --save_npy                      # save npy files
# --timestamp TIMESTAMP           # specify the timestamp (default: latest)
# --checkpoint CHECKPOINT         # specify the checkpoint (default: latest)
# --light_batch N_LIGHT           # modify light batch according to your GPU memory (default: 64)

## Please replace the `OBJ_NAME`, `EXPNAME` and `TEST_OUT_DIR` with your choices.
python evaluation.py --obj OBJ_NAME --expname EXPNAME --test_out_dir TEST_OUT_DIR

Train

To train a model from scratch, you need to first prepare the training data from Preprocessing with pretrained SDPS-Net. After that you will obtain the coarse normal/light direction/light intensity estimations from the dataset.
Then you need to calculate the light-averaged images (see below) for training Stage I. For convenience, we extract the surface/normal for each view and visibility for each light direction to accelerate the Stage II training process.
After Stage II training phase is finished, you may test and evaluate the final results. Please go to the subpages and follow the detailed instructions there.

  • Prepare light-averaged images for stage I

    ## replace `OBJ_NAME` with your choice.
    python light_avg.py --obj OBJ_NAME --path dataset
    ## other optional arguments
    # --train_light          # specify the train_light
    # --light_intnorm        # enable it if use GT/SDPS-Net light intensity to normalize the images
    # --sdps                 # enable it if use light intensity predicted by SDPS-Net (you should also enable `--light_intnorm`)
  • The overall workflow is as follows:

    cd preprocessing
    python test.py xxxx
    cd ..
    python light_avg.py xxxx
    cd stage1
    python train.py xxxx
    python shape_extract.py xxxx
    cd ../stage2
    python train.py xxxx
    python eval.py xxxx
    cd ..
    python evaluation.py xxxx

Relighting and Material Editing

Our method jointly estimates surface normals, spatially-varying BRDFs, and lights. After optimization, the reconstructed objects can be used for novel-view rendering, relighting, and material editing. You may render under environmental lighting and edit materials with the trained models for applications. Please follow the instructions in Stage II.

Prepare your own data

We provide preprocessed datasets, if you want to try with your own data, please prepare your dataset and parameters as follows.

  • Data Structure
└── OBJ_NAME
    ├── params.json
    ├── img
    │   ├── view_01
    │   │   ├── 001.png
    │   │   └── ...
    │   └── ...
    │       ├── 001.png
    │       └── ...
    ├── mask
    │   ├── view_01.png
    │   └── ...
    ├── norm_mask
    │   ├── view_01.png
    │   └── ...
    ├── normal (optional)
    │   ├── img
    │   │   ├── view_01.png
    │   │   └── ...
    │   └── npy
    │       ├── view_01.png
    │       └── ...
    └── visibility (optional)
        ├── view_01
        │   ├── 001.png
        │   └── ...
        └── ...
            ├── 001.png
            └── ...
  • Parameters in params.json
== PARAM ==             == HELP ==
obj_name                name of the object
n_view                  total number of views (train + test)            
imhw                    resolution of images as [H,W]
gt_normal_world         whether normal is in world coordinate system
view_train              index of training views         
view_test               index of testing views         
K                       K matrix 
pose_c2w                camera-to-world transformation matrix of all views         
light_is_same           whether lights are the same among all views             
light_direction         list of light directions for each view as [L1*3, L2*3, ...], (if `light_is_same`, only one view is provided as L*3)         
light_intensity         (optional) list of light intensity for each view as [L1*3, L2*3, ...], (if `light_is_same`, only one view is provided as L*3)         
view_slt_N              (optional) index of selected N views for training         
light_slt_N             (optional) index of selected N lights for training

Citation

If you find this code or the provided models useful in your research, please consider cite:

@inproceedings{yang2022psnerf,
    title={PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo},
    author={Yang, Wenqi and Chen, Guanying and Chen, Chaofeng and 
            Chen, Zhenfang and Wong, Kwan-Yee K.},
    booktitle={European Conference on Computer Vision (ECCV)},
    year={2022}
}

Acknowledgement

Part of our code is based on the awesome SDPS-Net, UNISURF, and PhySG.

psnerf's People

Contributors

ywq avatar

Stargazers

Yucheng Zheng avatar Thomas Graichen avatar  avatar Xiaobing Han avatar Sanghoon Jeon avatar  avatar Guangyi PAN avatar csmab avatar  avatar Sejong Yang avatar Yun Xiang avatar Jingnan Gao avatar Genuinely avatar Yanting Lu avatar AI/ML Engineer avatar Zonglin Tian avatar  avatar  avatar sumyyyyy avatar  avatar  avatar Berkan Lafci avatar George Kouros avatar Dr. Santosh Shah avatar  avatar Akshay Paruchuri avatar Matthieu Pizenberg avatar Jean-Philippe Deblonde avatar  avatar GAAP avatar Andrey Smorodov avatar 个人公众号 Hypochondira avatar Xxxiang avatar Yining Jiao avatar Zhiyuan Chen avatar XuqianRen avatar Dylan_邓珺礼 avatar YiboXia avatar  avatar Shenhao Zhu avatar  avatar  avatar Kai He avatar Shun Iwase avatar Lixin Xue avatar Pulkit gera avatar QPan avatar Arno Wei avatar Guohui Wang avatar Hyeontae Son avatar  avatar Linghao Chen avatar  avatar Sean avatar  avatar Lin avatar Wang Tianjiao avatar  avatar Dong, Zilong avatar Kaito avatar  avatar  avatar  avatar  avatar  avatar HITXYR avatar YeChongjie avatar Tao avatar cheng zhang avatar Mohammad Reza Taesiri avatar Sanctuary avatar Oguzhan Mete Ozturk avatar YiChenCityU avatar Matt Shaffer avatar Slice avatar LiAng avatar  avatar  avatar Evgenii Kashin avatar Ji Chaonan avatar Junhua Liu avatar Kenneth Wong avatar  avatar  avatar  avatar  avatar Haoqian.Wu avatar  avatar Jiarun Liu avatar  avatar  avatar zhanghe avatar YuxiHu avatar  avatar Lu Ming avatar  avatar Ni Lixia avatar davci avatar 艾梦 avatar 爱可可-爱生活 avatar

Watchers

cheng zhang avatar Jimmy Yu avatar  avatar Guanying Chen avatar Snow avatar Pyjcsx avatar  avatar  avatar  avatar Matt Shaffer avatar

psnerf's Issues

stage2 train

hello, I want to train stage2 ( dataset bear), it need the surface/normal/visibility. and i try to use the shape_extract.py of stage1, but is not work. Maybe i should train a model from scratch and to extract the surface/normal/visibility (bear)? or Can you provide the results of the stage1 of extraction~~ thank you

How to evaluate mesh chamfer distance?

Hi, thanks for your great work!

I want to evaluate the chamfer distance in Table 2 in the paper. It seems that the input data is normalized, so there is a rigid transformation between the extracted mesh and the ground truth mesh in the original DiLiGenT-MV dataset.

Could you please publish the code to calculate chamfer distance, or provide the rigid transformations so we can properly "unnormalize" the extracted mesh? Thank you!

Preparing my own data raised a few questions

Thank you for making your code publicly available.

I'm currently trying to run your pipeline on my own data. To this end I did follow your convention described here. However, this raised a few questions and I'd be happy to know if I understand your convention correctly.

  1. What is the difference between mask and norm_mask? Isn't this the same mask per view?
  2. Should mask and norm_mask be of shape (H, W), (H, W, 1), or (H, W, 3)?
  3. What exactly does the gt_normal_world keyword in params.json do and for which step in your code is it needed? Does it assume that the optional normals are in world coordinates OR does it toggle the final normals from normal_mlp to be in world coordinates OR is it only necessary for the preprocessing step in some way?
    In general, how can I query the normals from normal_mlp to be in world coordinates?
  4. Does your approach/code support different intrinsic cameras per view/image or does it assume a single intrinsic camera for all provided images?
  5. Why is it necessary to provide light_direction in params.json? Isn't your approach uncalibrated? What can I do, if I don't have light directions?
  6. Does your approach/code support colored light, or is it assumed to be achromatic/white, i.e. equal across the RGB channels?
  7. If light_is_same=False in params.json. What should be the convention to provide lights for all views? Let's say I have a numpy array lights of shape lights.shape = (num_views, num_images, 3) that describes the scaled light directions per image, per view. In params.json, should I then set light_direction = lights.tolist() or light_direction = lights.reshape(-1, 3).tolist(). Basically, should i provide a nested list of the 'same' shape as lights or should I provide a list of shape (num_views * num_images, 3)?

Inconsistent Chamfer distance results

Hi, thanks again for your great work!

I want to replicate the results in Table 2 in the paper. However, the chamfer distances I calculated is inconsistent with the ones in the paper, even for baselines whose results are publicly available.

For example, I use the following code to calculate BEAR on PJ16:

https://gist.github.com/gerwang/901658f9d238f4e1400fb8326d4b2a5a

I tried chamfer distance code from both pytorch3d and ChamferDistancePytorch, they all gave me a value of $2.5$. However, it is $19.58$ in the paper.
image

Could you please tell we where I get it wong? Thank you!

Prepare my own data

Great work!
Can you provide script files that handles my own dataset(real capture by phone)?
Thanks!
There are some important files,like mask, norm_mask, params.json and so on.

└── OBJ_NAME
├── params.json
├── img
│ ├── view_01
│ │ ├── 001.png
│ │ └── ...
│ └── ...
│ ├── 001.png
│ └── ...
├── mask
│ ├── view_01.png
│ └── ...
├── norm_mask
│ ├── view_01.png
│ └── ...
├── normal (optional)
│ ├── img
│ │ ├── view_01.png
│ │ └── ...
│ └── npy
│ ├── view_01.png
│ └── ...
└── visibility (optional)
├── view_01
│ ├── 001.png
│ └── ...
└── ...
├── 001.png
└── ...

is there a way to export mesh?

I have trained on cow dataset, following the instructions in Readme.
Is there a way to export mesh?I have got stage2/test_out/cow folder.
Thankyou, please help me.

Which config to choose

I'm currently building the config files of stage1 and stage2 of your approach, while using my own data. I noticed that across your data sets the config files are fairly consistent. However, two major differences popped up and I was wondering if you could clarify when to use which version of the config.

Stage1

For stage1/configs/*.yaml, the non-obvious differences are:
Version 1:

dataloading:
  train_view: 15
  # inten_normalize: sdps

Version 2:

dataloading:
  # train_view: 15
  inten_normalize: sdps

How and when should I decide which version to use? Is it just trial and error or is there some reasoning behind it?

Stage2

For stage2/confs/*.conf, the non-obvious differences are:
Version 1:

dataset{
    train_view = 15
    ## inten_normalize = sdps
}
train{
    ## light_inten_train = True
    ## light_inten_init = same       ## same, gt, pred
}
brdf{
    light_intensity = 4.0
}

Version 2:

dataset{
    ## train_view = 15
    inten_normalize = sdps
}
train{
    light_inten_train = True
    light_inten_init = same       ## same, gt, pred
}
brdf{
    light_intensity = 2.0
}

How and when should I decide which version to use? Is it just trial and error or is there some reasoning behind it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.