Coder Social home page Coder Social logo

vision-nerf's People

Contributors

ken2576 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

vision-nerf's Issues

How to export the 3D mesh?

Thank you for your amazing work!
Your code works great for new view synthesis, but how can I export the 3D mesh (as an obj) from your representation?

Confusion about generalization of NeRF

Thank your great work!
I have some confusion about NeRF generalizability.

Your paper title says that only need a single image to synthesize novel image, and And what is the function of the pre-training weights you provide? Pre-training weights how you get them?

Are the pre-training weights used to extract the global and local features of a single input image, and then use NeRF MLP to obtain target view?

The original NeRF needs to input dozens to hundreds of pictures of a scene, and after training, it can generate any new perspective of the scene. Although you only input a single image, you train a network on the image data set to extract global features and local features. What is the difference between input many images in this original nerf?

Sorry, I don't understand the generalizability of NeRF, I'd appreciate your reply, thanks!

Question about the number of input images

Hi,

In your paper, you said the input is one single image.
However, after reading your code, it seems that you use all other images except the target view image as the input of the transformer whether it is in the training phase or the generation phase?
This is not consistent with what you describe in your paper.
I'm curious why it is.

Single image input for NeRF

Hi there! Really cool model, managed to get the model working on my own input images but I've had to resort to a bit of workaround to get there.

I've been trying to run your model on my own data, and don't seem to be able to get the model to take in a single image as an input as described in the paper. The only way I've found the model to work is by duplicating the input image 100 times and adding a set of poses from the training SRN files.

This is the error code I get when running the SRN, NMR and gen_real models on a single image:
image

I also tried preparing the data as suggested by using Pixel-NeRF's method, and managed to get detectron working but not Pixel-NeRF itself, which are both required to prepare the data as suggested. Would you be able to clarify the format of the input data for the Vision-NeRF model?

visualization

Could you release the code that renders the visualization video and calculates metrics? It will save me a lot of time.
Thanks!

index is out of bounds

Hello,
When reading the ynz of the srn_cars dataset, an error of index is out of bounds for dimension with size 0 will appear. The dataset is downloaded from PixelNeRF as you described. How to solve it?

Question about reproducing.

Hi,
I am trying to reproduce the training of your ckpt. However, the code seems not support DDP or DP training&evaluation. Therefore, I tried the default training config where batchsize=1 and trained for 500K iterations. However, the performance is significantly worse than your provided ckpt. Do you have any idea about it ?
Best,

Generate Multi-level Feature Maps

Hello, may I ask what is the difference between using ViT Encoder and Convolutional Decoder to generate Multi-level Feature Maps, and using PVT Encoder to generate Multi-level Feature Maps directly?

A problem about cutting the Stanford automobile data set

Thank your great work!
When I was trying to use the Stanford Automobile Dataset to conduct experiments in real scenarios, I encountered the problem that the data set could not be cut correctly. Could you please help? I need a cut data set.

Seems still need the pose information of the single input image

Hi, thanks for sharing this work.

As you mentioned in the paper, the vision-nerf could synthesize the novel views conditioned on the single unposed input image.
However, from the code in render_ray.py, I found it seems still requires the pose information of the source image.

Could you point out whether I misunderstand something?

Weights are not available

Dear author, my access to your pretrained weight link has been denied recently, could you please provide me permission to download the weights? My Google account is [email protected]. I promise not to use it for any commercial purpose, very much looking forward to get your permission.
Best wishes!

question about reproducing this paper

when i reproduced this paper , use the dataset NMR but i have question like this :
File "<array_function internals>", line 200, in stack
File "/home4T/cxj/anaconda3/envs/VF/lib/python3.8/site-packages/numpy/core/shape_base.py", line 460, in stack
raise ValueError('need at least one array to stack')

i try many method,but i cant handle it ,can you give me some suggestion,thanks

Question about training time.

Hi,
May I ask about the specific training time and GPU number of your method on different datasets, e.g., SRN-chairs, SRN-cars, and NMR?
Best,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.