ken2576 / vision-nerf Goto Github PK
View Code? Open in Web Editor NEWOfficial PyTorch Implementation of paper "Vision Transformer for NeRF-Based View Synthesis from a Single Input Image", WACV 2023.
License: MIT License
Official PyTorch Implementation of paper "Vision Transformer for NeRF-Based View Synthesis from a Single Input Image", WACV 2023.
License: MIT License
Thank you for your amazing work!
Your code works great for new view synthesis, but how can I export the 3D mesh (as an obj) from your representation?
Thank your great work!
I have some confusion about NeRF generalizability.
Your paper title says that only need a single image to synthesize novel image, and And what is the function of the pre-training weights you provide? Pre-training weights how you get them?
Are the pre-training weights used to extract the global and local features of a single input image, and then use NeRF MLP to obtain target view?
The original NeRF needs to input dozens to hundreds of pictures of a scene, and after training, it can generate any new perspective of the scene. Although you only input a single image, you train a network on the image data set to extract global features and local features. What is the difference between input many images in this original nerf?
Sorry, I don't understand the generalizability of NeRF, I'd appreciate your reply, thanks!
Hi,
In your paper, you said the input is one single image.
However, after reading your code, it seems that you use all other images except the target view image as the input of the transformer whether it is in the training phase or the generation phase?
This is not consistent with what you describe in your paper.
I'm curious why it is.
Hi there! Really cool model, managed to get the model working on my own input images but I've had to resort to a bit of workaround to get there.
I've been trying to run your model on my own data, and don't seem to be able to get the model to take in a single image as an input as described in the paper. The only way I've found the model to work is by duplicating the input image 100 times and adding a set of poses from the training SRN files.
This is the error code I get when running the SRN, NMR and gen_real models on a single image:
I also tried preparing the data as suggested by using Pixel-NeRF's method, and managed to get detectron working but not Pixel-NeRF itself, which are both required to prepare the data as suggested. Would you be able to clarify the format of the input data for the Vision-NeRF model?
Could you release the code that renders the visualization video and calculates metrics? It will save me a lot of time.
Thanks!
Hello,
When reading the ynz of the srn_cars dataset, an error of index is out of bounds for dimension with size 0 will appear. The dataset is downloaded from PixelNeRF as you described. How to solve it?
Hi,
I am trying to reproduce the training of your ckpt. However, the code seems not support DDP or DP training&evaluation. Therefore, I tried the default training config where batchsize=1 and trained for 500K iterations. However, the performance is significantly worse than your provided ckpt. Do you have any idea about it ?
Best,
Is there any chance i can check out the rendering results of the srn-cars dataset using only the vit features.
Hello, may I ask what is the difference between using ViT Encoder and Convolutional Decoder to generate Multi-level Feature Maps, and using PVT Encoder to generate Multi-level Feature Maps directly?
Thank your great work!
When I was trying to use the Stanford Automobile Dataset to conduct experiments in real scenarios, I encountered the problem that the data set could not be cut correctly. Could you please help? I need a cut data set.
Hi, could you provide details to train your model? Many thanks
Hi, thanks for sharing this work.
As you mentioned in the paper, the vision-nerf could synthesize the novel views conditioned on the single unposed input image.
However, from the code in render_ray.py, I found it seems still requires the pose information of the source image.
Could you point out whether I misunderstand something?
Dear author, my access to your pretrained weight link has been denied recently, could you please provide me permission to download the weights? My Google account is [email protected]. I promise not to use it for any commercial purpose, very much looking forward to get your permission.
Best wishes!
When I use python eval_nmr.py --config [config path] to run the code always get the error:No module named 'configargparse'
So is my config file geting wrong? OR i missed some setting.
Excuse me, what part of the code corresponds to local feature extraction?
when i reproduced this paper , use the dataset NMR but i have question like this :
File "<array_function internals>", line 200, in stack
File "/home4T/cxj/anaconda3/envs/VF/lib/python3.8/site-packages/numpy/core/shape_base.py", line 460, in stack
raise ValueError('need at least one array to stack')
i try many method,but i cant handle it ,can you give me some suggestion,thanks
Hi,
May I ask about the specific training time and GPU number of your method on different datasets, e.g., SRN-chairs, SRN-cars, and NMR?
Best,
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.