Coder Social home page Coder Social logo

blendedmvs's Introduction

BlendedMVS

About

BlendedMVS is a large-scale MVS dataset for generalized multi-view stereo networks. The dataset contains 17k MVS training samples covering a variety of 113 scenes, including architectures, sculptures and small objects.

Upgrade to BlendedMVG

BlendedMVG, a superset of BlendedMVS, is a multi-purpose large-scale dataset for solving multi-view geometry related problems. Except for the 113 scenes in BlendedMVS dataset, we follow its blending procedure to generate 389 more scenes (originally shown in GL3D) for BlendedMVG. The training image number is increased from 17k to over 110k.

BlendedMVG and its preceding works (BlendedMVS and GL3D) have been applied to several key 3D computer vision tasks, including image retrieval, image feature detection and description, two-view outlier rejection and multi-view stereo. If you find BlendedMVS or BlendedMVG useful for your research, please cite:

@article{yao2020blendedmvs,
  title={BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks},
  author={Yao, Yao and Luo, Zixin and Li, Shiwei and Zhang, Jingyang and Ren, Yufan and Zhou, Lei and Fang, Tian and Quan, Long},
  journal={Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

License

Creative Commons License
BlendedMVS and BlendedMVG are licensed under a Creative Commons Attribution 4.0 International License!!!

Download

For MVS networks, BlendedMVG is preprocessed and split into 3 smaller subsets (BlendedMVS, BlendedMVS+ and BlendedMVS++):

Dataset         Resolution (768 x 576) Resolution (2048 x 1536) Supplementaries
BlendedMVS            low-res set (27.5 GB)     high-res set (156 GB)     textured meshes (9.42 GB), other images (7.56 GB)
BlendedMVS+ low-res set (81.5 GB)         -       -
BlendedMVS++ low-res set (80.0 GB)   -     -   

Experiments in BlendedMVS paper were conducting using the BlendedMVS low-res-dataset. In most cases, the low-res dataset would be enough.

Dataset Structure

BlendedMVS(G) dataset adopts MVSNet input format. Please structure your dataset as listed below after downloading the whole dataset:  

DATA_ROOT                 
├── BlendedMVG_list.txt                
├── BlendedMVS_list.txt                 
├── BlendedMVS+_list.txt                
├── BlendedMVS++_list.txt              
├── ...
├── PID0                        
│   ├── blended_images          
│   │	├── 00000000.jpg        
│   │	├── 00000000_masked.jpg        
│   │	├── 00000001.jpg        
│   │	├── 00000001_masked.jpg        
│   │	└── ...                 
│   ├── cams                      
│   │  	├── pair.txt           
│   │  	├── 00000000_cam.txt    
│   │  	├── 00000001_cam.txt    
│   │  	└── ...                 
│   └── rendered_depth_maps     
│      	├── 00000000.pfm        
│     	├── 00000001.pfm        
│     	└── ...                    
├── PID1                        
├── ...                         
└── PID501     

PID here is the unique project ID listed in the BlendedMVG_list.txt file. We provide blended images with and without masks.  For detailed file formats, please refer to MVSNet.

What you can do with BlendedMVS(G)?

Please refer to following repositories on how to apply BlendedMVS(G) on multi-view stereo and feature detector/descriptor networks:

Tasks             Repositories                                          
Multi-view stereo MVSNet & R-MVSNet
Descriptors & Detectors GL3D & ASLFeat & ContextDesc & GeoDesc  

Except for the above tasks, we believe BlendedMVS(G) could also be applied to a variety of geometry related problems, including, but not limited to:

  • Sparse outlier rejection (OANet, tested with the original GL3D)
  • Image retrieval (MIRorR, tested with the original GL3D)
  • Single-view depth/normal estimation
  • Two-view disparity estimation
  • Single/multi-view camera pose regression

Feel free to modify the dataset and adjust to your own tasks!

Note

  • Online augmentation should be implemented by users themselves. An example for tensorflow users could be found in MVSNet. An example for pytorch users could be found in CasMVSNet_pl
  • The number of selected source images for a given reference image might be smaller than 10 (when parsing pair.txt).
  • The depth_min and depth_max in ground truth cameras might be smaller or equal to zero (very few, when parsing *_cam.txt).
  • The rendered depth map and blended images might be empty as the textured mesh model is not necessarily to be complete (when dealing with *.pfm and *.jpg files).

Changelog

2020 April 13:

  • Upgrade to BlendedMVG dataset!

2020 April 13:

  • Upload BlendedMVS textured mesh models
  • Upload BlendedMVS high-res dataset
  • Upload input and rendered images (low-res)
  • Fix bug on multi-texture mesh rendering, update BlendedMVS low-res dataset.

2022 June 8:

  • Fix download links

blendedmvs's People

Contributors

lzx551402 avatar yoyo000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blendedmvs's Issues

About suggestion for training and validation split of BlendedMVG

Hi, YaoYao,

Thank you very much for you work. I am planning to train my network with BlendedMVG (BlendedMVS, BlendedMVS+, BlendedMVS++). I would like to split the whole dataset into a training set and a validation set. Do you have any suggestion on which scenes in training and which scenes in validation and also how many scenes should in both splits? I have seen in your paper that you split the BlendedMVS. I would like tto know what split did you use when you training on BlendedMVG.

I also have another question. I see in the dataset that you also provided masked images. Do you train with masked or unmasked images?

Thank you very much!

camera

for example
extrinsic
-0.990307 -0.138896 0.000753455 -2.85708
-0.0992137 0.703562 -0.703675 61.5539
0.0972074 -0.696928 -0.710522 160.384
0.0 0.0 0.0 1.0

intrinsic
575.955 0.0 384.0675 
0.0 575.955 288.286875 
0.0 0.0 1.0 

58.6333 1.46043515625 128.0 245.569�

I want to know what's mean "58.6333 1.46043515625 128.0 245.569".

What is BlendedMVG?

Do you have a paper describing what changes are made for BlendedMVG w.r.t BlendedMVS? Is the data generation the same, simply with more scenes?
Also thank you very much for adding my implementation in your README.

pairs.txt

Hi Yao,

Thanks for your amazing work! Could you please explain the details in pairs.txt? I understand the indexes denote the source images of the corresponding ref_image. But what do the floats behind the indexes mean?

Thanks in advance!

What does depth-wise pixel means?

We note that in the Depth Map Validation, end point error (EPE), > 1 pixel error and > 3 pixel error are used to demonstrate the capacity of BlendedMVS dataset. However, what does depth-wise pixel means? Is it simply the difference values between the depth map and the ground truth depth map?

Unit of the ground truth depth.

First of all, thank you for publishing this promising dataset.

I have a question about the unit of the given ground truth depth map.

depth

As in the figure above, I include the two depth maps that correspond to two upper rgb images.
Typically, I include the colorbar to check the range of depth, but I found the range looks weird.

Do I have to divide depth maps into 256 as the KITTI dataset?
I used to the same code in (MVSNet) but still cannot solve this problem.

As far as my understanding, the DTU dataset provides gt depth maps in the millimeter unit.
I have no idea about this set.

Sincerely.

unpleasant noise point in low resolution groundtruth depth map

I think that maybe you resize the depth map from high resolution to low resolution, So when I manage to convert the depth maps to disparity maps, there will be some unpleasant noise point in the disparity maps (point with extremely small depth).

Info about datasets

Hi,
I would like to ask what is the difference between 3 version of BlendMVS. I see that the normal BlendMVS is the smallest set but I am confused to see that BlendMVS++ is smaller than BlendMVS+. I would like to know the difference to write my paper.

MVSNET camera convention

Thank you for your work. I would like to convert the MVSNET/BlendedMVS (3x3 rotation matrix) camera poses to Euler (roll, pitch, yaw) terms. Are the camera matrices computed on 'ZXY' or 'XYZ' convention?

Blending details

Hi Yao,

thanks for the amazing work!

I find your idea of blending in frequency domain is quite cool. I am currently suffering from the huge domain gap between real and synthetic images for training a stereo matching network. I used Blender to render images or depth. But the model trained on rendered images behaved bad when tested with real images in the same object/scene. So I think your idea of blending could be quite useful for me.

  1. In your paper, you did not mention any other paper about this blending algorithm. Is this a novel idea from you, or can I find more details from other papers?

  2. It would be very nice if you could also provide the code/script about your blending implementation.

谢谢!

Question about training / validation split

Hi, thank you for the amazing data! I was wondering why the training / validation split is so uneven (106 / 7). Is there a particular reason or any reasonable 80% / 20% split chosen at random would work, as for most other datasets?

Download link still doenst work

Hi, I could not download he blendemvs low resolution, and the link provided on the mvsnet doesnt have the original images, only the blended ones.

What does depth-wise pixel means?

We note that in the Depth Map Validation, end point error (EPE), > 1 pixel error and > 3 pixel error are used to demonstrate the capacity of BlendedMVS dataset. However, what does depth-wise pixel means? Is it simply the difference values between the depth map and the ground truth depth map?

labels for objects

Hi! Thank you for your work, is there any possibility to determine which scenes are objects (like sculptures or small objects) and which are architectures? In our project we'd like to use only sculptures and objects, however I have not found any labeling or pattern in filenames that would help to determine the kind of object

Aliasing in images, especially low-resolution images

Thank you for the wonderful dataset! I have noticed significant aliasing in the images in the dataset (especially the low-resolution data, but also the high-resolution data). Here are some examples:

High-resolution image:
https://github.com/kwea123/BlendedMVS_scenes/blob/master/large/5afacb69ab00705d0cefdd5b.jpg

(Aliasing is noticable in various places, but maybe most noticeable in the race track in the stadium in the background (top middle) of image.)

Low-resolution image (5bf26cbbd43923194854b270\blended_images\00000003.jpg):

00000003

(Aliasing is evident in the rooftops, as well as in the wires crossing the water.)

Perhaps there is some aliasing from the rendering process (if anti-aliasing is not used when rendering high-res textures, and perhaps there is some aliasing resulting from downsampling high-resolution images to low-resolution ones.

I'm thinking this aliasing might hinder learning from this dataset. Is it possible to look into this issue and potentially rerender the dataset?

Download links do not function

Could you check the onedrive download links for the dataset please? The download did not start when I clicked on the download button on onedrive. All the links have the same issue while they were functioning a few days ago. Thank you!

Could you please provide other ways to download datasets?

Hi, I would like to thank you for your great work and for sharing your research datasets. However, I have got some trouble when downloading data from the onedriver links. Could you please provide other ways to download datasets? such as google driver or baidu disk...

泛化性测试

姚老师您好,感谢您开源的代码和数据集。
关于Blendedmvs论文中的泛化性测试有些疑惑想请教一下。
在Blendedmvs论文中看到,使用DTU数据集训练,在Blendedmvs数据集上测试看起来也是可行的。
我在使用DTU数据集的MVSNet预训练模型在Blendedmvs数据集进行测试时发现很难得到理想的重建效果,有些疑问想问下:

  1. 有什么参数设置需要注意的地方吗,比如numdepth, interval_scale等,每个数据集的每个场景都需要修改吗
  2. DTU数据集中depth_min 和 depth_min 和sample_interval 单位是mm,而Blendedmvs数据集中物理尺度是m,这样使用DTU数据集训练得到的模型在Blendedmvs数据集进行测试,模型输出结果的物理尺度是m还是mm呢

谢谢

Point cloud evaluation

Currently we can only evaluate depth map using EPE, <1px, etc.
Is there any point cloud evaluation method like DTU or Tanks and temples?

Depth problem with BlendedMVS++

I recently download BlendedMVS++ and I try to project the pixels from ref view to src views with ground-truth depth image. And I found there are significant difference between the original pixel and projected pixels. Here is the match result
matches
Does any one find the same problem?
I know the extrinsic is world to camera pose and i use read_pfm function in CasMVSNet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.