Coder Social home page Coder Social logo

qvpr / patch-netvlad Goto Github PK

View Code? Open in Web Editor NEW
500.0 20.0 71.0 7.02 MB

Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

License: MIT License

Python 100.00%
place-recognition descriptors netvlad patch-netvlad fusion local-features regions localization

patch-netvlad's Introduction

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition

License: MIT stars GitHub issues GitHub closed issues GitHub repo size QUT Centre for Robotics

PWC PWC PWC PWC PWC PWC

This repository contains code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

The article can be found on arXiv and the official proceedings.

Patch-NetVLAD method diagram

License + attribution/citation

When using code within this repository, please refer the following paper in your publications:

@inproceedings{hausler2021patchnetvlad,
  title={Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition},
  author={Hausler, Stephen and Garg, Sourav and Xu, Ming and Milford, Michael and Fischer, Tobias},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14141--14152},
  year={2021}
}

The code is licensed under the MIT License.

Installation

We recommend using conda (or better: mamba) to install all dependencies. If you have not yet installed conda/mamba, please download and install mambaforge. Note that issues with recent NumPy versions have been reported - please use numpy=1.21 which we know works.

# On Linux:
conda create -n patchnetvlad python numpy=1.21 pytorch-gpu torchvision natsort tqdm opencv pillow scikit-learn faiss matplotlib-base -c conda-forge
# On MacOS (x86 Intel processor):
conda create -n patchnetvlad python numpy=1.21 pytorch torchvision natsort tqdm opencv pillow scikit-learn faiss matplotlib-base -c conda-forge
# On MacOS (ARM M1/M2 processor):
conda create -n patchnetvlad python numpy=1.21 pytorch torchvision natsort tqdm opencv pillow scikit-learn faiss matplotlib-base -c conda-forge -c tobiasrobotics
# On Windows:
conda create -n patchnetvlad python numpy=1.21 natsort tqdm opencv pillow scikit-learn faiss matplotlib-base -c conda-forge
conda activate patchnetvlad
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

conda activate patchnetvlad

We provide several pre-trained models and configuration files. The pre-trained models will be downloaded automatically into the pretrained_models the first time feature extraction is performed.

Alternatively, you can manually download the pre-trained models into a folder of your choice; click to expand if you want to do so.

We recommend downloading the models into the pretrained_models folder (which is setup in the config files within the configs directory):

# Note: the pre-trained models will be downloaded automatically the first time feature extraction is performed
# the steps below are optional!

# You can use the download script which automatically downloads the models:
python ./download_models.py

# Manual download:
cd pretrained_models
wget -O mapillary_WPCA128.pth.tar https://huggingface.co/TobiasRobotics/Patch-NetVLAD/resolve/main/mapillary_WPCA128.pth.tar?download=true
wget -O mapillary_WPCA512.pth.tar https://huggingface.co/TobiasRobotics/Patch-NetVLAD/resolve/main/mapillary_WPCA512.pth.tar?download=true
wget -O mapillary_WPCA4096.pth.tar https://huggingface.co/TobiasRobotics/Patch-NetVLAD/resolve/main/mapillary_WPCA4096.pth.tar?download=true
wget -O pittsburgh_WPCA128.pth.tar https://huggingface.co/TobiasRobotics/Patch-NetVLAD/resolve/main/pitts_WPCA128.pth.tar?download=true
wget -O pittsburgh_WPCA512.pth.tar https://huggingface.co/TobiasRobotics/Patch-NetVLAD/resolve/main/pitts_WPCA512.pth.tar?download=true
wget -O pittsburgh_WPCA4096.pth.tar https://huggingface.co/TobiasRobotics/Patch-NetVLAD/resolve/main/pitts_WPCA4096.pth.tar?download=true

If you want to use the shortcuts patchnetvlad-match-two, patchnetvlad-feature-match and patchnetvlad-feature-extract, you also need to run (which also lets you use Patch-NetVLAD in a modular way):

pip3 install --no-deps -e .

Quick start

Feature extraction

Replace performance.ini with speed.ini or storage.ini if you want, and adapt the dataset paths - examples are given for the Pittsburgh30k dataset (simply replace pitts30k with tokyo247 or nordland for these datasets).

python feature_extract.py \
  --config_path patchnetvlad/configs/performance.ini \
  --dataset_file_path=pitts30k_imageNames_index.txt \
  --dataset_root_dir=/path/to/your/pitts/dataset \
  --output_features_dir patchnetvlad/output_features/pitts30k_index

Repeat for the query images by replacing _index with _query. Note that you have to adapt dataset_root_dir.

Feature matching (dataset)

python feature_match.py \
  --config_path patchnetvlad/configs/performance.ini \
  --dataset_root_dir=/path/to/your/pitts/dataset \
  --query_file_path=pitts30k_imageNames_query.txt \
  --index_file_path=pitts30k_imageNames_index.txt \
  --query_input_features_dir patchnetvlad/output_features/pitts30k_query \
  --index_input_features_dir patchnetvlad/output_features/pitts30k_index \
  --ground_truth_path patchnetvlad/dataset_gt_files/pitts30k_test.npz \
  --result_save_folder patchnetvlad/results/pitts30k

Note that providing ground_truth_path is optional.

This will create three output files in the folder specified by result_save_folder:

  • recalls.txt with a plain text output (only if ground_truth_path is specified)
  • NetVLAD_predictions.txt with top 100 reference images for each query images obtained using "vanilla" NetVLAD in Kapture format
  • PatchNetVLAD_predictions.txt with top 100 reference images from above re-ranked by Patch-NetVLAD, again in Kapture format

Feature matching (two files)

python match_two.py \
--config_path patchnetvlad/configs/performance.ini \
--first_im_path=patchnetvlad/example_images/tokyo_query.jpg \
--second_im_path=patchnetvlad/example_images/tokyo_db.png

We provide the match_two.py script which computes the Patch-NetVLAD features for two given images and then determines the local feature matching between these images. While we provide example images, any image pair can be used.

The script will print a score value as an output, where a larger score indicates more similar images and a lower score means dissimilar images. The function also outputs a matching figure, showing the patch correspondances (after RANSAC) between the two images. The figure is saved as results/patchMatchings.png.

Training

python train.py \
--config_path patchnetvlad/configs/train.ini \
--cache_path=/path/to/your/desired/cache/folder \
--save_path=/path/to/your/desired/checkpoint/save/folder \
--dataset_root_dir=/path/to/your/mapillary/dataset

To begin, request, download and unzip the Mapillary Street-level Sequences dataset (https://github.com/mapillary/mapillary_sls). The provided script will train a new network from scratch, to resume training add --resume_path and set to a full path, filename and extension to an existing checkpoint file. Note to resume our provided models, first remove the WPCA layers.

After training a model, PCA can be added using add_pca.py.

python add_pca.py \
--config_path patchnetvlad/configs/train.ini \
--resume_path=full/path/with/extension/to/your/saved/checkpoint \
--dataset_root_dir=/path/to/your/mapillary/dataset

This will add an additional checkpoint file to the same folder as resume_path, except including a WPCA layer.

FAQ

Patch-NetVLAD qualitative results

How to Create New Ground Truth Files

We provide three ready-to-go ground truth files in the dataset_gt_files folder, however, for evaluation on other datasets you will need to create your own .npz ground truth data files. Each .npz stores three variables: utmQ (a numpy array of floats), utmDb (a numpy array of floats) and posDistThr (a scalar numpy float).

Each successive element within utmQ and utmDb needs to correspond to the corresponding row of the image list file. posDistThr is the ground truth tolerance value (typically in meters).

The following mock example details the steps required to create a new ground truth file:

  1. Collect GPS data for your query and database traverses and convert to utm format. Ensure the data is sampled at the same rate as your images.
  2. Select your own choice of posDistThr value.
  3. Save these variables using Numpy, such as this line of code: np.savez('dataset_gt_files/my_dataset.npz', utmQ=my_utmQ, utmDb=my_utmDb, posDistThr=my_posDistThr)

Acknowledgements

We would like to thank Gustavo Carneiro, Niko Suenderhauf and Mark Zolotas for their valuable comments in preparing this paper. This work received funding from the Australian Government, via grant AUSMURIB000001 associated with ONR MURI grant N00014-19-1-2571. The authors acknowledge continued support from the Queensland University of Technology (QUT) through the Centre for Robotics.

Related works

Please check out this collection of related works on place recognition.

patch-netvlad's People

Contributors

doraemon96 avatar michaelschleiss avatar oravus avatar stephenhausler avatar tobias-fischer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

patch-netvlad's Issues

PackageNotFoundError

hi, before i ask you , thanks for sharing this cool Patch-NetVLAD

I am trying use this, but It is not easy (i am a beginner)

I typed below code in the conda prompt, (this is the installation you said)

conda create -n patchnetvlad python=3.8 numpy pytorch-gpu torchvision natsort tqdm opencv pillow scikit-learn faiss matplotlib-base -c conda-forge
conda activate patchnetvlad

but it doesn't work.
I got this message : Solving environment: failed with repodata from current_repodata.json,
and then finally it failed.

please let me know how to fix this problem and
as far as you know (with a good heart) please tell me how to use this NetVLAD 👍

I want to test like below:

  1. one street image (image no.1)
  2. five random place images (one of them is around of image no.1)
  3. so i want to match them.

have a nice day

Reproduce the results of DELG in the paper

Hi, thank you for sharing this great work. In the official paper, DELG achieves the top performance on some datasets. May I know how to reproduce those results of DELG? Do you use the pre-trained model or fine-tune it on Pitts30k/MSLS?

How to down NordLand test dataset

Hi, I really appreciate for the work you've done.
I want to test PatchNetvlad on NordLand dataset , so I download NordLand dataset from here, but the dataset seems not the the same as your provided nordland_imageNames_query.txt, nordland_imageNames_index.txt . besides, the datasets I have downloaded do not provide ground truth file, so could you please tell me how can I get NordLand test dataset?

can't assign a tuple to a torch.cuda.FloatTensor

Hi,
I get this error 'can't assign a tuple to a torch.cuda.FloatTensor' when I run train.py, and pooling = patchnetvlad in train.ini.
The place where the error occurred is:
vlad_encoding = net.pool(image_encoding)
pvecs[i * bs:(i + 1) * bs, :] = vlad_encoding
and
vlad_encoding:
([tensor([[[ 0.0000e+00, 0.0000e+00, 0.0000e+00, ...,

But when pooling = netvlad in train.ini
vlad_encoding:
tensor([[[-5.1483e-03, -2.0104e-02, 0.0000e+00, ...,
The above error will not appear

Why filter msls for im2im use "idx[len(idx) // 2]"

During process msls dataset for im2im train task, i know should filter them for this subtask, but why use "idx[len(idx) // 2]"(original code)

    @staticmethod
    def filter(seqKeys, seqIdxs, center_frame_condition): #center_frame_conditions is the frame idxs of origin dataset #TODO, filter in how condition
        keys, idxs = [], []
        for key, idx in zip(seqKeys, seqIdxs):
            if idx[len(idx) // 2] in center_frame_condition: # TODO: why shoud use [len(idx)//2]
                keys.append(key)
                idxs.append(idx)
        return keys, np.asarray(idxs)

What's it physical meaning. Thansks sincerely!

Code is missing

Please add an example to train on custom dataset and to do predictions

[Important] Compare to STOA

Hi,

First, thank you for your contribution and congratulations!

I would have a question here, concerning Table 1 and Table 2 in your paper.

I think the true STOA methods are missing here. Please refer to Table 1 in the paper [Self-supervising Fine-grained Region Similarities for Large-scale Image Localization]

Adding the full results to Table 1 would make this paper friendly to follow up works.

Furthermore, when I check Table 2, I get the info. that the reported numbers of the proposed method are based on the usage of RANSAC, known as Ours (Multi-RANSAC-Patch-NetVLAD).

I strongly expect a fair comparison as the reported recalls of ALL methods in the paper [Self-supervising Fine-grained Region Similarities for Large-scale Image Localization] did not use RANSAC.

It is well-known that using RANSAC [two-view matching] can boost recalls. To validate the effectiveness of the proposed `Patch-NetVLAD' descriptor, please refrain from using RANSAC or using RANSAC for all baseline descriptors of STOA methods.

Please make fair comparisons.

Last, even for the out-of-the-date Netvlad baseline, its recalls on the Tokyo 24/7 are much better than the numbers reported in your paper. Please refer to Table 1 in the paper [Self-supervising Fine-grained Region Similarities for Large-scale Image Localization]

Please reflect these changes to your Arxiv and final version.

Without comparing to true STOA methods, the contribution of the proposed `Patch-NetVLAD' is questionable.

Congratulations again!

Detect number of clusters and PCA dimensions from model

Hi,

I'm trying to replicate the NetVLAD scores on the Pitts30k and Tokyo247 datasets but am getting a significantly lower scores than reported. I've modified the code slightly by commenting out the Patch-NetVLAD portions to improve speeds, i.e. I'm not extracting any patches in feature extraction nor am I re-ranking the results.

For example, I'm getting:
Recall NetVLAD@1: 0.7031 for Pitts30k
Recall NetVLAD@1: 0.3778 for Tokyo247

As reference, you've reported:
Recall NetVLAD@1: 0.835 for Pitts30k
Recall NetVLAD@1: 0.648 for Tokyo247

I'm using performance.ini which uses the mapillary pre-trained model. I have tried using the pittsburgh pre-trained model, but I get a size mismatch error (copied and pasted below). Any help would be appreciated. Thanks!

Raj

=> loading checkpoint 'C:\Patch-NetVLAD\patchnetvlad./pretrained_models/pittsburgh_WPCA4096.pth.tar'
Traceback (most recent call last):
File ".\feature_extract.py", line 179, in
main()
File ".\feature_extract.py", line 165, in main
model.load_state_dict(checkpoint['state_dict'])
File "C:\Anaconda3\envs\patchnetvlad\lib\site-packages\torch\nn\modules\module.py", line 1051, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Module:
size mismatch for pool.centroids: copying a param with shape torch.Size([64, 512]) from checkpoint, the shape in current model is torch.Size([16, 512]).
size mismatch for pool.conv.weight: copying a param with shape torch.Size([64, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 512, 1, 1]).
size mismatch for WPCA.0.weight: copying a param with shape torch.Size([4096, 32768, 1, 1]) from checkpoint, the shape in current model is torch.Size([4096, 8192, 1, 1]).

How to convert pose to utm

Hi,
How to convert ground truth camera pose to utmQ or utmDb, I only have poses data, can you help me

Shape of Local features

Hi.
Thanks for interesting work.

I'm curious about the shape of local features.
Local feature seems to have the shape of 1131, 4096 when the size of the patch is 3.
And 1131 seems to be the number of patch.
Still, I don't get how 1131 is computed.
May I ask how the number of 1131 is computed?

Regards

Code Release Date

Hi there,

I'm working on a related project for submission to CoRL 2021 and would like to use this system for comparison. The deadline for CoRL submission is on the 18th June 2021, I was just wondering whether the code is on track for being available by the 1st of June?

Regards,

Will

sorry,how i can get the dataset

sorry,i can not find the whole Pittsburgh30k or Pittsburgh250k dataset after Google many times。I'm at the end of my rope。Therefore I can not but seek your help,Could you tell me convenient how i can get the dataset。think you very much!Excuse me for such a question。

question about training loss

Hi,
I really appreciate for the work! It is helpful for me to try it with your instructions.
But I have a question about training. I try to retrain the network (with original Vgg16 backbone or change it to ResNet18, cropped it properly and maintain most of the parameters as the pretrained weights). I found that the loss went down rapidly at the first epoch and almost reach to 0.00. Would you describe the change of your loss when you train this network? Thank you very much!

Best regards

Can't reproduce Robotcar results

Hi, I tried to reproduce your result on Robotcar Seasons V2 test set by submitting to the challenge submission server. I used the released performance-focused model which is pre-trained on MSLS dataset, but I got this incorrect result:
image
And I tried the model pre-trained on Pitts30k, the results are not correct either.
image
Besides, the results on other datasets is normal. Is the model version that I used is wrong? Could you possibly release the model state that achieves the results on Robotcat dataset shown in the paper? Or would you provide the results on test set split by conditions like the Supplementary Table 1? Thank you so much.

Best regards,

Indoor experiment

Hi, thanks for your great contribution. Here I want to consult some questions about indoor experiment.

In the paper, all experements are conducted in outdoor datasets. I wander what would be the difference if applying
it into a real indoor enviroment, such as home, official and even some room with repeat region(such as a Server Room where many server computers with the same apperiance are placed)? How can I get the training dataset without GPS in indoor room?
Thanks for your attention and I am always looking forward to your kind response and any advice.

Best regards,
Slamer

validation / evaluation wrong indices

Hello QVPR team,

In evaluation function, line 88 , I think all_pos_indices list has wrong indices and can be adjusted by adding offset _lenDb in msls class line 210

Thank you

Feature matching (two files)

Hi, I really appreciate for the work you've done.
I ran match_two.py, but the score I got was only about 0.02(tokyo_query.jpg and tokyo_db.png). I also matched 2 same images , and the score is only 0.25. Is it because I set it wrong?Thank you again for your work and look forward to your reply!

About calculating keypoints

Hi,
I really appreciate your work.
I'm reading your code about calculating keypoint centers from patch [./models/local_matcher.calc_keypoint_centers_from_patches], but I have got confused:
In the for loop:
keypoints[0, k] = ((boxes[j+(i*W), 0] + boxes[(j+(patch_size[1]-1))+(i*W), 2]) / 2)
keypoints[1, k] = ((boxes[j+((i+1)*W), 1] + boxes[j+((i+(patch_size[0]-1))*W), 3]) / 2)
You select 4 points in each patch to calculate keypoints, why are these points? I mean,

  • boxes[j+(i*W), 0] is the y_min of the top left
  • boxes[(j+(patch_size[1]-1))+(i*W), 2] is the y_max of the top right
  • boxes[j+((i+1)*W), 1] is the x_min of the directly below top left
  • boxes[j+((i+(patch_size[0]-1))*W), 3] is the x_max of the bottom left

Could you please give me a brief explanation? Thank you in advance and look forward to your reply.

Training Time

I was wondering what hardware you used for training and how much time it took to train the model on the full dataset.

About reproduce the results in the paper

Hi,
I'm trying to reproduce the Single-Spatial-Patch-NetVLAD on the Pitts30k datasets. I used pitts_WPCA512.pth.tar to calculate netvlad descriptor of patch. I got recall@1=87.09, recall@5=94.05, recall@10=95.67. Recall@5 and recall@10 are same as them in paper (Table 2), but recall@1=88.0 in the paper. Besides, in the paper Single-Spatial-Patch-NetVLAD recall@1=88 is higher performance than Single-RANSAC-Patch-NetVLAD recall@1=87.3. Could you please reconfirm the accuracy of the data (recall@1=88 in table 2 pitts30k use Single-Spatial-Patch-NetVLAD method) and could you please tell me how can i reproduce the results in the paper.

Downloading Nordland dataset

Hi, will you release your version of the Nordland dataset?
It would be great to have the chance to download the dataset directly, as it would avoid possible inconsistencies.
Also, there shouldn't be any legal issue given that it is licensed under Creative Commons :-)

'can only test a child process'

Hi,
I encountered this error assert self._parent_pid == os.getpid(), 'can only test a child process' when I run train.py with the MSLS dataset

Incorrect Aachen Day-Night result

Hi, I tried to use your pretrained pittsburgh_WPCA4096 model to test on Aachen Day-Night dataset, but got totally incorrect result.
here is my complete process.

  1. generate Aachen_db_path.txt and Aachen_query_path.txt

  2. run the feature_extract.py for db and query images use following two commands.

python feature_extract.py \
  --config_path patchnetvlad/configs/speed.ini \
  --dataset_file_path=Aachen_db_path.txt \
  --output_features_dir patchnetvlad/output_features/aachen_index

python feature_extract.py \
  --config_path patchnetvlad/configs/speed.ini \
  --dataset_file_path=Aachen_query_path.txt \
  --output_features_dir patchnetvlad/output_features/aachen_query
  1. run the 'feture_match.py' for feature matching without providing ground_truth_path
python feature_match.py \
  --config_path patchnetvlad/configs/speed.ini \
  --query_file_path=Aachen_query_path.txt \
  --index_file_path=Aachen_db_path.txt \
  --query_input_features_dir patchnetvlad/output_features/aachen_query \
  --index_input_features_dir patchnetvlad/output_features/aachen_index \
  --result_save_folder patchnetvlad/results/aachen_subset
  1. then process PatchNetVLAD_predictions.txt, for each query image it output top 100 reference images. I just used the pose of top1 reference image as the estimated pose of each query image and post the result to benchmark, but got totally wrong result. here is the result:

image

I want to know what's wrong with my steps and why it outputs such unacceptable results. Is the model version that I used is wrong? Or have you test your models on Aachen dataset, if you did, could you please provide the result ?

keywords: 'optimizer' and 'epoch'

hi :)
Can you tell me why there are no 'optimizer' and 'epoch' keywords when I load your model(mapillary_WPCA4096.pth.tar)?
thanks

Pytorch GPU Installation

Could be good to change the installation instructions for Conda/Mamba to include Pytorch GPU instead of CPU as this throws an error out of the box.

Also minor typo:

tqdm.write('====> Plotting Local Features and save them to ' + str(join(opt.plot_save_path, 'patchMatchings.png')))

opt.plot_save_path -> plot_save_path?

permission for loading the model [certificate verify failed]

hi :)
I installed all packages and tried to run "match_two.py" file.
for loading Auto-download pretrained models, it asked me so I type "YES"
but i got a error message. [certificate verify failed]

I am using Window PC, please let me know how to solve this.
thanks

image

About the hard disk memory

Hi, thanks for sharing your work.
I have a question. After I ran the feature_extract.py, the size of produced files is over 800G, is it normal?
I even cannot extract query images' features due to the lack of disk memory.

About the result on Mapillary

Thank you for this nice code base and paper.
I am trying to reproduce the result on Mapillary following the training from scratch instruction. However, after training for multiple weeks, the result is still extremely low:
image

Is the result before PCAW supposed to be like this? Or maybe there is something wrong? I just follow the instruction:
python train.py
--config_path patchnetvlad/configs/train.ini
--cache_path=/path/to/your/desired/cache/folder
--save_path=/path/to/your/desired/checkpoint/save/folder
--dataset_root_dir=/path/to/your/mapillary/dataset
And there is no error reported.
Thank you so much for your time. Looking forward to hearing from you.

Code has been released

Hi All, just creating this issue to notify everyone that our code has now been released. We've managed to release early, although if anyone encounters bugs please let us know and we will endeavor to fix them.

dataset download URLs

Where can I download the Pittsburgh dataset,can you give me the url, thank you!

About Tokoy dataset

HI,
When doing experiments on the Tokoy dataset, TimeMachine is used for the training set, Tokoy24/7 is used for the test set, is that right?

how to use pool=patchnetvlad to train the network

Dear,

Thank you for the great job!

I got a question about how to use the output of patchnetvlad to train the network while vlad_local and vlad_global are generated at the same time. is it OK to use the vlad_global feature to estimate the triplet loss?

Best,
Qiang

training code

Thansk for your work.
I wondered if the training code will be released, which can be used to train our custom dataset, thanks.

Best regards.

how to reproduce Mapillary val recall

Hi, I have some question about the result reproduction of Mapillary val, the results reported in paper is the mean value of recall per each city or the value that test all queries on all database composed of all cities?could you tell more details about this, thank you very much!

Why only train conv5 of VGG?

Hi! First of all, congratulations for your great work, very inspiring! I would like to know the reasons for only training the last layers of VGG instead of the full network. Have you tried to see if the results improve? I recall the original NetVLAD paper also did this for memory reasons, but GPUs are more powerful now.

Thank you and congrats again!

Store global features image-by-image

Hi @StephenHausler,

Together with Hasini we found an issue in the case when, at feature matching stage, only a subset of the images is used from the feature extraction stage. I propose to store the global features in the same way as the local features, i.e. image-by-image. In this case we can read the subset of global features. It should not be too hard to code.

What do you think?

/cc @mingu6 @oravus

About RobotCar v2 results

Hi,
I really appreciate for your work. I'm trying to reproduce your result on RobotCar seasons v2 dataset and encountered some problems. Which dataset split is used in the Tables 1 and 2 of the main paper, test or train? Besides, are these results integrate ALL conditions including day and night?

I noticed that the Suppl. Table 1. reports results for each condition obtained from the training split. But when I tried to summary these results in Suppl. Table 1. using statistics on the train query set, I failed to obtain the results in the Tables 1 of the main paper. So I'm a bit confused about the dataset settings.

I am always looking forward to your kind response.

Best regards,

About “out of memory” question

Hello, My question is "OSError: Not enough free space to write 18530304 bytes" in feature extract process.

It seems to be that Numpy.save() needs more memory than I have.

My computer have 16GB Memory, 200GB free Disk space and WSL2 Ubuntu20.04 OS.

The WSL2 needs 3GB Memory to keep working.

Can you give me some advice?

Thank you.

Train on custom dataset

Hi, I really appreciate for the work you've done.
I wondered if the training code will be released, which can be used to train our custom dataset, thanks.

Best regards.

reproduce Mapillary val recall

Hi,
can youj tell me how to reproduce val recall? I try get the val recall by val.py,but the result is bad.
global feature with pca(4096) and use weights mapillary_WPCA4096.pth.tar .
====> Recall@1: 0.4649
====> Recall@5: 0.5378
====> Recall@10: 0.5581
====> Recall@20: 0.5770
====> Recall@50: 0.6054
====> Recall@100: 0.6162

but in paper the the recall@1(Single-Spatial-Patch-NetVLAD)=77.2 and recall@1(NetVLAD)=60.8.

Could you tell me the some detail about reproduce Mapillary val recall?
thanks

Some questions about loss and GPU memory

Hello Stephen, thanks for your excellent work. I have some questions about Patch-NetVLAD: Which loss function did you use in Patch-NetVLAD? Is it the triplet loss same as in NetVLAD? But I think the triplet loss is used to evaluate the full image, is it suitable for patch and how to modify it to use it to evaluate patch? And how much GPU memory I need to run the train code of Patch-NetVLAD?
Looking forward to your reply!

Question about Nordland dataset.

Hi, thanks for your work. I have a question about Nordland dataset.
The Nordland dataset downloaded from [https://nrkbeta.no/2013/01/15/nordlandsbanen-minute-by-minute-season-by-season/]
is a video. What's frequency do you extract frames? Can you provide your related codes or used images dataset? Thanks.

Best regards.

Why is the performance improved so much after a pca module?

Hello, first of all thank you for your great work!I get the result as below: (I am a student and I don't understand this,as I know,pca is used for dimensionality reduction, it is for more efficient storage. Usually it will lose part of the accuracy. But why is the performance improved so much after a WPCA module?)

for msls val set:
netvlad val:(non pca)
====> Calculating recall @ N
====> Recall@1: 0.4946
====> Recall@5: 0.6500
====> Recall@10: 0.7176
====> Recall@20: 0.7703
====> Recall@50: 0.8297
====> Recall@100: 0.8676

after WPCA512:
<========= calculte Recall for netvlad
====> Recall netvlad@1: 0.6392
====> Recall netvlad@5: 0.7676
====> Recall netvlad@10: 0.8068
====> Recall netvlad@20: 0.8473
====> Recall netvlad@50: 0.8905
====> Recall netvlad@100: 0.9216

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.