Coder Social home page Coder Social logo

yingxin-jia / superglue-pytorch Goto Github PK

View Code? Open in Web Editor NEW
519.0 9.0 124.0 118.42 MB

[SuperGlue: Learning Feature Matching with Graph Neural Networks] This repo includes PyTorch code for training the SuperGlue matching network on top of SIFT keypoints and descriptors.

License: Other

Python 100.00%
pytorch superglue

superglue-pytorch's People

Contributors

yingxin-jia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

superglue-pytorch's Issues

原有的MN3好像写错了?

MN2 = np.concatenate([missing1[np.newaxis, :], (len(kp2)) * np.ones((1, len(missing1)), dtype=np.int64)])
MN3 = np.concatenate([missing2[np.newaxis, :], (len(kp1)) * np.ones((1, len(missing2)), dtype=np.int64)])

训练时的Loss与原论文不一致

您好,非常感谢您的开源工作,您代码中的损失函数为:
train1

论文中的损失函数为:
train

您好像是故意删除了未匹配部分的损失函数,请问这样做是出于什么考虑的呢?精度的损失是否主要因这个操作造成的?谢谢您的关注和回复

内存暴涨问题

您好,打扰了,再测试代码过程中,发现内存暴涨,几乎占满整个内存(测试机器256G内存),运行速度因此变得非常慢...
代码在初次运行中报错:
Traceback (most recent call last):
File "/home/xp1/workdir/SuperGlue-pytorch/train.py", line 183, in
data = superglue(pred)
File "/home/xp1/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/xp1/workdir/SuperGlue-pytorch/models/superglue.py", line 259, in forward
desc0 = desc0 + self.kenc(kpts0, torch.transpose(data['scores0'], 0, 1))
File "/home/xp1/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/xp1/workdir/SuperGlue-pytorch/models/superglue.py", line 84, in forward
return self.encoder(torch.cat(inputs, dim=1))
RuntimeError: Expected object of scalar type Double but got scalar type Float for sequence element 1 in sequence argument at position #1 'tensors'
定位到出错位置为:KeypointEncoder类下的forward函数
后修改
inputs = [kpts.transpose(1, 2), scores.unsqueeze(1)]

inputs = [kpts.transpose(1, 2).double(), scores.unsqueeze(1).double()]
程序可运行,但1次epoch刚运行约一半,就出现内存暴涨问题
希望您能抽时间帮我解答,如果方便您也可以加我的QQ:905181289
非常期待能和您交流,谢谢。

Some questions about test accuracy~

您好,首先感谢您的分享我有一些问题想请教您一下
请问您是怎样对复现结果进行评估的呢?您试过在官方提供的match_pair.py中测试精度吗?此外您使用的训练数据集是coco2014吗?

ValueError: Expected more than 1 spatial element when training, got input size torch.Size([1, 32, 1])

@HeatherJiaZG 您好!感谢您提供的代码!但是我在运行python train.py时出现了以下报错:
Traceback (most recent call last): File "/home/jhk/work code/SuperGlue-pytorch-master/train.py", line 172, in <module> data = superglue(pred) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/jhk/work code/SuperGlue-pytorch-master/models/superglue.py", line 262, in forward desc1 = desc1 + self.kenc(kpts1, torch.transpose(data['scores1'], 0, 1)) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/jhk/work code/SuperGlue-pytorch-master/models/superglue.py", line 83, in forward return self.encoder(torch.cat(inputs, dim=1)) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward input = module(input) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/modules/instancenorm.py", line 72, in forward return self._apply_instance_norm(input) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/modules/instancenorm.py", line 32, in _apply_instance_norm return F.instance_norm( File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/functional.py", line 2482, in instance_norm _verify_spatial_size(input.size()) File "/home/jhk/anaconda3/envs/openglue/lib/python3.8/site-packages/torch/nn/functional.py", line 2449, in _verify_spatial_size raise ValueError("Expected more than 1 spatial element when training, got input size {}".format(size)) ValueError: Expected more than 1 spatial element when training, got input size torch.Size([1, 32, 1])
请问您事先对数据集是否进行过其他处理,或者您是否知道这个报错如何能解决?万分感谢!

训练的速度越来越慢的原因

您好,打扰了,在多轮训练中,训练速度会越来越慢,epoch每增加1,训练时间便会增加一些,并消耗大量内存。
当我使用Oxford building数据集训练模型时,在训练数据到大概10000张,会出现如下报错:
cv2.error: OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\core\src\alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 3145728 bytes in function 'cv::OutOfMemoryError'
定位为:load_data中的kp1, descs1 = sift.detectAndCompute(image, None)
非常希望您能解答我的问题,如果您方便的话可以加我的微信:13253618903,非常期待和您交流,最后再次表达对您开源本项目的谢意。

How to evaluate and test the data after training

After using the COCO data set for training,
1、 how to evaluate and test the accuracy,
2、the training efficiency is too low. Has anyone tried to modify it to batch training?
3、The single-machine multi-card merge time reports the dimensionality error. Does anyone try to solve it?

Thanks to HeatherJiaZG for the training code, but unfortunately he doesn't seem to maintain this library anymore. Someone can provide some help, thanks a lot

There are a few issues with running MAKE_pairs

In superglue model, the input date is a dictionary which contains the key 'all_matches' and 'file_name '.So when train ,the SparseDataset produced training data containing these keys.But the evalution input did not contain any of this, so an error was reported.The 'file_name' is not used in the model and can be added directly, but 'all_matches' is used in the model and I hope you can give me a modified code or give me some comments.
Thank You!

Question about SIFT+SuperGlue

Thanks for your work. Since I notice that the input of the training result is 128 dimension descriptor, have you ever combined SIFT extractor with SuperGlue trained from your code?

NaN loss problem (related to nn.InstanceNorm1d(channels[i]))

Hello, @HeatherJiaZG. Thank you for your effort to write these scripts. I respect your energy and time to make it publicly available.

Here, I have a similar problem to other people about the Nan loss.
From my observation (and my choice of local feature), the NaN loss came from MLP which seems to be associated with nn.InstanceNorm1d(channels[i]) in Line 58 in superglue.py.

I am not sure if it is a good idea to turn it back to BatchNorm1D.

At the moment, I am testing if BatchNorm1D would work.
So, I would like to ask 3 main questions:

  • why did you switch to nn.InstanceNorm1d ? Also, if you just use nn.InstanceNorm1d(x)...this mean that the parameters of InstanceNorm1d is fixed (not trained). ....

  • Is that correct that the param of nn.InstanceNorm1d(x) should not be trained ?

  • And have you solve this problem?

RuntimeError: Expected object of scalar type Double but got scalar type Float for sequence element 1 in sequence argument at position #1 'tensors'

Hi i run the train.py code as is just using the coco validation 2014 (all images are jpg) instead of the original dataset,
i allways get this error:

File "..\SuperGlue-pytorch\models\superglue.py", line 83, in forward
return self.encoder(torch.cat(inputs, dim=1))
RuntimeError: Expected object of scalar type Double but got scalar type Float for sequence element 1 in sequence argument at position #1 'tensors'

can you please help me understand what is the problem here?

thanks

IndexError: index 257 is out of bounds for dimension 0 with size 256

While running the script by @skylook in the repo, following error occurs constantly

Traceback (most recent call last):
  File "/home/shuhulh/superglue_train/train.py", line 180, in <module>
    for i, pred in enumerate(train_loader):
  File "/opt/conda/envs/superglue/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/opt/conda/envs/superglue/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 671, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/opt/conda/envs/superglue/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/superglue/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/shuhulh/superglue_train/superpoint_dataset.py", line 86, in __getitem__
    desc1, scores1 = self.superpoint.computeDescriptorsAndScores({ 'image': data_warped, 'keypoints': kps1_filtered })
  File "/home/shuhulh/superglue_train/models/superpoint.py", line 231, in computeDescriptorsAndScores
    scores = [s[tuple(k.t())] for s, k in zip(scores, keypoints)]
  File "/home/shuhulh/superglue_train/models/superpoint.py", line 231, in <listcomp>
    scores = [s[tuple(k.t())] for s, k in zip(scores, keypoints)]
IndexError: index 257 is out of bounds for dimension 0 with size 256

How to overcome this? Everytime I execute the code, the number in IndexError changes everytime.

部署问题

大佬,请问有没有把superglue部署到生产环境?比如TRT、ncnn之类?速度如何?

Must training with COCO2014 dataset?

Can I use my dataset to train the model?
(It's always wrong when I use ma dataset)
Why there is a mistake as soon as I change the batch-size?
Looking forward to your reply!
Thank you!

Loss: nan

您好!我发现多轮训练后会出现 Loss: nan 的情况,仔细看代码,有一点点不是特别明白想请教您:如下代码中计算loss时为何需要先exp()再求log呢?这里如果score[0][x][y]如果为0,就会出现loss等于nan的情况.另外,这里求loss与论文中似乎有点差别呢?期待您的解答.

check if indexed correctly

    loss = []
    for i in range(len(all_matches[0])):
        x = all_matches[0][i][0]
        y = all_matches[0][i][1]
        loss.append(-torch.log( scores[0][x][y].exp() )) # check batch size == 1 ?

match_pairs.py

in match_pairs.py line 262,function has 5 parameters, so in ./models/utils.py line 263, function should take 5 parameters as well, but only take 4, it will raise error. plz attention!

AttributeError: 'NoneType' object has no attribute 'shape'

I am training the model on my dataset instead of what is mentioned on the github repository (COCO, etc.). While training following error occured:

Traceback (most recent call last):
  File "/home/shuhulh/superglue_train/train.py", line 164, in <module>
    for i, pred in enumerate(train_loader):
  File "/opt/conda/envs/animal-reid/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/opt/conda/envs/animal-reid/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 671, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/opt/conda/envs/animal-reid/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/animal-reid/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/shuhulh/superglue_train/load_data.py", line 30, in __getitem__
    width, height = image.shape[:2]
AttributeError: 'NoneType' object has no attribute 'shape'

I don't understand why does it happen while training. Can you help me out what am I supposed to do here? Do I need to change the name of files (the images) in the image directory (they are usually numerical values).

How to run `match_pairs.py`?

I have trained the model and now I want to check how does the model perform on my test dataset. But while training I check that the model does not get generate the required match_pairs because in match_pairs.py what we need is a .txt file with image pairs and a bunch of number written with such as in ./assets/scannet_sample_pairs_with_gt.txt which consists of file names of image pairs but I don't understand how to generate this txt file. Any help?

训练中loss出现nan

您好!!!首先非常感谢您能放出训练代码,我这个数据是在phototourism上进行训练的,但是在训练过程中会出现nan这个问题,但是感觉loss也不会出现nan的情况,因此想请教一下您在训练出出现过这种bug么?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.