Coder Social home page Coder Social logo

google-research / omniglue Goto Github PK

View Code? Open in Web Editor NEW
483.0 11.0 40.0 11.75 MB

Code release for CVPR'24 submission 'OmniGlue'

Home Page: https://hwjiang1510.github.io/OmniGlue

License: Apache License 2.0

Python 100.00%
image-matching multi-view-geometry

omniglue's Introduction

[CVPR'24] Code release for OmniGlue

Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, Andre Araujo



Open in Spaces


Official code release for the CVPR 2024 paper: OmniGlue: Generalizable Feature Matching with Foundation Model Guidance.

og_diagram.png

Abstract: The image matching field has been witnessing a continuous emergence of novel learnable feature matching techniques, with ever-improving performance on conventional benchmarks. However, our investigation shows that despite these gains, their potential for real-world applications is restricted by their limited generalization capabilities to novel image domains. In this paper, we introduce OmniGlue, the first learnable image matcher that is designed with generalization as a core principle. OmniGlue leverages broad knowledge from a vision foundation model to guide the feature matching process, boosting generalization to domains not seen at training time. Additionally, we propose a novel keypoint position-guided attention mechanism which disentangles spatial and appearance information, leading to enhanced matching descriptors. We perform comprehensive experiments on a suite of 6 datasets with varied image domains, including scene-level, object-centric and aerial images. OmniGlue’s novel components lead to relative gains on unseen domains of 18.8% with respect to a directly comparable reference model, while also outperforming the recent LightGlue method by 10.1% relatively.

Installation

First, use pip to install omniglue:

conda create -n omniglue pip
conda activate omniglue

git clone https://github.com/google-research/omniglue.git
cd omniglue
pip install -e .

Then, download the following models to ./models/

# Download to ./models/ dir.
mkdir models
cd models

# SuperPoint.
git clone https://github.com/rpautrat/SuperPoint.git
mv SuperPoint/pretrained_models/sp_v6.tgz . && rm -rf SuperPoint
tar zxvf sp_v6.tgz && rm sp_v6.tgz

# DINOv2 - vit-b14.
wget https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth

# OmniGlue.
wget https://storage.googleapis.com/omniglue/og_export.zip
unzip og_export.zip && rm og_export.zip

Direct download links:

Usage

The code snippet below outlines how you can perform OmniGlue inference in your own python codebase.

import omniglue

image0 = ... # load images from file into np.array
image1 = ...

og = omniglue.OmniGlue(
  og_export='./models/og_export',
  sp_export='./models/sp_v6',
  dino_export='./models/dinov2_vitb14_pretrain.pth',
)

match_kp0s, match_kp1s, match_confidences = og.FindMatches(image0, image1)
# Output:
#   match_kp0: (N, 2) array of (x,y) coordinates in image0.
#   match_kp1: (N, 2) array of (x,y) coordinates in image1.
#   match_confidences: N-dim array of each of the N match confidence scores.

Demo

demo.py contains example usage of the omniglue module. To try with your own images, replace ./res/demo1.jpg and ./res/demo2.jpg with your own filepaths.

conda activate omniglue
python demo.py ./res/demo1.jpg ./res/demo2.jpg
# <see output in './demo_output.png'>

Expected output: demo_output.png

Repo TODOs

  • Provide demo.py example usage script.
  • Add to image matching webui (credit: @Vincentqyw)
  • Support matching for pre-extracted features.
  • Release eval pipelines for in-domain (MegaDepth).
  • Release eval pipelines for all out-of-domain datasets.

BibTex

@inproceedings{jiang2024Omniglue,
   title={OmniGlue: Generalizable Feature Matching with Foundation Model Guidance},
   author={Jiang, Hanwen and Karpur, Arjun and Cao, Bingyi and Huang, Qixing and Araujo, Andre},
   booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
   year={2024},
}

This is not an officially supported Google product.

omniglue's People

Contributors

arjunkarpur avatar eunchan24 avatar qubvel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

omniglue's Issues

run problem

I tried to match my two pictures, but the following error appeared, how to solve it
图片

Missing matcher code

Can you open source the code for matcher inference, I'd like to see exactly how the matching is achieved and how to combine the feature descriptors of the dino

It appears that dinov2_vits14 is not supported

I noticed that the 'descriptors0_dino' network input in omniglue has an embed_dim of 768, which is the size of dinov2_vitb14. Does this imply that the 384 embed_dim size of dinov2_vits14 is not supported?

Inference takes large GPU memory

Hi,

Thanks for your great work. However, when I run your demo on an RTX 3090 (24GB) it takes about 22GB of GPU memory. Is this normal? Compared to LoFTR, which only consumes about 1.7GB, it is really surprising.

Superpoint excessive memory usage

Hello,
As the title indicates, all memory seems to be occupied when loading the superpoint model.

The image below is the gpu state when only the superpoint model is loaded.
스크린샷 2024-06-20 오후 10 38 36

superpoint_extract.py
Line 40 : tf1.saved_model.loader.load(self._sess, [tf1.saved_model.tag_constants.SERVING
It doesn't load as much as the size of the model on this line and seems to be using all of the GPUs.

Check failed: work_element_count > 0

Hello,

I tried to run the demo code with torch=2.1.2 and tensorflow=2.12.0 with cuda=11.8. It shows the error below and the program aborted:

2024-05-23 16:32:35.814947: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags
2024-05-23 16:32:37.248128: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Loading OmniGlue (and its submodules: SuperPoint & DINOv2)...
2024-05-23 16:32:41.019127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22504 MB memory: -> device: 0, name: NVIDIA RTX A5000, pci bus id: 0000:e1:00.0, compute capability: 8.6
2024-05-23 16:32:52.298761: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22504 MB memory: -> device: 0, name: NVIDIA RTX A5000, pci bus id: 0000:e1:00.0, compute capability: 8.6
WARNING:tensorflow:From /home/qiaomu/code/3D/omniglue/src/omniglue/superpoint_extract.py:40: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.saved_model.load instead.
2024-05-23 16:32:52.353076: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
2024-05-23 16:32:53.883790: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:424] Loaded cuDNN version 8700
2024-05-23 16:32:55.415750: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:637] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2024-05-23 16:32:55.428150: F ./tensorflow/core/util/gpu_launch_config.h:129] Check failed: work_element_count > 0 (0 vs. -1222967296)
Aborted (core dumped)

Do you have any idea of this error? What are the versions of pytorch, tensorflow, and cuda in your program?

Thanks.

Tensorflow with TensorRT required

I get the following error when trying to run demo.py, do I need to build a tensorflow with tensorrt integration? also, do I need to install tensorRT?

2024-05-23 10:12:43.968078: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-23 10:12:44.673121: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
error - usage: python demo.py <img1_fp> <img2_fp>

Does your method require training?

I tried your method on my data and found that the matching was unsatisfactory, so I was wondering if your method requires training. Because I don't see the training code.

cannot down the omniglue

I used the command pip install -e. After this command, I found that omniglue could not be downloaded. omniglue cannot be recognized after downloading using the pyproject.toml file in the main folder

run problem

If I configure the environment well, how can I run a picture of my data set sequence that can explain the specific operation

Code is difficult to reproduce

Hello, which version of tensorflow and torch are you using? I used torch1.13.0+tensorflow2.5+cuda11.2 and got an error: "FileNotFoundError: Op type not registered 'DisableCopyOnRead' in binary runn", and the model could not be loaded. After changing to tensorflow2.13.0, an error still occurred:
tensorflow.python.framework.errors_impl.UnimplementedError: Graph execution error:

Detected at node 'superpoint/pred_tower0/vgg/conv1_1/conv/Relu' defined at (most recent call last):
Node: 'superpoint/pred_tower0/vgg/conv1_1/conv/Relu'
Detected at node 'superpoint/pred_tower0/vgg/conv1_1/conv/Relu' defined at (most recent call last):
Node: 'superpoint/pred_tower0/vgg/conv1_1/conv/Relu'
2 root error(s) found.
(0) UNIMPLEMENTED: DNN library is not found.
[[{{node superpoint/pred_tower0/vgg/conv1_1/conv/Relu}}]]
[[superpoint/descriptors/_157]]
(1) UNIMPLEMENTED: DNN library is not found.
[[{{node superpoint/pred_tower0/vgg/conv1_1/conv/Relu}}]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'superpoint/pred_tower0/vgg/conv1_1/conv/Relu':
How can I solve this problem? Thank you!

OutOfMemoryError

Why does the model consume so much memory? I followed your suggestions to use CUDA acceleration, but during the demo run, my 6GB VRAM was insufficient, resulting in a torch.cuda.OutOfMemoryError. If I want to match two 700KB images, how much GPU memory would be required?

failed call to cuInit: CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE

Hi,

I would like to run omniglue, i have the following setup:

user@e3a1d13a6c45:/workspace$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:06:00.0 Off |                  N/A |
| 23%   33C    P8     9W / 250W |      0MiB / 11178MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
user@e3a1d13a6c45:/workspace$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

still I'm getting this error

user@e3a1d13a6c45:/workspace$ python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
2024-06-07 16:13:11.692100: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-07 16:13:12.505518: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-06-07 16:13:13.476204: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE: forward compatibility was attempted on non supported HW

I did already
pip install tensorflow[and-cuda]

Could you help me, please?

Question: inference takes long time

Thank you for providing the source code for this interesting work. However, I have a question regarding the inference time. On my device ( RTX 3090 (24GB)), a single inference takes 2.92 seconds (average of 100 runs), whereas the paper reports that it can achieve about 50 fps. I look forward to your response.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.