Coder Social home page Coder Social logo

ovsam's People

Contributors

eltociear avatar harboryuan avatar lxtgh avatar ly015 avatar wookiehangover avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ovsam's Issues

hao to Inference self datasts

Hi!Solved the env problem,i want to Inference self dataset,
you code used cocodatas, but in other data hao to test, example(img,segmention label png), when extract the language embeddings is in coco Clases, Will it affect use on other datasets, Can you explain in detail how to test inference on other datasets. thank you

Could you provide a detailed environment configuration example?Does cuda have to be 12.1?

According to your README, you have installed CUDA 12.1, but according to this website https://mmcv.readthedocs.io/en/latest/get_started/installation.html#install-with-pip , I should install PyTorch version 2.1.0 and mmcv 2.1.0. However, it seems that it is not meeting the requirement "Please install mmcv>=2.0.0, <2.1.0." Can you please tell me the correct environment configuration requirements? Thank you!!

When I execute "bash tools/dist.sh test seg/configs/sam2clip/sam_vith_dump.py 1", I get this error.

Traceback (most recent call last):
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/lazy.py", line 68, in build
module = importlib.import_module(self._module)
File "/root/miniconda3/envs/ovsam/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/workspace/ovsam/seg/models/detectors/init.py", line 1, in
from .sam2clip_distill import BackboneDistillation
File "/workspace/ovsam/seg/models/detectors/sam2clip_distill.py", line 6, in
from mmdet.models.detectors.base import ForwardResults
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/init.py", line 3, in
from .data_preprocessors import * # noqa: F401,F403
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/data_preprocessors/init.py", line 6, in
from .reid_data_preprocessor import ReIDDataPreprocessor
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/data_preprocessors/reid_data_preprocessor.py", line 13, in
import mmpretrain
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmpretrain/init.py", line 18, in
and mmcv_version < digit_version(mmcv_maximum_version)),
AssertionError: MMCV==2.1.0 is used but incompatible. Please install mmcv>=2.0.0, <2.1.0.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/workspace/ovsam/tools/test.py", line 177, in
main()
File "/workspace/ovsam/tools/test.py", line 141, in main
runner = Runner.from_cfg(cfg)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 445, in from_cfg
runner = cls(
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 412, in init
self.model = self.build_model(model)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 819, in build_model
model = MODELS.build(model)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 96, in build_from_cfg
obj_type = args.pop('type')
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/config.py", line 182, in pop
return self.build_lazy(super().pop(key, default))
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/config.py", line 215, in build_lazy
value = value.build()
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/lazy.py", line 70, in build
raise type(e)(f'Failed to import {self._module} '
AssertionError: Failed to import seg.models.detectors in seg/configs/sam2clip/sam_vith_dump.py, line 5 for MMCV==2.1.0 is used but incompatible. Please install mmcv>=2.0.0, <2.1.0.
[2024-07-05 10:13:04,386] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 40539) of binary: /root/miniconda3/envs/ovsam/bin/python
Traceback (most recent call last):
File "/root/miniconda3/envs/ovsam/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==2.1.0', 'console_scripts', 'torchrun')())
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

about "Feature-Crop baseline"

Hello, I would like to ask how the "Feature-Crop baseline" mentioned in the paper crops features using a mask? Is there any specific paper that I can refer to?

Where can I get the sam dataset

I downloaded the SA-1B dataset, but the website says NOTE: There are no class labels for the images or mask annotations.
After downloading and decompressing, it is not a json file. Where can I download the SAM dataset of your project

text embedding

Hello author,

I want to ask where is the text embedding extration file? How do you process the text dataset? Thanks!

Prompt compare SAM

OVSAM prompt is point and box,Is it any different compare with SAM? in train or infer, hao to get prompt? if test on images, mast get prompt by det or assign a bbox or rect?Can you make predictions by generating dense boxes

Object level masks

Hi @HarborYuan,

Thank you for the great work! You mentioned in the paper that you are concentrating only object level masks using segment anything model.

  • What was done specifically to get object level masks only and avoid part masks?
  • What size of the point grid was used to train the model? Was it different from 32x32 original SAM grid?
  • Where can I find the list of all classes list? is it possible to restrict the amount of classes to what I need?

Thank you!

Reduce Train Batch Size

Hi! I want to reduce train batch size from 2 to 1. How can I do it?
I'm looking forward to your early reply. Thanks!

The key argument of `Registry.get` must be a str

when i run the inference command, there is a error, how can i solve it?

Traceback (most recent call last):
File "/maggie.meng/code/ovsam/tools/test.py", line 177, in
main()
File "/maggie.meng/code/ovsam/tools/test.py", line 141, in main
runner = Runner.from_cfg(cfg)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 445, in from_cfg
runner = cls(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 412, in init
self.model = self.build_model(model)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 819, in build_model
Traceback (most recent call last):
File "/maggie.meng/code/ovsam/tools/test.py", line 177, in
model = MODELS.build(model)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
main()
File "/maggie.meng/code/ovsam/tools/test.py", line 141, in main
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
runner = Runner.from_cfg(cfg)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 445, in from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/maggie.meng/code/ovsam/seg/models/detectors/ovsam.py", line 63, in init
runner = cls(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 412, in init
self.neck = MODELS.build(neck)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
self.model = self.build_model(model)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 819, in build_model
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
model = MODELS.build(model) File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg

File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
obj = obj_cls(**args) # type: ignore
File "/maggie.meng/code/ovsam/seg/models/necks/transformer_neck.py", line 43, in init
patch_embed = PatchEmbed(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmdet/models/layers/transformer/utils.py", line 250, in init
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
self.projection = build_conv_layer(
return build_from_cfg(cfg, registry, default_args) File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmcv/cnn/bricks/conv.py", line 43, in build_conv_layer

File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
conv_layer = registry.get(layer_type)
obj = obj_cls(**args) # type: ignore File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 441, in get

File "/maggie.meng/code/ovsam/seg/models/detectors/ovsam.py", line 63, in init
self.neck = MODELS.build(neck)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
raise TypeError(
TypeError: The key argument of Registry.get must be a str, got <class 'type'>
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/maggie.meng/code/ovsam/seg/models/necks/transformer_neck.py", line 43, in init
patch_embed = PatchEmbed(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmdet/models/layers/transformer/utils.py", line 250, in init
self.projection = build_conv_layer(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmcv/cnn/bricks/conv.py", line 43, in build_conv_layer
conv_layer = registry.get(layer_type)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 441, in get
raise TypeError(
TypeError: The key argument of Registry.get must be a str, got <class 'type'>
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 523118) of binary: /root/anaconda3/envs/ovsam_demo/bin/python
Traceback (most recent call last):
File "/root/anaconda3/envs/ovsam_demo/bin/torchrun", line 8, in
sys.exit(main())
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 345, in wrapper
return f(*args, **kwargs)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

tools/test.py FAILED

Failures:
[1]:
time : 2024-08-27_16:34:49
host : 9rqdhjcat3fsm-0
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 523119)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2024-08-27_16:34:49
host : 9rqdhjcat3fsm-0
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 523118)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

How to recognize 22,000 classes?

Hi, thank you for your valuable contribution!

I appreciate your work on the ovsam model. In your paper, you mentioned that the model can currently segment and recognize around 22,000 classes. However, when I tested the example provided in the demo, it appears that only approximately 1,000 classes can be recognized. I noticed that the names field is defined in this file.

Could you please clarify whether my understanding is correct? If I have misunderstood, kindly point out the correct information. Thank you very much for your clarification.

How to add prompt?

I try the demo in huggingface, but it only support to click some place in picture to do inference, can it support add some text prompt instead of click on picture?

novel_score is very low (checkpoint evalutation on test)

Hi. Thanks for the code and paper.

When I evaluate the provided checkpoint with the codebase, I am able to reproduce all the COCO values that were reported in the paper.

But I have a question about the values printed in the terminal. Is the novel_score the accuracy on class prediction for the novel classes? Why is novel_score so low compared to base_score?

mmengine - INFO - Epoch(test) [1209/1209]    miou: 0.6791  base_iou: 0.6835  novel_iou: 0.6521 
                                             score: 76.7359  base_score: 87.4120  novel_score: 11.1742 
                                             data_time: 0.0165  time: 0.2565

Thanks.

Missing ovsam.py File - Upload Inquiry

Hello,

I encountered a ModuleNotFoundError related to the absence of the ovsam.py file. Could you please confirm if it will be uploaded?

Error Details:

ModuleNotFoundError: Failed to import seg.models.detectors in seg/configs/ovsam/ovsam_coco_rn50x16_point.py, line 7 for No module named 'seg.models.detectors.ovsam'

Thank you!

Problem regarding to creating the environment

After conda install pytorch torchvision torchaudio cuda-toolkit pytorch-cuda==12.1 -c pytorch -c "nvidia/label/cuda-12.1.0", there will be an error like:

Could not solve for environment specs
The following package could not be installed
└─ pytorch-cuda 12.1 is not installable because it requires
└─ libnvjitlink >=12.1.105,<12.2.0 , which does not exist (perhaps a missing channel).

Can you help me with this? Thank you so much.

RuntimeError: GET was unable to find an engine to execute this computation

I prepare envirement and want to Inference,but run bash tools/dist.sh test seg/configs/ovsam/ovsam_coco_rn50x16_point.py 8 happend GRT false,
its: File "/mnt/NewDataShare/D4/common/wbzhou/MLLM/ovsam/seg/models/data_preprocessor/ovsam_preprocessor.py", line 193, in forward
gt_instances.point_coords = get_center_coords(
File "/mnt/NewDataShare/D4/common/wbzhou/MLLM/ovsam/seg/models/data_preprocessor/ovsam_preprocessor.py", line 24, in
ovsam_bug

Is this issue due to incorrect environment configuration?
torch==2.1.2+cu121 torchvison==0.16.2+cu121 mmcv==2.1.0 mmdet==3.3.0

how to predict iou in ovsam?

Hello, I wonder how to predict iou in ovsam?

The paper states that there are three tokens, including iou、label、mask tokens, but the weights of iou_token is not found in the model('clip2sam_coco_rn50x16.pth'). There are only two token(mask 、label token) weights in the model('clip2sam_coco_rn50x16.pth').
Besides, by comparing the code of sam decoder, I found that you replaced the original iou_token position with label_token. When I obtain the iou_token, how do I predict the iou in the code?

My questions are as follows:

  1. Where can I get iou_token weights?
  2. When there are iou_token weights, how to implement the code for iou prediction?

How $Q_{label}$ is updated?

Hi. As mentioned in your paper, $Q_{label}$ is the key to CLIP2SAM. I noticed that $Q_{label}$ is a learnable token, am I right? And the paper metioned that: 'The final labels are derived by calculating the distance between the refined label token and the CLIP text embedding, as in Equ. (1)'. It means $Q_{label}$ is aligned with text embeddings, and then get the class label through cosine similarity. However, I found that in your code, the roi embeddings is not include Q, as follows,

roi_feats = roi_feats[:, None] + 0 * cls_embed

So where does $Q_{label}$ get the gradient for updating? This confuses me. Looking forward to your reply. Thank you in advance!

_pickle.UnpicklingError: invalid load key, 'v'.

Hello,

When I attempt to execute the test case using the following command:

python tools/test.py seg/configs/ovsam/ovsam_coco_rn50x16_point.py

I encountered the following error. Could you please guide me on how to resolve it? Any assistance would be greatly appreciated.

Error Details:

Traceback (most recent call last):
  File "tools/test.py", line 177, in <module>
    main()
  File "tools/test.py", line 141, in main
    runner = Runner.from_cfg(cfg)
  ...
  ...
  ...
  _pickle.UnpicklingError: invalid load key, 'v'.

Thank you!

Minimum Cuda Memory Required for SAM2CLIP Training

Thank you for all your hard work!
I encountered a "Cuda out of memory" error when running "bash tools/dist.sh train seg/configs/sam2clip/sam2clip_vith_rn50x16.py 8" on 8 RTX2080Ti GPUs to train SAM2CLIP. So, what is the minimum cuda memory required for SAM2CLIP training?
I‘m looking forward to your reply.

How can I evaluate on LVIS dataset?

Thanks for your sharing of your excellent work!

As described in the title, can you provide some guidance for me to validate OVSAM on LVIS dataset.

time

How long does training and inference take respectively?

FileNotFoundError

Hi, could you please provide the download link for RN50x16_CocoOVDataset.pth? I couldn't find the relevant download link.

Evaluation scripts on LVIS dataset

          Hi @HarborYuan,

Can I know when I can see the test scripts on LVIS dataset? Five days had flown by since you replied last time, and 3 weeks had flown by since this issue was created.

Hope for your updates!

Originally posted by @Dyb3438 in #34 (comment)

Completed

          You need to modify [this](https://github.com/HarborYuan/ovsam/blob/1d4dfb287fe113e8ecd60f76b4385c5506f566ca/seg/configs/ovsam/ovsam_coco_rn50x16_point.py#L13) ([file](https://github.com/HarborYuan/ovsam/blob/1d4dfb287fe113e8ecd60f76b4385c5506f566ca/seg/configs/_base_/datasets/coco_ov_instance_lsj.py)) config file to support more dataset.

To write such a config, you may need to write a new dataset class starting from COCO and import it to your config.

Originally posted by @HarborYuan in #27 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.