harboryuan / ovsam Goto Github PK
View Code? Open in Web Editor NEW[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
Home Page: https://www.mmlab-ntu.com/project/ovsam
License: Other
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
Home Page: https://www.mmlab-ntu.com/project/ovsam
License: Other
As the title, thank you for your attention and help !
Hi!Solved the env problem,i want to Inference self dataset,
you code used cocodatas, but in other data hao to test, example(img,segmention label png), when extract the language embeddings is in coco Clases, Will it affect use on other datasets, Can you explain in detail how to test inference on other datasets. thank you
Hey, the huggingface space fails to work.
According to your README, you have installed CUDA 12.1, but according to this website https://mmcv.readthedocs.io/en/latest/get_started/installation.html#install-with-pip , I should install PyTorch version 2.1.0 and mmcv 2.1.0. However, it seems that it is not meeting the requirement "Please install mmcv>=2.0.0, <2.1.0." Can you please tell me the correct environment configuration requirements? Thank you!!
When I execute "bash tools/dist.sh test seg/configs/sam2clip/sam_vith_dump.py 1", I get this error.
Traceback (most recent call last):
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/lazy.py", line 68, in build
module = importlib.import_module(self._module)
File "/root/miniconda3/envs/ovsam/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/workspace/ovsam/seg/models/detectors/init.py", line 1, in
from .sam2clip_distill import BackboneDistillation
File "/workspace/ovsam/seg/models/detectors/sam2clip_distill.py", line 6, in
from mmdet.models.detectors.base import ForwardResults
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/init.py", line 3, in
from .data_preprocessors import * # noqa: F401,F403
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/data_preprocessors/init.py", line 6, in
from .reid_data_preprocessor import ReIDDataPreprocessor
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmdet/models/data_preprocessors/reid_data_preprocessor.py", line 13, in
import mmpretrain
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmpretrain/init.py", line 18, in
and mmcv_version < digit_version(mmcv_maximum_version)),
AssertionError: MMCV==2.1.0 is used but incompatible. Please install mmcv>=2.0.0, <2.1.0.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspace/ovsam/tools/test.py", line 177, in
main()
File "/workspace/ovsam/tools/test.py", line 141, in main
runner = Runner.from_cfg(cfg)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 445, in from_cfg
runner = cls(
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 412, in init
self.model = self.build_model(model)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/runner/runner.py", line 819, in build_model
model = MODELS.build(model)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 96, in build_from_cfg
obj_type = args.pop('type')
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/config.py", line 182, in pop
return self.build_lazy(super().pop(key, default))
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/config.py", line 215, in build_lazy
value = value.build()
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/mmengine/config/lazy.py", line 70, in build
raise type(e)(f'Failed to import {self._module} '
AssertionError: Failed to import seg.models.detectors in seg/configs/sam2clip/sam_vith_dump.py, line 5 for MMCV==2.1.0 is used but incompatible. Please install mmcv>=2.0.0, <2.1.0.
[2024-07-05 10:13:04,386] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 40539) of binary: /root/miniconda3/envs/ovsam/bin/python
Traceback (most recent call last):
File "/root/miniconda3/envs/ovsam/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==2.1.0', 'console_scripts', 'torchrun')())
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/envs/ovsam/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Hello, I would like to ask how the "Feature-Crop baseline" mentioned in the paper crops features using a mask? Is there any specific paper that I can refer to?
I downloaded the SA-1B dataset, but the website says NOTE: There are no class labels for the images or mask annotations.
After downloading and decompressing, it is not a json file. Where can I download the SAM dataset of your project
Thanks for you excellent work!
As a SAM variant, does OV-SAM support segment everything(with class) mode in SAM?
Just like SamAutomaticMaskGenerator
in sam
Hello author,
I want to ask where is the text embedding extration file? How do you process the text dataset? Thanks!
OVSAM prompt is point and box,Is it any different compare with SAM? in train or infer, hao to get prompt? if test on images, mast get prompt by det or assign a bbox or rect?Can you make predictions by generating dense boxes
i want to test the inference results on my data, how to modify the code or config?
Hi @HarborYuan,
Thank you for the great work! You mentioned in the paper that you are concentrating only object level masks using segment anything model.
Thank you!
Hi! I want to reduce train batch size from 2 to 1. How can I do it?
I'm looking forward to your early reply. Thanks!
when i run the inference command, there is a error, how can i solve it?
Traceback (most recent call last):
File "/maggie.meng/code/ovsam/tools/test.py", line 177, in
main()
File "/maggie.meng/code/ovsam/tools/test.py", line 141, in main
runner = Runner.from_cfg(cfg)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 445, in from_cfg
runner = cls(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 412, in init
self.model = self.build_model(model)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 819, in build_model
Traceback (most recent call last):
File "/maggie.meng/code/ovsam/tools/test.py", line 177, in
model = MODELS.build(model)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
main()
File "/maggie.meng/code/ovsam/tools/test.py", line 141, in main
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
runner = Runner.from_cfg(cfg)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 445, in from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/maggie.meng/code/ovsam/seg/models/detectors/ovsam.py", line 63, in init
runner = cls(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 412, in init
self.neck = MODELS.build(neck)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
self.model = self.build_model(model)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/runner/runner.py", line 819, in build_model
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
model = MODELS.build(model) File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
obj = obj_cls(**args) # type: ignore
File "/maggie.meng/code/ovsam/seg/models/necks/transformer_neck.py", line 43, in init
patch_embed = PatchEmbed(
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmdet/models/layers/transformer/utils.py", line 250, in init
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
self.projection = build_conv_layer(
return build_from_cfg(cfg, registry, default_args) File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmcv/cnn/bricks/conv.py", line 43, in build_conv_layer
File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
conv_layer = registry.get(layer_type)
obj = obj_cls(**args) # type: ignore File "/root/anaconda3/envs/ovsam_demo/lib/python3.10/site-packages/mmengine/registry/registry.py", line 441, in get
Registry.get
must be a str, got <class 'type'>Registry.get
must be a str, got <class 'type'>Thank you very much for your work, can it work like the SAM full segmentation process?
Hi, thank you for your valuable contribution!
I appreciate your work on the ovsam model. In your paper, you mentioned that the model can currently segment and recognize around 22,000 classes. However, when I tested the example provided in the demo, it appears that only approximately 1,000 classes can be recognized. I noticed that the names
field is defined in this file.
Could you please clarify whether my understanding is correct? If I have misunderstood, kindly point out the correct information. Thank you very much for your clarification.
I tried both point and box prompt for sample images, but both failed for sky segmentation.
I deploy the demo following the instruments.
Is there anything i missed?
Hi, I want to know if I want to segment out class A objects. If there is more than one class A object in the image, then by entering the class A name, will all the objects be segmented out.
Looking forward to your reply
I try the demo in huggingface, but it only support to click some place in picture to do inference, can it support add some text prompt instead of click on picture?
How to use demo offline?
Hi. Thanks for the code and paper.
When I evaluate the provided checkpoint with the codebase, I am able to reproduce all the COCO values that were reported in the paper.
But I have a question about the values printed in the terminal. Is the novel_score
the accuracy on class prediction for the novel classes? Why is novel_score
so low compared to base_score
?
mmengine - INFO - Epoch(test) [1209/1209] miou: 0.6791 base_iou: 0.6835 novel_iou: 0.6521
score: 76.7359 base_score: 87.4120 novel_score: 11.1742
data_time: 0.0165 time: 0.2565
Thanks.
Hello,
I encountered a ModuleNotFoundError related to the absence of the ovsam.py file. Could you please confirm if it will be uploaded?
Error Details:
ModuleNotFoundError: Failed to import seg.models.detectors in seg/configs/ovsam/ovsam_coco_rn50x16_point.py, line 7 for No module named 'seg.models.detectors.ovsam'
Thank you!
No such file or directory: '/root/.cache/embd/RN50x16_CocoOVDataset.pth', how to download this model?
After conda install pytorch torchvision torchaudio cuda-toolkit pytorch-cuda==12.1 -c pytorch -c "nvidia/label/cuda-12.1.0", there will be an error like:
Could not solve for environment specs
The following package could not be installed
└─ pytorch-cuda 12.1 is not installable because it requires
└─ libnvjitlink >=12.1.105,<12.2.0 , which does not exist (perhaps a missing channel).
Can you help me with this? Thank you so much.
I prepare envirement and want to Inference,but run bash tools/dist.sh test seg/configs/ovsam/ovsam_coco_rn50x16_point.py 8 happend GRT false,
its: File "/mnt/NewDataShare/D4/common/wbzhou/MLLM/ovsam/seg/models/data_preprocessor/ovsam_preprocessor.py", line 193, in forward
gt_instances.point_coords = get_center_coords(
File "/mnt/NewDataShare/D4/common/wbzhou/MLLM/ovsam/seg/models/data_preprocessor/ovsam_preprocessor.py", line 24, in
Is this issue due to incorrect environment configuration?
torch==2.1.2+cu121 torchvison==0.16.2+cu121 mmcv==2.1.0 mmdet==3.3.0
Hello, I wonder how to predict iou in ovsam?
The paper states that there are three tokens, including iou、label、mask tokens, but the weights of iou_token is not found in the model('clip2sam_coco_rn50x16.pth'). There are only two token(mask 、label token) weights in the model('clip2sam_coco_rn50x16.pth').
Besides, by comparing the code of sam decoder, I found that you replaced the original iou_token position with label_token. When I obtain the iou_token, how do I predict the iou in the code?
My questions are as follows:
Hi. As mentioned in your paper,
ovsam/seg/models/heads/ovsam_head.py
Line 219 in 137d2c2
Hello,
When I attempt to execute the test case using the following command:
python tools/test.py seg/configs/ovsam/ovsam_coco_rn50x16_point.py
I encountered the following error. Could you please guide me on how to resolve it? Any assistance would be greatly appreciated.
Error Details:
Traceback (most recent call last):
File "tools/test.py", line 177, in <module>
main()
File "tools/test.py", line 141, in main
runner = Runner.from_cfg(cfg)
...
...
...
_pickle.UnpicklingError: invalid load key, 'v'.
Thank you!
Thank you for your great work! I found the label token in OVSAMHead seems to be useless because the cls_embed is multiplied with 0 like below.
ovsam/seg/models/heads/ovsam_head.py
Line 219 in 62c8ab5
Do I understand right? Thank you!
Thank you for all your hard work!
I encountered a "Cuda out of memory" error when running "bash tools/dist.sh train seg/configs/sam2clip/sam2clip_vith_rn50x16.py 8" on 8 RTX2080Ti GPUs to train SAM2CLIP. So, what is the minimum cuda memory required for SAM2CLIP training?
I‘m looking forward to your reply.
Thanks for your sharing of your excellent work!
As described in the title, can you provide some guidance for me to validate OVSAM on LVIS dataset.
How long does training and inference take respectively?
Hi, could you please provide the download link for RN50x16_CocoOVDataset.pth
? I couldn't find the relevant download link.
so the model can change some code to use ind mmdet 2.0?
seem like model sha256 doesn't match?
Hi @HarborYuan,
Can I know when I can see the test scripts on LVIS dataset? Five days had flown by since you replied last time, and 3 weeks had flown by since this issue was created.
Hope for your updates!
Originally posted by @Dyb3438 in #34 (comment)
Hi, thank you for your hard work!
I run your code but it raises an error:
"FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/.cache/embd/RN50x16_CocoOVDataset.pth'"
Could you tell me how to solve this problem?
I'm looking forward to your reply.
You need to modify [this](https://github.com/HarborYuan/ovsam/blob/1d4dfb287fe113e8ecd60f76b4385c5506f566ca/seg/configs/ovsam/ovsam_coco_rn50x16_point.py#L13) ([file](https://github.com/HarborYuan/ovsam/blob/1d4dfb287fe113e8ecd60f76b4385c5506f566ca/seg/configs/_base_/datasets/coco_ov_instance_lsj.py)) config file to support more dataset.
To write such a config, you may need to write a new dataset class starting from COCO and import it to your config.
Originally posted by @HarborYuan in #27 (comment)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.