Coder Social home page Coder Social logo

decola's People

Contributors

janghyuncho avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

decola's Issues

ModuleNotFoundError: No module named 'MultiScaleDeformableAttention'

Can you please provide a solution or a recommandation for my problem below. I have already checked other respositories of MultiscaleDefomableAttention and I run my code in google colab environment with T4GPU but somehow I couldn't overcome this problem:

! python train_net.py --num-gpus 8 --config-file /content/DECOLA/configs/BoxSup-DeformDETR_Lbase_CLIP_SwinB_4x.yaml

Traceback (most recent call last):
  File "/content/DECOLA/train_net.py", line 50, in <module>
    from decola.config import add_detic_config
  File "/content/DECOLA/decola/__init__.py", line 11, in <module>
    from .modeling.decola import (decola_deformable_transformer, decola_deformable_detr, 
  File "/content/DECOLA/decola/modeling/decola/decola_deformable_transformer.py", line 10, in <module>
    from models.ops.modules import MSDeformAttn
  File "/content/DECOLA/third_party/Deformable-DETR/models/__init__.py", line 10, in <module>
    from .deformable_detr import build
  File "/content/DECOLA/third_party/Deformable-DETR/models/deformable_detr.py", line 38, in <module>
    from .deformable_transformer import build_deforamble_transformer
  File "/content/DECOLA/third_party/Deformable-DETR/models/deformable_transformer.py", line 21, in <module>
    from models.ops.modules import MSDeformAttn
  File "/content/DECOLA/third_party/Deformable-DETR/models/ops/modules/__init__.py", line 9, in <module>
    from .ms_deform_attn import MSDeformAttn
  File "/content/DECOLA/third_party/Deformable-DETR/models/ops/modules/ms_deform_attn.py", line 21, in <module>
    from ..functions import MSDeformAttnFunction
  File "/content/DECOLA/third_party/Deformable-DETR/models/ops/functions/__init__.py", line 9, in <module>
    from .ms_deform_attn_func import MSDeformAttnFunction
  File "/content/DECOLA/third_party/Deformable-DETR/models/ops/functions/ms_deform_attn_func.py", line 18, in <module>
    import MultiScaleDeformableAttention as MSDA
ModuleNotFoundError: No module named 'MultiScaleDeformableAttention'

Classifier Weights

Hey,

Thank you for the fascinating and excellent work.

I have a question regarding the code in your repository. I noticed that the classifier weights are loaded from existing files, such as datasets/metadata/lvis_v1_clip_a+cname.npy. Does this mean that these weights were optimized as part of the training procedure, or are they equivalent to the CLIP embeddings of the corresponding texts?

Thanks

IndexError: list index out of range

Thank you for your great work!

I want to predict a video or a list of images using demo.py.

command:
python demo.py --config-file configs/DECOLA_PHASE1_L_CLIP_SwinB_4x.yaml --video-input ./test_videos/output2.mp4 --output ./test_videos/output/result.mkv --vocabulary custom --custom_vocabulary sea\ urchin --confidence-threshold 0.3 --language-condition --opts MODEL.WEIGHTS weights/DECOLA_PHASE1_L_CLIP_SwinB_4x.pth

output:

[04/23 19:55:00 detectron2]: Arguments: Namespace(c2=False, confidence_threshold=0.3, config_file='configs/DECOLA_PHASE1_L_CLIP_SwinB_4x.yaml', cpu=False, custom_vocabulary='sea urchin', input=None, language_condition=True, opts=['MODEL.WEIGHTS', 'weights/DECOLA_PHASE1_L_CLIP_SwinB_4x.pth'], output='./test_videos/output/result.mkv', pred_all_class=False, sam_checkpoint='weights/sam/sam_vit_h_4b8939.pth', use_sam=False, video_input='./test_videos/output2.mp4', vocabulary='custom', webcam=None)
Loading pretrained CLIP
/homes/hilary/anaconda3/envs/decola2/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2895.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[04/23 19:55:13 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from weights/DECOLA_PHASE1_L_CLIP_SwinB_4x.pth ...
[04/23 19:55:13 fvcore.common.checkpoint]: [Checkpointer] Loading from weights/DECOLA_PHASE1_L_CLIP_SwinB_4x.pth ...
custom weight normalized. (shape: torch.Size([2, 512]))
[ERROR:[email protected]] global cap_ffmpeg_impl.hpp:3130 open Could not find encoder for codec_id=27, error: Encoder not found
[ERROR:[email protected]] global cap_ffmpeg_impl.hpp:3208 open VIDEOIO/FFMPEG: Failed to initialize VideoWriter
[ERROR:[email protected]] global cap.cpp:643 open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.9.0) /io/opencv/modules/videoio/src/cap_images.cpp:430: error: (-215:Assertion failed) !filename_pattern.empty() in function 'open'

0%| | 0/221 [00:00<?, ?it/s]
Traceback (most recent call last):
File "demo.py", line 240, in
for vis_frame in tqdm.tqdm(demo.run_on_video(video), total=num_frames):
File "/homes/hilary/anaconda3/envs/decola2/lib/python3.8/site-packages/tqdm/std.py", line 1181, in iter
for obj in iterable:
File "/homes/hilary/marinedet/sota_ovd/DECOLA/decola/predictor.py", line 190, in run_on_video
yield process_predictions(frame, self.predictor(frame))
File "/homes/hilary/anaconda3/envs/decola2/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 319, in call
predictions = self.model([inputs])[0]
IndexError: list index out of range

I use the same way to install the env for Detic and DECOLA.
also use similar commands.
Detic works, but DECOLA does not work.
Is there anything else I haven't noticed?

looking forward to your help!

missing metadata

Your work is excellent. When I was running the first phase of training, I found that the file datasets/metadata/lvis_v1_clip_a+object.npy was missing. Could you provide it? I look forward to your reply.

Can not reproduce Direct zero-shot transfer to LVIS v1.0 results

command: python train_net.py --num-gpus 4 --config-file configs/DECOLA_PHASE2_O365IN21k_CLIP_SwinT.yaml --eval-only MODEL.WEIGHTS DECOLA_PHASE2_O365IN21k_CLIP_SwinT.pth

there are several errors:

  1. there is no config named DECOLA_PHASE1_O365_CLIP_SwinT_4x.yaml
  2. there are some redundant config items: NO_FED_LOSS_LIST, ONLINE_LABELING

I try to fix the problems by renaming the file to DECOLA_PHASE1_O365_CLIP_SwinT.yaml and deleting the config items. It works but with poor inference results.
image

Looking forward to your help.

SwinL-Phase2 Weights seems empty.

Thanks for your excellent work. I want using the best swinl-phase2 open-vocabulary models to test some images. However i found the model_zoo link seems empty. Would you provide it? thanks.

No module named 'third_party.DETA'

While following the training guide training phase I, I found this error
python train_net.py --num-gpus 8 --config-file DECOLA_PHASE1_Lbase_CLIP_R5021k_4x.yaml

from third_party.DETA.models.deformable_detr import SetCriterion
ModuleNotFoundError: No module named 'third_party.DETA'

was the DETA really meant for Deformable-DETR?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.