Coder Social home page Coder Social logo

nvlabs / odise Goto Github PK

View Code? Open in Web Editor NEW
801.0 40.0 44.0 16.8 MB

Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]

Home Page: https://arxiv.org/abs/2303.04803

License: Other

Python 99.51% Dockerfile 0.49%
deep-learning instance-segmentation panoptic-segmentation pytorch semantic-segmentation diffusion-models text-image-retrieval zero-shot-learning open-vocabulary open-vocabulary-segmentation

odise's Introduction

ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation exploits pre-trained text-image diffusion and discriminative models to perform open-vocabulary panoptic segmentation. It leverages the frozen representation of both these models to perform panoptic segmentation of any category in the wild.

This repository is the official implementation of ODISE introduced in the paper:

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models Jiarui Xu, Sifei Liu*, Arash Vahdat*, Wonmin Byeon, Xiaolong Wang, Shalini De Mello CVPR 2023 Highlight. (*equal contribution)

For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing.

teaser

Visual Results

Links

Citation

If you find our work useful in your research, please cite:

@article{xu2023odise,
  title={{Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models}},
  author={Xu, Jiarui and Liu, Sifei and Vahdat, Arash and Byeon, Wonmin and Wang, Xiaolong and De Mello, Shalini},
  journal={arXiv preprint arXiv:2303.04803},
  year={2023}
}

Environment Setup

Install dependencies by running:

conda create -n odise python=3.9
conda activate odise
conda install pytorch=1.13.1 torchvision=0.14.1 pytorch-cuda=11.6 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.6.1" libcusolver-dev
git clone [email protected]:NVlabs/ODISE.git 
cd ODISE
pip install -e .

(Optional) install xformers for efficient transformer implementation: One could either install the pre-built version

pip install xformers==0.0.16

or build from latest source

# (Optional) Makes the build much faster
pip install ninja
# Set TORCH_CUDA_ARCH_LIST if running and building on different GPU types
pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers
# (this can take dozens of minutes)

Model Zoo

We provide two pre-trained models for ODISE trained with label or caption supervision on COCO's entire training set. ODISE's pre-trained models are subject to the Creative Commons — Attribution-NonCommercial-ShareAlike 4.0 International — CC BY-NC-SA 4.0 License terms. Each model contains 28.1M trainable parameters. The download links for these models are provided in the table below. When you run the demo/demo.py or inference script for the very first time, it will also automatically download ODISE's pre-trained model to your local folder $HOME/.torch/iopath_cache/NVlabs/ODISE/releases/download/v1.0.0/.

ADE20K(A-150) COCO ADE20K-Full
(A-847)
Pascal Context 59
(PC-59)
Pascal Context 459
(PC-459)
Pascal VOC 21
(PAS-21)
download
PQ mAP mIoU PQ mAP mIoU mIoU mIoU mIoU mIoU
ODISE (label) 22.6 14.4 29.9 55.4 46.0 65.2 11.1 57.3 14.5 84.6 checkpoint
ODISE (caption) 23.4 13.9 28.7 45.6 38.4 52.4 11.0 55.3 13.8 82.7 checkpoint

Get Started

See Preparing Datasets for ODISE.

See Getting Started with ODISE for detailed instructions on training and inference with ODISE.

Demo

Important Note: When you run the demo/demo.py script for the very first time, besides ODISE's pre-trained models, it will also automaticlaly download the pre-trained models for Stable Diffusion v1.3 and CLIP, from their original sources, to your local directories $HOME/.torch/ and $HOME/.cache/clip, respectively. The pre-trained models for Stable Diffusion and CLIP are subject to their original license terms from Stable Diffusion and CLIP, respectively.

  • To run ODISE's demo from the command line:

    python demo/demo.py --input demo/examples/coco.jpg --output demo/coco_pred.jpg --vocab "black pickup truck, pickup truck; blue sky, sky"

    The output is saved in demo/coco_pred.jpg. For more detailed options for demo/demo.py see Getting Started with ODISE.

  • To run the Gradio demo locally:

    python demo/app.py

Acknowledgement

Code is largely based on Detectron2, Stable Diffusion, Mask2Former, OpenCLIP and GLIDE.

Thank you, all, for the great open-source projects!

odise's People

Contributors

shalinidemello avatar xvjiarui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

odise's Issues

Training Time

Thanks for releasing the code. How long does your method need to train?

cityscape evaluation

could you please indicate what the command for evaluating on cityscape is?
thanks in advance.

Abut the different of code and paper

The paper claims that ODISE freezes the Denoising Unet. However, upon inspecting the code from ODISE's "ldm.py" file, I encountered some aspects that left me uncertain about the actual freezing status of the Unet. This code is from ODISE/odise/modeling/meta_arch/ldm.py 974
20230626-103552

Used GPUs -

Dear authors,

thank you for your brilliant work!
I have one question concerning the used GPUs. According to NVIDIA the V100s are available with 16GB and 32GB as well. Which ones did you use?

BR
Thanos

fatal error: cusparse.h: No such file or directory

when run "pip install -e .",
the error happen:
lude/ATen/cuda/CUDAContext.h:6:10: fatal error: cusparse.h: No such file or directory
#include <cusparse.h>
^~~~~~~~~~~~
compilation terminated.
error: command '/home/cheng/ws/miniconda3/envs/odise/bin/nvcc' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> detectron2

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

environment implement

i follow your install.md step by step and one by one .but unfortunately i still confronted this problem when i run pip install -e .
Failed to build mask2former
ERROR: Could not build wheels for mask2former, which is required to install pyproject.toml-based projects

i really confused about all this stuff and tried for a long time ,could you kind god give some solutions or some great links that i could download these beautiful datasets ?

much appreciate!!

RuntimeError: CUDA error: invalid argument with 3090 GPU

Thanks for your great work. When I try to train the model with eight 3090 GPUs with the following commands,
./tools/train_net.py --config-file configs/Panoptic/odise_label_coco_50e.py --num-gpus 8 --amp --ref 32

The following errors are encountered.

Starting training from iteration 0
 Exception during training:
Traceback (most recent call last):
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/train_loop.py", line 149, in train
    self.run_step()
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
    grad_norm = self.grad_scaler(
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
    self._scaler.scale(loss).backward(create_graph=create_graph)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/ldm/modules/diffusionmodules/util.py", line 142, in backward
    input_grads = torch.autograd.grad(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autogr
[06/17 03:38:05 d2.engine.hooks]: Total training time: 0:00:25 (0:00:00 on hooks)
[06/17 03:38:05 d2.utils.events]: odise_label_coco_50e_bs16x8/default  iter: 0/368752    lr: N/A  max_mem: 19297M
Traceback (most recent call last):
  File "/home/zoloz/8T-1/zitong/code/ODISE/./tools/train_net.py", line 392, in <module>
    launch(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/launch.py", line 67, in launch
    mp.spawn(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/launch.py", line 126, in _distributed_worker
    main_func(*args)
  File "/home/zoloz/8T-1/zitong/code/ODISE/tools/train_net.py", line 363, in main
    do_train(args, cfg)
  File "/home/zoloz/8T-1/zitong/code/ODISE/tools/train_net.py", line 309, in do_train
    trainer.train(start_iter, cfg.train.max_iter)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/train_loop.py", line 149, in train
    self.run_step()
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
    grad_norm = self.grad_scaler(
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
    self._scaler.scale(loss).backward(create_graph=create_graph)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/ldm/modules/diffusionmodules/util.py", line 142, in backward
    input_grads = torch.autograd.grad(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 414, in wrapper
    outputs = fn(ctx, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/xformers/ops/fmha/__init__.py", line 111, in backward
    grads = _memory_efficient_attention_backward(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/xformers/ops/fmha/__init__.py", line 382, in _memory_efficient_attention_backward
    grads = op.apply(ctx, inp, grad)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/xformers/ops/fmha/cutlass.py", line 184, in apply
    (grad_q, grad_k, grad_v,) = cls.OPERATOR(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/_ops.py", line 442, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Performance of the demo varies dramatically with the CUDA device

Hi, and thank you for your fantastic work! I have encountered a minor issue that I'd like to bring to your attention. I found that when I change the line model.to(cfg.train.device) to model.to("cuda:1") (or any other device) in the demo.ipynb, there is a significant difference in the generated segmentation map compared to the original (please see the attached image below).

output

The original code runs perfectly fine and produces results consistent with those in the paper. However, when I make this modification, I don't encounter any specific warnings or errors, so I'm uncertain where the issue lies (I suspect that perhaps some modules are not loaded correctly). I'd greatly appreciate your help with this issue. Thank you!

error in gradio demo app.py

I can run demo.py and demo.ipynb, but when I run the app.py file by the command python demo/app.py, the following error occurs:

Traceback (most recent call last):
File "/home/luoc/workspace/ODISE/demo/app.py", line 294, in
examples_handler = gr.Examples(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/helpers.py", line 71, in create_examples
client_utils.synchronize_async(examples_obj.create)
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio_client/utils.py", line 359, in synchronize_async
return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs) # type: ignore
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
raise return_result
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
result[0] = await coro
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/helpers.py", line 278, in create
await self.cache()
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/helpers.py", line 312, in cache
prediction = await Context.root_block.process_api(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/blocks.py", line 1108, in process_api
result = await self.call_function(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/blocks.py", line 915, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/luoc/workspace/ODISE/demo/app.py", line 253, in inference
model=models[model_name],
KeyError: None

a weird bug

Thanks for the nice work!
So I was playing with some images using the hugging face demo, and I found out that the model is able to detect the coffee maker in the scene if I use the LVIS categories. However, if I just use a single category "coffee maker,coffee machine", the model is not able to detect the coffee maker in the image. Do you know what might be the problem here? BTW, I can provide the image if you want.

Should the input image be in RGB or BGR?

Thanks for the excellent work in open source.
The run_on_image function says the input image should be in BGR order. But in the demo code, the input image is in RGB mode. So, I'm unsure about which mode would yield better results.

Additionally, I found that the ODISE(Lable) model doesn't recognize the "poles". What could be the reason? Is the prompt "poles" incorrect?

The performance obtained is not ideal

Hello, thank you for your excellent work. Can you provide a detailed environment configuration for running your code? The results I achieved locally differ significantly from the expectations you provided.
image

Error while installing ODISE

I am running into many errors while installing ODISE. Mainly with compiling Mask2Former.

Errors include:

 fatal error: 'crypt.h' file not found
 fatal error: 'cusparse.h' file not found

Here is my workaround (or at least attempted).

Install Mask2former from their repo (https://github.com/facebookresearch/Mask2Former/blob/main/INSTALL.md) -- this is the main issue. However make sure you use python 3.9.

Once you are able to install Detectron2 and Mask2former. Then you should be set for ODISE.

I had to append CUDA_HOME="/usr/local/cuda-11.3" pip install -e . CUDA_HOME to install detectron2 and mask2former inside my conda environment.

512x512 configuration as in ablation studies

Hello, could you share the $512\times512$ configuration used in the ablation study? Is there any other change other than the resolution?

I've just modify all 1024 into 512 in configs/common/data/coco_panoptic_semseg.py. It diffs like this:

--- a/configs/common/data/coco_panoptic_semseg.py
+++ b/configs/common/data/coco_panoptic_semseg.py
@@ -49,10 +49,10 @@ dataloader.train = L(build_d2_train_dataloader)(
             L(T.ResizeScale)(
                 min_scale=0.1,
                 max_scale=2.0,
-                target_height=1024,
-                target_width=1024,
+                target_height=512,
+                target_width=512,
             ),
-            L(T.FixedSizeCrop)(crop_size=(1024, 1024)),
+            L(T.FixedSizeCrop)(crop_size=(512, 512)),
         ],
         image_format="RGB",
     ),
@@ -68,7 +68,7 @@ dataloader.test = L(build_d2_test_dataloader)(
     mapper=L(DatasetMapper)(
         is_train=False,
         augmentations=[
-            L(T.ResizeShortestEdge)(short_edge_length=1024, sample_style="choice", max_size=2560),
+            L(T.ResizeShortestEdge)(short_edge_length=512, sample_style="choice", max_size=1280),
diff --git a/configs/common/models/odise_with_caption.py b/configs/common/models/odise_with_caption.py
index e2862cb..03a2bf8 100644
--- a/configs/common/models/odise_with_caption.py
+++ b/configs/common/models/odise_with_caption.py
@@ -25,7 +25,7 @@ model.backbone = L(FeatureExtractorBackbone)(
     ),
     out_features=["s2", "s3", "s4", "s5"],
     use_checkpoint=True,
-    slide_training=True,
+    slide_training=False,

I suppose $512\times512$ does not require slides so I turned it off as well. I wonder if these are consistent with your configuration.

out of memory

CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.69 GiB total capacity; 21.31 GiB already allocated; 12.06 MiB free; 21.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting

I am working on a gpu with 24 gb and I run out of space and I am using crops of 512x512 with batch size 1

Is this a memory leak problem? I am also using garbage collector but that only delays running into the 'out of memory' error.
Help is appreciated :)

conda error while downloading the specified pytorch cuda versions

When I try to install these versions "conda install pytorch=1.13.1 torchvision=0.14.1 pytorch-cuda=11.6 -c pytorch -c nvidia", I got the following error.

**"Downloading and Extracting Packages
CondaError: Downloaded bytes did not match Content-Length
url: https://conda.anaconda.org/nvidia/linux-64/libcufft-dev-10.7.1.112-ha5ce4c0_0.tar.bz2
target_path: /home/aub/anaconda3/pkgs/libcufft-dev-10.7.1.112-ha5ce4c0_0.tar.bz2
Content-Length: 206803679
downloaded bytes: 102857120

CancelledError()
CancelledError()
CancelledError()
CancelledError() "**

I have tried to update my conda but did not work

error in demo

Hello, GREAT JOB! It looks like your demo shows an error when inference, do you have any plan to fix it? Looking forward to play with it :D

Some questions about the code

Thank you for your outstanding work.

I have thoroughly reviewed the paper and the code. Most of it is clear and understandable. However, I find the following sections rather perplexing: self.alpha_cond and self.alpha_cond_time_embed

self.alpha_cond = nn.Parameter(torch.zeros_like(self.ldm_extractor.ldm.uncond_inputs))
self.alpha_cond_time_embed = nn.Parameter(torch.zeros(self.ldm_extractor.ldm.unet.time_embed[-1].out_features))

It appears that self.alpha_cond and self.alpha_cond_time_embed are used to interact with prefixes (as referenced here), which are generated by the Implicit Captioner. Subsequently, the results of this interaction are fed into the Latent Diffusion Model.

I'm curious about the necessity of the following operation (as mentioned here):

batched_inputs["cond_inputs"] = (self.ldm_extractor.ldm.uncond_inputs + torch.tanh(self.alpha_cond) * prefix_embed).

It seems that we could directly feed prefix_embed into the Latent Diffusion Model. I would like to understand the purpose and rationale behind introducing self.alpha_cond and self.alpha_cond_time_embed. Has any previous work employed such an operation?

I eagerly anticipate your response. Thank you very much.

Cant reduce the batch size

My setup is having 8 Titan X GPUs, when i tried to set --ref 32 it gives this error,

/var/spool/slurm/slurmd/job86812/slurm_script: line 50: $benchmarch_logs: ambiguous redirect
Traceback (most recent call last):
File "/home/mu480317/ODISE/./tools/train_net.py", line 392, in
launch(
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/detectron2/engine/launch.py", line 67, in launch
mp.spawn(
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 5 terminated with the following error:
Traceback (most recent call last):
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/detectron2/engine/launch.py", line 126, in _distributed_worker
main_func(*args)
File "/home/mu480317/ODISE/tools/train_net.py", line 319, in main
cfg = auto_scale_workers(cfg, comm.get_world_size())
File "/home/mu480317/ODISE/odise/config/utils.py", line 65, in auto_scale_workers
assert cfg.dataloader.train.total_batch_size % old_world_size == 0, (
AssertionError: Invalid reference_world_size in config! 8 % 32 != 0

When --ref 8 , then the GPU memory is overflowing.

Please help me solve this. Thank you

Setup issue with mask2former

  running build_ext
  building 'MultiScaleDeformableAttention' extension
  Emitting ninja build file //ODISE/third_party/Mask2Former/build/temp.linux-x86_64-cpython-39/build.ninja...
  error: [Errno 2] No such file or directory: '//ODISE/third_party/Mask2Former/build/temp.linux-x86_64-cpython-39/build.ninja'
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> mask2former

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Is the error related to ninja build?

Detail about 'background' class

Hi, thanks for your great work.

I have a question about the classifier in this paper, I want to know whether you used the 'background' class in the $C_{train}$.

We encode the names of all the categories in Ctrain with the frozen text encoder and define the set of embeddings of all the training categories' names as: Equation4

If you used a background class, is it learnable or fixed?

System RAM crashes while loading model in Google Colab

Thanks for the great Colab

I have a problems.

  1. System RAM out of memory
    When executing the code below, it overflows the system RAM and crashes.
    I installed xformers, but I couldn't avoid it. Is there any solution?

model = instantiate_odise(cfg.model)

Minimum GPU requirements

I get CUDA out of memory error when I run python demo/demo.py --input demo/examples/coco.jpg --output demo/coco_pred.jpg --vocab "black pickup truck, pickup truck; blue sky, sky" on RTX 3060 GPU with 12GB of vram.

Last lines of the error is as follows:

output_features[k] = torch.zeros(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 176.00 MiB (GPU 0; 11.73 GiB total capacity; 8.91 GiB already allocated; 136.75 MiB free; 9.09 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

What are the minimum requirements for running inference code? Is there a way to prevent getting these errors on less powerful systems? Is it possible to perform inference using CPU?

Thanks!

Installation puts too-new version of numpy

When installing following the instructions in the readme, we end up with numpy version 1.25.2

This breaks detectron2 visualizer, which throws error: module ‘numpy‘ has no attribute ‘bool‘.

Therefore, I needed to downgrade numpy to v1.23.*: conda install numpy==1.23.*

Then it works as expected

install‘s q:about detectron2

run:pip install -e .

return:ERROR: Could not build wheels for detectron2, which is required to install pyproject.toml-based projects

detail:
39\detectron2\model_zoo\configs\new_baselines
copying detectron2\model_zoo\configs\new_baselines\mask_rcnn_R_50_FPN_50ep_LSJ.py -> build\lib.win-amd64-cpython-39\detectron2\model_zoo\configs\new_baselines
running build_ext
D:\anaconda\envs\nerf\lib\site-packages\torch\utils\cpp_extension.py:358: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'detectron2._C' extension
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\detectron2
error: could not create 'C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\detectron2': 文件名或扩展名太长。
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for detectron2
Running setup.py clean for detectron2
Building wheel for lvis (setup.py) ... done
Created wheel for lvis: filename=lvis-0.5.3-py3-none-any.whl size=14020 sha256=d9272abdad25f5a6bfe26b3f5bf00c215e352eb030e512cdfc85a1dc3100997e
Stored in directory: C:\Users\PaXini_035\AppData\Local\Temp\pip-ephem-wheel-cache-polfx76g\wheels\56\46\42\dc63fcf42b15c084a2d44b6d6854d3dd27d0f3886363ce582b
Building wheel for panopticapi (setup.py) ... done
Created wheel for panopticapi: filename=panopticapi-0.1-py3-none-any.whl size=9302 sha256=17b9b66051da4a373f6fceff0b62c995b4a3173b8ba6d5a8560729a47abed543
Stored in directory: C:\Users\PaXini_035\AppData\Local\Temp\pip-ephem-wheel-cache-polfx76g\wheels\52\9a\3e\b664fb2d7b0016a15b505840f9d97ece85bbc203b74debcde0
Building wheel for pathtools (setup.py) ... done
Created wheel for pathtools: filename=pathtools-0.1.2-py3-none-any.whl size=8801 sha256=bd9d445360da0cdea47b35a8206cac311954640517ba449e1678f28c5e10878b
Stored in directory: c:\users\paxini_035\appdata\local\pip\cache\wheels\ac\67\0c\7406f4ff2becf8690a173e4ad09fad416c31dd5ddcb23b7f9d
Building wheel for future (setup.py) ... done
Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492055 sha256=d9023b0844c47de4abc7637c72575b36e901f949ed769d05226bd5478252fbdc
Stored in directory: c:\users\paxini_035\appdata\local\pip\cache\wheels\56\e1\4e\6ceef740e8a6cd23736ece789be212141ec1a451067edcb87f
Successfully built diffdist antlr4-python3-runtime mask2former test-tube lvis panopticapi pathtools future
Failed to build detectron2
ERROR: Could not build wheels for detectron2, which is required to install pyproject.toml-based projects

RuntimeError: expected scalar type Half but found Float

Thanks for your great work!

When I run the code tools/train_net.py with 2 V100 GPUs, I encounter the follow error:

File "/mnt/cap/caijh/app/src/detectron2/detectron2/engine/train_loop.py", line 155, in train
    self.run_step()
  File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
    grad_norm = self.grad_scaler(
  File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
    self._scaler.scale(loss).backward(create_graph=create_graph)
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 138, in backward
    output_tensors = ctx.run_function(*shallow_copies)
  File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/attention.py", line 212, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Half but found Float

The arguments are

./tools/train_net.py --config-file configs/Panoptic/odise_label_coco_50e.py --num-gpus 2 --amp

Appreciate any idea to solve this issue, thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.