shi-labs / oneformer Goto Github PK
View Code? Open in Web Editor NEWOneFormer: One Transformer to Rule Universal Image Segmentation, arxiv 2022 / CVPR 2023
Home Page: https://praeclarumjj3.github.io/oneformer
License: MIT License
OneFormer: One Transformer to Rule Universal Image Segmentation, arxiv 2022 / CVPR 2023
Home Page: https://praeclarumjj3.github.io/oneformer
License: MIT License
Excellent Work! I have a question about the loss_contrastive. I find that if we are using distributed training, loss_contrastive function will collect text and image features from the whole batch. I think there may be a situation: For example, for a semantic segmentation task, there are two images A and B have the same classes, which means that they have exactly the same
Thank you for sharing good resources.
I am trying to do binary Semantic Segmentation with only one class. The data I have currently are images and masks, and the masks are labeled with 0 and 1 (background 0, foreground 1).
I've looked at the code provided by OneFormer on GitHub, but all the examples are for instances(panoptic), and I couldn't find any examples of semantic segmentation using just images and masks without a Json file used in instances. So, I'm asking you this question.
(I've tried training semantic segmentation with the ade20k dataset as an example, but even then, instance annotations were essential.)
Is it possible to do semantic segmentation without a panoptic (or instance) annotations file? If so, what resources should I refer to? It's a bit complicated, so it's a bit difficult to understand.
Thank you.
When doing inference on my own data i get this warning: Attempting to copy inputs of <function sem_seg_postprocess at 0x7fc057578a70> to CPU due to CUDA OOM
Does this in the end affect the performance and is there a way to fix this?
I am using one RTX 2070 for inference.
Hi
Firstly thank you for releasing this amazing work. Not only is the model amazing but the code quality is excellent. Very easy to follow.
I have a question regarding GPU memory requirements for training. In the readme there's a bit of conflicting information.
We train all our models using 8 A6000 (48 GB each) GPUs. We use 8 A100 (80 GB each) for training Swin-L† OneFormer and DiNAT-L† OneFormer on COCO and all
Is it 8xA6000 (384GB) or 8xA100 (640GB)?
Additionally would it be possible to achieve good results with less, say 2xA6000 (96GB), with it just taking longer?
Many Thanks
Tom
Thanks for your great work!! But I found something that confused me.
To make things easier, let's first see the logic of the Transformer
in the code.
The self.class_transformer
is an instance of Transformer
, and its forward should be
src
will be fed into transformer_encoder
layers (an instance of TransformerEncoderLayer
)OneFormer/oneformer/modeling/transformer_decoder/transformer.py
Lines 99 to 104 in 7611899
which will be further fed into the self.with_pos_embed
with_pos_embed
isOneFormer/oneformer/modeling/transformer_decoder/transformer.py
Lines 186 to 187 in 7611899
Here, in my understanding, the tensor
denotes the input features the transformer encoder, and the pos
denotes the positional embeddings.
However, it seems that the tensor feats
is actually the positional embeddings but is treated as the input features, while the tensor self.class_input_proj(mask_features)
is actually the input features, but is treated as the positional embeddings
Am I misunderstanding here?
I have read that similar issue, but the cuda version on my machine is 11.6, and my PyTorch is installed with CUDA 11.3, so it doesn’t feel like it’s caused by the same bug. At the same time, I can run another repo normally in the same environment.
Hello 👋🏻! As requested in #9 I created dedicated issue. I hope you will get email notifications now.
I started to build Google Colab but I did run into some problems:
ImportError: /usr/local/lib/python3.7/dist-packages/MultiScaleDeformableAttention-1.0-py3.7-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIdEEPT_v
Here is my current version of the notebook: https://colab.research.google.com/drive/1ugQqod5zZLTh9bibOEaflI6QHwkcAM41#scrollTo=Rerjwwk_ZEY_
Do you have any idea how to solve that?
Thanks for your incredible work team. Getting this error on inference:
Weight format of OneFormerHead have changed! Please upgrade your models. Applying automatic conversion now ... WARNING [11/26 13:32:01 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
Using this model for config & checkpoint: OneFormer | DiNAT-L† | 896×896
I am running inference on cpu with:
cfg.MODEL.DEVICE = 'cpu'
Any idea where I'm going wrong would be greatly appreciated. Thanks !
Hi team 👋!
First of all great project! I'm super excited to see that you used Detectron2 as framework of choice.
I'm trying to train my own model using custom dataset in COCO format. And for now I have 2 questions:
coco/
annotations/
instances_{train,val}2017.json
panoptic_{train,val}2017.json
caption_{train,val}2017.json
# evaluate on instance labels derived from panoptic annotations
panoptic2instances_val2017.json
{train,val}2017/
# image files that are mentioned in the corresponding json
panoptic_{train,val}2017/ # png annotations
panoptic_semseg_{train,val}2017/ # generated by the script mentioned below
Have you tested to compute linear sum assigment or find a implementation in Pytorch?
I was stuck at this error, any help would be great ! @praeclarumjj3
[04/29 21:09:11 oneformer.data.dataset_mappers.oneformer_unified_dataset_mapper]: [OneFormerUnifiedDatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=..., max_size=2048, sample_style='choice'), RandomCrop_CategoryAreaConstraint(crop_type='absolute', crop_size=[1024, 1024], single_category_max_area=1.0, ignored_category=255), <detectron2.projects.point_rend.color_augmentation.ColorAugSSDTransform object at 0x7fa072dcfe20>, RandomFlip()]
[04/29 21:09:11 d2.data.build]: Using training sampler TrainingSampler
[04/29 21:09:11 d2.data.common]: Serializing 6000 elements to byte tensors and concatenating them all ...
[04/29 21:09:11 d2.data.common]: Serialized dataset takes 7.08 MiB
[04/29 21:09:11 fvcore.common.checkpoint]: [Checkpointer] Loading from detectron2://ImageNetPretrained/torchvision/R-50.pkl ...
[04/29 21:09:11 fvcore.common.checkpoint]: Reading a file from 'torchvision'
ERROR [04/29 21:09:11 d2.checkpoint.c2_model_loading]: Ambiguity found for res5.0.conv1.norm.bias in checkpoint!It matches at least two keys in the model (roi_heads.res5.0.conv1.norm.bias and backbone.res5.0.conv1.norm.bias).
Traceback (most recent call last):
File "train_net.py", line 435, in
launch(
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "train_net.py", line 424, in main
trainer.resume_or_load(resume=args.resume)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 412, in resume_or_load
self.checkpointer.resume_or_load(self.cfg.MODEL.WEIGHTS, resume=resume)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/fvcore/common/checkpoint.py", line 227, in resume_or_load
return self.load(path, checkpointables=[])
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/checkpoint/detection_checkpoint.py", line 52, in load
ret = super().load(path, *args, **kwargs)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/fvcore/common/checkpoint.py", line 156, in load
incompatible = self._load_model(checkpoint)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/checkpoint/detection_checkpoint.py", line 97, in _load_model
checkpoint["model"] = align_and_update_state_dicts(
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/checkpoint/c2_model_loading.py", line 287, in align_and_update_state_dicts
raise ValueError("Cannot match one checkpoint key to multiple keys in the model.")
ValueError: Cannot match one checkpoint key to multiple keys in the model.
I'm using custom dataset for panoptic segmentation.
wrote register file, config file , used oneformer_unified_dataset_mapper, COCOPanopticEvaluator.
Thanks in advance !
Greetings,
I have followed your installation steps, dataset preparation steps as it is. I have an issue that it is showing images are not in the directory even though images are in it. Any help would be great. Thanks in advance.
03/20 15:44:27 oneformer.data.dataset_mappers.oneformer_unified_dataset_mapper]: [OneFormerUnifiedDatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=..., max_size=2560, sample_style='choice'), RandomCrop_CategoryAreaConstraint(crop_type='absolute', crop_size=[640, 640], single_category_max_area=1.0, ignored_category=255), <detectron2.projects.point_rend.color_augmentation.ColorAugSSDTransform object at 0x7f05844cbca0>, RandomFlip()]
Traceback (most recent call last):
File "train_net.py", line 435, in
launch(
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "train_net.py", line 423, in main
trainer = Trainer(cfg)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 378, in init
data_loader = self.build_train_loader(cfg)
File "train_net.py", line 162, in build_train_loader
return build_detection_train_loader(cfg, mapper=mapper)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/config/config.py", line 207, in wrapped
explicit_args = _get_args_from_config(from_config, *args, **kwargs)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/config/config.py", line 245, in _get_args_from_config
ret = from_config_func(*args, **kwargs)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/data/build.py", line 337, in _train_loader_from_config
dataset = get_detection_dataset_dicts(
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/data/build.py", line 240, in get_detection_dataset_dicts
dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/data/build.py", line 240, in
dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/data/catalog.py", line 58, in get
return f()
File "/home/iit29/Desktop/OneFormer/oneformer/data/datasets/register_ade20k_panoptic.py", line 295, in
lambda: load_ade20k_panoptic_json(
File "/home/iit29/Desktop/OneFormer/oneformer/data/datasets/register_ade20k_panoptic.py", line 267, in load_ade20k_panoptic_json
assert len(ret), f"No images found in {image_dir}!"
AssertionError: No images found in /home/iit29/Desktop/OneFormer/datasets/ADEChallengeData2016/ADEChallengeData2016/images/training!
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
wandb:
Any help would be great, thanks in advance!
hello, can this structure be used for real-time segmentation or video segmentation
Requesting to make pre-trained model weights public.
Thanks in advance!
Hello
How are you?
Thanks for contributing to this project.
I want to get one segmentation map containing all the class labels rather than binary mask for each instance.
How can I get it?
I know these are very basic queries, thanks in advance !
After writing a file similar to https://github.com/SHI-Labs/OneFormer/blob/main/oneformer/data/datasets/register_ade20k_panoptic.py to register custom data, I should run this specific python file separately to register. is that right?
I'm using oneformerunifieddatasetmapper https://github.com/SHI-Labs/OneFormer/blob/main/oneformer/data/dataset_mappers/oneformer_unified_dataset_mapper.py
I'm using config file similar to https://github.com/SHI-Labs/OneFormer/blob/5e04c9aaffd9bc73020d2238757f62346fe778c0/configs/ade20k/Base-ADE20K-UnifiedSegmentation.yaml
4.I have doubt in evaluator, can I use any provided evaluators for my dataset? if not kindly guide me through
Setting up the environment for the first time can be a bit tricky since this projects depends on specific CUDA and GCC version. Maybe add them in the Install.md file. Wrote the version that worked for me.
We use an evironment with the following specifications, packages and dependencies:
Hello There,
Thanks for sharing the amazing work!
I have been experimenting OneFormer repo since past few days and I am able to run the training (fine-tuning) for Instance Segmentations using Custom Dataset on 1 GPU (Tesla T4) by reducing the image size to 512.
cfg.INPUT.IMAGE_SIZE = 512
cfg.SOLVER.IMS_PER_BATCH = 1
(Even 16 works)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = <Number Of Classes In My Dataset>
cfg.MODEL.RETINANET.NUM_CLASSES = <Number Of Classes In My Dataset>
cfg.MODEL.SEM_SEG_HEAD.NUM_CLASSES = <Number Of Classes In My Dataset>
cfg.SOLVER.MAX_ITERA = 40000
with default Base Learning Rate of 0.0001
COCO DINAT Configuration file : oneformer_dinat_large_bs16_100ep.yaml
MODEL WEIGHTS : 150_16_dinat_l_oneformer_coco_100ep.pth
My dataset has approx 10,000 images in the train set.
I found the Training Settings you have used from the Appendix Section of the Paper. So, a batch size of 16 was used for around 90K or more iterations depending on the datasets.
For example, at batch size of 1, the stating total loss was 87 which reduced to around 13 in 8000 iterations. But after that the train loss oscillates between the values of 9 to 28.
Thanks for the help !
Oneformer colab provided on the GitHub through (open in colab option) gives this error "No module named 'detectron2". Kindly have a look, please. Thank you
Using my custom dataset in COCO format for instance segmentation training.
Changed CFG to
cfg.MODEL.TEST.TASK = "instance"
cfg.INPUT.TASK_PROB.SEMANTIC = 0
cfg.INPUT.TASK_PROB.INSTANCE = 1
Still getting an error UnboundLocalError: local variable 'pan_seg_gt' referenced before assignment
From #5 and reading docs I understand I have to somehow prepare my dataset for instance segmentation training.
Hello There,
Fristly, thanks for the sharing the amazing work !
I am using OneFormer for Instance Segmentation Task on custom dataset.
I read the #17 and used the "InstanceCOCOCustomNewBaselineDatasetMapper" from the instance_coco_custom_dataset_mapper.py; and I am able to train the model on my dataset.
I was trying to figure out - if I can get the inference results on the Validation Dataset periodically, says every 100 iterations.
I modified the cfg as below -
cfg.DATASETS.TEST = "CustomInstSegVAL"
cfg.TEST.EVAL_PERIOD = 100
I created a overridden version of the build_test_loader
function using the InstanceCOCOCustomNewBaselineDatasetMapper as below -
def build_test_loader(cos, cfg, dataset_name):
val_ampper = InstanceCOCOCustomNewBaselineDatasetMapper(cfg, is_train = True)
return build_detection_test_loader(DatasetCatalog.get('CustomInstSegVal'), mapper=val_mapper)
NOTE: if I set is_train = False
in val_mapper = InstanceCOCOCustomNewBaselineDatasetMapper(cfg, is_train = True)
, then it throws AssertionError from the build_transform_gen function
of InstanceCOCOCustomNewBaselineDatasetMapper.
If I set is_train = True
in val_mapper = InstanceCOCOCustomNewBaselineDatasetMapper(cfg, is_train = True)
, then it throws AssertionError from pycocotools/coco.py > stating "AssertionError: Results do not correspond to current coco set"
Can you please guide me as to how we can use the Validation Dataset to test the models performance during the training.
Thanks !
cd oneformer/modeling/pixel_decoder/ops
sh make.sh
got
/OneFormer/oneformer/modeling/pixel_decoder/ops/setup.py", line 52, in get_extensions
raise NotImplementedError('CUDA_HOME is None. Please set environment variable CUDA_HOME.')
NotImplementedError: CUDA_HOME is None. Please set environment variable CUDA_HOME.
\how to run on CPU?
Hello
How are you?
Thanks for contributing to this project.
Could u guide me to get a segmentation label map from the model output?
Hello
How are you?
Thanks for contributing to this project.
I found that the demo script does NOT output bounding boxes for each instance in panoptic/instance segmentation.
Could u guide me how to get the bounding boxes?
Hi so I tried to clone the repo in your google colab sample and modified the code to include mapillary it seems to be failling with a meta dictonary erorr in get_config where meta info seems to missing i.e thingstocontiguous id
Hi! I plan to compare Oneformer's panoptic quality results with the results stated in the paper for the COCO dataset. Initially, I need to convert oneformer output to .json coco format and then use panopticapi to evaluate Panoptic Quality. I do not know if this is correct but I tried yet I could not succeed. Could you please tell me the accurate steps and proper links to perform this task? I am using Oneformer's colab. Thank you in anticipation.
Hi, Thanks for your work.
When I using demo in huggingface, it just has error info.
Could you please check that?
Hi, thanks for your great work. I did not find the caption-related annotations on the ADE20K website, could you point out how to get them? or how to generate them?
Thanks for this work. I would like to ask if you can please share the logs for Swin-L backbone on ADE20K (640×640). I tried and get similar numbers to #14 and wonder what is the issue.
Specifically, you reported the following, hence would like to be able to see the logs to under the issue.
49.8 , 35.9, 57.0
Also it would be great if you can please share the logs for Swin-L backbone using Cityscapes dataset.
P.S: Now that this work is accepted to CVPR, it is so crucial to maintain reproduciblity.
can not get PQ 48 on ade20k dataset with swim-L backbone,i only get PQ 46,how to get the result you provide in the paper?
Is there a guide/suggestion on how to retrain one class using a custom dataset and also rename the class? Assuming one of the classes trained using Swin-L is tree. Can I retrain the class with a dataset of almond trees and rename the class to almond?
Thank you. @honghuis @rbavery @SkalskiP @praeclarumjj3 @alihassanijr
Hi, I am trying to train OneFormer on the custom dataset and I was able to start the training. But, I have a few questions regarding choosing the right settings. Currently I resued ADE20k config file after editing the number of classes, iterations, and batch size.
Could you please provide detailed instructions on training for a custom dataset? Read the Custom training page but couldn't understand anything.
According to the paper, the queries Q are only conditioned on "the task is {task}", but {task} only has 3 possible values. So do the queries only have 3 possible values?
can you please elaborate the datasets
section in Base-ADE20K-UnifiedSegmentation.yaml
DATASETS:
TRAIN: ("ade20k_panoptic_train",)
TEST_PANOPTIC: ("ade20k_panoptic_val",)
TEST_INSTANCE: ("ade20k_instance_val",)
TEST_SEMANTIC: ("ade20k_sem_seg_val",)
They are ground truth files for train and val, right?
what if instance ground truths are not available?
kindly please clarify, thanks in advance!
@praeclarumjj3
if i just want segmantic segmentation training, do not want to Instance segmentation training. how should i do?
i just prepare the dataset like this :
ADEChallengeData2016/
images/
annotations/ # segmantic segmentation data
i don't want to download annotations_instance.tar which contain Instance segmentation data.
what should i do next?
[03/21 22:37:19 d2.data.build]: Using training sampler TrainingSampler
[03/21 22:37:19 d2.data.common]: Serializing 20210 elements to byte tensors and concatenating them all ...
[03/21 22:37:19 d2.data.common]: Serialized dataset takes 18.42 MiB
[03/21 22:37:19 fvcore.common.checkpoint]: [Checkpointer] Loading from dinat_large_in22k_in1k_384_11x11.pkl ...
Traceback (most recent call last):
File "train_net.py", line 435, in
launch(
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "train_net.py", line 424, in main
trainer.resume_or_load(resume=args.resume)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 412, in resume_or_load
self.checkpointer.resume_or_load(self.cfg.MODEL.WEIGHTS, resume=resume)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/fvcore/common/checkpoint.py", line 227, in resume_or_load
return self.load(path, checkpointables=[])
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/detectron2/checkpoint/detection_checkpoint.py", line 52, in load
ret = super().load(path, *args, **kwargs)
File "/home/iit29/anaconda3/envs/oneformer/lib/python3.8/site-packages/fvcore/common/checkpoint.py", line 153, in load
assert os.path.isfile(path), "Checkpoint {} not found!".format(path)
AssertionError: Checkpoint dinat_large_in22k_in1k_384_11x11.pkl not found!
Similar errors popped when I try with other backbones too, I have no idea why, Any help would be great, Thanks in advance!
Thanks for sharing the awesome work!
I have a minor question.
how many GPU hours does the model training process need on the COCO and ADE20K datasets?
Thanks for sharing your gread job! I have serveral questions during reading the paper and codes. Hope to disguss.
About Contrastive Loss: According to the paper, T_pad is a list of representation for each mask-to-be-detected in image. How this relationship is keeped in training process? I found "i" in pairs {qobj_i, xtxt_i} represent index in code, so it seems like q^text always matches the q^obj with same index. But we shouldn't meant to know which object the q^obj represent before DETR decoder inference. Did I misunderstand the paper?
Table 6 of ablation study confuse me. It seems like ablation is about some kind of prompt engineering (of course it's not). I still can't get why adding "a photo with a" can raise model performances. Is this paper use pretrained text encoder? Do you have any new idea or explanation about this ablation?
Hi, first of all thank you for sharing your awesome work.
I'm trying to fine-tune the model for instance segmentation with a custom dataset that I have locally in COCO format. The issue that I'm having is that I don't know how exactly to convert the segmentation polygon masks to pixel_values and task_inputs that the model's forward function expects.
This is my data loader script:
import datasets
import os
from pycocotools.coco import COCO
from pathlib import Path
class COCODataset(datasets.GeneratorBasedBuilder):
def _info(self):
return datasets.DatasetInfo(
description="COCO dataset",
features=datasets.Features({
# "pixel_values": ...
# "task_inputs": ...
"image": datasets.Image(),
"annotations": datasets.Sequence({
"id": datasets.Value("int32"),
"image_id": datasets.Value("int32"),
"category_id": datasets.Value("int32"),
"area": datasets.Value("int32"),
"iscrowd": datasets.Value("int32"),
"bbox": datasets.Sequence(datasets.Value("float32")),
"attributes": {
"occluded": datasets.Value("bool"),
},
"segmentation": datasets.Sequence(datasets.Sequence(datasets.Value("float32"))),
})
}),
)
def _split_generators(self, dl_manager):
instances_train_path = dl_manager.download(os.path.join(self.config.data_dir, "annotations/instances_train.json"))
instances_val_path = dl_manager.download(os.path.join(self.config.data_dir, "annotations/instances_val.json"))
return [
datasets.SplitGenerator(name=datasets.Split.TRAIN, gen_kwargs={"images": instances_train_path}),
datasets.SplitGenerator(name=datasets.Split.VALIDATION, gen_kwargs={"images": instances_val_path}),
]
def _generate_examples(self, images):
coco = COCO(images)
for image_id in coco.imgs:
image = coco.loadImgs(image_id)[0]
annotations = coco.loadAnns(coco.getAnnIds(image_id))
# Load the image content as bytes
image_path = os.path.join(self.config.data_dir, "images", image["file_name"])
image_content = Path(image_path).read_bytes()
yield image_id, {
"image": image_content,
"annotations": annotations,
# "pixel_values": ...,
# "task_inputs": ...
}
I know that I'm supposed to use OneFormerProcessor
, but the examples provided are only for inference and don't specify how to process input masks. What exactly am I supposed to do in the _generate_examples
method? Any tips are greatly appreciated!
Just for reference, here is my train script as well:
import numpy as np
import evaluate
from transformers import OneFormerForUniversalSegmentation, TrainingArguments, Trainer
import datasets
import os
script_dir = os.path.dirname(os.path.abspath(__file__))
data_dir = os.path.join(script_dir, "..", "data/datasets/archviz-600-v2-coco")
ds = datasets.load_dataset(os.path.join(script_dir, "dataset_loader.py"), data_dir=data_dir)
print("Length of train dataset:", len(ds['train']))
print("Length of validation dataset:", len(ds['validation']))
model = OneFormerForUniversalSegmentation.from_pretrained("shi-labs/oneformer_cityscapes_swin_large")
training_args = TrainingArguments(output_dir=os.path.join(script_dir, 'output'), evaluation_strategy="epoch")
metric = evaluate.load("accuracy")
def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
return metric.compute(predictions=predictions, references=labels)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=ds['train'],
eval_dataset=ds['validation'],
compute_metrics=compute_metrics,
)
trainer.train()
And this is the output:
Length of train dataset: 472
Length of validation dataset: 118
/usr/local/lib/python3.8/dist-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
0%|
| 0/177 [00:00<?, ?it/s]Traceback (most recent call last):
File "oneformer-hugging/train.py", line 32, in <module>
trainer.train()
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 1662, in train
return inner_training_loop(
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 1899, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 635, in __next__
data = self._next_data()
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 679, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 56, in fetch
data = self.dataset.__getitems__(possibly_batched_index)
File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 2782, in __getitems__
batch = self.__getitem__(keys)
File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 2778, in __getitem__
return self._getitem(key)
File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 2762, in _getitem
pa_subtable = query_table(self._data, key, indices=self._indices if self._indices is not None else None)
File "/usr/local/lib/python3.8/dist-packages/datasets/formatting/formatting.py", line 578, in query_table
_check_valid_index_key(key, size)
File "/usr/local/lib/python3.8/dist-packages/datasets/formatting/formatting.py", line 531, in _check_valid_index_key
_check_valid_index_key(int(max(key)), size=size)
File "/usr/local/lib/python3.8/dist-packages/datasets/formatting/formatting.py", line 521, in _check_valid_index_key
raise IndexError(f"Invalid key: {key} is out of bounds for size {size}")
IndexError: Invalid key: 375 is out of bounds for size 0
0%|
hi, i want to use pretained model for new trainning, how should i do? i want to trainning with dataset B, and i want to use model_0159999.pth which is producted at trainning on dataset A, as the pretrained model.
I've been trying to build a docker image by following the steps from INSTALL.md, but I'm stuck on this:
# Setup MSDeformAttn
cd oneformer/modeling/pixel_decoder/ops
sh make.sh
I tried installing CUDA toolkit globally, I also tried without using conda at all. No luck, I keep getting all kinds of errors. Please help, I've been pulling my hair with this all day. Here is my Dockerfile so far:
# Use the official Ubuntu 20.04 LTS image as the base image
FROM ubuntu:20.04
# Set environment variables to avoid interaction during package installation
ENV DEBIAN_FRONTEND=noninteractive
# Update the package index and install required packages
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
ca-certificates \
bzip2 \
build-essential \
git
# Set the working directory
WORKDIR /opt
# Download and install Miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& chmod +x Miniconda3-latest-Linux-x86_64.sh \
&& ./Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \
&& rm Miniconda3-latest-Linux-x86_64.sh
# Add conda to the system PATH
ENV PATH="/opt/conda/bin:${PATH}"
# Create the "oneformer" virtual environment
RUN conda create -y -n oneformer
# Activate the "oneformer" virtual environment and run any further commands within it
SHELL ["conda", "run", "-n", "oneformer", "/bin/bash", "-c"]
RUN git clone https://github.com/SHI-Labs/OneFormer.git /OneFormer
RUN cd /OneFormer
WORKDIR /OneFormer
# Install Pytorch
RUN conda install -y pytorch==1.10.1 -c pytorch
RUN conda install -y torchvision==0.11.2 -c pytorch
RUN conda install -y cudatoolkit=11.3 -c pytorch
# Install opencv (required for running the demo)
RUN pip3 install -U opencv-python
# Install detectron2
RUN python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
# Install other dependencies
RUN pip3 install git+https://github.com/cocodataset/panopticapi.git
RUN pip3 install git+https://github.com/mcordts/cityscapesScripts.git
RUN pip3 install -r requirements.txt
# Setup wand
RUN pip3 install wandb
#ENV WANDB_API_KEY=...
#RUN wandb login
# Setup MSDeformAttn
# THIS IS WHERE IT BREAKS
# ENV CUDA_HOME=/opt/conda/envs/oneformer/lib/python3.9/site-packages/torch/cuda
# ENV FORCE_CUDA=1
RUN cd oneformer/modeling/pixel_decoder/ops && \
sh ./make.sh
# Set the entrypoint to use the "oneformer" virtual environment by default
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "oneformer"]
# Set the default command to run when starting the container
CMD ["/bin/bash"]
And this is the error that I'm getting:
[19/19] RUN cd oneformer/modeling/pixel_decoder/ops && sh ./make.sh:
#0 1.605 /opt/conda/envs/oneformer/lib/python3.9/site-packages/torch/utils/cpp_extension.py:381: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
#0 1.605 warnings.warn(msg.format('we could not find ninja.'))
#0 1.605 error: [Errno 2] No such file or directory: '/opt/conda/envs/oneformer/lib/python3.9/site-packages/torch/cuda/bin/nvcc'
#0 1.605
#0 1.605 ERROR conda.cli.main_run:execute(47): `conda run /bin/bash -c cd oneformer/modeling/pixel_decoder/ops && sh ./make.sh` failed. (See above for error)
#0 1.605 No CUDA runtime is found, using CUDA_HOME='/opt/conda/envs/oneformer/lib/python3.9/site-packages/torch/cuda'
#0 1.605 running build
#0 1.605 running build_py
#0 1.605 creating build
#0 1.605 creating build/lib.linux-x86_64-3.9
#0 1.605 creating build/lib.linux-x86_64-3.9/functions
#0 1.605 copying functions/__init__.py -> build/lib.linux-x86_64-3.9/functions
#0 1.605 copying functions/ms_deform_attn_func.py -> build/lib.linux-x86_64-3.9/functions
#0 1.605 creating build/lib.linux-x86_64-3.9/modules
#0 1.605 copying modules/__init__.py -> build/lib.linux-x86_64-3.9/modules
#0 1.605 copying modules/ms_deform_attn.py -> build/lib.linux-x86_64-3.9/modules
#0 1.605 running build_ext
#0 1.605
------
failed to solve: executor failed running [conda run -n oneformer /bin/bash -c cd oneformer/modeling/pixel_decoder/ops && sh ./make.sh]: exit code: 1
Thanks for your excellent work. I have used pretrained weights of swin backbone and evaluated the model of ADE20k and got he following results.
03/25 23:29:06 d2.evaluation.panoptic_evaluation]: Panoptic Evaluation Results:
PQ | SQ | RQ | #categories | |
---|---|---|---|---|
All | 0.000 | 0.000 | 0.000 | 150 |
Things | 0.000 | 0.000 | 0.000 | 100 |
Stuff | 0.000 | 0.000 | 0.000 | 50 |
[03/25 23:46:54 d2.evaluation.testing]: copypaste: Task: sem_seg
[03/25 23:46:54 d2.evaluation.testing]: copypaste: mIoU,fwIoU,mACC,pACC
[03/25 23:46:54 d2.evaluation.testing]: copypaste: 0.0029,0.0005,0.2547,0.0320
[03/25 23:46:54 d2.evaluation.testing]: copypaste: Task: panoptic_seg
[03/25 23:46:54 d2.evaluation.testing]: copypaste: PQ,SQ,RQ,PQ_th,SQ_th,RQ_th,PQ_st,SQ_st,RQ_st
[03/25 23:46:54 d2.evaluation.testing]: copypaste: 0.0000,0.0000,0.0000,0.0000,0.0000,0.0000,0.0000,0.0000,0.0000
[03/25 23:46:54 d2.evaluation.testing]: copypaste: Task: bbox
[03/25 23:46:54 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[03/25 23:46:54 d2.evaluation.testing]: copypaste: 0.0000,0.0000,0.0000,0.0000,0.0000,0.0000
[03/25 23:46:54 d2.evaluation.testing]: copypaste: Task: segm
[03/25 23:46:54 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[03/25 23:46:54 d2.evaluation.testing]: copypaste: 0.0000,0.0000,0.0000,0.0000,0.0000,0.0000
Anyhelp would be appreciated. Thanks in advance !
In "configs/ade20k/convnext/oneformer_convnext_xlarge_bs16_160k.yaml":
change
_BASE_: ../oneformerrrr_R50_bs16_160k.yaml
to
_BASE_: ../oneformer_R50_bs16_160k.yaml
Hi there, thanks very much for publishing this repo, it looks very interesting.
I'm trying to follow the installation instructions but due to the CPU architecture of the system I'm using, I don't think I'll be able to use wandb
(I don't have an account with them so thought I'd try run it locally):
$ wandb server start
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/ppc64le) and no specific platform was requested
Is it possible to run the code without wandb?
Thanks for any help! :)
Thanks for sharing this work. I test the model and I have a similar result for panoptic and semantic but I have a poor performance for instance task.
I tried to test the model with the swin backbone but I have this error:
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
I would like to train the model on custom datasets for instance segmentation can you provide please a demo for training on custom datasets.
Thanks in advance
Hello!
I am trying to run a demo for a single image. I use "oneformer_dinat_large_IN21k_384_bs16_160k.yaml" as config file and "250_16_dinat_l_oneformer_ade20k_160k.pth" as model weights. When I run the demo.py, I see the following lines: "error in ms_deformable_im2col_cuda: too many resources requested for launch" and the code ends up saving {task}.jpg without significant information.
The error string occurs when executing the timeline code in demo/defaults.py: line 81 "predictions = self.model([inputs])[0]"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.