Coder Social home page Coder Social logo

youngwanlee / centermask2 Goto Github PK

View Code? Open in Web Editor NEW
769.0 16.0 159.0 112 KB

[CVPR 2020] CenterMask : Real-time Anchor-Free Instance Segmentation

License: Other

Python 99.72% Shell 0.28%
centermask detectron2 object-detection instance-segmentation anchor-free vovnet vovnetv2 real-time pytorch cvpr2020

centermask2's Introduction

๐Ÿ‘‹ย  Hi there! I'm Youngwan, a senior researcher at ETRI and Ph.D student in Graduate school of AI at KAIST, where I'm advised by Prof. Sung Ju Hwang in the Machine Learning and Artificial Intelligence (MLAI) lab.

My research interest is how computers understand the world, including efficient 2D/3D neural network design, object detection, instance segmentation, semantic segmentation, and video classification. ๐Ÿ–ฅ๏ธ๐ŸŒ

Representative publications and Codes

See Google scholar for full list.

  • RC-MAE: Exploring the Role of Mean Teachers in Self-supervised Masked Auto-Encoders, ICLR 2023.
  • MPViT : Multi-Path Vision Transformer for Dense Prediction, CVPR 2022.
  • CenterMask : Real-Time Anchor-Free Instance Segmentation, CVPR 2020.
  • 2D convolutional neural network : VoVNet
  • 3D convolutional neural network : VoV3D

About me

  • ๐Ÿ“ I enjoy teaching talking what I know learn, so I am giving lectures on AI as an AI Facilitator at ETRI AI Academy.
  • ๐ŸŒ๐ŸŒฑ๐ŸŒฒ๐ŸŒŠ โ›ฐ๏ธ I love to appreciate the beautiful nature.
  • ๐ŸŽพ ๐Ÿ€ I enjoy playing tennis and basket ball.
  • ๐Ÿ“ซ How to reach me: [email protected] | [email protected]

๐Ÿ’ช Skills

Platforms & Languages

Python PyTorch Tensorflow Java Android

centermask2's People

Contributors

stigma0617 avatar yacobby avatar youngwanlee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

centermask2's Issues

Getting error during training

Hi,
I am getting error in mask_head.py . According the stacktrace, this line make a problem mask_ratios = torch.max(mask_ratios, value_eps)

Error is:
RuntimeError: Expected object of scalar type double but got scalar type float for argument 'other'

Could you please take a look at it?

The only thing I changed is registering my own dataset for training:


def setup(args):
    """
    Create configs and perform basic setups.
    """
    cfg = get_cfg()
    cfg.merge_from_file(args.config_file)
    cfg.merge_from_list(args.opts)


    from detectron2.data.datasets import register_coco_instances

    register_coco_instances("bad", {}, "datasets/coco/annotations/instances_train2017.json",
                            "datasets/coco/annotations/images")

    cfg.DATASETS.TRAIN = ("bad",)

    cfg.SOLVER.IMS_PER_BATCH = 8
    cfg.SOLVER.BASE_LR = 0.001
    cfg.SOLVER.MAX_ITER = 2000
    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1


    cfg.freeze()
    default_setup(cfg, args)
    return cfg

###########################################

/opt/conda/conda-bld/pytorch_1587428266983/work/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple)
Traceback (most recent call last):
File "/home/tom/PycharmProjects/centermask2/train_net.py", line 247, in
args=(args,),
File "/home/tom/miniconda3/envs/detectron2/lib/python3.7/site-packages/detectron2/engine/launch.py", line 57, in launch
main_func(*args)
File "/home/tom/PycharmProjects/centermask2/train_net.py", line 235, in main
return trainer.train()
File "/home/tom/PycharmProjects/centermask2/train_net.py", line 97, in train
self.train_loop(self.start_iter, self.max_iter)
File "/home/tom/PycharmProjects/centermask2/train_net.py", line 86, in train_loop
self.run_step()
File "/home/tom/miniconda3/envs/detectron2/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 215, in run_step
loss_dict = self.model(data)
File "/home/tom/miniconda3/envs/detectron2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tom/miniconda3/envs/detectron2/lib/python3.7/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 130, in forward
_, detector_losses = self.roi_heads(images, features, proposals, gt_instances)
File "/home/tom/miniconda3/envs/detectron2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/tom/PycharmProjects/centermask2/centermask/modeling/centermask/center_heads.py", line 401, in forward
losses, mask_features, selected_mask, labels, maskiou_targets = self._forward_mask(features, proposals)
File "/home/tom/PycharmProjects/centermask2/centermask/modeling/centermask/center_heads.py", line 476, in _forward_mask
loss, selected_mask, labels, maskiou_targets = mask_rcnn_loss(mask_logits, proposals, self.maskiou_on)
File "/home/tom/PycharmProjects/centermask2/centermask/modeling/centermask/mask_head.py", line 153, in mask_rcnn_loss
mask_ratios = torch.max(mask_ratios, value_eps)
RuntimeError: Expected object of scalar type double but got scalar type float for argument 'other'

Problems with single class (Balloon example from detectron2)

Hi,
I tried to rewrite the balloon example from detectron2.

It seems like the centermask2 version of mask_rcnn_loss (in mask_head.py) is not handling the case of training a single class correctly. I get the following error:

/content/centermask2/centermask/modeling/centermask/mask_head.py in mask_rcnn_loss(pred_mask_logits, instances, maskiou_on)
     95         gt_masks.append(gt_masks_per_image)
     96 
---> 97     gt_classes = cat(gt_classes, dim=0)
     98 
     99     if len(gt_masks) == 0:

/usr/local/lib/python3.6/dist-packages/detectron2/layers/wrappers.py in cat(tensors, dim)
     23     if len(tensors) == 1:
     24         return tensors[0]
---> 25     return torch.cat(tensors, dim)
     26 
     27 

RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat.  This usually means that this function requires a non-empty list of Tensors.  Available functions are [CUDATensorId, CPUTensorId, VariableTensorId]

Colab Notebook

Here is how I set it up.

Install dependencies

# install dependencies: (use cu100 because colab is on CUDA 10.0)
!pip install -U torch==1.4+cu100 torchvision==0.5+cu100 -f https://download.pytorch.org/whl/torch_stable.html 
!pip install cython pyyaml==5.1
!pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
import torch, torchvision
torch.__version__
!gcc --version
# opencv is pre-installed on colab

# install detectron2:
!pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu100/index.html

# clone CenterMask2
!git clone https://github.com/youngwanLEE/centermask2.git
%cd centermask2

# Download weight
!wget https://dl.dropbox.com/s/uwc0ypa1jvco2bi/centermask2-lite-V-39-eSE-FPN-ms-4x.pth

# download, decompress the data
!wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/balloon_dataset.zip
!unzip balloon_dataset.zip > /dev/null

Register the balloon dataset

Same as in the detectron balloon example.

%cd /content/centermask2/

import os
import numpy as np
import json
from detectron2.structures import BoxMode
import cv2

def get_balloon_dicts(img_dir):
    json_file = os.path.join(img_dir, "via_region_data.json")
    with open(json_file) as f:
        imgs_anns = json.load(f)

    dataset_dicts = []
    for idx, v in enumerate(imgs_anns.values()):
        record = {}
        
        filename = os.path.join(img_dir, v["filename"])
        height, width = cv2.imread(filename).shape[:2]
        
        record["file_name"] = filename
        record["image_id"] = idx
        record["height"] = height
        record["width"] = width
      
        annos = v["regions"]
        objs = []
        for _, anno in annos.items():
            assert not anno["region_attributes"]
            anno = anno["shape_attributes"]
            px = anno["all_points_x"]
            py = anno["all_points_y"]
            poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]
            poly = [p for x in poly for p in x]

            obj = {
                "bbox": [np.min(px), np.min(py), np.max(px), np.max(py)],
                "bbox_mode": BoxMode.XYXY_ABS,
                "segmentation": [poly],
                "category_id": 0,
                "iscrowd": 0
            }
            objs.append(obj)
        record["annotations"] = objs
        dataset_dicts.append(record)
    return dataset_dicts

from detectron2.data import DatasetCatalog, MetadataCatalog
for d in ["train", "val"]:
    DatasetCatalog.register("balloon_" + d, lambda d=d: get_balloon_dicts("balloon/" + d))
    MetadataCatalog.get("balloon_" + d).set(thing_classes=["balloon"])
balloon_metadata = MetadataCatalog.get("balloon_train")

Train

from centermask.config import get_cfg
from detectron2.engine import  default_setup
from train_net import Trainer
import torch

class Args(object):
    pass

def init_config():
  args = Args()
  args.config_file = 'configs/centermask/centermask_lite_V_39_eSE_FPN_ms_4x.yaml'
  args.dist_url = 'tcp://127.0.0.1:50152'
  args.eval_only = False
  args.machine_rank = 0
  args.num_gpus = 1
  args.num_machines = 1
  args.opts = ['MODEL.WEIGHTS', 'centermask2-lite-V-39-eSE-FPN-ms-4x.pth']
  args.resume = False

  cfg = get_cfg()
  cfg.merge_from_file(args.config_file)
  cfg.merge_from_list(args.opts)
  cfg.DATASETS.TEST = ("balloon_val",)
  cfg.DATASETS.TRAIN = ("balloon_train",)
  cfg.DATALOADER.NUM_WORKERS = 2
  cfg.SOLVER.IMS_PER_BATCH = 2
  cfg.SOLVER.BASE_LR = 0.00025  # pick a good LR
  cfg.SOLVER.MAX_ITER = 300    # 300 iterations seems good enough for this toy dataset; you may need to train longer for a practical dataset
  # cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128   # faster, and good enough for this toy dataset (default: 512)
  cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # only has one class (ballon)
  cfg.freeze()
  default_setup(cfg, args)

  return cfg

cfg = init_config()

def train():
  cfg = init_config()
  os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
  trainer = Trainer(cfg) 
  trainer.resume_or_load(resume=False)
  if cfg.TEST.AUG.ENABLED:
      trainer.register_hooks(
          [hooks.EvalHook(0, lambda: trainer.test_with_TTA(cfg, trainer.model))]
      )
  trainer.train()

# Make training start from iteration 0
pth = torch.load("centermask2-lite-V-39-eSE-FPN-ms-4x.pth")
pth['iteration']=0
torch.save(pth, "centermask2-lite-V-39-eSE-FPN-ms-4x.pth")

train()

Warning & Slow problem

First, thanks for releasing your code.

I'm trying to run the model with demo.py and using three Titian GPU for running your code.
But it takes quite longer than I expected for video prediction, and after quite long time I got the result but there is no bounding box in my video.

I used centermask_lite_V_39_eSE_FPN_ms_4x.yaml weights and when I start the code, there is a warning
" WARNING [07/29 19:58:36 d2.config.compat]: Config '../configs/centermask/centermask_lite_V_39_eSE_FPN_ms_4x.yaml' has no VERSION. Assuming it to be compatible with latest v2. "

V_19 models does not work

Dear author:
Thanks for the neat and well-documented repo. It's really helpful!
I found these V_19 series pth file can not found any instances, While theseV_39 series pth model can wokr pretty well. Is there any wrong , or I missed somthing? Please give any tips. Thank you.

Where does the improvements come from?

From the table given in readme, the AP increase by 1.2 point from implementation of maskrcnn-benchmark to Detectron2.

ๅ›พ็‰‡

Could you please uncover where the improvements comes from๏ผŸ

Thank you.

'Non-existent config key: MODEL.MASKIOU_ON' while running demo.py

Hey @youngwanLEE,
I'm trying to run demo.py but I'm getting an error. Can you help me how can I solve it?

[03/11 17:23:10 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='../configs/centermask/centermask_lite_V_39_eSE_FPN_ms_4x.yaml', input=None, opts=[], output=None, video_input=None, webcam=True)
WARNING [03/11 17:23:10 d2.config.compat]: Config '../configs/centermask/centermask_lite_V_39_eSE_FPN_ms_4x.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
  File "demo.py", line 72, in <module>
    cfg = setup_cfg(args)
  File "demo.py", line 23, in setup_cfg
    cfg.merge_from_file(args.config_file)
  File "/home/media4us/PycharmProjects/detectron2/detectron2/config/config.py", line 47, in merge_from_file
    self.merge_from_other_cfg(loaded_cfg)
  File "/home/media4us/anaconda3/lib/python3.7/site-packages/fvcore/common/config.py", line 121, in merge_from_other_cfg
    return super().merge_from_other_cfg(cfg_other)
  File "/home/media4us/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/home/media4us/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 460, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/media4us/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 473, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.MASKIOU_ON' 

I have cloned and installed detectron2 and adet successfully without getting any errors.

This is the command that I've used to run demo.py

python demo.py --webcam --config-file ../configs/centermask/centermask_lite_V_39_eSE_FPN_ms_4x.yaml

Thanks!

config not right

python3 demo/demo.py --config-file ./configs/centermask/centermask_V_39_eSE_FPN_ms_3x.yaml --opts MODEL.WEIGHTS=./weights/centermask2-V-39-eSE-FPN-ms-3x.pth

 File "/usr/local/lib/python3.6/dist-packages/fvcore/common/config.py", line 121, in merge_from_other_cfg
    return super().merge_from_other_cfg(cfg_other)
  File "/usr/local/lib/python3.6/dist-packages/yacs-0.1.6-py3.6.egg/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/usr/local/lib/python3.6/dist-packages/yacs-0.1.6-py3.6.egg/yacs/config.py", line 460, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/usr/local/lib/python3.6/dist-packages/yacs-0.1.6-py3.6.egg/yacs/config.py", line 473, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.MASKIOU_ON'

Error while training on one-class dataset

Thanks for your great work.
When I am training the model on my own dataset which has only one class(i.e. person), I got the error:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/detectron2_repo/detectron2/engine/launch.py", line 84, in _distributed_worker
main_func(*args)
File "/home/liuyoucun/human_segmentation/centermask2-master/train_net.py", line 234, in main
return trainer.train()
File "/home/liuyoucun/human_segmentation/centermask2-master/train_net.py", line 113, in train
self.train_loop(self.start_iter, self.max_iter)
File "/home/liuyoucun/human_segmentation/centermask2-master/train_net.py", line 102, in train_loop
self.run_step()
File "/detectron2_repo/detectron2/engine/train_loop.py", line 215, in run_step
loss_dict = self.model(data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/detectron2_repo/detectron2/modeling/meta_arch/rcnn.py", line 130, in forward
_, detector_losses = self.roi_heads(images, features, proposals, gt_instances)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/liuyoucun/human_segmentation/centermask2-master/centermask/modeling/centermask/center_heads.py", line 321, in forward
losses, mask_features, selected_mask, labels, maskiou_targets = self._forward_mask(features_list, proposals)
File "/home/liuyoucun/human_segmentation/centermask2-master/centermask/modeling/centermask/center_heads.py", line 387, in _forward_mask
loss, selected_mask, labels, maskiou_targets = mask_rcnn_loss(mask_logits, proposals, self.maskiou_on)
File "/home/liuyoucun/human_segmentation/centermask2-master/centermask/modeling/centermask/mask_head.py", line 97, in mask_rcnn_loss
gt_classes = cat(gt_classes, dim=0)
File "/detectron2_repo/detectron2/layers/wrappers.py", line 25, in cat
return torch.cat(tensors, dim)
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CUDATensorId, CPUTensorId, VariableTensorId]

I suppose there is bug at line 63 in mask_head.py:
cls_agnostic_mask = pred_mask_logits.size(1) == 1
it will treat one-class-specific mask as class-agnostic mask then skip the code at line 76-78:

ๆˆชๅœ– 2020-02-21 ไธ‹ๅˆ9 19 10

and finally leads to the torch.cat RuntimeError.

However, setting cls_agnostic_mask to false makes the code work.

About the inference time

I do the evaluation with the model centermask2-lite-V-39-eSE-FPN-ms-4x.pth on 1080Ti, the result shows that the inference pure computation time is about 50ms per image, it is much slower than the result on paper. So is there something wrong or I add more computation time? Could you help me? Thank you.

Installing Detctron2 does NOT install centermask :(

have tried

pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.5/index.html

and

python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

and

git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2

None install centermask

Is there a wheel somewhere for this module?

Accuracy Threshhold not working

I have a problem that the Accuracy Threshhold is not working in my evaluation. I configured this:

cfg.MODEL.RETINANET.SCORE_THRESH_TEST = 0.5 
 cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
 cfg.MODEL.PANOPTIC_FPN.COMBINE.INSTANCES_CONFIDENCE_THRESH = 0.5

but no threshhold is applied during the visualization of my test data.

This is how a visualize my test data:

img = cv2.imread(d["file_name"])
outputs = predictor(img)
visualizer = Visualizer(img[:, :, ::-1], metadata=can_metadata, scale=0.8, instance_mode=ColorMode.SEGMENTATION)
vis = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2.imwrite("/volume/processed/" + d["file_name"][36:], vis.get_image()[:, :, ::-1])

A_00_20190328_140037

Do you know a way to solve this issue?

Input size of data set

Hi @youngwanLEE
If I use a data set with a very large resolution and do not modify the input size setting, will it affect my training? For example, when 4000x6000, set 600x800๏ผ›

CUDA out of memory

Hi @youngwanLEE
I was trying centermask2 on a different dataset other than COCO.
I use a single V100 GPU.

I put Batch size to 8 and remain MIN_SIZE_TRAIN unchanged.
The config file I used is centermask_V_39_eSE_FPN_ms_3x.yaml

Yet I still got CUDA OOM error.

I couldn't see other factors that could leading to this OOM error.

Could you please give me some tips?

Question about training own datasets

Thanks for sharing!
I met a problem when training with my own dataset. In the begining, the loss seems very small, then they become zero.
And there was no output during the test.
ๆˆชๅฑ2020-06-29 ไธ‹ๅˆ10 04 58

Can someone tell me where the problem is?

Pretrained models still contains training informations

The pretrained models you uploaded still contain all the training informations. So it is not possible to restart a training from scratch without modifying directly the downloaded file.

For example:

[fvcore.common.download][INFO] - Downloading from https://dl.dropbox.com/s/v64mknwzfpmfcdh/faster_V_99_eSE_ms_3x.pth?dl=1
...
...
...
[detectron2.engine.train_loop][INFO] - Starting training from iteration 270000

After some research, it seems that it is not possible to give an argument to avoid this, it must be modified by hand.

FloatingPointError: Loss became infinite or NaN at iteration=137!

Hi! Got the error FloatingPointError: Loss became infinite or NaN at iteration=137!
Any help is applicable! Thanks!


-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/alfrid/.virtualenvs/jupyter/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/home/alfrid/.virtualenvs/jupyter/lib/python3.6/site-packages/detectron2/engine/launch.py", line 84, in _distributed_worker
    main_func(*args)
  File "/home/alfrid/Documents/repos/centermask2/train_net.py", line 338, in main
    return trainer.train()
  File "/home/alfrid/Documents/repos/centermask2/train_net.py", line 201, in train
    self.train_loop(self.start_iter, self.max_iter)
  File "/home/alfrid/Documents/repos/centermask2/train_net.py", line 190, in train_loop
    self.run_step()
  File "/home/alfrid/.virtualenvs/jupyter/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 216, in run_step
    self._detect_anomaly(losses, loss_dict)
  File "/home/alfrid/.virtualenvs/jupyter/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 239, in _detect_anomaly
    self.iter, loss_dict
FloatingPointError: Loss became infinite or NaN at iteration=137!
loss_dict = {'loss_mask': tensor(0.7000, device='cuda:1', grad_fn=<BinaryCrossEntropyWithLogitsBackward>), 'loss_maskiou': tensor(0., device='cuda:1', grad_fn=<MulBackward0>), 'loss_fcos_cls': tensor(nan, device='cuda:1', grad_fn=<DivBackward0>), 'loss_fcos_loc': tensor(nan, device='cuda:1', grad_fn=<DivBackward0>), 'loss_fcos_ctr': tensor(nan, device='cuda:1', grad_fn=<DivBackward0>)}

python train_net.py --config-file configs/centermask/centermask_V_99_eSE_FPN_ms_3x.yaml  --num-gpus 2```


cfg.DATASETS.TRAIN = tuple(DATASETS)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.MODEL.FCOS.NUM_CLASSES = 1
cfg.MODEL.RETINANET.NUM_CLASSES = 1
cfg.OUTPUT_DIR = OUTPUT_DIR
cfg.SOLVER.CHECKPOINT_PERIOD = 500
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
cfg.DATALOADER.NUM_WORKERS = 2
cfg.SOLVER.MAX_ITER = 1250000
# cfg.VIS_PERIOD = 100
cfg.TEST.DETECTIONS_PER_IMAGE = 500

python environment

Hi @youngwanLEE
Thanks for the nice work.
The current dependency for pytorch that is mentioned in README is 1.3.1.
However, the dependency for detectron2 is pytorch >= 1.4 as mentioned in HERE
Can you clarify this? Maybe it might be better to hardcode the dependencies into a requirements.txt or setup.py

KeyError: 'Non-existent config key: MODEL.VOVNET'

https://github.com/youngwanLEE/centermask2#installation

All you need to use centermask2 is detectron2. It's easy!
you just install detectron2 following INSTALL.md.
Prepare for coco dataset following this instruction.

No, it's not easy. It can't work.

Here is what I did:

1.followed install instruction and then detectron2-0.1.3+cu101-cp37-cp37m-linux_x86_64.whl can work.
2.download centermask2-V-99-eSE-FPN-ms-3x.pth and this repo, set as

cfg.merge_from_file("/home/pc/centermask2/configs/centermask/centermask_V_99_eSE_FPN_ms_3x.yaml")

cfg.MODEL.WEIGHTS = "/home/pc/dataset/centermask2-V-99-eSE-FPN-ms-3x.pth"

  1. train
  2. get the message KeyError: 'Non-existent config key: MODEL.VOVNET'.

so, are there any paths needed to set in environment?

Thx a lot

ImportError CityscapesEvaluator (with recent Detectron2 git version)

Hello, thank you for your very interesting work with CenterMask2!

While giving CenterMask2 a test shot with the latest Detectron2 version (from GitHub) the train_net.py ended with an Import Error:

> python train_net.py --config-file "configs/centermask/centermask_lite_V_19_eSE_FPN_ms_4x.yaml"
Traceback (most recent call last):
  File "train_net.py", line 13, in <module>
    from detectron2.evaluation import (
ImportError: cannot import name 'CityscapesEvaluator' from 'detectron2.evaluation' (/home/daniel/git/detectron2/detectron2/evaluation/__init__.py)

I did a little digging and found out that there has been a change a couple of days ago in Detectron2. As far as I understand the class CityscapesEvaluator still exists but is not contained in the __init__.py anymore. The functionality of the class seems to have partly moved into CityscapesInstanceEvaluator and an additional class CityscapesSemSegEvaluator has been introduced.

I'm not sure whether the Detectron2 version has changed but I find that if I compile my version (from GitHub) it does say:

>>> print(detectron2.__version__)
0.1.2

I'm not sure how to best handle this - possibly by adjusting the code and adding a requirements(.txt) file when fixing the issue. I myself (locally) will replace the import of CityscapesEvaluator with CityscapesInstanceEvaluator. Will try to create a GitHub pull request but as I'm not familiar with this system please correct me if I'm doing something wrong.

Thank you and greetings from Hongdae, Seoul.

inference / training only based on bounding boxes

Hi,

according to the results in the paper you listed for the COCO dataset, centermask is under the current state-of-the-art architectures for object detection. For my use case I only have bounding box labeled training data. I'm wondering if it is possible to train centermask only on bounding boxes?

Hope you can give provide some information, thank you so much!

Error while running evaluation using "--eval-only" flag

We have trained an instance segmentation model: 6 classes and V2_99 backbone. The detectron2 and centermask2 documentations are pretty comprehensive and we were able to run demo.py and visualize the bbox and mask predictions without any issues.

However, we haven't been able to run the evaluation and get a quantitative assessment of the model thus far. This is what we did and observed:

We ran train_net with --eval-only flag on cfg.DATASET.TEST that has 240 images. The evaluation process executed. We printed the prediction and could see multiple instances predicted in each of the images. However, when the process reached image_id 5., it threw an error despite the fact that the model had predicted 22 instances in that image. There is nothing special about the image - everything is the same as the first set of images on which the evaluation ran. Here are the details and output:

Model loading:
image

Dataset loading:
image

The evaluation runs fine for the first 4 images but starts to give the following error after that:

image

image

Can you please help us with this issue?

How to export the onnx of the centermask?

Hi, because you built the centermask based on detectron2, I'm a little confused about how to export the onnx of the model. Hope you can give some help, thank you so much!

Issues with pretrained weights

  • Dimensions mismatch
Config '/opt/detectron2/projects/CenterMask2/configs/centermask/centermask_lite_V_19_slim_dw_eSE_FPN_ms_4x.yaml' has no VERSION. 
Assuming it to be compatible with latest v2.

Pretrained weights on coco dataset found 
(at `https://dl.dropbox.com/s/tczecsdxt10uai5/vsvhwtqm6ko1c7m/centermask-lite-V-19-eSE-slim-dw-FPN-ms-4x.pth`).
'backbone.fpn_lateral3.weight' has shape (256, 512, 1, 1) in the checkpoint but (128, 256, 1, 1) in the model! Skipped.
'backbone.fpn_lateral3.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.fpn_output3.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'backbone.fpn_output3.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.fpn_lateral4.weight' has shape (256, 768, 1, 1) in the checkpoint but (128, 384, 1, 1) in the model! Skipped.
'backbone.fpn_lateral4.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.fpn_output4.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'backbone.fpn_output4.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.fpn_lateral5.weight' has shape (256, 1024, 1, 1) in the checkpoint but (128, 512, 1, 1) in the model! Skipped.
'backbone.fpn_lateral5.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.fpn_output5.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'backbone.fpn_output5.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.top_block.p6.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'backbone.top_block.p6.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.top_block.p7.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'backbone.top_block.p7.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'backbone.bottom_up.stage2.OSA2_1.concat.OSA2_1_concat/conv.weight' has shape (256, 768, 1, 1) in the checkpoint but (112, 256, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage2.OSA2_1.concat.OSA2_1_concat/norm.weight' has shape (256,) in the checkpoint but (112,) in the model! Skipped.
'backbone.bottom_up.stage2.OSA2_1.concat.OSA2_1_concat/norm.bias' has shape (256,) in the checkpoint but (112,) in the model! Skipped.
'backbone.bottom_up.stage2.OSA2_1.concat.OSA2_1_concat/norm.running_mean' has shape (256,) in the checkpoint but (112,) in the model! Skipped.
'backbone.bottom_up.stage2.OSA2_1.concat.OSA2_1_concat/norm.running_var' has shape (256,) in the checkpoint but (112,) in the model! Skipped.
'backbone.bottom_up.stage2.OSA2_1.ese.fc.weight' has shape (256, 256, 1, 1) in the checkpoint but (112, 112, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage2.OSA2_1.ese.fc.bias' has shape (256,) in the checkpoint but (112,) in the model! Skipped.
'backbone.bottom_up.stage3.OSA3_1.concat.OSA3_1_concat/conv.weight' has shape (512, 1056, 1, 1) in the checkpoint but (256, 352, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage3.OSA3_1.concat.OSA3_1_concat/norm.weight' has shape (512,) in the checkpoint but (256,) in the model! Skipped.
'backbone.bottom_up.stage3.OSA3_1.concat.OSA3_1_concat/norm.bias' has shape (512,) in the checkpoint but (256,) in the model! Skipped.
'backbone.bottom_up.stage3.OSA3_1.concat.OSA3_1_concat/norm.running_mean' has shape (512,) in the checkpoint but (256,) in the model! Skipped.
'backbone.bottom_up.stage3.OSA3_1.concat.OSA3_1_concat/norm.running_var' has shape (512,) in the checkpoint but (256,) in the model! Skipped.
'backbone.bottom_up.stage3.OSA3_1.ese.fc.weight' has shape (512, 512, 1, 1) in the checkpoint but (256, 256, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage3.OSA3_1.ese.fc.bias' has shape (512,) in the checkpoint but (256,) in the model! Skipped.
'backbone.bottom_up.stage4.OSA4_1.concat.OSA4_1_concat/conv.weight' has shape (768, 1472, 1, 1) in the checkpoint but (384, 544, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage4.OSA4_1.concat.OSA4_1_concat/norm.weight' has shape (768,) in the checkpoint but (384,) in the model! Skipped.
'backbone.bottom_up.stage4.OSA4_1.concat.OSA4_1_concat/norm.bias' has shape (768,) in the checkpoint but (384,) in the model! Skipped.
'backbone.bottom_up.stage4.OSA4_1.concat.OSA4_1_concat/norm.running_mean' has shape (768,) in the checkpoint but (384,) in the model! Skipped.
'backbone.bottom_up.stage4.OSA4_1.concat.OSA4_1_concat/norm.running_var' has shape (768,) in the checkpoint but (384,) in the model! Skipped.
'backbone.bottom_up.stage4.OSA4_1.ese.fc.weight' has shape (768, 768, 1, 1) in the checkpoint but (384, 384, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage4.OSA4_1.ese.fc.bias' has shape (768,) in the checkpoint but (384,) in the model! Skipped.
'backbone.bottom_up.stage5.OSA5_1.concat.OSA5_1_concat/conv.weight' has shape (1024, 1888, 1, 1) in the checkpoint but (512, 720, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage5.OSA5_1.concat.OSA5_1_concat/norm.weight' has shape (1024,) in the checkpoint but (512,) in the model! Skipped.
'backbone.bottom_up.stage5.OSA5_1.concat.OSA5_1_concat/norm.bias' has shape (1024,) in the checkpoint but (512,) in the model! Skipped.
'backbone.bottom_up.stage5.OSA5_1.concat.OSA5_1_concat/norm.running_mean' has shape (1024,) in the checkpoint but (512,) in the model! Skipped.
'backbone.bottom_up.stage5.OSA5_1.concat.OSA5_1_concat/norm.running_var' has shape (1024,) in the checkpoint but (512,) in the model! Skipped.
'backbone.bottom_up.stage5.OSA5_1.ese.fc.weight' has shape (1024, 1024, 1, 1) in the checkpoint but (512, 512, 1, 1) in the model! Skipped.
'backbone.bottom_up.stage5.OSA5_1.ese.fc.bias' has shape (1024,) in the checkpoint but (512,) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.0.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.0.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.1.weight' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.1.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.3.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.3.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.4.weight' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.cls_tower.4.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.0.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.0.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.1.weight' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.1.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.3.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.3.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.4.weight' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.bbox_tower.4.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'proposal_generator.fcos_head.cls_logits.weight' has shape (80, 256, 3, 3) in the checkpoint but (80, 128, 3, 3) in the model! Skipped.
'proposal_generator.fcos_head.bbox_pred.weight' has shape (4, 256, 3, 3) in the checkpoint but (4, 128, 3, 3) in the model! Skipped.
'proposal_generator.fcos_head.ctrness.weight' has shape (1, 256, 3, 3) in the checkpoint but (1, 128, 3, 3) in the model! Skipped.
'roi_heads.mask_head.mask_fcn1.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'roi_heads.mask_head.mask_fcn1.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'roi_heads.mask_head.mask_fcn2.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'roi_heads.mask_head.mask_fcn2.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'roi_heads.mask_head.deconv.weight' has shape (256, 256, 2, 2) in the checkpoint but (128, 128, 2, 2) in the model! Skipped.
'roi_heads.mask_head.deconv.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'roi_heads.mask_head.predictor.weight' has shape (80, 256, 1, 1) in the checkpoint but (80, 128, 1, 1) in the model! Skipped.
'roi_heads.maskiou_head.maskiou_fcn1.weight' has shape (256, 257, 3, 3) in the checkpoint but (128, 129, 3, 3) in the model! Skipped.
'roi_heads.maskiou_head.maskiou_fcn1.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'roi_heads.maskiou_head.maskiou_fcn2.weight' has shape (256, 256, 3, 3) in the checkpoint but (128, 128, 3, 3) in the model! Skipped.
'roi_heads.maskiou_head.maskiou_fcn2.bias' has shape (256,) in the checkpoint but (128,) in the model! Skipped.
'roi_heads.maskiou_head.maskiou_fc1.weight' has shape (1024, 12544) in the checkpoint but (1024, 6272) in the model! Skipped.

This missmatch only happend with lite and slim pretrained weights.

  • Centermask V-99

Something is not working with this pretrained weights of centermask2-V-99-eSE-FPN-ms-3x. The model don't detect cars on this image:

multiple_car

When centermask2-V-39-eSE-FPN-ms-3x or centermask2-V-57-eSE-FPN-ms-3x detects cars correctly.

The problem of mask_iou occupying excessive memory

Now, I am training centermask2 with resnest50 backbone on a single 1080 GPU. I find that if i set MASK_IOU = False, IMS_PER_BATCH can up to 4. However, IMS_PER_BATCH will be limited to 2 with MASK_IOU == True. Is this phenomenon normal? Does anyone know what caused this?

NUM_CLASSES on custom dataset ?

I'm trying to change the number of classes the network can detect. By using the exact same pipeline with another network I can get results but with CenterMask I get nothing at all. Looks like the network stays on the 80 COCO classes.

I've changed all the cfg keys containing NUM_CLASSES :

cfg.MODEL.ROI_HEADS.NUM_CLASSES = num_classes
cfg.MODEL.SEM_SEG_HEAD.NUM_CLASSES = num_classes
cfg.MODEL.RETINANET.NUM_CLASSES = num_classes
cfg.MODEL.FCOS.NUM_CLASSES = num_classes

I guess I'm missing something?

Error about use pretrainde weight to train one-class dataset?

I want to use this weight to finetune my dataset, I use 2 gpus.
The weight I use is centermask-V2-99-FPN-ms-3x.pth
What I change is as follow:

  1. In centermask_V_99_eSE_FPN_ms_3x.yaml change MAX_ITER=300000,and change MODEL.WEIGHT to path of centermask-V2-99-FPN-ms-3x.pth
  2. In Base-CenterMask-VoVNet.yaml change IMS_PER_BATCH: 2, BASE_LR: 0.002
  3. register my dataset, my dataset only has one class, and annotation from .dataset_mapper import DatasetMapperWithBasis because it get error
  4. In defaults.py change _C.MODEL.FCOS.NUM_CLASSES = 1
    Then I run train_net.py, it show:
Traceback (most recent call last):
  File "/home/ql-b423/Desktop/TXH/CenterMask/new/centermask2/train_net.py", line 233, in <module>
    args=(args,),
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/detectron2/engine/launch.py", line 49, in launch
    daemon=False,
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
    while not spawn_context.join():
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
    raise Exception(msg)
Exception: 

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/detectron2/engine/launch.py", line 84, in _distributed_worker
    main_func(*args)
  File "/home/ql-b423/Desktop/TXH/CenterMask/new/centermask2/train_net.py", line 218, in main
    return trainer.train()
  File "/home/ql-b423/Desktop/TXH/CenterMask/new/centermask2/train_net.py", line 97, in train
    self.train_loop(self.start_iter, self.max_iter)
  File "/home/ql-b423/Desktop/TXH/CenterMask/new/centermask2/train_net.py", line 86, in train_loop
    self.run_step()
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 234, in run_step
    self.optimizer.step()
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/torch/optim/lr_scheduler.py", line 66, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/ql-b423/anaconda3/envs/Kiruto/lib/python3.6/site-packages/torch/optim/sgd.py", line 100, in step
    buf.mul_(momentum).add_(1 - dampening, d_p)
RuntimeError: The size of tensor a (3) must match the size of tensor b (512) at non-singleton dimension 1


Process finished with exit code 1

Could you tell me what am I doing wrong?

loss_fcos_ctr is always around 0.6

These days I trained the centermask(lite_mv2 and vov99) model on coco and my own dataset. However the training can not make loss_fcos_ctr go down. The loss_fcos_ctr is just around 0.6, the total loss is even get about 0.7. The two training experiment get the same result and I got my data checked again to make sure it correct. Now, I don't know how to solve this problem. Is there anyone else get the same problem like this?

Enable support for RLE encoded segmentations

Hi there! Thanks of the amazing repository.

I tried training my model with custom data (following detectron2 guidelines) with annotation masks encoded in RLE. I also used cfg.INPUT.MASK_FORMAT='bitmask'. All the detectron2 magic works - loading the data, building the model etc. - but the code crashes when the model first tries to evaluate the losses. Attached is (the relevant part of) the error message, from which I would deduce that somewhere in the code a conversion of masks from Bitmasks to Polygons would be needed.

Traceback (most recent call last):
 ...

  File "../centermask/modeling/centermask/center_heads.py", line 401, in forward
    losses, mask_features, selected_mask, labels, maskiou_targets = self._forward_mask(features, proposals)
  File "../centermask/modeling/centermask/center_heads.py", line 476, in _forward_mask
    loss, selected_mask, labels, maskiou_targets = mask_rcnn_loss(mask_logits, proposals, self.maskiou_on)
  File "../centermask/modeling/centermask/mask_head.py", line 80, in mask_rcnn_loss
    cropped_mask = crop(instances_per_image.gt_masks.polygons, instances_per_image.proposal_boxes.tensor)
AttributeError: 'BitMasks' object has no attribute 'polygons'

Three undefined names

flake8 testing of https://github.com/youngwanLEE/centermask2 on Python 3.8.1

$ flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics

./centermask/modeling/backbone/vovnet.py:283:17: F821 undefined name 'freeze_bn_params'
                freeze_bn_params(m)
                ^
./centermask/modeling/backbone/vovnet.py:343:19: F821 undefined name 'LastLevelMaxPool'
        top_block=LastLevelMaxPool(),
                  ^
./centermask/modeling/centermask/mask_head.py:102:62: F821 undefined name 'labels'
            selected_mask = pred_mask_logits[selected_index, labels]
                                                             ^
3     F821 undefined name 'freeze_bn_params'
3

https://flake8.pycqa.org/en/latest/user/error-codes.html

On the flake8 test selection, this PR does not focus on "style violations" (the majority of flake8 error codes that psf/black can autocorrect). Instead these tests are focus on runtime safety and correctness:

  • E9 tests are about Python syntax errors usually raised because flake8 can not build an Abstract Syntax Tree (AST). Often these issues are a sign of unused code or code that has not been ported to Python 3. These would be compile-time errors in a compiled language but in a dynamic language like Python they result in the script halting/crashing on the user.
  • F63 tests are usually about the confusion between identity and equality in Python. Use ==/!= to compare str, bytes, and int literals is the classic case. These are areas where a == b is True but a is b is False (or vice versa). Python >= 3.8 will raise SyntaxWarnings on these instances.
  • F7 tests logic errors and syntax errors in type hints
  • F82 tests are almost always undefined names which are usually a sign of a typo, missing imports, or code that has not been ported to Python 3. These also would be compile-time errors in a compiled language but in Python a NameError is raised which will halt/crash the script on the user.

Example for a simple inference

Hey,

Could anyone point me to a simple inference example for one of the models. I can not get the demo/demo.py or the example for evaluation to work.

Thank you.

total_loss: 0.6 Bad results

@youngwanLEE

  • I trained both centermask_V_39_eSE_FPN_ms_3x and centermask_lite_V_19_slim_dw_eSE_FPN_ms_4x
  • After ~70,000 iterations, the results were bad, and the total_loss doesn't stabilize below 0.6

How to reproduce:

Both v19 and v39 get bad results, and won't decline further below ~total_loss: 0.6

Samples
inference_2
inference_0
inference_2

Total loss doesn't goes down less 1.2 for small objects

I'me training for small object the model. Using now the ResNet backbone. And the loss doesn't go down less 1.2. I also tried the vovnet backbone, the same story.

Logs from training
loss_fcos_loc: 0.175 loss_fcos_ctr: 0.633 time: 0.6485 data_time: 0.0014 lr: 0.000050 max_mem: 4282M [05/25 12:13:24 d2.utils.events]: eta: 4:25:05 iter: 5019 total_loss: 1.281 loss_mask: 0.393 loss_maskiou: 0.005 loss_fcos_cls: 0.105 loss_fcos_loc: 0.168 loss_fcos_ctr: 0.621 time: 0.6485 data_time: 0.0014 lr: 0.000050 max_mem: 4282M [05/25 12:13:36 d2.utils.events]: eta: 4:24:51 iter: 5039 total_loss: 1.318 loss_mask: 0.338 loss_maskiou: 0.005 loss_fcos_cls: 0.117 loss_fcos_loc: 0.190 loss_fcos_ctr: 0.617 time: 0.6485 data_time: 0.0014 lr: 0.000050 max_mem: 4282M [05/25 12:13:49 d2.utils.events]: eta: 4:24:26 iter: 5059 total_loss: 1.216 loss_mask: 0.375 loss_maskiou: 0.005 loss_fcos_cls: 0.097 loss_fcos_loc: 0.148 loss_fcos_ctr: 0.627 time: 0.6485 data_time: 0.0014 lr: 0.000050 max_mem: 4282M

Do you have any idea why it's happening ?

Assertion error while running demo.py

(torchtest) C:\Users\User\Downloads\centermask2\demo>python demo.py
๏ฟฝ[32m[03/11 10:18:37 detectron2]: ๏ฟฝ[0mArguments: Namespace(confidence_threshold=0.5, config_file='C:/Users/User/Downloads/centermask2/demo/detectron2/configs/quick_schedules/fast_rcnn_R_50_FPN_inference_acc_test.yaml', input='C:/Users/User/Downloads/CenterMask-master/demo/images', opts=[], output='C:/Users/User/Downloads/centermask2/output', video_input=None, webcam=False)
0%| | 0/134 [00:00<?, ?it/s]Process _PredictWorker-1:
Traceback (most recent call last):
File "C:\Users\User\Anaconda3\envs\torchtest\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "C:\Users\User\Downloads\centermask2\demo\predictor.py", line 179, in run
result = predictor(data)
File "C:\Users\User\Anaconda3\envs\torchtest\lib\site-packages\detectron2-0.1.1-py3.7-win-amd64.egg\detectron2\engine\defaults.py", line 196, in call
predictions = self.model([inputs])[0]
File "C:\Users\User\Anaconda3\envs\torchtest\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\User\Anaconda3\envs\torchtest\lib\site-packages\detectron2-0.1.1-py3.7-win-amd64.egg\detectron2\modeling\meta_arch\rcnn.py", line 108, in forward
return self.inference(batched_inputs)
File "C:\Users\User\Anaconda3\envs\torchtest\lib\site-packages\detectron2-0.1.1-py3.7-win-amd64.egg\detectron2\modeling\meta_arch\rcnn.py", line 167, in inference
assert "proposals" in batched_inputs[0]
AssertionError

No instances detected on input image while running demo.py

I run the following command to run a demo -
"python demo.py --input trio.jpg --output results/ --config-file ../configs/centermask/centermask_V_99_eSE_FPN_ms_3x.yaml"
But I get the following-
[03/17 07:05:17 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='../configs/centermask/Base-CenterMask-VoVNet.yaml', input=['trio.jpg'], opts=[], output='results/', video_input=None, webcam=False)
WARNING [03/17 07:05:17 d2.config.compat]: Config '../configs/centermask/Base-CenterMask-VoVNet.yaml' has no VERSION. Assuming it to be compatible with latest v2.
0% 0/1 [00:00<?, ?it/s][03/17 07:05:22 detectron2]: trio.jpg: detected 0 instances in 0.16s
100% 1/1 [00:00<00:00, 2.61it/s]
And the output is the same as input. There are no segmentation masks painted on the result image.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.