Coder Social home page Coder Social logo

jfzhang95 / pytorch-deeplab-xception Goto Github PK

View Code? Open in Web Editor NEW
2.9K 46.0 772.0 939 KB

DeepLab v3+ model in PyTorch. Support different backbones.

License: MIT License

Python 99.69% Shell 0.31%
deeplab-v3-plus pytorch xception resnet mobilenetv2 drn

pytorch-deeplab-xception's Introduction

pytorch-deeplab-xception

Update on 2018/12/06. Provide model trained on VOC and SBD datasets.

Update on 2018/11/24. Release newest version code, which fix some previous issues and also add support for new backbones and multi-gpu training. For previous code, please see in previous branch

TODO

  • Support different backbones
  • Support VOC, SBD, Cityscapes and COCO datasets
  • Multi-GPU training
Backbone train/eval os mIoU in val Pretrained Model
ResNet 16/16 78.43% google drive
MobileNet 16/16 70.81% google drive
DRN 16/16 78.87% google drive

Introduction

This is a PyTorch(0.4.1) implementation of DeepLab-V3-Plus. It can use Modified Aligned Xception and ResNet as backbone. Currently, we train DeepLab V3 Plus using Pascal VOC 2012, SBD and Cityscapes datasets.

Results

Installation

The code was tested with Anaconda and Python 3.6. After installing the Anaconda environment:

  1. Clone the repo:

    git clone https://github.com/jfzhang95/pytorch-deeplab-xception.git
    cd pytorch-deeplab-xception
  2. Install dependencies:

    For PyTorch dependency, see pytorch.org for more details.

    For custom dependencies:

    pip install matplotlib pillow tensorboardX tqdm

Training

Follow steps below to train your model:

  1. Configure your dataset path in mypath.py.

  2. Input arguments: (see full input arguments via python train.py --help):

    usage: train.py [-h] [--backbone {resnet,xception,drn,mobilenet}]
                [--out-stride OUT_STRIDE] [--dataset {pascal,coco,cityscapes}]
                [--use-sbd] [--workers N] [--base-size BASE_SIZE]
                [--crop-size CROP_SIZE] [--sync-bn SYNC_BN]
                [--freeze-bn FREEZE_BN] [--loss-type {ce,focal}] [--epochs N]
                [--start_epoch N] [--batch-size N] [--test-batch-size N]
                [--use-balanced-weights] [--lr LR]
                [--lr-scheduler {poly,step,cos}] [--momentum M]
                [--weight-decay M] [--nesterov] [--no-cuda]
                [--gpu-ids GPU_IDS] [--seed S] [--resume RESUME]
                [--checkname CHECKNAME] [--ft] [--eval-interval EVAL_INTERVAL]
                [--no-val]
    
  3. To train deeplabv3+ using Pascal VOC dataset and ResNet as backbone:

    bash train_voc.sh
  4. To train deeplabv3+ using COCO dataset and ResNet as backbone:

    bash train_coco.sh

Acknowledgement

PyTorch-Encoding

Synchronized-BatchNorm-PyTorch

drn

pytorch-deeplab-xception's People

Contributors

jfzhang95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-deeplab-xception's Issues

Unable to reproduce result on VOC with small batch size

Hi, thanks for releasing this great repo. I have a problem in reproducing the result on VOC dataset.

I noticed that your released pretrained model with ResNet | 16/16 | gives 78.43%. But when I trained with a batch size of 2 (cause I only have one GPU), the mIoU is really bad. I'm just wondering if it's necessary to use large batch size in training deeplabv3+.

Why DW is not followed by BN in Xception?

I have some questions about the network architecture.

  1. Why no BN after DW in Xception-backbone
  2. In Resnet Backbone, when os=8, the dilation rate of convs in block3 should all be 2, but you only make the rate of the first conv layer to be 2.
  3. For os=16, when Resnet has only 4 blocks, blocks[1,2,4] is a better choice. But I don't think blocks[1, 2, 1] for os=8 is a right choice.
  4. Maybe we should add more choice for Resnet structure, for example, 7 blocks just as proposed in paper of DeepLabv3.

xception backbone mIou low

I can get mIoU = 74.4% with resnet backbone after 50 epoch
while still get a low mIoU when I train with xception as backbone

Skip Connection Middle FLow

Using if planes != inplanes or stride != 1: condition dosen't create skip connection for middle flow.
It should create skip connection as in the original paper, or am i missing something?

A problem about load weight

load trained weights
weightPath: /media/files/pytorch-deeplab-xception-v2/run/deeplab-xception/model_best.pth.tar
Traceback (most recent call last):
File "pre.py", line 65, in
net.load_state_dict(torch.load(weightPath))
File "/home/Software/anaconda3/envs/torch-py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.backbone.conv1.weight", "module.backbone.bn1.weight", "module.backbone.bn1.bias", "module.backbone.bn1.running_mean", "module.backbone.bn1.running_var", "module.backbone.conv2.weight", "module.backbone.bn2.weight", "module.backbone.bn2.bias", "module.backbone.bn2.running_mean", "module.backbone.bn2.running_var", "module.backbone.block1.skip.weight", "module.backbone.block1.skipbn.weight", "module.backbone.block1.skipbn.bias", "module.backbone.block1.skipbn.running_mean", "module.backbone.block1.skipbn.running_var", "module.backbone.block1.rep.0.conv1.weight", "module.backbone.block1.rep.0.bn.weight", "module.backbone.block1.rep.0.bn.bias", "module.backbone.block1.rep.0.bn.running_mean", "module.backbone.block1.rep.0.bn.running_var", "module.backbone.block1.rep.0.pointwise.weight", "module.backbone.block1.rep.1.weight", "module.backbone.block1.rep.1.bias", "module.backbone.block1.rep.1.running_mean", "module.backbone.block1.rep.1.running_var", "module.backbone.block1.rep.3.conv1.weight", "module.backbone.block1.rep.3.bn.weight", "module.backbone.block1.rep.3.bn.bias", "module.backbone.block1.rep.3.bn.running_mean", "module.backbone.block1.rep.3.bn.running_var", "module.backbone.block1.rep.3.pointwise.weight", "module.backbone.block1.rep.4.weight", "module.backbone.block1.rep.4.bias", "module.backbone.block1.rep.4.running_mean", "module.backbone.block1.rep.4.running_var", "module.backbone.block1.rep.6.conv1.weight", "module.backbone.block1.rep.6.bn.weight", "module.backbone.block1.rep.6.bn.bias", "module.backbone.block1.rep.6.bn.running_mean", "module.backbone.block1.rep.6.bn.running_var", "module.backbone.block1.rep.6.pointwise.weight", "module.backbone.block1.rep.7.weight", "module.backbone.block1.rep.7.bias", "module.backbone.block1.rep.7.running_mean", "module.backbone.block1.rep.7.running_var", "module.backbone.block2.skip.weight", "module.backbone.block2.skipbn.weight", "module.backbone.block2.skipbn.bias", "module.backbone.block2.skipbn.running_mean", "module.backbone.block2.skipbn.running_var", "module.backbone.block2.rep.0.conv1.weight", "module.backbone.block2.rep.0.bn.weight", "module.backbone.block2.rep.0.bn.bias", "module.backbone.block2.rep.0.bn.running_mean", "module.backbone.block2.rep.0.bn.running_var", "module.backbone.block2.rep.0.pointwise.weight", "module.backbone.block2.rep.1.weight", "module.backbone.block2.rep.1.bias", "module.backbone.block2.rep.1.running_mean", "module.backbone.block2.rep.1.running_var", "module.backbone.block2.rep.3.conv1.weight", "module.backbone.block2.rep.3.bn.weight", "module.backbone.block2.rep.3.bn.bias", "module.backbone.block2.rep.3.bn.running_mean", "module.backbone.block2.rep.3.bn.running_var", "module.backbone.block2.rep.3.pointwise.weight", "module.backbone.block2.rep.4.weight", "module.backbone.block2.rep.4.bias", "module.backbone.block2.rep.4.running_mean", "module.backbone.block2.rep.4.running_var", "module.backbone.block2.rep.6.conv1.weight", "module.backbone.block2.rep.6.bn.weight", "module.backbone.block2.rep.6.bn.bias", "module.backbone.block2.rep.6.bn.running_mean", "module.backbone.block2.rep.6.bn.running_var", "module.backbone.block2.rep.6.pointwise.weight", "module.backbone.block2.rep.7.weight", "module.backbone.block2.rep.7.bias", "module.backbone.block2.rep.7.running_mean", "module.backbone.block2.rep.7.running_var", "module.backbone.block3.skip.weight", "module.backbone.block3.skipbn.weight", "module.backbone.block3.skipbn.bias", "module.backbone.block3.skipbn.running_mean", "module.backbone.block3.skipbn.running_var", "module.backbone.block3.rep.1.conv1.weight", "module.backbone.block3.rep.1.bn.weight", "module.backbone.block3.rep.1.bn.bias", "module.backbone.block3.rep.1.bn.running_mean", "module.backbone.block3.rep.1.bn.running_var", "module.backbone.block3.rep.1.pointwise.weight", "module.backbone.block3.rep.2.weight", "module.backbone.block3.rep.2.bias", "module.backbone.block3.rep.2.running_mean", "module.backbone.block3.rep.2.running_var", "module.backbone.block3.rep.4.conv1.weight", "module.backbone.block3.rep.4.bn.weight", "module.backbone.block3.rep.4.bn.bias", "module.backbone.block3.rep.4.bn.running_mean", "module.backbone.block3.rep.4.bn.running_var", "module.backbone.block3.rep.4.pointwise.weight", "module.backbone.block3.rep.5.weight", "module.backbone.block3.rep.5.bias", "module.backbone.block3.rep.5.running_mean", "module.backbone.block3.rep.5.running_var", "module.backbone.block3.rep.7.conv1.weight", "module.backbone.block3.rep.7.bn.weight", "module.backbone.block3.rep.7.bn.bias", "module.backbone.block3.rep.7.bn.running_mean", "module.backbone.block3.rep.7.bn.running_var", "module.backbone.block3.rep.7.pointwise.weight", "module.backbone.block3.rep.8.weight", "module.backbone.block3.rep.8.bias", "module.backbone.block3.rep.8.running_mean", "module.backbone.block3.rep.8.running_var", "module.backbone.block4.rep.1.conv1.weight", "module.backbone.block4.rep.1.bn.weight", "module.backbone.block4.rep.1.bn.bias", "module.backbone.block4.rep.1.bn.running_mean", "module.backbone.block4.rep.1.bn.running_var", "module.backbone.block4.rep.1.pointwise.weight", "module.backbone.block4.rep.2.weight", "module.backbone.block4.rep.2.bias", "module.backbone.block4.rep.2.running_mean", "module.backbone.block4.rep.2.running_var", "module.backbone.block4.rep.4.conv1.weight", "module.backbone.block4.rep.4.bn.weight", "module.backbone.block4.rep.4.bn.bias", "module.backbone.block4.rep.4.bn.running_mean", "module.backbone.block4.rep.4.bn.running_var", "module.backbone.block4.rep.4.pointwise.weight", "module.backbone.block4.rep.5.weight", "module.backbone.block4.rep.5.bias", "module.backbone.block4.rep.5.running_mean", "module.backbone.block4.rep.5.running_var", "module.backbone.block4.rep.7.conv1.weight", "module.backbone.block4.rep.7.bn.weight", "module.backbone.block4.rep.7.bn.bias", "module.backbone.block4.rep.7.bn.running_mean", "module.backbone.block4.rep.7.bn.running_var", "module.backbone.block4.rep.7.pointwise.weight", "module.backbone.block4.rep.8.weight", "module.backbone.block4.rep.8.bias", "module.backbone.block4.rep.8.running_mean", "module.backbone.block4.rep.8.running_var", "module.backbone.block5.rep.1.conv1.weight", "module.backbone.block5.rep.1.bn.weight", "module.backbone.block5.rep.1.bn.bias", "module.backbone.block5.rep.1.bn.running_mean", "module.backbone.block5.rep.1.bn.running_var", "module.backbone.block5.rep.1.pointwise.weight", "module.backbone.block5.rep.2.weight", "module.backbone.block5.rep.2.bias", "module.backbone.block5.rep.2.running_mean", "module.backbone.block5.rep.2.running_var", "module.backbone.block5.rep.4.conv1.weight", "module.backbone.block5.rep.4.bn.weight", "module.backbone.block5.rep.4.bn.bias", "module.backbone.block5.rep.4.bn.running_mean", "module.backbone.block5.rep.4.bn.running_var", "module.backbone.block5.rep.4.pointwise.weight", "module.backbone.block5.rep.5.weight", "module.backbone.block5.rep.5.bias", "module.backbone.block5.rep.5.running_mean", "module.backbone.block5.rep.5.running_var", "module.backbone.block5.rep.7.conv1.weight", "module.backbone.block5.rep.7.bn.weight", "module.backbone.block5.rep.7.bn.bias", "module.backbone.block5.rep.7.bn.running_mean", "module.backbone.block5.rep.7.bn.running_var", "module.backbone.block5.rep.7.pointwise.weight", "module.backbone.block5.rep.8.weight", "module.backbone.block5.rep.8.bias", "module.backbone.block5.rep.8.running_mean", "module.backbone.block5.rep.8.running_var", "module.backbone.block6.rep.1.conv1.weight", "module.backbone.block6.rep.1.bn.weight", "module.backbone.block6.rep.1.bn.bias", "module.backbone.block6.rep.1.bn.running_mean", "module.backbone.block6.rep.1.bn.running_var", "module.backbone.block6.rep.1.pointwise.weight", "module.backbone.block6.rep.2.weight", "module.backbone.block6.rep.2.bias", "module.backbone.block6.rep.2.running_mean", "module.backbone.block6.rep.2.running_var", "module.backbone.block6.rep.4.conv1.weight", "module.backbone.block6.rep.4.bn.weight", "module.backbone.block6.rep.4.bn.bias", "module.backbone.block6.rep.4.bn.running_mean", "module.backbone.block6.rep.4.bn.running_var", "module.backbone.block6.rep.4.pointwise.weight", "module.backbone.block6.rep.5.weight", "module.backbone.block6.rep.5.bias", "module.backbone.block6.rep.5.running_mean", "module.backbone.block6.rep.5.running_var", "module.backbone.block6.rep.7.conv1.weight", "module.backbone.block6.rep.7.bn.weight", "module.backbone.block6.rep.7.bn.bias", "module.backbone.block6.rep.7.bn.running_mean", "module.backbone.block6.rep.7.bn.running_var", "module.backbone.block6.rep.7.pointwise.weight", "module.backbone.block6.rep.8.weight", "module.backbone.block6.rep.8.bias", "module.backbone.block6.rep.8.running_mean", "module.backbone.block6.rep.8.running_var", "module.backbone.block7.rep.1.conv1.weight", "module.backbone.block7.rep.1.bn.weight", "module.backbone.block7.rep.1.bn.bias", "module.backbone.block7.rep.1.bn.running_mean", "module.backbone.block7.rep.1.bn.running_var", "module.backbone.block7.rep.1.pointwise.weight", "module.backbone.block7.rep.2.weight", "module.backbone.block7.rep.2.bias", "module.backbone.block7.rep.2.running_mean", "module.backbone.block7.rep.2.running_var", "module.backbone.block7.rep.4.conv1.weight", "module.backbone.block7.rep.4.bn.weight", "module.backbone.block7.rep.4.bn.bias", "module.backbone.block7.rep.4.bn.running_mean", "module.backbone.block7.rep.4.bn.running_var", "module.backbone.block7.rep.4.pointwise.weight", "module.backbone.block7.rep.5.weight", "module.backbone.block7.rep.5.bias", "module.backbone.block7.rep.5.running_mean", "module.backbone.block7.rep.5.running_var", "module.backbone.block7.rep.7.conv1.weight", "module.backbone.block7.rep.7.bn.weight", "module.backbone.block7.rep.7.bn.bias", "module.backbone.block7.rep.7.bn.running_mean", "module.backbone.block7.rep.7.bn.running_var", "module.backbone.block7.rep.7.pointwise.weight", "module.backbone.block7.rep.8.weight", "module.backbone.block7.rep.8.bias", "module.backbone.block7.rep.8.running_mean", "module.backbone.block7.rep.8.running_var", "module.backbone.block8.rep.1.conv1.weight", "module.backbone.block8.rep.1.bn.weight", "module.backbone.block8.rep.1.bn.bias", "module.backbone.block8.rep.1.bn.running_mean", "module.backbone.block8.rep.1.bn.running_var", "module.backbone.block8.rep.1.pointwise.weight", "module.backbone.block8.rep.2.weight", "module.backbone.block8.rep.2.bias", "module.backbone.block8.rep.2.running_mean", "module.backbone.block8.rep.2.running_var", "module.backbone.block8.rep.4.conv1.weight", "module.backbone.block8.rep.4.bn.weight", "module.backbone.block8.rep.4.bn.bias", "module.backbone.block8.rep.4.bn.running_mean", "module.backbone.block8.rep.4.bn.running_var", "module.backbone.block8.rep.4.pointwise.weight", "module.backbone.block8.rep.5.weight", "module.backbone.block8.rep.5.bias", "module.backbone.block8.rep.5.running_mean", "module.backbone.block8.rep.5.running_var", "module.backbone.block8.rep.7.conv1.weight", "module.backbone.block8.rep.7.bn.weight", "module.backbone.block8.rep.7.bn.bias", "module.backbone.block8.rep.7.bn.running_mean", "module.backbone.block8.rep.7.bn.running_var", "module.backbone.block8.rep.7.pointwise.weight", "module.backbone.block8.rep.8.weight", "module.backbone.block8.rep.8.bias", "module.backbone.block8.rep.8.running_mean", "module.backbone.block8.rep.8.running_var", "module.backbone.block9.rep.1.conv1.weight", "module.backbone.block9.rep.1.bn.weight", "module.backbone.block9.rep.1.bn.bias", "module.backbone.block9.rep.1.bn.running_mean", "module.backbone.block9.rep.1.bn.running_var", "module.backbone.block9.rep.1.pointwise.weight", "module.backbone.block9.rep.2.weight", "module.backbone.block9.rep.2.bias", "module.backbone.block9.rep.2.running_mean", "module.backbone.block9.rep.2.running_var", "module.backbone.block9.rep.4.conv1.weight", "module.backbone.block9.rep.4.bn.weight", "module.backbone.block9.rep.4.bn.bias", "module.backbone.block9.rep.4.bn.running_mean", "module.backbone.block9.rep.4.bn.running_var", "module.backbone.block9.rep.4.pointwise.weight", "module.backbone.block9.rep.5.weight", "module.backbone.block9.rep.5.bias", "module.backbone.block9.rep.5.running_mean", "module.backbone.block9.rep.5.running_var", "module.backbone.block9.rep.7.conv1.weight", "module.backbone.block9.rep.7.bn.weight", "module.backbone.block9.rep.7.bn.bias", "module.backbone.block9.rep.7.bn.running_mean", "module.backbone.block9.rep.7.bn.running_var", "module.backbone.block9.rep.7.pointwise.weight", "module.backbone.block9.rep.8.weight", "module.backbone.block9.rep.8.bias", "module.backbone.block9.rep.8.running_mean", "module.backbone.block9.rep.8.running_var", "module.backbone.block10.rep.1.conv1.weight", "module.backbone.block10.rep.1.bn.weight", "module.backbone.block10.rep.1.bn.bias", "module.backbone.block10.rep.1.bn.running_mean", "module.backbone.block10.rep.1.bn.running_var", "module.backbone.block10.rep.1.pointwise.weight", "module.backbone.block10.rep.2.weight", "module.backbone.block10.rep.2.bias", "module.backbone.block10.rep.2.running_mean", "module.backbone.block10.rep.2.running_var", "module.backbone.block10.rep.4.conv1.weight", "module.backbone.block10.rep.4.bn.weight", "module.backbone.block10.rep.4.bn.bias", "module.backbone.block10.rep.4.bn.running_mean", "module.backbone.block10.rep.4.bn.running_var", "module.backbone.block10.rep.4.pointwise.weight", "module.backbone.block10.rep.5.weight", "module.backbone.block10.rep.5.bias", "module.backbone.block10.rep.5.running_mean", "module.backbone.block10.rep.5.running_var", "module.backbone.block10.rep.7.conv1.weight", "module.backbone.block10.rep.7.bn.weight", "module.backbone.block10.rep.7.bn.bias", "module.backbone.block10.rep.7.bn.running_mean", "module.backbone.block10.rep.7.bn.running_var", "module.backbone.block10.rep.7.pointwise.weight", "module.backbone.block10.rep.8.weight", "module.backbone.block10.rep.8.bias", "module.backbone.block10.rep.8.running_mean", "module.backbone.block10.rep.8.running_var", "module.backbone.block11.rep.1.conv1.weight", "module.backbone.block11.rep.1.bn.weight", "module.backbone.block11.rep.1.bn.bias", "module.backbone.block11.rep.1.bn.running_mean", "module.backbone.block11.rep.1.bn.running_var", "module.backbone.block11.rep.1.pointwise.weight", "module.backbone.block11.rep.2.weight", "module.backbone.block11.rep.2.bias", "module.backbone.block11.rep.2.running_mean", "module.backbone.block11.rep.2.running_var", "module.backbone.block11.rep.4.conv1.weight", "module.backbone.block11.rep.4.bn.weight", "module.backbone.block11.rep.4.bn.bias", "module.backbone.block11.rep.4.bn.running_mean", "module.backbone.block11.rep.4.bn.running_var", "module.backbone.block11.rep.4.pointwise.weight", "module.backbone.block11.rep.5.weight", "module.backbone.block11.rep.5.bias", "module.backbone.block11.rep.5.running_mean", "module.backbone.block11.rep.5.running_var", "module.backbone.block11.rep.7.conv1.weight", "module.backbone.block11.rep.7.bn.weight", "module.backbone.block11.rep.7.bn.bias", "module.backbone.block11.rep.7.bn.running_mean", "module.backbone.block11.rep.7.bn.running_var", "module.backbone.block11.rep.7.pointwise.weight", "module.backbone.block11.rep.8.weight", "module.backbone.block11.rep.8.bias", "module.backbone.block11.rep.8.running_mean", "module.backbone.block11.rep.8.running_var", "module.backbone.block12.rep.1.conv1.weight", "module.backbone.block12.rep.1.bn.weight", "module.backbone.block12.rep.1.bn.bias", "module.backbone.block12.rep.1.bn.running_mean", "module.backbone.block12.rep.1.bn.running_var", "module.backbone.block12.rep.1.pointwise.weight", "module.backbone.block12.rep.2.weight", "module.backbone.block12.rep.2.bias", "module.backbone.block12.rep.2.running_mean", "module.backbone.block12.rep.2.running_var", "module.backbone.block12.rep.4.conv1.weight", "module.backbone.block12.rep.4.bn.weight", "module.backbone.block12.rep.4.bn.bias", "module.backbone.block12.rep.4.bn.running_mean", "module.backbone.block12.rep.4.bn.running_var", "module.backbone.block12.rep.4.pointwise.weight", "module.backbone.block12.rep.5.weight", "module.backbone.block12.rep.5.bias", "module.backbone.block12.rep.5.running_mean", "module.backbone.block12.rep.5.running_var", "module.backbone.block12.rep.7.conv1.weight", "module.backbone.block12.rep.7.bn.weight", "module.backbone.block12.rep.7.bn.bias", "module.backbone.block12.rep.7.bn.running_mean", "module.backbone.block12.rep.7.bn.running_var", "module.backbone.block12.rep.7.pointwise.weight", "module.backbone.block12.rep.8.weight", "module.backbone.block12.rep.8.bias", "module.backbone.block12.rep.8.running_mean", "module.backbone.block12.rep.8.running_var", "module.backbone.block13.rep.1.conv1.weight", "module.backbone.block13.rep.1.bn.weight", "module.backbone.block13.rep.1.bn.bias", "module.backbone.block13.rep.1.bn.running_mean", "module.backbone.block13.rep.1.bn.running_var", "module.backbone.block13.rep.1.pointwise.weight", "module.backbone.block13.rep.2.weight", "module.backbone.block13.rep.2.bias", "module.backbone.block13.rep.2.running_mean", "module.backbone.block13.rep.2.running_var", "module.backbone.block13.rep.4.conv1.weight", "module.backbone.block13.rep.4.bn.weight", "module.backbone.block13.rep.4.bn.bias", "module.backbone.block13.rep.4.bn.running_mean", "module.backbone.block13.rep.4.bn.running_var", "module.backbone.block13.rep.4.pointwise.weight", "module.backbone.block13.rep.5.weight", "module.backbone.block13.rep.5.bias", "module.backbone.block13.rep.5.running_mean", "module.backbone.block13.rep.5.running_var", "module.backbone.block13.rep.7.conv1.weight", "module.backbone.block13.rep.7.bn.weight", "module.backbone.block13.rep.7.bn.bias", "module.backbone.block13.rep.7.bn.running_mean", "module.backbone.block13.rep.7.bn.running_var", "module.backbone.block13.rep.7.pointwise.weight", "module.backbone.block13.rep.8.weight", "module.backbone.block13.rep.8.bias", "module.backbone.block13.rep.8.running_mean", "module.backbone.block13.rep.8.running_var", "module.backbone.block14.rep.1.conv1.weight", "module.backbone.block14.rep.1.bn.weight", "module.backbone.block14.rep.1.bn.bias", "module.backbone.block14.rep.1.bn.running_mean", "module.backbone.block14.rep.1.bn.running_var", "module.backbone.block14.rep.1.pointwise.weight", "module.backbone.block14.rep.2.weight", "module.backbone.block14.rep.2.bias", "module.backbone.block14.rep.2.running_mean", "module.backbone.block14.rep.2.running_var", "module.backbone.block14.rep.4.conv1.weight", "module.backbone.block14.rep.4.bn.weight", "module.backbone.block14.rep.4.bn.bias", "module.backbone.block14.rep.4.bn.running_mean", "module.backbone.block14.rep.4.bn.running_var", "module.backbone.block14.rep.4.pointwise.weight", "module.backbone.block14.rep.5.weight", "module.backbone.block14.rep.5.bias", "module.backbone.block14.rep.5.running_mean", "module.backbone.block14.rep.5.running_var", "module.backbone.block14.rep.7.conv1.weight", "module.backbone.block14.rep.7.bn.weight", "module.backbone.block14.rep.7.bn.bias", "module.backbone.block14.rep.7.bn.running_mean", "module.backbone.block14.rep.7.bn.running_var", "module.backbone.block14.rep.7.pointwise.weight", "module.backbone.block14.rep.8.weight", "module.backbone.block14.rep.8.bias", "module.backbone.block14.rep.8.running_mean", "module.backbone.block14.rep.8.running_var", "module.backbone.block15.rep.1.conv1.weight", "module.backbone.block15.rep.1.bn.weight", "module.backbone.block15.rep.1.bn.bias", "module.backbone.block15.rep.1.bn.running_mean", "module.backbone.block15.rep.1.bn.running_var", "module.backbone.block15.rep.1.pointwise.weight", "module.backbone.block15.rep.2.weight", "module.backbone.block15.rep.2.bias", "module.backbone.block15.rep.2.running_mean", "module.backbone.block15.rep.2.running_var", "module.backbone.block15.rep.4.conv1.weight", "module.backbone.block15.rep.4.bn.weight", "module.backbone.block15.rep.4.bn.bias", "module.backbone.block15.rep.4.bn.running_mean", "module.backbone.block15.rep.4.bn.running_var", "module.backbone.block15.rep.4.pointwise.weight", "module.backbone.block15.rep.5.weight", "module.backbone.block15.rep.5.bias", "module.backbone.block15.rep.5.running_mean", "module.backbone.block15.rep.5.running_var", "module.backbone.block15.rep.7.conv1.weight", "module.backbone.block15.rep.7.bn.weight", "module.backbone.block15.rep.7.bn.bias", "module.backbone.block15.rep.7.bn.running_mean", "module.backbone.block15.rep.7.bn.running_var", "module.backbone.block15.rep.7.pointwise.weight", "module.backbone.block15.rep.8.weight", "module.backbone.block15.rep.8.bias", "module.backbone.block15.rep.8.running_mean", "module.backbone.block15.rep.8.running_var", "module.backbone.block16.rep.1.conv1.weight", "module.backbone.block16.rep.1.bn.weight", "module.backbone.block16.rep.1.bn.bias", "module.backbone.block16.rep.1.bn.running_mean", "module.backbone.block16.rep.1.bn.running_var", "module.backbone.block16.rep.1.pointwise.weight", "module.backbone.block16.rep.2.weight", "module.backbone.block16.rep.2.bias", "module.backbone.block16.rep.2.running_mean", "module.backbone.block16.rep.2.running_var", "module.backbone.block16.rep.4.conv1.weight", "module.backbone.block16.rep.4.bn.weight", "module.backbone.block16.rep.4.bn.bias", "module.backbone.block16.rep.4.bn.running_mean", "module.backbone.block16.rep.4.bn.running_var", "module.backbone.block16.rep.4.pointwise.weight", "module.backbone.block16.rep.5.weight", "module.backbone.block16.rep.5.bias", "module.backbone.block16.rep.5.running_mean", "module.backbone.block16.rep.5.running_var", "module.backbone.block16.rep.7.conv1.weight", "module.backbone.block16.rep.7.bn.weight", "module.backbone.block16.rep.7.bn.bias", "module.backbone.block16.rep.7.bn.running_mean", "module.backbone.block16.rep.7.bn.running_var", "module.backbone.block16.rep.7.pointwise.weight", "module.backbone.block16.rep.8.weight", "module.backbone.block16.rep.8.bias", "module.backbone.block16.rep.8.running_mean", "module.backbone.block16.rep.8.running_var", "module.backbone.block17.rep.1.conv1.weight", "module.backbone.block17.rep.1.bn.weight", "module.backbone.block17.rep.1.bn.bias", "module.backbone.block17.rep.1.bn.running_mean", "module.backbone.block17.rep.1.bn.running_var", "module.backbone.block17.rep.1.pointwise.weight", "module.backbone.block17.rep.2.weight", "module.backbone.block17.rep.2.bias", "module.backbone.block17.rep.2.running_mean", "module.backbone.block17.rep.2.running_var", "module.backbone.block17.rep.4.conv1.weight", "module.backbone.block17.rep.4.bn.weight", "module.backbone.block17.rep.4.bn.bias", "module.backbone.block17.rep.4.bn.running_mean", "module.backbone.block17.rep.4.bn.running_var", "module.backbone.block17.rep.4.pointwise.weight", "module.backbone.block17.rep.5.weight", "module.backbone.block17.rep.5.bias", "module.backbone.block17.rep.5.running_mean", "module.backbone.block17.rep.5.running_var", "module.backbone.block17.rep.7.conv1.weight", "module.backbone.block17.rep.7.bn.weight", "module.backbone.block17.rep.7.bn.bias", "module.backbone.block17.rep.7.bn.running_mean", "module.backbone.block17.rep.7.bn.running_var", "module.backbone.block17.rep.7.pointwise.weight", "module.backbone.block17.rep.8.weight", "module.backbone.block17.rep.8.bias", "module.backbone.block17.rep.8.running_mean", "module.backbone.block17.rep.8.running_var", "module.backbone.block18.rep.1.conv1.weight", "module.backbone.block18.rep.1.bn.weight", "module.backbone.block18.rep.1.bn.bias", "module.backbone.block18.rep.1.bn.running_mean", "module.backbone.block18.rep.1.bn.running_var", "module.backbone.block18.rep.1.pointwise.weight", "module.backbone.block18.rep.2.weight", "module.backbone.block18.rep.2.bias", "module.backbone.block18.rep.2.running_mean", "module.backbone.block18.rep.2.running_var", "module.backbone.block18.rep.4.conv1.weight", "module.backbone.block18.rep.4.bn.weight", "module.backbone.block18.rep.4.bn.bias", "module.backbone.block18.rep.4.bn.running_mean", "module.backbone.block18.rep.4.bn.running_var", "module.backbone.block18.rep.4.pointwise.weight", "module.backbone.block18.rep.5.weight", "module.backbone.block18.rep.5.bias", "module.backbone.block18.rep.5.running_mean", "module.backbone.block18.rep.5.running_var", "module.backbone.block18.rep.7.conv1.weight", "module.backbone.block18.rep.7.bn.weight", "module.backbone.block18.rep.7.bn.bias", "module.backbone.block18.rep.7.bn.running_mean", "module.backbone.block18.rep.7.bn.running_var", "module.backbone.block18.rep.7.pointwise.weight", "module.backbone.block18.rep.8.weight", "module.backbone.block18.rep.8.bias", "module.backbone.block18.rep.8.running_mean", "module.backbone.block18.rep.8.running_var", "module.backbone.block19.rep.1.conv1.weight", "module.backbone.block19.rep.1.bn.weight", "module.backbone.block19.rep.1.bn.bias", "module.backbone.block19.rep.1.bn.running_mean", "module.backbone.block19.rep.1.bn.running_var", "module.backbone.block19.rep.1.pointwise.weight", "module.backbone.block19.rep.2.weight", "module.backbone.block19.rep.2.bias", "module.backbone.block19.rep.2.running_mean", "module.backbone.block19.rep.2.running_var", "module.backbone.block19.rep.4.conv1.weight", "module.backbone.block19.rep.4.bn.weight", "module.backbone.block19.rep.4.bn.bias", "module.backbone.block19.rep.4.bn.running_mean", "module.backbone.block19.rep.4.bn.running_var", "module.backbone.block19.rep.4.pointwise.weight", "module.backbone.block19.rep.5.weight", "module.backbone.block19.rep.5.bias", "module.backbone.block19.rep.5.running_mean", "module.backbone.block19.rep.5.running_var", "module.backbone.block19.rep.7.conv1.weight", "module.backbone.block19.rep.7.bn.weight", "module.backbone.block19.rep.7.bn.bias", "module.backbone.block19.rep.7.bn.running_mean", "module.backbone.block19.rep.7.bn.running_var", "module.backbone.block19.rep.7.pointwise.weight", "module.backbone.block19.rep.8.weight", "module.backbone.block19.rep.8.bias", "module.backbone.block19.rep.8.running_mean", "module.backbone.block19.rep.8.running_var", "module.backbone.block20.skip.weight", "module.backbone.block20.skipbn.weight", "module.backbone.block20.skipbn.bias", "module.backbone.block20.skipbn.running_mean", "module.backbone.block20.skipbn.running_var", "module.backbone.block20.rep.1.conv1.weight", "module.backbone.block20.rep.1.bn.weight", "module.backbone.block20.rep.1.bn.bias", "module.backbone.block20.rep.1.bn.running_mean", "module.backbone.block20.rep.1.bn.running_var", "module.backbone.block20.rep.1.pointwise.weight", "module.backbone.block20.rep.2.weight", "module.backbone.block20.rep.2.bias", "module.backbone.block20.rep.2.running_mean", "module.backbone.block20.rep.2.running_var", "module.backbone.block20.rep.4.conv1.weight", "module.backbone.block20.rep.4.bn.weight", "module.backbone.block20.rep.4.bn.bias", "module.backbone.block20.rep.4.bn.running_mean", "module.backbone.block20.rep.4.bn.running_var", "module.backbone.block20.rep.4.pointwise.weight", "module.backbone.block20.rep.5.weight", "module.backbone.block20.rep.5.bias", "module.backbone.block20.rep.5.running_mean", "module.backbone.block20.rep.5.running_var", "module.backbone.block20.rep.7.conv1.weight", "module.backbone.block20.rep.7.bn.weight", "module.backbone.block20.rep.7.bn.bias", "module.backbone.block20.rep.7.bn.running_mean", "module.backbone.block20.rep.7.bn.running_var", "module.backbone.block20.rep.7.pointwise.weight", "module.backbone.block20.rep.8.weight", "module.backbone.block20.rep.8.bias", "module.backbone.block20.rep.8.running_mean", "module.backbone.block20.rep.8.running_var", "module.backbone.conv3.conv1.weight", "module.backbone.conv3.bn.weight", "module.backbone.conv3.bn.bias", "module.backbone.conv3.bn.running_mean", "module.backbone.conv3.bn.running_var", "module.backbone.conv3.pointwise.weight", "module.backbone.bn3.weight", "module.backbone.bn3.bias", "module.backbone.bn3.running_mean", "module.backbone.bn3.running_var", "module.backbone.conv4.conv1.weight", "module.backbone.conv4.bn.weight", "module.backbone.conv4.bn.bias", "module.backbone.conv4.bn.running_mean", "module.backbone.conv4.bn.running_var", "module.backbone.conv4.pointwise.weight", "module.backbone.bn4.weight", "module.backbone.bn4.bias", "module.backbone.bn4.running_mean", "module.backbone.bn4.running_var", "module.backbone.conv5.conv1.weight", "module.backbone.conv5.bn.weight", "module.backbone.conv5.bn.bias", "module.backbone.conv5.bn.running_mean", "module.backbone.conv5.bn.running_var", "module.backbone.conv5.pointwise.weight", "module.backbone.bn5.weight", "module.backbone.bn5.bias", "module.backbone.bn5.running_mean", "module.backbone.bn5.running_var", "module.aspp.aspp1.atrous_conv.weight", "module.aspp.aspp1.bn.weight", "module.aspp.aspp1.bn.bias", "module.aspp.aspp1.bn.running_mean", "module.aspp.aspp1.bn.running_var", "module.aspp.aspp2.atrous_conv.weight", "module.aspp.aspp2.bn.weight", "module.aspp.aspp2.bn.bias", "module.aspp.aspp2.bn.running_mean", "module.aspp.aspp2.bn.running_var", "module.aspp.aspp3.atrous_conv.weight", "module.aspp.aspp3.bn.weight", "module.aspp.aspp3.bn.bias", "module.aspp.aspp3.bn.running_mean", "module.aspp.aspp3.bn.running_var", "module.aspp.aspp4.atrous_conv.weight", "module.aspp.aspp4.bn.weight", "module.aspp.aspp4.bn.bias", "module.aspp.aspp4.bn.running_mean", "module.aspp.aspp4.bn.running_var", "module.aspp.global_avg_pool.1.weight", "module.aspp.global_avg_pool.2.weight", "module.aspp.global_avg_pool.2.bias", "module.aspp.global_avg_pool.2.running_mean", "module.aspp.global_avg_pool.2.running_var", "module.aspp.conv1.weight", "module.aspp.bn1.weight", "module.aspp.bn1.bias", "module.aspp.bn1.running_mean", "module.aspp.bn1.running_var", "module.decoder.conv1.weight", "module.decoder.bn1.weight", "module.decoder.bn1.bias", "module.decoder.bn1.running_mean", "module.decoder.bn1.running_var", "module.decoder.last_conv.0.weight", "module.decoder.last_conv.1.weight", "module.decoder.last_conv.1.bias", "module.decoder.last_conv.1.running_mean", "module.decoder.last_conv.1.running_var", "module.decoder.last_conv.4.weight", "module.decoder.last_conv.5.weight", "module.decoder.last_conv.5.bias", "module.decoder.last_conv.5.running_mean", "module.decoder.last_conv.5.running_var", "module.decoder.last_conv.8.weight", "module.decoder.last_conv.8.bias".
Unexpected key(s) in state_dict: "epoch", "state_dict", "optimizer", "best_pred".

A problem about encoding mask

The label mask of VOC is rgb, but when I use Image.open() to read the image why does it become label index ?
If I use Image.open() to read my dataset it won't become label index and it will also be rgb. So if I want to use my own data to train, I need to write code to encode the rgb mask to be label index right?

A problem about transform code

Hi,

Thanks for the great library, i see the code custom_transforms.py, in class ToTensor , mask[mask == 255] = 0 (line 127).
why need to set mask == 255 to 0 ? 255 means ignore index which should not be set to background. Am i right?

Program does not run

I have the parameters set as follows
image

I get the result of the following figure after executing the program.
image

The model is full of video memory, but it has not started training.
I don't know what I should do. Can you give me some tips?

About pascal voc 2012 SBD training

I am training this model on pacal voc 2012 SBD dataset and I found that the mIoU improved slowly. 12 hours's training produced 0.41 mIoU in epoch 143. In the training process, I set learning rate as 1e-3, momentum as 0.90, batch size as 8 and weight decay as 5e-4. I also use data augmentations: random rotate 10 degrees, random horizontally flip.
I modify the model according to tensorflow version. The details are shown in my forked repository.
Is this training speed normal?

high resolution input image

greetings to all experts here,
i have been using deeplab for some time now, but am always limited by the low resolution image input. My images are always of rather high resolution (2k-3k). Does anyone know if there is a higher resolution version of deeplab? If not, is it even feasible to create one that supports much higher resolution (as well as train it), i.e. is the training time going to be exponentially longer?

Wrong padding of "label"

When training, the augmentation RandomScaleCrop may downscale the image and the target label image. It then pads the image and the label with self.fill which is ZERO.
This is in contrast to the "ignore value" of the loss that is set to 255.
This way the loss treats the padded region as valid "class 0" pixels and compute loss for it.

self.fill of the augmentation functions should be equal to self.ignore_index of the loss function.

Single GPU

Hello, I want to use single gpu to train the net. But I find that if I choose to use gpu-ids=1 or 2, the following error will appear。
image
Fortunately, if I choose gpu-ids=0 the code will work successful.
But why? Can I use single gpu with other gpu-ids not 0?

test

Hi:
Can you provide your test file to inference one image?

mIoU error using xception as backbone

when I use resnet as backbone ,the mIoU can be 0.XX ,for example,0.38 at first epoch,0.74 at 50 epoch

however when I use xception as backbone the mIoU is 0.0X ,for example,0.03 at first epoch

Is there something wrong with the code?
00111

Result Miou

could anyone show the training and testing result ?
my result goes too bad to see.

Is there any problem with exit flow of Xception?

The output size is 256x256 in the example code, in which input's size is 512x512.

I find that there are some differences with official tensorflow version. That version uses dilated convolution(rate 2, stride 1) in exit flow and uses stride 1 convolution in skip connection.

As a result, in their version, the scale rate of Xception layer is 16 instead of 32. The whole model can produce same size output with input instead half of input.

Do I mistake something?

Validation is only estimated on part of the images

Validation images undergo FixScaleCrop, that is, only the central rectangular region of the validation image participate in evaluating the model. The boundaries of the image are ignored.

When taking into account the entire validation images the performance of the trained model (resnet backbone) reduces from mIoU=78.43% to 77.89%

Questions about the experiment on Cityscapes dataset

Hello, thanks for offering your implementation about Deeplabv3+ on Pytorch. However, when I utilize your code to conduct a experiment on the Cityscapes dataset, I can not obtain a comparable result with the results published in the paper. My result on the val subset is about 36% on mIoU, which is much lower than the published result (greater than 75%). All training configurations I used are according to the paper (Deeplabv3+ and Deeplabv3). Have you conducted similar experiments on the Cityscapes dataset? Or can you provide any suggestion? Thanks a lot!

about loss and mIou

Sorry to bother u,i just download the voc2012,and i run train.py,but the loss is very high,about 20000+,and the mIou is about 39%.Is that because epoch that i use is 200.(too little?)By the way,the number of training pictures is1400+,the val pictures is 1560.

SGD Batch_size = 1 problem

First, thank you share this cod
e.
I find a problem when I use batch_size = 1

self.global_avg_pool = nn.Sequential(nn.AdaptiveAvgPool2d((1, 1)),
                                            nn.Conv2d(2048, 256, 1, stride=1, bias=False),
                                            nn.BatchNorm2d(256),
                                            nn.ReLU())

Using AdaptiveAvgPool2d will get (1, 2048, 1, 1)
nn.Conv2d => (1, 256, 1, 1)
nn.BatchNorm2d(256) => problem in here
Three dimensions be one, BatchNorm2d will raise error

nn.BatchNorm2d(256), #if batch size ==1 should comment out

Good job!

This Code implemented by pytorch is great and excited. Thanks for your working and how soon will you provide the performance (such as mIoU) by this code?

Low mIoU on Cityscapes

Hi, thanks for sharing your code. I trained the model on the cityscapes dataset (btw, you missed a self here) without code edits and I can get only 68% mIoU. Do you have any pretrained models, or can you describe your training strategy? I trained on a single Tesla v100 GPU, with lr = 0.007 and batch size 8.
Thanks in advance

didn't converge

hi, I tried your code without loading pretrained model. But I found it didn't converge at all. What should I do to make it work.

[Epoch: 0, numImages: 1464]
Loss: 754343.768581
Execution time: 61.669747280015144

[Epoch: 1, numImages: 1464]
Loss: 754388.596284
Execution time: 44.01166869502049

[Epoch: 2, numImages: 1464]
Loss: 753789.062500
Execution time: 44.40118463395629

[Epoch: 3, numImages: 1464]
Loss: 754601.996622
Execution time: 43.993270942009985

[Epoch: 4, numImages: 1464]
Loss: 754301.511824
Execution time: 44.05079116899287

[Epoch: 5, numImages: 1464]
Loss: 754124.653716
Execution time: 44.40390331699746

[Epoch: 6, numImages: 1464]
Loss: 753820.405405
Execution time: 44.04822350398172

[Epoch: 7, numImages: 1464]
Loss: 754613.310811
Execution time: 44.43823240103666

[Epoch: 8, numImages: 1464]
Loss: 754287.202703
Execution time: 44.060883720987476

[Epoch: 9, numImages: 1464]
Loss: 754111.920608
Execution time: 44.11198482802138

[Epoch: 9, numImages: 1449]
Loss: 756997.688530
[Epoch: 10, numImages: 1464]
Loss: 754386.648649
Execution time: 44.27616982697509

[Epoch: 11, numImages: 1464]
Loss: 754753.653716
Execution time: 44.208685109973885

[Epoch: 12, numImages: 1464]
Loss: 753667.771959
Execution time: 44.36111898702802

[Epoch: 13, numImages: 1464]
Loss: 754298.302365
Execution time: 44.42224766005529

[Epoch: 14, numImages: 1464]
Loss: 754134.364865
Execution time: 44.61775282601593

[Epoch: 15, numImages: 1464]
Loss: 754144.538851
Execution time: 44.4529153269832

[Epoch: 16, numImages: 1464]
Loss: 754300.746622
Execution time: 44.196999565989245

[Epoch: 17, numImages: 1464]
Loss: 753685.072635
Execution time: 44.14125015295576

mIOU is low

Thanks for sharing the code.
When I used this project to train Pascal VOC for nearly 200 epochs, I got only 0.32 mIOU.

The result on tensorboardX is:
wrong_result

My configuration is:
learning rate=0.0001
backbone=xception
use_sbd=False

And I have set size_average=True in nn.CrossEntropyLoss function.

pretrained models?

Hi,

Thanks for the great library, will you be providing pretrained models on CityScape?

Models are not in tar format

The models linked via README.md have a .tar file extension but are not in .tar format.

For example deeplab-resnet.pth.tar starts with 0x80 0x02 and is a PyTorch .pth pickle file, not a tar file.

How to output result image from output?

When i finished the model training. & load model & input a single image
by:

# Image preprocessing
    image = cv2.imread(image_path, cv2.IMREAD_COLOR).astype(float)
    image = cv2.resize(image, dsize=(512,512))
    image_original = image.astype(np.uint8)
    image = torch.from_numpy(image.transpose(2, 0, 1)).float().unsqueeze(0)
    image = image.to(device)

    # Inference
    output = Model(image)

output is a tensor & shape is (5,512,512).
like:

[[[  4916.116    5713.0825   6510.0493 ...   8198.43     7449.5635
     6700.7207]
  [  5698.8423   6637.15     7575.4585 ...   9693.137    8787.99
     7882.872 ]
  [  6481.569    7561.218    8640.867  ...  11187.843   10126.417
     9065.023 ]
  ...

if i want to invert the tensor to labelimage ,What should i do?

if i process output by softmax or argmax or torch.max ,the output will be null image like

[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]

It's a weird question!

DRN model cannot be downloaded

You don't have permission to access /~fy/drn/models/ on this server.
Could you provide download links for DRN weights?

CuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Hello, I want to train my datasets. However, when I try to run the code, the error occurs as follows:
Namespace(backbone='resnet', base_size=513, batch_size=8, checkname='deeplab-resnet', crop_size=513, cuda=True, dataset='pascal', epochs=50, eval_interval=1, freeze_bn=False, ft=False, gpu_ids=[0], loss_type='ce', lr=0.007, lr_scheduler='poly', momentum=0.9, nesterov=False, no_cuda=False, no_val=False, out_stride=16, resume=None, seed=1, start_epoch=0, sync_bn=False, test_batch_size=8, use_balanced_weights=False, use_sbd=False, weight_decay=0.0005, workers=4) Number of images in train: 3184 Number of images in val: 797 Using poly LR Scheduler! Starting Epoch: 0 Total Epoches: 50 0%| | 0/398 [00:00<?, ?it/s] =>Epoches 0, learning rate = 0.0070, previous best = 0.0000 /home/image/anaconda3/envs/ajy/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='elementwise_mean' instead. warnings.warn(warning.format(ret)) Train loss: 0.288: 1%|▏ | 3/398 [00:03<07:59, 1.21s/it]
Traceback (most recent call last): File "train.py", line 305, in <module> main() File "train.py", line 298, in main trainer.training(epoch) File "train.py", line 109, in training loss.backward() File "/home/image/anaconda3/envs/ajy/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/image/anaconda3/envs/ajy/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: CuDNN error: CUDNN_STATUS_EXECUTION_FAILED /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int, long) [with T = float, AccumT = float]: block: [13,0,0], thread: [457,0,0] Assertion t >= 0 && t < n_classesfailed. /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int, long) [with T = float, AccumT = float]: block: [13,0,0], thread: [458,0,0] Assertiont >= 0 && t < n_classesfailed.

loss value is too high, any explanation? thank you.

Thank you for your sharing!
Could you please provide the results, such as a screen snapshot.
I find no error in your source code. However, high loss value exists all the time (even after 300 epochs).
What's wrong with that?

Thank you for answering. @jfzhang95

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.