Coder Social home page Coder Social logo

gss's Introduction

Generative Semantic Segmentation

Paper

Generative Semantic Segmentation,
Jiaqi Chen, Jiachen Lu, Xiatian Zhu, and Li Zhang
CVPR 2023

Abstract

We present Generative Semantic Segmentation (GSS), a generative framework for semantic segmentation. Unlike previous methods addressing a per-pixel classification problem, we cast semantic segmentation into an image-conditioned mask generation problem. This is achieved by replacing the conventional per-pixel discriminative learning with a latent prior learning process. Specifically, we model the variational posterior distribution of latent variables given the segmentation mask. This is done by expressing the segmentation mask with a special type of image (dubbed as maskige). This posterior distribution allows to generate segmentation masks unconditionally. To implement semantic segmentation, we further introduce a conditioning network (e.g., an encoder-decoder Transformer) optimized by minimizing the divergence between the posterior distribution of maskige (i.e. segmentation masks) and the latent prior distribution of input images on the training set. Extensive experiments on standard benchmarks show that our GSS can perform competitively to prior art alternatives in the standard semantic segmentation setting, whilst achieving a new state of the art in the more challenging cross-domain setting.

GSS

TODO List

  • Upload model weights and DALL-E VQVAE weight
  • Complete install.md
  • Add dataset link

Results

Cityscapes

Name Backbone Iterations mIoU mAcc Config checkpoint
GSS-FF R101 80k 77.76 85.9 config google drive
GSS-FF Swin-L 80k 78.90 87.03 config google drive
GSS-FT-W ResNet 80k 78.46 85.92 config google drive
GSS-FT-W Swin-L 80k 80.05 87.32 config google drive

ADE20K

Name Backbone Iterations mIoU mAcc Config checkpoint
GSS-FF Swin-L 160k 46.29 57.84 config google drive
GSS-FT-W Swin-L 160k 48.54 58.94 config google drive

MSeg

Name Backbone Iterations h.mean Config checkpoint
GSS-FF HRNet-W48 160k 52.60 config google drive
GSS-FF Swin-L 160k 59.49 config google drive
GSS-FT-W HRNet-W48 160k 55.20 config google drive
GSS-FT-W Swin-L 160k 61.94 config google drive

Get Started

Environment

This implementation is build upon mmsegmentation, please follow the steps in install.md to prepare the environment.

Train & Test

# train with 8 GPUs
bash tools/dist_train.sh configs/gss/cityscapes/gss-ff_r101_768x768_80k_cityscapes.py 8
# test with 8 GPUs
bash tools/dist_test.sh configs/gss/cityscapes/gss-ff_r101_768x768_80k_cityscapes.py ./ckp_dir/iter_80000.pth 8 --eval mIoU

Reference

@inproceedings{chen2023generative,
  title={Generative Semantic Segmentation
  author={Chen, Jiaqi and Lu, Jiachen and Zhu, Xiatian and Zhang, Li},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023}
}

gss's People

Contributors

jiaqi-chen-00 avatar lzrobots avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.