Coder Social home page Coder Social logo

forkbabu / semask-segmentation Goto Github PK

View Code? Open in Web Editor NEW

This project forked from picsart-ai-research/semask-segmentation

0.0 0.0 0.0 2.19 MB

[Preprint] SeMask: Semantically Masked Transformers for Semantic Segmentation.

Home Page: https://arxiv.org/abs/2112.12782

License: Other

Python 87.60% Shell 0.14% C++ 3.52% C 0.48% Cuda 8.26%

semask-segmentation's Introduction

SeMask: Semantically Masked Transformers

PWC

PWC

PWC

Framework: PyTorch License

Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

[arXiv] [pdf] [BibTeX]

This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation.

semask

Contents

  1. Results
  2. Setup Instructions
  3. Citing SeMask

1. Results

Note: † denotes the backbones were pretrained on ImageNet-22k and 384x384 resolution images.

ADE20K

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 42.11 43.16 35M config TBD
SeMask-S FPN SeMask Swin-S 512x512 45.92 47.63 56M config checkpoint
SeMask-B FPN SeMask Swin-B 512x512 49.35 50.98 96M config checkpoint
SeMask-L FPN SeMask Swin-L 640x640 51.89 53.52 211M config checkpoint
SeMask-L MaskFormer SeMask Swin-L 640x640 54.75 56.15 219M config checkpoint
SeMask-L Mask2Former SeMask Swin-L 640x640 56.41 57.52 222M config checkpoint
SeMask-L Mask2Former FAPN SeMask Swin-L 640x640 56.68 58.00 227M config TBD
SeMask-L Mask2Former MSFAPN SeMask Swin-L 640x640 56.54 58.22 224M config checkpoint

Cityscapes

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 768x768 74.92 76.56 34M config checkpoint
SeMask-S FPN SeMask Swin-S 768x768 77.13 79.14 56M config checkpoint
SeMask-B FPN SeMask Swin-B 768x768 77.70 79.73 96M config checkpoint
SeMask-L FPN SeMask Swin-L 768x768 78.53 80.39 211M config checkpoint
SeMask-L Mask2Former SeMask Swin-L 512x1024 83.97 84.98 222M config checkpoint

COCO-Stuff 10k

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 37.53 38.88 35M config checkpoint
SeMask-S FPN SeMask Swin-S 512x512 40.72 42.27 56M config checkpoint
SeMask-B FPN SeMask Swin-B 512x512 44.63 46.30 96M config checkpoint
SeMask-L FPN SeMask Swin-L 640x640 47.47 48.54 211M config checkpoint

demo

2. Setup Instructions

We provide the codebase with SeMask incorporated into various models. Please check the setup instructions inside the corresponding folders:

3. Citing SeMask

@article{jain2021semask,
  title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
  author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
  journal={arXiv},
  year={2021}
}

Acknowledgements

Code is based heavily on the following repositories: Swin-Transformer-Semantic-Segmentation, Mask2Former, MaskFormer and FaPN-full.

semask-segmentation's People

Contributors

praeclarumjj3 avatar honghuis avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.